public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: linux-arch@vger.kernel.org
Subject: reordering pgdat list
Date: Wed, 17 Aug 2005 02:32:45 +0200	[thread overview]
Message-ID: <20050817003245.GD3996@wotan.suse.de> (raw)


Hallo,

I had some problems with bootmem allocators who need to allocate memory in
the first 4GB.  On a NUMA system with enough memory alloc_bootmem would
just go over the nodes with a for_each_pgdat and try them in turn. When
the nodes are added in the straight forward order beginning from 0 to
bootmem they end up reversed on the pgdat_list because init_bootmem_node 
always inserts the new node at the head of the list. This results
in alloc_bootmem to look first into the last node and if there
is enough memory there allocate memory. Which can be beyond 4GB.

Anyways, i pondered a few solutions. The best one seems to be to just
reorder the list. I see that IA64 had some magic
code to do the same, but it looked so hackish that I didn't want
to duplicate it. So I just changed init_bootmem to insert at the tail.

I think the generic code doing for_each_pgdat is all ok and doesn't
care about the order, but several architectures do their own
for_each_pgdat() and they might in theory break. 

If your architecture does funky things with for_each_pgdat testing this patch
might good. I plan to submit it when 2.6.14 opens.

-Andi


Index: linux/mm/bootmem.c
===================================================================
--- linux.orig/mm/bootmem.c
+++ linux/mm/bootmem.c
@@ -61,9 +61,17 @@ static unsigned long __init init_bootmem
 {
 	bootmem_data_t *bdata = pgdat->bdata;
 	unsigned long mapsize = ((end - start)+7)/8;
+	static struct pglist_data *pgdat_last;
 
-	pgdat->pgdat_next = pgdat_list;
-	pgdat_list = pgdat;
+	pgdat->pgdat_next = NULL;
+	/* Add new nodes last so that bootmem always starts 
+	   searching in the first nodes, not the last ones */
+	if (pgdat_last)
+		pgdat_last->pgdat_next = pgdat;
+	else {
+		pgdat_list = pgdat; 	
+		pgdat_last = pgdat;
+	}
 
 	mapsize = ALIGN(mapsize, sizeof(long));
 	bdata->node_bootmem_map = phys_to_virt(mapstart << PAGE_SHIFT);

                 reply	other threads:[~2005-08-17  0:32 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050817003245.GD3996@wotan.suse.de \
    --to=ak@suse.de \
    --cc=linux-arch@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox