public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* reordering pgdat list
@ 2005-08-17  0:32 Andi Kleen
  0 siblings, 0 replies; only message in thread
From: Andi Kleen @ 2005-08-17  0:32 UTC (permalink / raw)
  To: linux-arch


Hallo,

I had some problems with bootmem allocators who need to allocate memory in
the first 4GB.  On a NUMA system with enough memory alloc_bootmem would
just go over the nodes with a for_each_pgdat and try them in turn. When
the nodes are added in the straight forward order beginning from 0 to
bootmem they end up reversed on the pgdat_list because init_bootmem_node 
always inserts the new node at the head of the list. This results
in alloc_bootmem to look first into the last node and if there
is enough memory there allocate memory. Which can be beyond 4GB.

Anyways, i pondered a few solutions. The best one seems to be to just
reorder the list. I see that IA64 had some magic
code to do the same, but it looked so hackish that I didn't want
to duplicate it. So I just changed init_bootmem to insert at the tail.

I think the generic code doing for_each_pgdat is all ok and doesn't
care about the order, but several architectures do their own
for_each_pgdat() and they might in theory break. 

If your architecture does funky things with for_each_pgdat testing this patch
might good. I plan to submit it when 2.6.14 opens.

-Andi


Index: linux/mm/bootmem.c
===================================================================
--- linux.orig/mm/bootmem.c
+++ linux/mm/bootmem.c
@@ -61,9 +61,17 @@ static unsigned long __init init_bootmem
 {
 	bootmem_data_t *bdata = pgdat->bdata;
 	unsigned long mapsize = ((end - start)+7)/8;
+	static struct pglist_data *pgdat_last;
 
-	pgdat->pgdat_next = pgdat_list;
-	pgdat_list = pgdat;
+	pgdat->pgdat_next = NULL;
+	/* Add new nodes last so that bootmem always starts 
+	   searching in the first nodes, not the last ones */
+	if (pgdat_last)
+		pgdat_last->pgdat_next = pgdat;
+	else {
+		pgdat_list = pgdat; 	
+		pgdat_last = pgdat;
+	}
 
 	mapsize = ALIGN(mapsize, sizeof(long));
 	bdata->node_bootmem_map = phys_to_virt(mapstart << PAGE_SHIFT);

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2005-08-17  0:32 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-17  0:32 reordering pgdat list Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox