* [PATCH 0/8] Reducing fragmentation using zones v6
@ 2006-05-05 17:34 Mel Gorman
  2006-05-05 17:35 ` [PATCH 1/8] Add __GFP_EASYRCLM flag and update callers Mel Gorman
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:34 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This is V6 of the zone-based anti-fragmentation patches. These patches require
the architecture-independent zone-sizing patches posted under the subject
"[PATCH 0/7] Sizing zones and holes in an architecture independent manner V5"
at http://marc.theaimsgroup.com/?l=linux-kernel&m=114649066331962&w=2 .
In this version, all the zone sizing is done in one place rather
than per-architecture like older versions. Any architecture that uses
add_active_range() and free_area_init_nodes() to initialise it's zones
can trivially support ZONE_EASYRCLM by parsing a kernelcore= command-line
parameter and calling set_required_kernelcore().
As these patches require the zone-sizing merged first, I am not looking to
merge these now. I just wanted to show that zone-based anti-fragmentation is
a bit nicer looking when the zone sizing is not architecture-specific. Some
of this patch is a bit rough and ready :)
Changelog since v5
  o Rebase on top of arch-independent zone and hole sizing code
  o Calculate ZONE_EASYRCLM boundaries in an arch-independent manner
Changelog since v4
  o Move x86_64 from one patch to another
  o Fix for oops bug on ppc64
Changelog since v3
  o Minor bugs
  o ppc64 can specify kernelcore
  o Ability to disable use of ZONE_EASYRCLM at boot time
  o HugeTLB uses ZONE_EASYRCLM
  o Add drain-percpu caches for testing
  o boot-parameter documentation added
This is a zone-based approach to anti-fragmentation. This is posted in light
of the discussions related to the list-based (sometimes dubbed as sub-zones)
approach where the prevailing opinion was that zones were the answer. The
patches have been boot-tested based on linux-2.6.17-rc3-mm1 with x86,
x86_64, ppc64 in a variety of different configurations. It was boot-tested
on ia64 where it blew up before the serial console was initialised. Lacking
early_printk, I have not figured out what is going wrong there yet. If
anyone has physical access to an IA64 that can send me a bug report, I'd
appreciate it.
Ordinarily, I would include performance regressions, but I'm not looking to
merge this time.
The diffstat for the all the patches is;
 Documentation/kernel-parameters.txt |   16 ++
 arch/i386/kernel/setup.c            |   12 ++
 arch/i386/kernel/srat.c             |    2
 arch/ia64/kernel/efi.c              |    5
 arch/powerpc/kernel/prom.c          |    9 +
 arch/ppc/mm/init.c                  |    9 +
 arch/x86_64/kernel/setup.c          |    6 +
 arch/x86_64/mm/init.c               |    2
 fs/compat.c                         |    2
 fs/exec.c                           |    2
 fs/inode.c                          |   11 ++
 fs/ntfs/malloc.h                    |    3
 include/asm-i386/page.h             |    3
 include/linux/gfp.h                 |    3
 include/linux/highmem.h             |    2
 include/linux/mm.h                  |    1
 include/linux/mmzone.h              |   12 +-
 mm/hugetlb.c                        |    4
 mm/mem_init.c                       |  198 +++++++++++++++++++++++++++++++++---
 mm/memory.c                         |    4
 mm/page_alloc.c                     |   10 +
 mm/shmem.c                          |    4
 mm/swap_state.c                     |    2
 23 files changed, 288 insertions(+), 34 deletions(-)
-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 1/8] Add __GFP_EASYRCLM flag and update callers
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
@ 2006-05-05 17:35 ` Mel Gorman
  2006-05-05 17:35 ` [PATCH 2/8] Create the ZONE_EASYRCLM zone Mel Gorman
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:35 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This creates a zone modifier __GFP_EASYRCLM and a set of GFP flags called
GFP_RCLMUSER. The only difference between GFP_HIGHUSER and GFP_RCLMUSER is the
zone that is used. Callers appropriate to use the ZONE_EASYRCLM are changed.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/fs/compat.c linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/compat.c
--- linux-2.6.17-rc3-mm1-zonesizing-v9/fs/compat.c	2006-05-03 09:41:32.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/compat.c	2006-05-03 09:45:54.000000000 +0100
@@ -1437,7 +1437,7 @@ static int compat_copy_strings(int argc,
 			page = bprm->page[i];
 			new = 0;
 			if (!page) {
-				page = alloc_page(GFP_HIGHUSER);
+				page = alloc_page(GFP_RCLMUSER);
 				bprm->page[i] = page;
 				if (!page) {
 					ret = -ENOMEM;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/fs/exec.c linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/exec.c
--- linux-2.6.17-rc3-mm1-zonesizing-v9/fs/exec.c	2006-05-03 09:41:32.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/exec.c	2006-05-03 09:45:54.000000000 +0100
@@ -238,7 +238,7 @@ static int copy_strings(int argc, char _
 			page = bprm->page[i];
 			new = 0;
 			if (!page) {
-				page = alloc_page(GFP_HIGHUSER);
+				page = alloc_page(GFP_RCLMUSER);
 				bprm->page[i] = page;
 				if (!page) {
 					ret = -ENOMEM;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/fs/inode.c linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/inode.c
--- linux-2.6.17-rc3-mm1-zonesizing-v9/fs/inode.c	2006-05-03 09:41:32.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/inode.c	2006-05-05 17:34:36.000000000 +0100
@@ -147,7 +147,18 @@ static struct inode *alloc_inode(struct 
 		mapping->a_ops = &empty_aops;
  		mapping->host = inode;
 		mapping->flags = 0;
+#ifndef CONFIG_IA64
+		mapping_set_gfp_mask(mapping, GFP_RCLMUSER);
+#else
+		/*
+		 * This is obviously a rubbish check to be making. On ia64,
+		 * the machine crashes before the console inits. If someone
+		 * has an ia64 that they can see the screen of, please get rid
+		 * of this #define, boot the machine and send mel@csn.ul.ie a
+		 * report on what breaks during boot please
+		 */
 		mapping_set_gfp_mask(mapping, GFP_HIGHUSER);
+#endif
 		mapping->assoc_mapping = NULL;
 		mapping->backing_dev_info = &default_backing_dev_info;
 
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/include/asm-i386/page.h linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/asm-i386/page.h
--- linux-2.6.17-rc3-mm1-zonesizing-v9/include/asm-i386/page.h	2006-05-03 09:41:32.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/asm-i386/page.h	2006-05-03 09:45:54.000000000 +0100
@@ -36,7 +36,8 @@
 #define clear_user_page(page, vaddr, pg)	clear_page(page)
 #define copy_user_page(to, from, vaddr, pg)	copy_page(to, from)
 
-#define alloc_zeroed_user_highpage(vma, vaddr) alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vma, vaddr)
+#define alloc_zeroed_user_highpage(vma, vaddr) \
+	alloc_page_vma(GFP_RCLMUSER | __GFP_ZERO, vma, vaddr)
 #define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
 
 /*
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/include/linux/gfp.h linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/gfp.h
--- linux-2.6.17-rc3-mm1-zonesizing-v9/include/linux/gfp.h	2006-05-03 09:41:33.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/gfp.h	2006-05-03 09:45:54.000000000 +0100
@@ -21,6 +21,7 @@ struct vm_area_struct;
 #else
 #define __GFP_DMA32	((__force gfp_t)0x04)	/* Has own ZONE_DMA32 */
 #endif
+#define __GFP_EASYRCLM  ((__force gfp_t)0x08u)
 
 /*
  * Action modifiers - doesn't change the zoning
@@ -67,6 +68,8 @@ struct vm_area_struct;
 #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
 #define GFP_HIGHUSER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL | \
 			 __GFP_HIGHMEM)
+#define GFP_RCLMUSER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL | \
+			__GFP_EASYRCLM)
 
 /* Flag - indicates that the buffer will be suitable for DMA.  Ignored on some
    platforms, used as appropriate on others */
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/include/linux/highmem.h linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/highmem.h
--- linux-2.6.17-rc3-mm1-zonesizing-v9/include/linux/highmem.h	2006-05-03 09:41:33.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/highmem.h	2006-05-03 09:45:54.000000000 +0100
@@ -59,7 +59,7 @@ static inline void clear_user_highpage(s
 static inline struct page *
 alloc_zeroed_user_highpage(struct vm_area_struct *vma, unsigned long vaddr)
 {
-	struct page *page = alloc_page_vma(GFP_HIGHUSER, vma, vaddr);
+	struct page *page = alloc_page_vma(GFP_RCLMUSER, vma, vaddr);
 
 	if (page)
 		clear_user_highpage(page, vaddr);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/mm/memory.c linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/memory.c
--- linux-2.6.17-rc3-mm1-zonesizing-v9/mm/memory.c	2006-05-03 09:41:33.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/memory.c	2006-05-03 09:45:54.000000000 +0100
@@ -1495,7 +1495,7 @@ gotten:
 		if (!new_page)
 			goto oom;
 	} else {
-		new_page = alloc_page_vma(GFP_HIGHUSER, vma, address);
+		new_page = alloc_page_vma(GFP_RCLMUSER, vma, address);
 		if (!new_page)
 			goto oom;
 		cow_user_page(new_page, old_page, address);
@@ -2091,7 +2091,7 @@ retry:
 
 		if (unlikely(anon_vma_prepare(vma)))
 			goto oom;
-		page = alloc_page_vma(GFP_HIGHUSER, vma, address);
+		page = alloc_page_vma(GFP_RCLMUSER, vma, address);
 		if (!page)
 			goto oom;
 		copy_user_highpage(page, new_page, address);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/mm/shmem.c linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/shmem.c
--- linux-2.6.17-rc3-mm1-zonesizing-v9/mm/shmem.c	2006-05-03 09:41:33.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/shmem.c	2006-05-03 09:45:54.000000000 +0100
@@ -968,6 +968,8 @@ shmem_alloc_page(gfp_t gfp, struct shmem
 	pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, idx);
 	pvma.vm_pgoff = idx;
 	pvma.vm_end = PAGE_SIZE;
+	if (gfp & __GFP_HIGHMEM)
+		gfp = (gfp & ~__GFP_HIGHMEM) | __GFP_EASYRCLM;
 	page = alloc_page_vma(gfp | __GFP_ZERO, &pvma, 0);
 	mpol_free(pvma.vm_policy);
 	return page;
@@ -988,6 +990,8 @@ shmem_swapin(struct shmem_inode_info *in
 static inline struct page *
 shmem_alloc_page(gfp_t gfp,struct shmem_inode_info *info, unsigned long idx)
 {
+	if (gfp & __GFP_HIGHMEM)
+		gfp = (gfp & ~__GFP_HIGHMEM) | __GFP_EASYRCLM;
 	return alloc_page(gfp | __GFP_ZERO);
 }
 #endif
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-v9/mm/swap_state.c linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/swap_state.c
--- linux-2.6.17-rc3-mm1-zonesizing-v9/mm/swap_state.c	2006-05-03 09:41:33.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/swap_state.c	2006-05-03 09:45:54.000000000 +0100
@@ -343,7 +343,7 @@ struct page *read_swap_cache_async(swp_e
 		 * Get a new page to read into from swap.
 		 */
 		if (!new_page) {
-			new_page = alloc_page_vma(GFP_HIGHUSER, vma, addr);
+			new_page = alloc_page_vma(GFP_RCLMUSER, vma, addr);
 			if (!new_page)
 				break;		/* Out of memory */
 		}
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 2/8] Create the ZONE_EASYRCLM zone
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
  2006-05-05 17:35 ` [PATCH 1/8] Add __GFP_EASYRCLM flag and update callers Mel Gorman
@ 2006-05-05 17:35 ` Mel Gorman
  2006-05-05 17:35 ` [PATCH 3/8] x86 - Specify amount of kernel memory at boot time Mel Gorman
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:35 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This patch adds the ZONE_EASYRCLM zone and updates relevant contants and
helper functions. After this patch is applied, memory that is hot-added on
the x86 will be placed in ZONE_EASYRCLM. Memory hot-added on the ppc64 still
goes to ZONE_DMA.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/arch/i386/kernel/srat.c linux-2.6.17-rc3-mm1-zonesizing-102_addzone/arch/i386/kernel/srat.c
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/arch/i386/kernel/srat.c	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/arch/i386/kernel/srat.c	2006-05-03 09:46:37.000000000 +0100
@@ -43,7 +43,7 @@
 #define PXM_BITMAP_LEN (MAX_PXM_DOMAINS / 8) 
 static u8 pxm_bitmap[PXM_BITMAP_LEN];	/* bitmap of proximity domains */
 
-#define MAX_CHUNKS_PER_NODE	4
+#define MAX_CHUNKS_PER_NODE	5
 #define MAXCHUNKS		(MAX_CHUNKS_PER_NODE * MAX_NUMNODES)
 struct node_memory_chunk_s {
 	unsigned long	start_pfn;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/arch/x86_64/mm/init.c linux-2.6.17-rc3-mm1-zonesizing-102_addzone/arch/x86_64/mm/init.c
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/arch/x86_64/mm/init.c	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/arch/x86_64/mm/init.c	2006-05-03 09:46:37.000000000 +0100
@@ -482,7 +482,7 @@ int memory_add_physaddr_to_nid(u64 start
 int arch_add_memory(int nid, u64 start, u64 size)
 {
 	struct pglist_data *pgdat = NODE_DATA(nid);
-	struct zone *zone = pgdat->node_zones + MAX_NR_ZONES-2;
+	struct zone *zone = pgdat->node_zones + MAX_NR_ZONES-3;
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/ntfs/malloc.h linux-2.6.17-rc3-mm1-zonesizing-102_addzone/fs/ntfs/malloc.h
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/fs/ntfs/malloc.h	2006-05-03 09:41:32.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/fs/ntfs/malloc.h	2006-05-05 14:34:38.000000000 +0100
@@ -44,7 +44,8 @@ static inline void *__ntfs_malloc(unsign
 	if (likely(size <= PAGE_SIZE)) {
 		BUG_ON(!size);
 		/* kmalloc() has per-CPU caches so is faster for now. */
-		return kmalloc(PAGE_SIZE, gfp_mask & ~__GFP_HIGHMEM);
+		return kmalloc(PAGE_SIZE, gfp_mask & 
+					~(__GFP_HIGHMEM | __GFP_EASYRCLM));
 		/* return (void *)__get_free_page(gfp_mask); */
 	}
 	if (likely(size >> PAGE_SHIFT < num_physpages))
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/mm.h linux-2.6.17-rc3-mm1-zonesizing-102_addzone/include/linux/mm.h
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/mm.h	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/include/linux/mm.h	2006-05-03 09:46:37.000000000 +0100
@@ -949,6 +949,7 @@ extern int early_pfn_to_nid(unsigned lon
 extern void free_bootmem_with_active_regions(int nid,
 						unsigned long max_low_pfn);
 extern void sparse_memory_present_with_active_regions(int nid);
+extern void set_required_kernelcore(unsigned long kernelcore_pfn);
 #endif
 extern void memmap_init_zone(unsigned long, int, unsigned long, unsigned long);
 extern void setup_per_zone_pages_min(void);
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/mmzone.h linux-2.6.17-rc3-mm1-zonesizing-102_addzone/include/linux/mmzone.h
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/include/linux/mmzone.h	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/include/linux/mmzone.h	2006-05-03 09:46:37.000000000 +0100
@@ -74,9 +74,10 @@ struct per_cpu_pageset {
 #define ZONE_DMA32		1
 #define ZONE_NORMAL		2
 #define ZONE_HIGHMEM		3
+#define ZONE_EASYRCLM		4
 
-#define MAX_NR_ZONES		4	/* Sync this with ZONES_SHIFT */
-#define ZONES_SHIFT		2	/* ceil(log2(MAX_NR_ZONES)) */
+#define MAX_NR_ZONES		5	/* Sync this with ZONES_SHIFT */
+#define ZONES_SHIFT		3	/* ceil(log2(MAX_NR_ZONES)) */
 
 
 /*
@@ -104,7 +105,7 @@ struct per_cpu_pageset {
  *
  * NOTE! Make sure this matches the zones in <linux/gfp.h>
  */
-#define GFP_ZONEMASK	0x07
+#define GFP_ZONEMASK	0x0f
 /* #define GFP_ZONETYPES       (GFP_ZONEMASK + 1) */           /* Non-loner */
 #define GFP_ZONETYPES  ((GFP_ZONEMASK + 1) / 2 + 1)            /* Loner */
 
@@ -364,7 +365,7 @@ static inline int populated_zone(struct 
 
 static inline int is_highmem_idx(int idx)
 {
-	return (idx == ZONE_HIGHMEM);
+	return (idx == ZONE_HIGHMEM || idx == ZONE_EASYRCLM);
 }
 
 static inline int is_normal_idx(int idx)
@@ -380,7 +381,8 @@ static inline int is_normal_idx(int idx)
  */
 static inline int is_highmem(struct zone *zone)
 {
-	return zone == zone->zone_pgdat->node_zones + ZONE_HIGHMEM;
+	return zone == zone->zone_pgdat->node_zones + ZONE_HIGHMEM ||
+		zone == zone->zone_pgdat->node_zones + ZONE_EASYRCLM;
 }
 
 static inline int is_normal(struct zone *zone)
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/mem_init.c linux-2.6.17-rc3-mm1-zonesizing-102_addzone/mm/mem_init.c
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/mem_init.c	2006-05-03 09:42:17.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/mm/mem_init.c	2006-05-05 08:30:46.000000000 +0100
@@ -24,7 +24,8 @@
 #include <linux/cpu.h>
 #include <linux/stop_machine.h>
 
-static char *zone_names[MAX_NR_ZONES] = { "DMA", "DMA32", "Normal", "HighMem" };
+static char *zone_names[MAX_NR_ZONES] = { "DMA", "DMA32", "Normal",
+						"HighMem", "EasyRclm" };
 int percpu_pagelist_fraction;
 
 #ifdef CONFIG_ARCH_POPULATES_NODE_MAP
@@ -37,6 +38,8 @@ int percpu_pagelist_fraction;
   struct node_active_region __initdata early_node_map[MAX_ACTIVE_REGIONS];
   unsigned long __initdata arch_zone_lowest_possible_pfn[MAX_NR_ZONES];
   unsigned long __initdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
+  unsigned long __initdata required_kernelcore;
+  unsigned long __initdata easyrclm_pfn[MAX_NUMNODES];
 #endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
 
 /*
@@ -49,13 +52,17 @@ static int __meminit build_zonelists_nod
 {
 	struct zone *zone;
 
-	BUG_ON(zone_type > ZONE_HIGHMEM);
+	BUG_ON(zone_type > ZONE_EASYRCLM);
 
 	do {
 		zone = pgdat->node_zones + zone_type;
 		if (populated_zone(zone)) {
 #ifndef CONFIG_HIGHMEM
-			BUG_ON(zone_type > ZONE_NORMAL);
+			/*
+			 * On architectures with only ZONE_DMA, it is still
+			 * valid to have a ZONE_EASYRCLM
+			 */
+			BUG_ON(zone_type == ZONE_HIGHMEM);
 #endif
 			zonelist->zones[nr_zones++] = zone;
 			check_highest_zone(zone_type);
@@ -69,6 +76,8 @@ static int __meminit build_zonelists_nod
 static inline int highest_zone(int zone_bits)
 {
 	int res = ZONE_NORMAL;
+	if (zone_bits & (__force int)__GFP_EASYRCLM)
+		res = ZONE_EASYRCLM;
 	if (zone_bits & (__force int)__GFP_HIGHMEM)
 		res = ZONE_HIGHMEM;
 	if (zone_bits & (__force int)__GFP_DMA32)
@@ -773,6 +782,44 @@ void __init get_pfn_range_for_nid(unsign
 	}
 }
 
+/* Return the highest zone that can be used for EASYRCLM pages */
+unsigned long __init highest_usable_zone(void)
+{
+	int zone_index;
+	for (zone_index = MAX_NR_ZONES - 1; zone_index >= 0; zone_index--) {
+		if (arch_zone_highest_possible_pfn[zone_index] >
+				arch_zone_lowest_possible_pfn[zone_index])
+			break;
+	}
+
+	BUG_ON(zone_index == -1);
+	return zone_index;
+}
+
+void __init adjust_zone_range_for_easyrclm(int nid,
+					unsigned long zone_type,
+					unsigned long *zone_start_pfn,
+					unsigned long *zone_end_pfn)
+{
+	/* Adjust the zone range to take EASYRCLM into account */
+	if (easyrclm_pfn[nid]) {
+		unsigned long usable_zone = highest_usable_zone();
+		/* Size ZONE_EASYRCLM */
+		if (zone_type == ZONE_EASYRCLM) {
+			*zone_start_pfn = easyrclm_pfn[nid];
+			*zone_end_pfn =
+				arch_zone_highest_possible_pfn[usable_zone];
+
+		/* Adjust for ZONE_EASYRCLM starting within this range */
+		} else if (*zone_start_pfn < easyrclm_pfn[nid] &&
+				*zone_end_pfn > easyrclm_pfn[nid]) {
+			*zone_end_pfn = easyrclm_pfn[nid];
+
+		/* Check if this whole range is within ZONE_EASYRCLM */
+		} else if (*zone_start_pfn >= easyrclm_pfn[nid])
+			*zone_start_pfn = *zone_end_pfn;
+	}
+}
 unsigned long __init zone_present_pages_in_node(int nid,
 					unsigned long zone_type,
 					unsigned long *ignored)
@@ -784,6 +831,8 @@ unsigned long __init zone_present_pages_
 	get_pfn_range_for_nid(nid, &node_start_pfn, &node_end_pfn);
 	zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type];
 	zone_end_pfn = arch_zone_highest_possible_pfn[zone_type];
+	adjust_zone_range_for_easyrclm(nid, zone_type,
+			&zone_start_pfn, &zone_end_pfn);
 
 	/* Check that this node has pages within the zone's required range */
 	if (zone_end_pfn < node_start_pfn || zone_start_pfn > node_end_pfn)
@@ -794,6 +843,8 @@ unsigned long __init zone_present_pages_
 	zone_start_pfn = max(zone_start_pfn, node_start_pfn);
 
 	/* Return the spanned pages */
+	printk("zone_present_pages_in_node(%d, %lu) = %lu\n", nid,
+					zone_type, zone_end_pfn - zone_start_pfn);
 	return zone_end_pfn - zone_start_pfn;
 }
 
@@ -805,8 +856,6 @@ unsigned long __init __absent_pages_in_r
 	unsigned long prev_end_pfn = 0, hole_pages = 0;
 	unsigned long start_pfn;
 
-	printk("__absent_pages_in_range(%d, %lu, %lu) = ", nid,
-					range_start_pfn, range_end_pfn);
 	/* Find the end_pfn of the first active range of pfns in the node */
 	i = first_active_region_index_in_nid(nid);
 	if (i == MAX_ACTIVE_REGIONS)
@@ -833,8 +882,6 @@ unsigned long __init __absent_pages_in_r
 		prev_end_pfn = early_node_map[i].end_pfn;
 	}
 
-	printk("%lu\n", hole_pages);
-
 	return hole_pages;
 }
 
@@ -848,9 +895,13 @@ unsigned long __init zone_absent_pages_i
 					unsigned long zone_type,
 					unsigned long *ignored)
 {
-	return __absent_pages_in_range(nid,
-				arch_zone_lowest_possible_pfn[zone_type],
-				arch_zone_highest_possible_pfn[zone_type]);
+	unsigned long zone_start_pfn, zone_end_pfn;
+	zone_start_pfn = arch_zone_lowest_possible_pfn[zone_type];
+	zone_end_pfn = arch_zone_highest_possible_pfn[zone_type];
+	adjust_zone_range_for_easyrclm(nid, zone_type,
+			&zone_start_pfn, &zone_end_pfn);
+
+	return __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn);
 }
 #else
 static inline unsigned long zone_present_pages_in_node(int nid,
@@ -1121,18 +1172,122 @@ unsigned long __init find_min_pfn_with_a
 	return find_min_pfn_for_node(MAX_NUMNODES);
 }
 
-unsigned long __init find_max_pfn_with_active_regions(void)
+/* Find the highest pfn for a node. This depends on a sorted early_node_map */
+unsigned long __init find_max_pfn_for_node(unsigned long nid)
 {
 	int i;
 	unsigned long max_pfn = 0;
 
-	for (i = 0; early_node_map[i].end_pfn; i++)
+	/* Assuming a sorted map, the first range found has the starting pfn */
+	for_each_active_range_index_in_nid(i, nid)
 		max_pfn = max(max_pfn, early_node_map[i].end_pfn);
 
-	printk("find_max_pfn_with_active_regions() == %lu\n", max_pfn);
 	return max_pfn;
 }
 
+unsigned long __init find_max_pfn_with_active_regions(void)
+{
+	return find_max_pfn_for_node(MAX_NUMNODES);
+}
+
+/*
+ * Find the PFN the EasyRclm zone begins in each node for. Assumes
+ * that early_node_map is already sorted. In this function, an attempt
+ * is made to spread the kernel memory evenly between nodes
+ */
+void __init find_easyrclm_pfns_for_nodes(unsigned long *easyrclm_pfn)
+{
+	int i, j, nid, map_index;
+	int nids_seen[MAX_NUMNODES];
+	unsigned long num_active_nodes = 0;
+	unsigned long usable_startpfn;
+
+	/* If kernelcore was not specified, just return */
+	if (!required_kernelcore)
+		return;
+
+	/* Count the number of unique nodes in the system */
+	for (i = 0; early_node_map[i].end_pfn; i++) {
+		for (j = 0; j < num_active_nodes; j++) {
+			if (nids_seen[j] == early_node_map[i].nid)
+				break;
+		}
+
+		if (j == num_active_nodes) {
+			nids_seen[j] = early_node_map[i].nid;
+			num_active_nodes++;
+		}
+	}
+	printk("num_active_nodes = %lu\n", num_active_nodes);
+	printk("highest_usable_zone = %lu\n", highest_usable_zone());
+
+	usable_startpfn = arch_zone_lowest_possible_pfn[highest_usable_zone()];
+	printk("usable_startpfn = %lu\n", arch_zone_lowest_possible_pfn[highest_usable_zone()]);
+
+	/* Walk the early_node_map, finding where easyrclm pfn is */
+	for (map_index = 0; early_node_map[map_index].end_pfn; map_index++) {
+		unsigned long size_pages;
+		unsigned long start_pfn, end_pfn;
+		unsigned long node_max_pfn;
+		unsigned long kernelcore_required_node = 0;
+
+		nid = early_node_map[map_index].nid;
+		start_pfn = early_node_map[map_index].start_pfn;
+		end_pfn = early_node_map[map_index].end_pfn;
+		node_max_pfn = find_max_pfn_for_node(nid);
+
+		/* Check if this range is usable only by the kernel */
+		if (end_pfn < usable_startpfn) {
+			required_kernelcore -= min( (end_pfn - start_pfn),
+							required_kernelcore);
+			continue;
+		}
+
+recalc:
+		/* Divide the remaining required_kernelcore between nodes */
+		if (num_active_nodes > 0) {
+			kernelcore_required_node =
+				required_kernelcore / num_active_nodes;
+		}
+
+		/* Calculate usable pages for kernelcore in this range */
+		size_pages = end_pfn - start_pfn;
+		if (size_pages > kernelcore_required_node)
+			size_pages = kernelcore_required_node;
+
+		/* Set the easyrclm_pfn and update required_kernelcore */
+		if (required_kernelcore)
+			easyrclm_pfn[nid] = start_pfn + size_pages;
+		required_kernelcore -= size_pages;
+
+		/*
+		 * If easyrclm_pfn is currently using more than the usable
+		 * zone, then easyrclm_pfn to the start of the usable zone,
+		 * account for the additional pages used for kernelcore and
+		 * recalculate requirements
+		 */
+		if (easyrclm_pfn[nid] < usable_startpfn) {
+			size_pages = usable_startpfn - easyrclm_pfn[nid];
+			required_kernelcore -=
+				min(required_kernelcore, size_pages);
+			easyrclm_pfn[nid] = usable_startpfn;
+			start_pfn = usable_startpfn;
+			goto recalc;
+		}
+
+
+		/* If this node is now empty, stop counting it as active */
+		if (node_max_pfn == end_pfn)
+			num_active_nodes--;
+	}
+
+	/* Make sure all easyrclm pfns are within the usable zone */
+	for (nid = 0; i < MAX_NUMNODES; i++) {
+		if (easyrclm_pfn[nid] < usable_startpfn)
+			easyrclm_pfn[nid] = usable_startpfn;
+	}
+}
+
 void __init free_area_init_nodes(unsigned long arch_max_dma_pfn,
 				unsigned long arch_max_dma32_pfn,
 				unsigned long arch_max_low_pfn,
@@ -1154,7 +1309,8 @@ void __init free_area_init_nodes(unsigne
 					find_min_pfn_with_active_regions();
 	arch_zone_highest_possible_pfn[ZONE_DMA] = arch_max_dma_pfn;
 	arch_zone_highest_possible_pfn[ZONE_DMA32] = arch_max_dma32_pfn;
-	arch_zone_highest_possible_pfn[ZONE_NORMAL] = arch_max_low_pfn;
+	arch_zone_highest_possible_pfn[ZONE_NORMAL] = 
+				max(arch_max_dma32_pfn, arch_max_low_pfn);
 	arch_zone_highest_possible_pfn[ZONE_HIGHMEM] = arch_max_high_pfn;
 	for (zone_index = 1; zone_index < MAX_NR_ZONES; zone_index++) {
 		arch_zone_lowest_possible_pfn[zone_index] =
@@ -1163,13 +1319,26 @@ void __init free_area_init_nodes(unsigne
 
 	printk("free_area_init_nodes(): find_min_pfn = %lu\n", find_min_pfn_with_active_regions());
 
+	arch_zone_lowest_possible_pfn[ZONE_EASYRCLM] = 0;
+	arch_zone_highest_possible_pfn[ZONE_EASYRCLM] = 0;
+
 	/* Regions in the early_node_map can be in any order */
 	sort_node_map();
 
+	/* Find the PFNs that EasyRclm begins at in each node */
+	memset(easyrclm_pfn, 0, sizeof(easyrclm_pfn));
+	find_easyrclm_pfns_for_nodes(easyrclm_pfn);
+
 	for_each_online_node(nid) {
 		pg_data_t *pgdat = NODE_DATA(nid);
 		free_area_init_node(nid, pgdat, NULL,
 				find_min_pfn_for_node(nid), NULL);
 	}
 }
+
+void __init set_required_kernelcore(unsigned long size_pfn)
+{
+	required_kernelcore = size_pfn;
+	printk("required_kernelcore = %lu\n", size_pfn);
+}
 #endif /* CONFIG_ARCH_POPULATES_NODE_MAP */
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/page_alloc.c linux-2.6.17-rc3-mm1-zonesizing-102_addzone/mm/page_alloc.c
--- linux-2.6.17-rc3-mm1-zonesizing-101_antifrag_flags/mm/page_alloc.c	2006-05-03 09:42:17.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-102_addzone/mm/page_alloc.c	2006-05-03 09:46:37.000000000 +0100
@@ -68,7 +68,7 @@ static void __free_pages_ok(struct page 
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = { 256, 256, 32 };
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = { 256, 256, 32, 32 };
 
 EXPORT_SYMBOL(totalram_pages);
 
@@ -231,6 +231,8 @@ static inline void prep_zero_page(struct
 	int i;
 
 	VM_BUG_ON((gfp_flags & (__GFP_WAIT | __GFP_HIGHMEM)) == __GFP_HIGHMEM);
+	VM_BUG_ON((gfp_flags & (__GFP_WAIT | __GFP_EASYRCLM))
+							== __GFP_EASYRCLM);
 	/*
 	 * clear_highpage() will use KM_USER0, so it's a bug to use __GFP_ZERO
 	 * and __GFP_HIGHMEM from hard or soft interrupt context.
@@ -1271,7 +1273,7 @@ unsigned int nr_free_buffer_pages(void)
  */
 unsigned int nr_free_pagecache_pages(void)
 {
-	return nr_free_zone_pages(gfp_zone(GFP_HIGHUSER));
+	return nr_free_zone_pages(gfp_zone(GFP_RCLMUSER));
 }
 
 #ifdef CONFIG_HIGHMEM
@@ -1280,8 +1282,10 @@ unsigned int nr_free_highpages (void)
 	pg_data_t *pgdat;
 	unsigned int pages = 0;
 
-	for_each_online_pgdat(pgdat)
+	for_each_online_pgdat(pgdat) {
 		pages += pgdat->node_zones[ZONE_HIGHMEM].free_pages;
+		pages += pgdat->node_zones[ZONE_EASYRCLM].free_pages;
+	}
 
 	return pages;
 }
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 3/8] x86 - Specify amount of kernel memory at boot time
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
  2006-05-05 17:35 ` [PATCH 1/8] Add __GFP_EASYRCLM flag and update callers Mel Gorman
  2006-05-05 17:35 ` [PATCH 2/8] Create the ZONE_EASYRCLM zone Mel Gorman
@ 2006-05-05 17:35 ` Mel Gorman
  2006-05-05 17:36 ` [PATCH 4/8] ppc64 " Mel Gorman
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:35 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This patch adds the kernelcore= parameter for x86.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-102_addzone/arch/i386/kernel/setup.c linux-2.6.17-rc3-mm1-zonesizing-103_x86coremem/arch/i386/kernel/setup.c
--- linux-2.6.17-rc3-mm1-zonesizing-102_addzone/arch/i386/kernel/setup.c	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-103_x86coremem/arch/i386/kernel/setup.c	2006-05-03 09:47:22.000000000 +0100
@@ -943,6 +943,18 @@ static void __init parse_cmdline_early (
 		else if (!memcmp(from, "vmalloc=", 8))
 			__VMALLOC_RESERVE = memparse(from+8, &from);
 
+		/*
+		 * kernelcore=size sets the amount of memory for use for
+		 * kernel allocations that cannot be reclaimed easily.
+		 * The remaining memory is set aside for easy reclaim
+		 * for features like memory remove or huge page allocations
+		 */
+		else if (!memcmp(from, "kernelcore=",11)) {
+			unsigned long core_pages;
+			core_pages = memparse(from+11, &from) >> PAGE_SHIFT;
+			set_required_kernelcore(core_pages);
+		}
+
 	next_char:
 		c = *(from++);
 		if (!c)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 4/8] ppc64 - Specify amount of kernel memory at boot time
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
                   ` (2 preceding siblings ...)
  2006-05-05 17:35 ` [PATCH 3/8] x86 - Specify amount of kernel memory at boot time Mel Gorman
@ 2006-05-05 17:36 ` Mel Gorman
  2006-05-05 17:36 ` [PATCH 5/8] x86_64 " Mel Gorman
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:36 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This patch adds the kernelcore= parameter for ppc64.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-103_x86coremem/arch/powerpc/kernel/prom.c linux-2.6.17-rc3-mm1-zonesizing-104_ppc64coremem/arch/powerpc/kernel/prom.c
--- linux-2.6.17-rc3-mm1-zonesizing-103_x86coremem/arch/powerpc/kernel/prom.c	2006-05-03 09:41:31.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-104_ppc64coremem/arch/powerpc/kernel/prom.c	2006-05-03 09:48:06.000000000 +0100
@@ -1064,6 +1064,15 @@ static int __init early_init_dt_scan_cho
 		}
 	}
 
+	/* Check if ZONE_EASYRCLM should be populated */
+	if (strstr(cmd_line, "kernelcore=")) {
+		unsigned long core_pages;
+		char *opt = strstr(cmd_line, "kernelcore=");
+		opt += 11;
+		core_pages = memparse(opt, &opt) >> PAGE_SHIFT;
+		set_required_kernelcore(core_pages);
+	}
+
 	/* break now */
 	return 1;
 }
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-103_x86coremem/arch/ppc/mm/init.c linux-2.6.17-rc3-mm1-zonesizing-104_ppc64coremem/arch/ppc/mm/init.c
--- linux-2.6.17-rc3-mm1-zonesizing-103_x86coremem/arch/ppc/mm/init.c	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-104_ppc64coremem/arch/ppc/mm/init.c	2006-05-03 09:48:06.000000000 +0100
@@ -213,6 +213,15 @@ void MMU_setup(void)
 		}
 		__max_memory = maxmem;
 	}
+
+	/* Check if ZONE_EASYRCLM should be populated */
+	if (strstr(cmd_line, "kernelcore=")) {
+		unsigned long core_pages;
+		char *opt = strstr(cmd_line, "kernelcore=");
+		opt += 11;
+		core_pages = memparse(opt, &opt) >> PAGE_SHIFT;
+		set_required_kernelcore(core_pages);
+	}
 }
 
 /*
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 5/8] x86_64 - Specify amount of kernel memory at boot time
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
                   ` (3 preceding siblings ...)
  2006-05-05 17:36 ` [PATCH 4/8] ppc64 " Mel Gorman
@ 2006-05-05 17:36 ` Mel Gorman
  2006-05-05 17:36 ` [PATCH 6/8] ia64 " Mel Gorman
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:36 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This patch adds the kernelcore= parameter for x64_64.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-104_ppc64coremem/arch/x86_64/kernel/setup.c linux-2.6.17-rc3-mm1-zonesizing-105_x8664coremem/arch/x86_64/kernel/setup.c
--- linux-2.6.17-rc3-mm1-zonesizing-104_ppc64coremem/arch/x86_64/kernel/setup.c	2006-05-03 09:42:16.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-105_x8664coremem/arch/x86_64/kernel/setup.c	2006-05-03 09:48:56.000000000 +0100
@@ -372,6 +372,12 @@ static __init void parse_cmdline_early (
 		if (!memcmp(from, "mem=", 4))
 			parse_memopt(from+4, &from); 
 
+		if (!memcmp(from, "kernelcore=", 11)) {
+			unsigned long core_pages;
+			core_pages = memparse(from+11, &from) >> PAGE_SHIFT;
+			set_required_kernelcore(core_pages);
+		}
+
 		if (!memcmp(from, "memmap=", 7)) {
 			/* exactmap option is for used defined memory */
 			if (!memcmp(from+7, "exactmap", 8)) {
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 6/8] ia64 - Specify amount of kernel memory at boot time
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
                   ` (4 preceding siblings ...)
  2006-05-05 17:36 ` [PATCH 5/8] x86_64 " Mel Gorman
@ 2006-05-05 17:36 ` Mel Gorman
  2006-05-05 17:37 ` [PATCH 7/8] Allow HugeTLB allocations to use ZONE_EASYRCLM Mel Gorman
  2006-05-05 17:37 ` [PATCH 8/8] Add documentation for extra boot parameters Mel Gorman
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:36 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
This patch adds the kernelcore= parameter for ia64.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-105_x8664coremem/arch/ia64/kernel/efi.c linux-2.6.17-rc3-mm1-zonesizing-106_ia64coremem/arch/ia64/kernel/efi.c
--- linux-2.6.17-rc3-mm1-zonesizing-105_x8664coremem/arch/ia64/kernel/efi.c	2006-05-03 09:41:30.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-106_ia64coremem/arch/ia64/kernel/efi.c	2006-05-04 13:53:14.000000000 +0100
@@ -25,6 +25,7 @@
 #include <linux/types.h>
 #include <linux/time.h>
 #include <linux/efi.h>
+#include <linux/mm.h>
 
 #include <asm/io.h>
 #include <asm/kregs.h>
@@ -420,6 +421,10 @@ efi_init (void)
 			mem_limit = memparse(cp + 4, &cp);
 		} else if (memcmp(cp, "max_addr=", 9) == 0) {
 			max_addr = GRANULEROUNDDOWN(memparse(cp + 9, &cp));
+		} else if (memcmp(cp, "kernelcore=",11) == 0) {
+			unsigned long core_pages;
+			core_pages = memparse(cp+11, &cp) >> PAGE_SHIFT;
+			set_required_kernelcore(core_pages);
 		} else {
 			while (*cp != ' ' && *cp)
 				++cp;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 7/8] Allow HugeTLB allocations to use ZONE_EASYRCLM
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
                   ` (5 preceding siblings ...)
  2006-05-05 17:36 ` [PATCH 6/8] ia64 " Mel Gorman
@ 2006-05-05 17:37 ` Mel Gorman
  2006-05-05 17:37 ` [PATCH 8/8] Add documentation for extra boot parameters Mel Gorman
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:37 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
On ppc64 at least, a HugeTLB is the same size as a memory section. Hence,
it causes no fragmentation that is worth caring about because a section can
still be offlined.
Once HugeTLB is allowed to use ZONE_EASYRCLM, the size of the zone becomes a
"soft" area where HugeTLB allocations may be satisified. For example, take
a situation where a system administrator is not willing to reserve HugeTLB
pages at boot time. In this case, he can use kernelcore to size the EasyRclm
zone which is still usable by normal processes. If a job starts that need
HugeTLB pages, one could dd a file the size of physical memory, delete it
and have a good chance of getting a number of HugeTLB pages. To get all of
EasyRclm as HugeTLB pages, the ability to drain per-cpu pages is required.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-106_ia64coremem/mm/hugetlb.c linux-2.6.17-rc3-mm1-zonesizing-107_hugetlb_use_easyrclm/mm/hugetlb.c
--- linux-2.6.17-rc3-mm1-zonesizing-106_ia64coremem/mm/hugetlb.c	2006-05-03 09:41:33.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-107_hugetlb_use_easyrclm/mm/hugetlb.c	2006-05-03 09:50:27.000000000 +0100
@@ -73,7 +73,7 @@ static struct page *dequeue_huge_page(st
 
 	for (z = zonelist->zones; *z; z++) {
 		nid = (*z)->zone_pgdat->node_id;
-		if (cpuset_zone_allowed(*z, GFP_HIGHUSER) &&
+		if (cpuset_zone_allowed(*z, GFP_RCLMUSER) &&
 		    !list_empty(&hugepage_freelists[nid]))
 			break;
 	}
@@ -103,7 +103,7 @@ static int alloc_fresh_huge_page(void)
 {
 	static int nid = 0;
 	struct page *page;
-	page = alloc_pages_node(nid, GFP_HIGHUSER|__GFP_COMP|__GFP_NOWARN,
+	page = alloc_pages_node(nid, GFP_RCLMUSER|__GFP_COMP|__GFP_NOWARN,
 					HUGETLB_PAGE_ORDER);
 	nid = next_node(nid, node_online_map);
 	if (nid == MAX_NUMNODES)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 8/8] Add documentation for extra boot parameters
  2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
                   ` (6 preceding siblings ...)
  2006-05-05 17:37 ` [PATCH 7/8] Allow HugeTLB allocations to use ZONE_EASYRCLM Mel Gorman
@ 2006-05-05 17:37 ` Mel Gorman
  7 siblings, 0 replies; 9+ messages in thread
From: Mel Gorman @ 2006-05-05 17:37 UTC (permalink / raw)
  To: linux-mm; +Cc: Mel Gorman, linux-kernel, lhms-devel
Once all patches are applied, two new command-line parameters exist -
kernelcore and noeasyrclm. This patch adds the necessary documentation.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.17-rc3-mm1-zonesizing-107_hugetlb_use_easyrclm/Documentation/kernel-parameters.txt linux-2.6.17-rc3-mm1-zonesizing-108_docs/Documentation/kernel-parameters.txt
--- linux-2.6.17-rc3-mm1-zonesizing-107_hugetlb_use_easyrclm/Documentation/kernel-parameters.txt	2006-05-03 09:41:30.000000000 +0100
+++ linux-2.6.17-rc3-mm1-zonesizing-108_docs/Documentation/kernel-parameters.txt	2006-05-03 09:51:12.000000000 +0100
@@ -724,6 +724,22 @@ running once the system is up.
 	js=		[HW,JOY] Analog joystick
 			See Documentation/input/joystick.txt.
 
+	kernelcore=nn[KMG]	[KNL,IA-32,IA-64,PPC,X86-64] This parameter
+			specifies the amount of memory usable by the kernel.
+			The requested amount is spread evenly throughout
+			all nodes in the system. The remaining memory
+			in each node is used for EasyRclm pages. In the
+			event, a node is too small to have both kernelcore
+			and EasyRclm pages, kernelcore pages will take
+			priority and other nodes will have a larger
+			number of kernelcore pages.  The EasyRclm zone
+			is used for the allocation of pages on behalf
+			of a process and for HugeTLB pages. On ppc64,
+			it is likely that memory sections on this zone
+			can be offlined. Note that allocations like
+			PTEs-from-HighMem still use the HighMem zone if
+			it exists, and the Normal zone if it does not.
+
 	keepinitrd	[HW,ARM]
 
 	kstack=N	[IA-32,X86-64] Print N words from the kernel stack
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply	[flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-05-05 17:37 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-05 17:34 [PATCH 0/8] Reducing fragmentation using zones v6 Mel Gorman
2006-05-05 17:35 ` [PATCH 1/8] Add __GFP_EASYRCLM flag and update callers Mel Gorman
2006-05-05 17:35 ` [PATCH 2/8] Create the ZONE_EASYRCLM zone Mel Gorman
2006-05-05 17:35 ` [PATCH 3/8] x86 - Specify amount of kernel memory at boot time Mel Gorman
2006-05-05 17:36 ` [PATCH 4/8] ppc64 " Mel Gorman
2006-05-05 17:36 ` [PATCH 5/8] x86_64 " Mel Gorman
2006-05-05 17:36 ` [PATCH 6/8] ia64 " Mel Gorman
2006-05-05 17:37 ` [PATCH 7/8] Allow HugeTLB allocations to use ZONE_EASYRCLM Mel Gorman
2006-05-05 17:37 ` [PATCH 8/8] Add documentation for extra boot parameters Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).