* [PATCH 0/3] Unmapped Page Cache Control (v4) @ 2011-01-25 5:04 Balbir Singh 2011-01-25 5:05 ` [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v4) Balbir Singh 0 siblings, 1 reply; 4+ messages in thread From: Balbir Singh @ 2011-01-25 5:04 UTC (permalink / raw) To: linux-mm, akpm Cc: npiggin, kvm, linux-kernel, kosaki.motohiro, cl, kamezawa.hiroyu The following series implements page cache control, this is a split out version of patch 1 of version 3 of the page cache optimization patches posted earlier at Previous posting http://lwn.net/Articles/419564/ The previous few revision received lot of comments, I've tried to address as many of those as possible in this revision. Detailed Description ==================== This patch implements unmapped page cache control via preferred page cache reclaim. The current patch hooks into kswapd and reclaims page cache if the user has requested for unmapped page control. This is useful in the following scenario - In a virtualized environment with cache=writethrough, we see double caching - (one in the host and one in the guest). As we try to scale guests, cache usage across the system grows. The goal of this patch is to reclaim page cache when Linux is running as a guest and get the host to hold the page cache and manage it. There might be temporary duplication, but in the long run, memory in the guests would be used for mapped pages. - The option is controlled via a boot option and the administrator can selectively turn it on, on a need to use basis. A lot of the code is borrowed from zone_reclaim_mode logic for __zone_reclaim(). One might argue that the with ballooning and KSM this feature is not very useful, but even with ballooning, we need extra logic to balloon multiple VM machines and it is hard to figure out the correct amount of memory to balloon. With these patches applied, each guest has a sufficient amount of free memory available, that can be easily seen and reclaimed by the balloon driver. The additional memory in the guest can be reused for additional applications or used to start additional guests/balance memory in the host. KSM currently does not de-duplicate host and guest page cache. The goal of this patch is to help automatically balance unmapped page cache when instructed to do so. The sysctl for min_unmapped_ratio provides further control from within the guest on the amount of unmapped pages to reclaim, a similar max_unmapped_ratio sysctl is added and helps in the decision making process of when reclaim should occur. This is tunable and set by default to 16 (based on tradeoff's seen between aggressiveness in balancing versus size of unmapped pages). Distro's and administrators can further tweak this for desired control. Data from the previous patchsets can be found at https://lkml.org/lkml/2010/11/30/79 --- Balbir Singh (3): Move zone_reclaim() outside of CONFIG_NUMA Refactor zone_reclaim code Provide control over unmapped pages Documentation/kernel-parameters.txt | 8 ++ include/linux/mmzone.h | 9 ++- include/linux/swap.h | 23 +++++-- init/Kconfig | 12 +++ kernel/sysctl.c | 29 ++++++-- mm/page_alloc.c | 31 ++++++++- mm/vmscan.c | 122 +++++++++++++++++++++++++++++++---- 7 files changed, 202 insertions(+), 32 deletions(-) -- Balbir Singh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v4) 2011-01-25 5:04 [PATCH 0/3] Unmapped Page Cache Control (v4) Balbir Singh @ 2011-01-25 5:05 ` Balbir Singh 2011-01-26 16:56 ` Christoph Lameter 0 siblings, 1 reply; 4+ messages in thread From: Balbir Singh @ 2011-01-25 5:05 UTC (permalink / raw) To: linux-mm, akpm Cc: npiggin, kvm, linux-kernel, kosaki.motohiro, cl, kamezawa.hiroyu This patch moves zone_reclaim and associated helpers outside CONFIG_NUMA. This infrastructure is reused in the patches for page cache control that follow. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> --- include/linux/mmzone.h | 4 ++-- include/linux/swap.h | 4 ++-- kernel/sysctl.c | 18 +++++++++--------- mm/page_alloc.c | 6 +++--- mm/vmscan.c | 2 -- 5 files changed, 16 insertions(+), 18 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 02ecb01..2485acc 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -303,12 +303,12 @@ struct zone { */ unsigned long lowmem_reserve[MAX_NR_ZONES]; -#ifdef CONFIG_NUMA - int node; /* * zone reclaim becomes active if more unmapped pages exist. */ unsigned long min_unmapped_pages; +#ifdef CONFIG_NUMA + int node; unsigned long min_slab_pages; #endif struct per_cpu_pageset __percpu *pageset; diff --git a/include/linux/swap.h b/include/linux/swap.h index 5e3355a..7b75626 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -255,11 +255,11 @@ extern int vm_swappiness; extern int remove_mapping(struct address_space *mapping, struct page *page); extern long vm_total_pages; +extern int sysctl_min_unmapped_ratio; +extern int zone_reclaim(struct zone *, gfp_t, unsigned int); #ifdef CONFIG_NUMA extern int zone_reclaim_mode; -extern int sysctl_min_unmapped_ratio; extern int sysctl_min_slab_ratio; -extern int zone_reclaim(struct zone *, gfp_t, unsigned int); #else #define zone_reclaim_mode 0 static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index bc86bb3..12e8f26 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1224,15 +1224,6 @@ static struct ctl_table vm_table[] = { .extra1 = &zero, }, #endif -#ifdef CONFIG_NUMA - { - .procname = "zone_reclaim_mode", - .data = &zone_reclaim_mode, - .maxlen = sizeof(zone_reclaim_mode), - .mode = 0644, - .proc_handler = proc_dointvec, - .extra1 = &zero, - }, { .procname = "min_unmapped_ratio", .data = &sysctl_min_unmapped_ratio, @@ -1242,6 +1233,15 @@ static struct ctl_table vm_table[] = { .extra1 = &zero, .extra2 = &one_hundred, }, +#ifdef CONFIG_NUMA + { + .procname = "zone_reclaim_mode", + .data = &zone_reclaim_mode, + .maxlen = sizeof(zone_reclaim_mode), + .mode = 0644, + .proc_handler = proc_dointvec, + .extra1 = &zero, + }, { .procname = "min_slab_ratio", .data = &sysctl_min_slab_ratio, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index aede3a4..7b56473 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4167,10 +4167,10 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat, zone->spanned_pages = size; zone->present_pages = realsize; -#ifdef CONFIG_NUMA - zone->node = nid; zone->min_unmapped_pages = (realsize*sysctl_min_unmapped_ratio) / 100; +#ifdef CONFIG_NUMA + zone->node = nid; zone->min_slab_pages = (realsize * sysctl_min_slab_ratio) / 100; #endif zone->name = zone_names[j]; @@ -5084,7 +5084,6 @@ int min_free_kbytes_sysctl_handler(ctl_table *table, int write, return 0; } -#ifdef CONFIG_NUMA int sysctl_min_unmapped_ratio_sysctl_handler(ctl_table *table, int write, void __user *buffer, size_t *length, loff_t *ppos) { @@ -5101,6 +5100,7 @@ int sysctl_min_unmapped_ratio_sysctl_handler(ctl_table *table, int write, return 0; } +#ifdef CONFIG_NUMA int sysctl_min_slab_ratio_sysctl_handler(ctl_table *table, int write, void __user *buffer, size_t *length, loff_t *ppos) { diff --git a/mm/vmscan.c b/mm/vmscan.c index 47a5096..5899f2f 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2868,7 +2868,6 @@ static int __init kswapd_init(void) module_init(kswapd_init) -#ifdef CONFIG_NUMA /* * Zone reclaim mode * @@ -3078,7 +3077,6 @@ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order) return ret; } -#endif /* * page_evictable - test whether a page is evictable -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v4) 2011-01-25 5:05 ` [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v4) Balbir Singh @ 2011-01-26 16:56 ` Christoph Lameter 2011-01-26 17:43 ` Balbir Singh 0 siblings, 1 reply; 4+ messages in thread From: Christoph Lameter @ 2011-01-26 16:56 UTC (permalink / raw) To: Balbir Singh Cc: linux-mm, akpm, npiggin, kvm, linux-kernel, kosaki.motohiro, kamezawa.hiroyu Reviewed-by: Christoph Lameter <cl@linux.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v4) 2011-01-26 16:56 ` Christoph Lameter @ 2011-01-26 17:43 ` Balbir Singh 0 siblings, 0 replies; 4+ messages in thread From: Balbir Singh @ 2011-01-26 17:43 UTC (permalink / raw) To: Christoph Lameter Cc: linux-mm, akpm, npiggin, kvm, linux-kernel, kosaki.motohiro, kamezawa.hiroyu * Christoph Lameter <cl@linux.com> [2011-01-26 10:56:56]: > > Reviewed-by: Christoph Lameter <cl@linux.com> > Thanks for the review! -- Three Cheers, Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-01-28 6:34 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-25 5:04 [PATCH 0/3] Unmapped Page Cache Control (v4) Balbir Singh 2011-01-25 5:05 ` [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v4) Balbir Singh 2011-01-26 16:56 ` Christoph Lameter 2011-01-26 17:43 ` Balbir Singh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).