From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by e3.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m1DFF0Es021885 for ; Wed, 13 Feb 2008 10:15:00 -0500 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DFErdn716876 for ; Wed, 13 Feb 2008 10:14:53 -0500 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1DFEq6d017116 for ; Wed, 13 Feb 2008 08:14:53 -0700 From: Balbir Singh Date: Wed, 13 Feb 2008 20:42:01 +0530 Message-Id: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> Subject: [RFC] [PATCH 0/4] Add soft limits to the memory controller Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org Cc: Nick Piggin , Paul Menage , Hugh Dickins , YAMAMOTO Takashi , Herbert Poetzl , Peter Zijlstra , Lee Schermerhorn , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Balbir Singh , Rik Van Riel , Andrew Morton , KAMEZAWA Hiroyuki List-ID: This patchset implements the basic changes required to implement soft limits in the memory controller. A soft limit is a variation of the currently supported hard limit feature. A memory cgroup can exceed it's soft limit provided there is no contention for memory. These patches were tested under KVM and on a PowerPC box, by running a programs in parallel, and checking their behaviour for various soft limit values. These patches were developed on top of 2.6.24-mm1. Comments, suggestions, criticism are all welcome! TODOs: 1. Currently there is no ordering of memory cgroups over their limit. We use a simple linked list to maintain a list of groups over their limit. In the future, we might want to create a heap of objects ordered by the amount by which they exceed soft limit. 2. Distribute the excessive (non-contended) resources between groups in the ratio of their soft limits series ------ memory-controller-res_counters-soft-limit-setup.patch memory-controller-add-soft-limit-interface.patch memory-controller-reclaim-on-contention.patch memory-controller-add-soft-limit-documentation.patch -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e2.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m1DFFZ7f015107 for ; Wed, 13 Feb 2008 10:15:35 -0500 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DFFZsm233864 for ; Wed, 13 Feb 2008 10:15:35 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1DFFYhJ000668 for ; Wed, 13 Feb 2008 10:15:35 -0500 From: Balbir Singh Date: Wed, 13 Feb 2008 20:42:42 +0530 Message-Id: <20080213151242.7529.79924.sendpatchset@localhost.localdomain> In-Reply-To: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> Subject: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org Cc: Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Balbir Singh , Rik Van Riel , Andrew Morton , KAMEZAWA Hiroyuki List-ID: The global list of all cgroups over their soft limit is scanned under memory pressure. We call mem_cgroup_pushback_groups_over_soft_limit from __alloc_pages() prior to calling try_to_free_pages(), in an attempt to rescue memory from groups that are using memory above their soft limit. If this attempt is unsuccessfull, we call try_to_free_pages() and take the normal global reclaim path. Signed-off-by: Balbir Singh --- include/linux/memcontrol.h | 9 +++++ include/linux/res_counter.h | 11 +++++++ include/linux/swap.h | 4 +- mm/memcontrol.c | 67 ++++++++++++++++++++++++++++++++++++++++---- mm/page_alloc.c | 10 +++++- mm/vmscan.c | 12 +++++-- 6 files changed, 101 insertions(+), 12 deletions(-) diff -puN mm/vmscan.c~memory-controller-reclaim-on-contention mm/vmscan.c --- linux-2.6.24/mm/vmscan.c~memory-controller-reclaim-on-contention 2008-02-13 20:16:04.000000000 +0530 +++ linux-2.6.24-balbir/mm/vmscan.c 2008-02-13 20:16:04.000000000 +0530 @@ -1440,22 +1440,26 @@ unsigned long try_to_free_pages(struct z #ifdef CONFIG_CGROUP_MEM_CONT unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont, - gfp_t gfp_mask) + gfp_t gfp_mask, + unsigned long nr_pages, + struct zone **zones) { struct scan_control sc = { .gfp_mask = gfp_mask, .may_writepage = !laptop_mode, .may_swap = 1, - .swap_cluster_max = SWAP_CLUSTER_MAX, + .swap_cluster_max = nr_pages, .swappiness = vm_swappiness, .order = 0, .mem_cgroup = mem_cont, .isolate_pages = mem_cgroup_isolate_pages, }; - struct zone **zones; int target_zone = gfp_zone(GFP_HIGHUSER_MOVABLE); - zones = NODE_DATA(numa_node_id())->node_zonelists[target_zone].zones; + if (!zones) + zones = + NODE_DATA(numa_node_id())->node_zonelists[target_zone].zones; + if (do_try_to_free_pages(zones, sc.gfp_mask, &sc)) return 1; return 0; diff -puN include/linux/memcontrol.h~memory-controller-reclaim-on-contention include/linux/memcontrol.h --- linux-2.6.24/include/linux/memcontrol.h~memory-controller-reclaim-on-contention 2008-02-13 20:16:04.000000000 +0530 +++ linux-2.6.24-balbir/include/linux/memcontrol.h 2008-02-13 20:19:20.000000000 +0530 @@ -76,6 +76,8 @@ extern long mem_cgroup_calc_reclaim_acti struct zone *zone, int priority); extern long mem_cgroup_calc_reclaim_inactive(struct mem_cgroup *mem, struct zone *zone, int priority); +extern unsigned long +mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, gfp_t gfp_mask); #else /* CONFIG_CGROUP_MEM_CONT */ static inline void mm_init_cgroup(struct mm_struct *mm, @@ -184,6 +186,13 @@ static inline long mem_cgroup_calc_recla { return 0; } + +static inline unsigned long +mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, gfp_t gfp_mask) +{ + return 0; +} + #endif /* CONFIG_CGROUP_MEM_CONT */ #endif /* _LINUX_MEMCONTROL_H */ diff -puN mm/memcontrol.c~memory-controller-reclaim-on-contention mm/memcontrol.c --- linux-2.6.24/mm/memcontrol.c~memory-controller-reclaim-on-contention 2008-02-13 20:16:04.000000000 +0530 +++ linux-2.6.24-balbir/mm/memcontrol.c 2008-02-13 20:16:04.000000000 +0530 @@ -35,7 +35,7 @@ struct cgroup_subsys mem_cgroup_subsys; static const int MEM_CGROUP_RECLAIM_RETRIES = 5; -static spinlock_t mem_cgroup_sl_list_lock; /* spin lock that protects */ +static rwlock_t mem_cgroup_sl_list_lock; /* spin lock that protects */ /* the list of cgroups over*/ /* their soft limit */ static struct list_head mem_cgroup_sl_exceeded_list; @@ -646,7 +646,8 @@ retry: if (!(gfp_mask & __GFP_WAIT)) goto out; - if (try_to_free_mem_cgroup_pages(mem, gfp_mask)) + if (try_to_free_mem_cgroup_pages(mem, gfp_mask, + SWAP_CLUSTER_MAX, NULL)) continue; /* @@ -692,11 +693,11 @@ retry: * cgroups over their soft limit */ if (!res_counter_check_under_limit(&mem->res, RES_SOFT_LIMIT)) { - spin_lock_irqsave(&mem_cgroup_sl_list_lock, flags); + write_lock_irqsave(&mem_cgroup_sl_list_lock, flags); if (list_empty(&mem->sl_exceeded_list)) list_add_tail(&mem->sl_exceeded_list, &mem_cgroup_sl_exceeded_list); - spin_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); + write_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); } mz = page_cgroup_zoneinfo(pc); @@ -928,7 +929,55 @@ out: return ret; } +/* + * Free all control groups, which are over their soft limit + */ +unsigned long mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, + gfp_t gfp_mask) +{ + struct mem_cgroup *mem; + unsigned long nr_pages; + long long nr_bytes_over_sl; + unsigned long ret = 0; + unsigned long flags; + struct list_head reclaimed_groups; + INIT_LIST_HEAD(&reclaimed_groups); + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); + while (!list_empty(&mem_cgroup_sl_exceeded_list)) { + mem = list_first_entry(&mem_cgroup_sl_exceeded_list, + struct mem_cgroup, sl_exceeded_list); + list_move(&mem->sl_exceeded_list, &reclaimed_groups); + read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); + + nr_bytes_over_sl = res_counter_sl_excess(&mem->res); + if (nr_bytes_over_sl <= 0) + goto next; + nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT); + ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages, + zones); +next: + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); + } + + while (!list_empty(&reclaimed_groups)) { + /* + * Check again to see if we've gone below the soft + * limit. XXX: Consider giving up the &mem_cgroup_sl_list_lock + * before calling res_counter_sl_excess. + */ + mem = list_first_entry(&reclaimed_groups, struct mem_cgroup, + sl_exceeded_list); + nr_bytes_over_sl = res_counter_sl_excess(&mem->res); + if (nr_bytes_over_sl <= 0) + list_del_init(&mem->sl_exceeded_list); + else + list_move(&mem->sl_exceeded_list, + &mem_cgroup_sl_exceeded_list); + } + read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); + return ret; +} int mem_cgroup_write_strategy(char *buf, unsigned long long *tmp) { @@ -1124,8 +1173,7 @@ mem_cgroup_create(struct cgroup_subsys * if (unlikely((cont->parent) == NULL)) { mem = &init_mem_cgroup; init_mm.mem_cgroup = mem; - INIT_LIST_HEAD(&mem->sl_exceeded_list); - spin_lock_init(&mem_cgroup_sl_list_lock); + rwlock_init(&mem_cgroup_sl_list_lock); INIT_LIST_HEAD(&mem_cgroup_sl_exceeded_list); } else mem = kzalloc(sizeof(struct mem_cgroup), GFP_KERNEL); @@ -1155,7 +1203,14 @@ static void mem_cgroup_pre_destroy(struc struct cgroup *cont) { struct mem_cgroup *mem = mem_cgroup_from_cont(cont); + unsigned long flags; + mem_cgroup_force_empty(mem); + + write_lock_irqsave(&mem_cgroup_sl_list_lock, flags); + if (!list_empty(&mem->sl_exceeded_list)) + list_del_init(&mem->sl_exceeded_list); + write_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); } static void mem_cgroup_destroy(struct cgroup_subsys *ss, diff -puN include/linux/res_counter.h~memory-controller-reclaim-on-contention include/linux/res_counter.h --- linux-2.6.24/include/linux/res_counter.h~memory-controller-reclaim-on-contention 2008-02-13 20:16:04.000000000 +0530 +++ linux-2.6.24-balbir/include/linux/res_counter.h 2008-02-13 20:16:04.000000000 +0530 @@ -142,4 +142,15 @@ static inline bool res_counter_check_und return ret; } +static inline long long res_counter_sl_excess(struct res_counter *cnt) +{ + unsigned long flags; + long long ret; + + spin_lock_irqsave(&cnt->lock, flags); + ret = cnt->usage - cnt->soft_limit; + spin_unlock_irqrestore(&cnt->lock, flags); + return ret; +} + #endif diff -puN kernel/res_counter.c~memory-controller-reclaim-on-contention kernel/res_counter.c diff -puN mm/page_alloc.c~memory-controller-reclaim-on-contention mm/page_alloc.c --- linux-2.6.24/mm/page_alloc.c~memory-controller-reclaim-on-contention 2008-02-13 20:16:04.000000000 +0530 +++ linux-2.6.24-balbir/mm/page_alloc.c 2008-02-13 20:16:04.000000000 +0530 @@ -1635,7 +1635,15 @@ nofail_alloc: reclaim_state.reclaimed_slab = 0; p->reclaim_state = &reclaim_state; - did_some_progress = try_to_free_pages(zonelist->zones, order, gfp_mask); + /* + * First reclaim from all memory control groups over their + * soft limit + */ + did_some_progress = mem_cgroup_pushback_groups_over_soft_limit( + zonelist->zones, gfp_mask); + if (!did_some_progress) + did_some_progress = + try_to_free_pages(zonelist->zones, order, gfp_mask); p->reclaim_state = NULL; p->flags &= ~PF_MEMALLOC; diff -puN include/linux/swap.h~memory-controller-reclaim-on-contention include/linux/swap.h --- linux-2.6.24/include/linux/swap.h~memory-controller-reclaim-on-contention 2008-02-13 20:16:04.000000000 +0530 +++ linux-2.6.24-balbir/include/linux/swap.h 2008-02-13 20:16:04.000000000 +0530 @@ -184,7 +184,9 @@ extern void swap_setup(void); extern unsigned long try_to_free_pages(struct zone **zones, int order, gfp_t gfp_mask); extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem, - gfp_t gfp_mask); + gfp_t gfp_mask, + unsigned long nr_pages, + struct zone **zones); extern int __isolate_lru_page(struct page *page, int mode); extern unsigned long shrink_all_memory(unsigned long nr_pages); extern int vm_swappiness; _ -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e32.co.us.ibm.com (8.13.8/8.13.8) with ESMTP id m1DFG4ms020168 for ; Wed, 13 Feb 2008 10:16:04 -0500 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DFFpN1151716 for ; Wed, 13 Feb 2008 08:15:52 -0700 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1DFFlBP023341 for ; Wed, 13 Feb 2008 08:15:49 -0700 From: Balbir Singh Date: Wed, 13 Feb 2008 20:42:56 +0530 Message-Id: <20080213151256.7529.59791.sendpatchset@localhost.localdomain> In-Reply-To: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> Subject: [RFC] [PATCH 4/4] Add soft limit documentation Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org Cc: Hugh Dickins , Paul Menage , YAMAMOTO Takashi , Peter Zijlstra , Lee Schermerhorn , Herbert Poetzl , David Rientjes , Andrew Morton , Pavel Emelianov , Nick Piggin , Balbir Singh , Rik Van Riel , "Eric W. Biederman" , KAMEZAWA Hiroyuki List-ID: Add documentation for the soft limit feature. Signed-off-by: Balbir Singh --- Documentation/controllers/memory.txt | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff -puN Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation Documentation/controllers/memory.txt --- linux-2.6.24/Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation 2008-02-13 18:45:40.000000000 +0530 +++ linux-2.6.24-balbir/Documentation/controllers/memory.txt 2008-02-13 18:49:58.000000000 +0530 @@ -201,6 +201,22 @@ The memory.force_empty gives an interfac will drop all charges in cgroup. Currently, this is maintained for test. +The file memory.soft_limit_in_bytes allows users to set soft limits. A soft +limit is set in a manner similar to limit. The limit feature described +earlier is a hard limit, a group can never exceed it's hard limit. A soft +limit on the other hand can be exceeded. A group will be shrunk back +to it's soft limit, when there is memory pressure/contention. + +Ideally the soft limit should always be set to a value smaller than the +hard limit. However, the code does not force the user to do so. The soft +limit can be greater than the hard limit; then the soft limit has +no meaning in that setup, since the group will alwasy be restrained to its +hard limit. + +Example setting of soft limit + +# echo -n 100M > memory.soft_limit_in_bytes + 4. Testing Balbir posted lmbench, AIM9, LTP and vmmstress results [10] and [11]. _ -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e3.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m1DFFMdH022908 for ; Wed, 13 Feb 2008 10:15:22 -0500 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DFFLfH096924 for ; Wed, 13 Feb 2008 10:15:22 -0500 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1DFFLIZ014798 for ; Wed, 13 Feb 2008 10:15:21 -0500 From: Balbir Singh Date: Wed, 13 Feb 2008 20:42:29 +0530 Message-Id: <20080213151229.7529.17894.sendpatchset@localhost.localdomain> In-Reply-To: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> Subject: [RFC] [PATCH 2/4] Add the soft limit interface Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org Cc: Hugh Dickins , Paul Menage , YAMAMOTO Takashi , Herbert Poetzl , Peter Zijlstra , Lee Schermerhorn , Nick Piggin , David Rientjes , Andrew Morton , Pavel Emelianov , Balbir Singh , Rik Van Riel , "Eric W. Biederman" , KAMEZAWA Hiroyuki List-ID: A new configuration file called soft_limit_in_bytes is added. The parsing and configuration rules remain the same as for the limit_in_bytes user interface. A global list of all memory cgroups over their soft limit is maintained. This list is then used to reclaim memory on global pressure. A cgroup is removed from the list when the cgroup is deleted. The global list is protected with a read-write spinlock. Signed-off-by: Balbir Singh --- mm/memcontrol.c | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) diff -puN mm/memcontrol.c~memory-controller-add-soft-limit-interface mm/memcontrol.c --- linux-2.6.24/mm/memcontrol.c~memory-controller-add-soft-limit-interface 2008-02-13 19:50:27.000000000 +0530 +++ linux-2.6.24-balbir/mm/memcontrol.c 2008-02-13 19:50:27.000000000 +0530 @@ -35,6 +35,10 @@ struct cgroup_subsys mem_cgroup_subsys; static const int MEM_CGROUP_RECLAIM_RETRIES = 5; +static spinlock_t mem_cgroup_sl_list_lock; /* spin lock that protects */ + /* the list of cgroups over*/ + /* their soft limit */ +static struct list_head mem_cgroup_sl_exceeded_list; /* * Statistics for memory cgroup. @@ -136,6 +140,10 @@ struct mem_cgroup { * statistics. */ struct mem_cgroup_stat stat; + /* + * List of all mem_cgroup's that exceed their soft limit + */ + struct list_head sl_exceeded_list; }; /* @@ -679,6 +687,18 @@ retry: goto retry; } + /* + * If we exceed our soft limit, we get added to the list of + * cgroups over their soft limit + */ + if (!res_counter_check_under_limit(&mem->res, RES_SOFT_LIMIT)) { + spin_lock_irqsave(&mem_cgroup_sl_list_lock, flags); + if (list_empty(&mem->sl_exceeded_list)) + list_add_tail(&mem->sl_exceeded_list, + &mem_cgroup_sl_exceeded_list); + spin_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); + } + mz = page_cgroup_zoneinfo(pc); spin_lock_irqsave(&mz->lru_lock, flags); /* Update statistics vector */ @@ -736,13 +756,14 @@ void mem_cgroup_uncharge(struct page_cgr if (atomic_dec_and_test(&pc->ref_cnt)) { page = pc->page; mz = page_cgroup_zoneinfo(pc); + mem = pc->mem_cgroup; /* * get page->cgroup and clear it under lock. * force_empty can drop page->cgroup without checking refcnt. */ unlock_page_cgroup(page); + if (clear_page_cgroup(page, pc) == pc) { - mem = pc->mem_cgroup; css_put(&mem->css); res_counter_uncharge(&mem->res, PAGE_SIZE); spin_lock_irqsave(&mz->lru_lock, flags); @@ -1046,6 +1067,12 @@ static struct cftype mem_cgroup_files[] .name = "stat", .open = mem_control_stat_open, }, + { + .name = "soft_limit_in_bytes", + .private = RES_SOFT_LIMIT, + .write = mem_cgroup_write, + .read = mem_cgroup_read, + }, }; static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *mem, int node) @@ -1097,6 +1124,9 @@ mem_cgroup_create(struct cgroup_subsys * if (unlikely((cont->parent) == NULL)) { mem = &init_mem_cgroup; init_mm.mem_cgroup = mem; + INIT_LIST_HEAD(&mem->sl_exceeded_list); + spin_lock_init(&mem_cgroup_sl_list_lock); + INIT_LIST_HEAD(&mem_cgroup_sl_exceeded_list); } else mem = kzalloc(sizeof(struct mem_cgroup), GFP_KERNEL); @@ -1104,6 +1134,7 @@ mem_cgroup_create(struct cgroup_subsys * return NULL; res_counter_init(&mem->res); + INIT_LIST_HEAD(&mem->sl_exceeded_list); memset(&mem->info, 0, sizeof(mem->info)); diff -puN include/linux/memcontrol.h~memory-controller-add-soft-limit-interface include/linux/memcontrol.h _ -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by e4.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m1DFFKuS005457 for ; Wed, 13 Feb 2008 10:15:20 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DFFJwY137142 for ; Wed, 13 Feb 2008 08:15:19 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1DFFGrh014615 for ; Wed, 13 Feb 2008 08:15:19 -0700 From: Balbir Singh Date: Wed, 13 Feb 2008 20:42:14 +0530 Message-Id: <20080213151214.7529.3954.sendpatchset@localhost.localdomain> In-Reply-To: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> Subject: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support Sender: owner-linux-mm@kvack.org Return-Path: To: linux-mm@kvack.org Cc: Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Nick Piggin , "Eric W. Biederman" , David Rientjes , Andrew Morton , Pavel Emelianov , Balbir Singh , Rik Van Riel , Herbert Poetzl , KAMEZAWA Hiroyuki List-ID: The resource counter member limit is split into soft and hard limits. The same locking rule apply for both limits. Signed-off-by: Balbir Singh --- include/linux/res_counter.h | 34 ++++++++++++++++++++++++++-------- kernel/res_counter.c | 11 +++++++---- mm/memcontrol.c | 10 +++++----- 3 files changed, 38 insertions(+), 17 deletions(-) diff -puN mm/vmscan.c~memory-controller-res_counters-soft-limit-setup mm/vmscan.c diff -puN mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup mm/memcontrol.c --- linux-2.6.24/mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup 2008-02-13 19:50:24.000000000 +0530 +++ linux-2.6.24-balbir/mm/memcontrol.c 2008-02-13 19:50:24.000000000 +0530 @@ -568,7 +568,7 @@ unsigned long mem_cgroup_isolate_pages(u * Charge the memory controller for page usage. * Return * 0 if the charge was successful - * < 0 if the cgroup is over its limit + * < 0 if the cgroup is over its hard limit */ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm, gfp_t gfp_mask, enum charge_type ctype) @@ -632,7 +632,7 @@ retry: /* * If we created the page_cgroup, we should free it on exceeding - * the cgroup limit. + * the cgroup hard limit. */ while (res_counter_charge(&mem->res, PAGE_SIZE)) { if (!(gfp_mask & __GFP_WAIT)) @@ -645,10 +645,10 @@ retry: * try_to_free_mem_cgroup_pages() might not give us a full * picture of reclaim. Some pages are reclaimed and might be * moved to swap cache or just unmapped from the cgroup. - * Check the limit again to see if the reclaim reduced the + * Check the hard limit again to see if the reclaim reduced the * current usage of the cgroup before giving up */ - if (res_counter_check_under_limit(&mem->res)) + if (res_counter_check_under_limit(&mem->res, RES_HARD_LIMIT)) continue; if (!nr_retries--) { @@ -1028,7 +1028,7 @@ static struct cftype mem_cgroup_files[] }, { .name = "limit_in_bytes", - .private = RES_LIMIT, + .private = RES_HARD_LIMIT, .write = mem_cgroup_write, .read = mem_cgroup_read, }, diff -puN kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup kernel/res_counter.c --- linux-2.6.24/kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup 2008-02-13 19:50:24.000000000 +0530 +++ linux-2.6.24-balbir/kernel/res_counter.c 2008-02-13 19:50:24.000000000 +0530 @@ -16,12 +16,13 @@ void res_counter_init(struct res_counter *counter) { spin_lock_init(&counter->lock); - counter->limit = (unsigned long long)LLONG_MAX; + counter->soft_limit = (unsigned long long)LLONG_MAX; + counter->hard_limit = (unsigned long long)LLONG_MAX; } int res_counter_charge_locked(struct res_counter *counter, unsigned long val) { - if (counter->usage + val > counter->limit) { + if (counter->usage + val > counter->hard_limit) { counter->failcnt++; return -ENOMEM; } @@ -65,8 +66,10 @@ res_counter_member(struct res_counter *c switch (member) { case RES_USAGE: return &counter->usage; - case RES_LIMIT: - return &counter->limit; + case RES_SOFT_LIMIT: + return &counter->soft_limit; + case RES_HARD_LIMIT: + return &counter->hard_limit; case RES_FAILCNT: return &counter->failcnt; }; diff -puN include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup include/linux/res_counter.h --- linux-2.6.24/include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup 2008-02-13 19:50:24.000000000 +0530 +++ linux-2.6.24-balbir/include/linux/res_counter.h 2008-02-13 19:50:24.000000000 +0530 @@ -27,7 +27,13 @@ struct res_counter { /* * the limit that usage cannot exceed */ - unsigned long long limit; + unsigned long long hard_limit; + /* + * the limit that usage can exceed, but under memory + * pressure, we will reclaim back memory above the + * soft limit mark + */ + unsigned long long soft_limit; /* * the number of unsuccessful attempts to consume the resource */ @@ -64,7 +70,8 @@ ssize_t res_counter_write(struct res_cou enum { RES_USAGE, - RES_LIMIT, + RES_SOFT_LIMIT, + RES_HARD_LIMIT, RES_FAILCNT, }; @@ -101,11 +108,21 @@ int res_counter_charge(struct res_counte void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val); void res_counter_uncharge(struct res_counter *counter, unsigned long val); -static inline bool res_counter_limit_check_locked(struct res_counter *cnt) +static inline bool res_counter_limit_check_locked(struct res_counter *cnt, + int member) { - if (cnt->usage < cnt->limit) - return true; - + switch (member) { + case RES_HARD_LIMIT: + if (cnt->usage < cnt->hard_limit) + return true; + break; + case RES_SOFT_LIMIT: + if (cnt->usage < cnt->soft_limit) + return true; + break; + default: + BUG_ON(1); + } return false; } @@ -113,13 +130,14 @@ static inline bool res_counter_limit_che * Helper function to detect if the cgroup is within it's limit or * not. It's currently called from cgroup_rss_prepare() */ -static inline bool res_counter_check_under_limit(struct res_counter *cnt) +static inline bool res_counter_check_under_limit(struct res_counter *cnt, + int member) { bool ret; unsigned long flags; spin_lock_irqsave(&cnt->lock, flags); - ret = res_counter_limit_check_locked(cnt); + ret = res_counter_limit_check_locked(cnt, member); spin_unlock_irqrestore(&cnt->lock, flags); return ret; } diff -puN include/linux/memcontrol.h~memory-controller-res_counters-soft-limit-setup include/linux/memcontrol.h _ -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 13 Feb 2008 07:59:29 -0800 From: Randy Dunlap Subject: Re: [RFC] [PATCH 4/4] Add soft limit documentation Message-Id: <20080213075929.52a3ae05.randy.dunlap@oracle.com> In-Reply-To: <20080213151256.7529.59791.sendpatchset@localhost.localdomain> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151256.7529.59791.sendpatchset@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Balbir Singh Cc: linux-mm@kvack.org, Hugh Dickins , Paul Menage , YAMAMOTO Takashi , Peter Zijlstra , Lee Schermerhorn , Herbert Poetzl , David Rientjes , Andrew Morton , Pavel Emelianov , Nick Piggin , Rik Van Riel , "Eric W. Biederman" , KAMEZAWA Hiroyuki List-ID: On Wed, 13 Feb 2008 20:42:56 +0530 Balbir Singh wrote: > > Add documentation for the soft limit feature. > > Signed-off-by: Balbir Singh > --- > > Documentation/controllers/memory.txt | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff -puN Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation Documentation/controllers/memory.txt > --- linux-2.6.24/Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation 2008-02-13 18:45:40.000000000 +0530 > +++ linux-2.6.24-balbir/Documentation/controllers/memory.txt 2008-02-13 18:49:58.000000000 +0530 > @@ -201,6 +201,22 @@ The memory.force_empty gives an interfac > > will drop all charges in cgroup. Currently, this is maintained for test. > > +The file memory.soft_limit_in_bytes allows users to set soft limits. A soft > +limit is set in a manner similar to limit. The limit feature described > +earlier is a hard limit, a group can never exceed it's hard limit. A soft ; [or: ". A group ..."] and s/it's/its/ > +limit on the other hand can be exceeded. A group will be shrunk back > +to it's soft limit, when there is memory pressure/contention. its [it's == it is] > + > +Ideally the soft limit should always be set to a value smaller than the > +hard limit. However, the code does not force the user to do so. The soft > +limit can be greater than the hard limit; then the soft limit has > +no meaning in that setup, since the group will alwasy be restrained to its always > +hard limit. > + > +Example setting of soft limit > + > +# echo -n 100M > memory.soft_limit_in_bytes > + > 4. Testing --- ~Randy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com [9.184.220.59]) by e28smtp01.in.ibm.com (8.13.1/8.13.1) with ESMTP id m1DGBiHr002673 for ; Wed, 13 Feb 2008 21:41:44 +0530 Received: from d28av05.in.ibm.com (d28av05.in.ibm.com [9.184.220.67]) by d28relay02.in.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DGBiwI839738 for ; Wed, 13 Feb 2008 21:41:44 +0530 Received: from d28av05.in.ibm.com (loopback [127.0.0.1]) by d28av05.in.ibm.com (8.13.1/8.13.3) with ESMTP id m1DGBhxw028415 for ; Wed, 13 Feb 2008 16:11:44 GMT Message-ID: <47B3161A.9090504@linux.vnet.ibm.com> Date: Wed, 13 Feb 2008 21:38:58 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 4/4] Add soft limit documentation References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151256.7529.59791.sendpatchset@localhost.localdomain> <20080213075929.52a3ae05.randy.dunlap@oracle.com> In-Reply-To: <20080213075929.52a3ae05.randy.dunlap@oracle.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Randy Dunlap Cc: linux-mm@kvack.org, Hugh Dickins , Paul Menage , YAMAMOTO Takashi , Peter Zijlstra , Lee Schermerhorn , Herbert Poetzl , David Rientjes , Andrew Morton , Pavel Emelianov , Nick Piggin , Rik Van Riel , "Eric W. Biederman" , KAMEZAWA Hiroyuki List-ID: Randy Dunlap wrote: > On Wed, 13 Feb 2008 20:42:56 +0530 Balbir Singh wrote: > >> Add documentation for the soft limit feature. >> >> Signed-off-by: Balbir Singh >> --- >> >> Documentation/controllers/memory.txt | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> diff -puN Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation Documentation/controllers/memory.txt >> --- linux-2.6.24/Documentation/controllers/memory.txt~memory-controller-add-soft-limit-documentation 2008-02-13 18:45:40.000000000 +0530 >> +++ linux-2.6.24-balbir/Documentation/controllers/memory.txt 2008-02-13 18:49:58.000000000 +0530 >> @@ -201,6 +201,22 @@ The memory.force_empty gives an interfac >> >> will drop all charges in cgroup. Currently, this is maintained for test. >> >> +The file memory.soft_limit_in_bytes allows users to set soft limits. A soft >> +limit is set in a manner similar to limit. The limit feature described >> +earlier is a hard limit, a group can never exceed it's hard limit. A soft > > ; [or: ". A group ..."] Will do > and s/it's/its/ > Thanks, I seem to use it's instead of its at times. I'll double check next time >> +limit on the other hand can be exceeded. A group will be shrunk back >> +to it's soft limit, when there is memory pressure/contention. > > its [it's == it is] > >> + >> +Ideally the soft limit should always be set to a value smaller than the >> +hard limit. However, the code does not force the user to do so. The soft >> +limit can be greater than the hard limit; then the soft limit has >> +no meaning in that setup, since the group will alwasy be restrained to its > > always > Will fix >> +hard limit. >> + >> +Example setting of soft limit >> + >> +# echo -n 100M > memory.soft_limit_in_bytes >> + >> 4. Testing > Thanks for helping us keep the documentation readable. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47B324F4.1050102@openvz.org> Date: Wed, 13 Feb 2008 20:12:20 +0300 From: Pavel Emelyanov MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151214.7529.3954.sendpatchset@localhost.localdomain> In-Reply-To: <20080213151214.7529.3954.sendpatchset@localhost.localdomain> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Balbir Singh Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Nick Piggin , "Eric W. Biederman" , David Rientjes , Andrew Morton , Rik Van Riel , Herbert Poetzl , KAMEZAWA Hiroyuki List-ID: Balbir Singh wrote: > The resource counter member limit is split into soft and hard limits. > The same locking rule apply for both limits. > > Signed-off-by: Balbir Singh > --- > > include/linux/res_counter.h | 34 ++++++++++++++++++++++++++-------- > kernel/res_counter.c | 11 +++++++---- > mm/memcontrol.c | 10 +++++----- > 3 files changed, 38 insertions(+), 17 deletions(-) > > diff -puN mm/vmscan.c~memory-controller-res_counters-soft-limit-setup mm/vmscan.c > diff -puN mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup mm/memcontrol.c > --- linux-2.6.24/mm/memcontrol.c~memory-controller-res_counters-soft-limit-setup 2008-02-13 19:50:24.000000000 +0530 > +++ linux-2.6.24-balbir/mm/memcontrol.c 2008-02-13 19:50:24.000000000 +0530 > @@ -568,7 +568,7 @@ unsigned long mem_cgroup_isolate_pages(u > * Charge the memory controller for page usage. > * Return > * 0 if the charge was successful > - * < 0 if the cgroup is over its limit > + * < 0 if the cgroup is over its hard limit > */ > static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm, > gfp_t gfp_mask, enum charge_type ctype) > @@ -632,7 +632,7 @@ retry: > > /* > * If we created the page_cgroup, we should free it on exceeding > - * the cgroup limit. > + * the cgroup hard limit. > */ > while (res_counter_charge(&mem->res, PAGE_SIZE)) { > if (!(gfp_mask & __GFP_WAIT)) > @@ -645,10 +645,10 @@ retry: > * try_to_free_mem_cgroup_pages() might not give us a full > * picture of reclaim. Some pages are reclaimed and might be > * moved to swap cache or just unmapped from the cgroup. > - * Check the limit again to see if the reclaim reduced the > + * Check the hard limit again to see if the reclaim reduced the > * current usage of the cgroup before giving up > */ > - if (res_counter_check_under_limit(&mem->res)) > + if (res_counter_check_under_limit(&mem->res, RES_HARD_LIMIT)) > continue; > > if (!nr_retries--) { > @@ -1028,7 +1028,7 @@ static struct cftype mem_cgroup_files[] > }, > { > .name = "limit_in_bytes", > - .private = RES_LIMIT, > + .private = RES_HARD_LIMIT, > .write = mem_cgroup_write, > .read = mem_cgroup_read, > }, > diff -puN kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup kernel/res_counter.c > --- linux-2.6.24/kernel/res_counter.c~memory-controller-res_counters-soft-limit-setup 2008-02-13 19:50:24.000000000 +0530 > +++ linux-2.6.24-balbir/kernel/res_counter.c 2008-02-13 19:50:24.000000000 +0530 > @@ -16,12 +16,13 @@ > void res_counter_init(struct res_counter *counter) > { > spin_lock_init(&counter->lock); > - counter->limit = (unsigned long long)LLONG_MAX; > + counter->soft_limit = (unsigned long long)LLONG_MAX; > + counter->hard_limit = (unsigned long long)LLONG_MAX; > } > > int res_counter_charge_locked(struct res_counter *counter, unsigned long val) > { > - if (counter->usage + val > counter->limit) { > + if (counter->usage + val > counter->hard_limit) { > counter->failcnt++; > return -ENOMEM; > } > @@ -65,8 +66,10 @@ res_counter_member(struct res_counter *c > switch (member) { > case RES_USAGE: > return &counter->usage; > - case RES_LIMIT: > - return &counter->limit; > + case RES_SOFT_LIMIT: > + return &counter->soft_limit; > + case RES_HARD_LIMIT: > + return &counter->hard_limit; > case RES_FAILCNT: > return &counter->failcnt; > }; > diff -puN include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup include/linux/res_counter.h > --- linux-2.6.24/include/linux/res_counter.h~memory-controller-res_counters-soft-limit-setup 2008-02-13 19:50:24.000000000 +0530 > +++ linux-2.6.24-balbir/include/linux/res_counter.h 2008-02-13 19:50:24.000000000 +0530 > @@ -27,7 +27,13 @@ struct res_counter { > /* > * the limit that usage cannot exceed > */ > - unsigned long long limit; > + unsigned long long hard_limit; > + /* > + * the limit that usage can exceed, but under memory > + * pressure, we will reclaim back memory above the > + * soft limit mark > + */ Resource counter accounts for arbitrary resource. Memory pressure and memory reclamation both only make sense in case we're dealing with memory controller. Please, remove this comment or move it to memcontrol.c. > + unsigned long long soft_limit; > /* > * the number of unsuccessful attempts to consume the resource > */ > @@ -64,7 +70,8 @@ ssize_t res_counter_write(struct res_cou > > enum { > RES_USAGE, > - RES_LIMIT, > + RES_SOFT_LIMIT, > + RES_HARD_LIMIT, > RES_FAILCNT, > }; > > @@ -101,11 +108,21 @@ int res_counter_charge(struct res_counte > void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val); > void res_counter_uncharge(struct res_counter *counter, unsigned long val); > > -static inline bool res_counter_limit_check_locked(struct res_counter *cnt) > +static inline bool res_counter_limit_check_locked(struct res_counter *cnt, > + int member) > { > - if (cnt->usage < cnt->limit) > - return true; > - > + switch (member) { > + case RES_HARD_LIMIT: > + if (cnt->usage < cnt->hard_limit) > + return true; > + break; > + case RES_SOFT_LIMIT: > + if (cnt->usage < cnt->soft_limit) > + return true; > + break; > + default: > + BUG_ON(1); > + } Does the compiler optimize this when the member is a built in const? > return false; > } > > @@ -113,13 +130,14 @@ static inline bool res_counter_limit_che > * Helper function to detect if the cgroup is within it's limit or > * not. It's currently called from cgroup_rss_prepare() > */ > -static inline bool res_counter_check_under_limit(struct res_counter *cnt) > +static inline bool res_counter_check_under_limit(struct res_counter *cnt, > + int member) > { > bool ret; > unsigned long flags; > > spin_lock_irqsave(&cnt->lock, flags); > - ret = res_counter_limit_check_locked(cnt); > + ret = res_counter_limit_check_locked(cnt, member); > spin_unlock_irqrestore(&cnt->lock, flags); > return ret; > } > diff -puN include/linux/memcontrol.h~memory-controller-res_counters-soft-limit-setup include/linux/memcontrol.h > _ > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com [9.184.220.61]) by e28esmtp04.in.ibm.com (8.13.1/8.13.1) with ESMTP id m1DHMgcX004048 for ; Wed, 13 Feb 2008 22:52:42 +0530 Received: from d28av03.in.ibm.com (d28av03.in.ibm.com [9.184.220.65]) by d28relay04.in.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DHMgQG712836 for ; Wed, 13 Feb 2008 22:52:42 +0530 Received: from d28av03.in.ibm.com (loopback [127.0.0.1]) by d28av03.in.ibm.com (8.13.1/8.13.3) with ESMTP id m1DHMf58021330 for ; Wed, 13 Feb 2008 17:22:42 GMT Message-ID: <47B326BA.7040000@linux.vnet.ibm.com> Date: Wed, 13 Feb 2008 22:49:54 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151214.7529.3954.sendpatchset@localhost.localdomain> <47B324F4.1050102@openvz.org> In-Reply-To: <47B324F4.1050102@openvz.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Pavel Emelyanov Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Nick Piggin , "Eric W. Biederman" , David Rientjes , Andrew Morton , Rik Van Riel , Herbert Poetzl , KAMEZAWA Hiroyuki List-ID: Pavel Emelyanov wrote: > Balbir Singh wrote: > Resource counter accounts for arbitrary resource. Memory pressure > and memory reclamation both only make sense in case we're dealing > with memory controller. Please, remove this comment or move it to > memcontrol.c. > Yes, they always have. The concept of soft limits, hard limits, guarantees applies to all resources. Why do you say they apply only to memory controller? I can change the comment to make the definition generic for all resources. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47B32B27.4090406@openvz.org> Date: Wed, 13 Feb 2008 20:38:47 +0300 From: Pavel Emelyanov MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151214.7529.3954.sendpatchset@localhost.localdomain> <47B324F4.1050102@openvz.org> <47B326BA.7040000@linux.vnet.ibm.com> In-Reply-To: <47B326BA.7040000@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Nick Piggin , "Eric W. Biederman" , David Rientjes , Andrew Morton , Rik Van Riel , Herbert Poetzl , KAMEZAWA Hiroyuki List-ID: Balbir Singh wrote: > Pavel Emelyanov wrote: >> Balbir Singh wrote: > >> Resource counter accounts for arbitrary resource. Memory pressure >> and memory reclamation both only make sense in case we're dealing >> with memory controller. Please, remove this comment or move it to >> memcontrol.c. >> > > Yes, they always have. The concept of soft limits, hard limits, guarantees > applies to all resources. Why do you say they apply only to memory controller? I I said that *memory pressure and memory reclamation*, not the soft limits in general, applies to memory controller only :) > can change the comment to make the definition generic for all resources. > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com [9.184.220.61]) by e28esmtp07.in.ibm.com (8.13.1/8.13.1) with ESMTP id m1DHvHer011615 for ; Wed, 13 Feb 2008 23:27:17 +0530 Received: from d28av01.in.ibm.com (d28av01.in.ibm.com [9.184.220.63]) by d28relay04.in.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1DHvHgB905380 for ; Wed, 13 Feb 2008 23:27:17 +0530 Received: from d28av01.in.ibm.com (loopback [127.0.0.1]) by d28av01.in.ibm.com (8.13.1/8.13.3) with ESMTP id m1DHvLnc019455 for ; Wed, 13 Feb 2008 17:57:22 GMT Message-ID: <47B32ED5.8030209@linux.vnet.ibm.com> Date: Wed, 13 Feb 2008 23:24:29 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 1/4] Modify resource counters to add soft limit support References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151214.7529.3954.sendpatchset@localhost.localdomain> <47B324F4.1050102@openvz.org> <47B326BA.7040000@linux.vnet.ibm.com> <47B32B27.4090406@openvz.org> In-Reply-To: <47B32B27.4090406@openvz.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Pavel Emelyanov Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Nick Piggin , "Eric W. Biederman" , David Rientjes , Andrew Morton , Rik Van Riel , Herbert Poetzl , KAMEZAWA Hiroyuki List-ID: Pavel Emelyanov wrote: > Balbir Singh wrote: >> Pavel Emelyanov wrote: >>> Balbir Singh wrote: >>> Resource counter accounts for arbitrary resource. Memory pressure >>> and memory reclamation both only make sense in case we're dealing >>> with memory controller. Please, remove this comment or move it to >>> memcontrol.c. >>> >> Yes, they always have. The concept of soft limits, hard limits, guarantees >> applies to all resources. Why do you say they apply only to memory controller? I > > I said that *memory pressure and memory reclamation*, not the soft limits > in general, applies to memory controller only :) > I suspected that, that's why I asked if I should change the comment to make it generic :) I'll make that change in the next revision of the patches. >> can change the comment to make the definition generic for all resources. >> -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 14 Feb 2008 16:30:54 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Message-Id: <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20080213151242.7529.79924.sendpatchset@localhost.localdomain> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Balbir Singh Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Wed, 13 Feb 2008 20:42:42 +0530 Balbir Singh wrote: > > + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); > + while (!list_empty(&mem_cgroup_sl_exceeded_list)) { > + mem = list_first_entry(&mem_cgroup_sl_exceeded_list, > + struct mem_cgroup, sl_exceeded_list); > + list_move(&mem->sl_exceeded_list, &reclaimed_groups); > + read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); > + > + nr_bytes_over_sl = res_counter_sl_excess(&mem->res); > + if (nr_bytes_over_sl <= 0) > + goto next; > + nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT); > + ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages, > + zones); > +next: > + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); Hmm... This is triggered by page allocation failure (fast path) in alloc_pages() after try_to_free_pages(). Then, what pages should be reclaimed is depends on zones[]. Because nr-bytes_over_sl is counted globally, cgroup's pages may not be included in zones[]. And I think it's big workload to relclaim all excessed pages at once. How about just reclaiming small # of pages ? like == if (nr_bytes_over_sl <= 0) goto next; nr_pages = SWAP_CLUSTER_MAX; == Regards, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sd0109e.au.ibm.com (d23rh905.au.ibm.com [202.81.18.225]) by e23smtp06.au.ibm.com (8.13.1/8.13.1) with ESMTP id m1E7hNPi022590 for ; Thu, 14 Feb 2008 18:43:23 +1100 Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by sd0109e.au.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1E7lB8r242296 for ; Thu, 14 Feb 2008 18:47:11 +1100 Received: from d23av01.au.ibm.com (loopback [127.0.0.1]) by d23av01.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1E7hW80030214 for ; Thu, 14 Feb 2008 18:43:33 +1100 Message-ID: <47B3F073.1070804@linux.vnet.ibm.com> Date: Thu, 14 Feb 2008 13:10:35 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: KAMEZAWA Hiroyuki Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: KAMEZAWA Hiroyuki wrote: > On Wed, 13 Feb 2008 20:42:42 +0530 > Balbir Singh wrote: > >> + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); >> + while (!list_empty(&mem_cgroup_sl_exceeded_list)) { >> + mem = list_first_entry(&mem_cgroup_sl_exceeded_list, >> + struct mem_cgroup, sl_exceeded_list); >> + list_move(&mem->sl_exceeded_list, &reclaimed_groups); >> + read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); >> + >> + nr_bytes_over_sl = res_counter_sl_excess(&mem->res); >> + if (nr_bytes_over_sl <= 0) >> + goto next; >> + nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT); >> + ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages, >> + zones); >> +next: >> + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); > > Hmm... > This is triggered by page allocation failure (fast path) in alloc_pages() > after try_to_free_pages(). We trigger it prior to try_to_free_pages() in __alloc_pages() Then, what pages should be reclaimed is > depends on zones[]. Because nr-bytes_over_sl is counted globally, cgroup's > pages may not be included in zones[]. > True, that is quite possible. > And I think it's big workload to relclaim all excessed pages at once. > > How about just reclaiming small # of pages ? like > == > if (nr_bytes_over_sl <= 0) > goto next; > nr_pages = SWAP_CLUSTER_MAX; I thought about this, but wanted to push back all groups over their soft limit back to their soft limit quickly. I'll experiment with your suggestion and see how the system behaves when we push back pages slowly. Thanks for the suggestion. > == > > Regards, > -Kame -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 14 Feb 2008 17:42:36 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Message-Id: <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <47B3F073.1070804@linux.vnet.ibm.com> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Thu, 14 Feb 2008 13:10:35 +0530 Balbir Singh wrote: > > And I think it's big workload to relclaim all excessed pages at once. > > > > How about just reclaiming small # of pages ? like > > == > > if (nr_bytes_over_sl <= 0) > > goto next; > > nr_pages = SWAP_CLUSTER_MAX; > > I thought about this, but wanted to push back all groups over their soft limit > back to their soft limit quickly. I'll experiment with your suggestion and see > how the system behaves when we push back pages slowly. Thanks for the suggestion. My point is an unlucky process may have to reclaim tons of pages even if what he wants is just 1 page. It's not good, IMO. Probably backgound-reclaim patch will be able to help this soft-limit situation, if a daemon can know it should reclaim or not. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d28relay04.in.ibm.com (d28relay04.in.ibm.com [9.184.220.61]) by e28esmtp05.in.ibm.com (8.13.1/8.13.1) with ESMTP id m1E9JHp6028921 for ; Thu, 14 Feb 2008 14:49:17 +0530 Received: from d28av04.in.ibm.com (d28av04.in.ibm.com [9.184.220.66]) by d28relay04.in.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1E9JG6v1020020 for ; Thu, 14 Feb 2008 14:49:16 +0530 Received: from d28av04.in.ibm.com (loopback [127.0.0.1]) by d28av04.in.ibm.com (8.13.1/8.13.3) with ESMTP id m1E9JGww013081 for ; Thu, 14 Feb 2008 09:19:16 GMT Message-ID: <47B406E4.9060109@linux.vnet.ibm.com> Date: Thu, 14 Feb 2008 14:46:20 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: KAMEZAWA Hiroyuki Cc: linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Paul Menage , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: KAMEZAWA Hiroyuki wrote: > On Thu, 14 Feb 2008 13:10:35 +0530 > Balbir Singh wrote: > >>> And I think it's big workload to relclaim all excessed pages at once. >>> >>> How about just reclaiming small # of pages ? like >>> == >>> if (nr_bytes_over_sl <= 0) >>> goto next; >>> nr_pages = SWAP_CLUSTER_MAX; >> I thought about this, but wanted to push back all groups over their soft limit >> back to their soft limit quickly. I'll experiment with your suggestion and see >> how the system behaves when we push back pages slowly. Thanks for the suggestion. > > My point is an unlucky process may have to reclaim tons of pages even if > what he wants is just 1 page. It's not good, IMO. > Yes, that makes sense. > Probably backgound-reclaim patch will be able to help this soft-limit situation, > if a daemon can know it should reclaim or not. > Yes, I agree. I might just need to schedule the daemon under memory pressure. > Thanks, > -Kame -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure In-Reply-To: Your message of "Wed, 13 Feb 2008 20:42:42 +0530" <20080213151242.7529.79924.sendpatchset@localhost.localdomain> References: <20080213151242.7529.79924.sendpatchset@localhost.localdomain> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Message-Id: <20080214102758.D2CD91E3C58@siro.lan> Date: Thu, 14 Feb 2008 19:27:58 +0900 (JST) From: yamamoto@valinux.co.jp (YAMAMOTO Takashi) Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: linux-mm@kvack.org, hugh@veritas.com, a.p.zijlstra@chello.nl, menage@google.com, Lee.Schermerhorn@hp.com, herbert@13thfloor.at, ebiederm@xmission.com, rientjes@google.com, xemul@openvz.org, nickpiggin@yahoo.com.au, riel@redhat.com, akpm@linux-foundation.org, kamezawa.hiroyu@jp.fujitsu.com List-ID: > +/* > + * Free all control groups, which are over their soft limit > + */ > +unsigned long mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, > + gfp_t gfp_mask) > +{ > + struct mem_cgroup *mem; > + unsigned long nr_pages; > + long long nr_bytes_over_sl; > + unsigned long ret = 0; > + unsigned long flags; > + struct list_head reclaimed_groups; > > + INIT_LIST_HEAD(&reclaimed_groups); > + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); > + while (!list_empty(&mem_cgroup_sl_exceeded_list)) { > + mem = list_first_entry(&mem_cgroup_sl_exceeded_list, > + struct mem_cgroup, sl_exceeded_list); > + list_move(&mem->sl_exceeded_list, &reclaimed_groups); > + read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); > + > + nr_bytes_over_sl = res_counter_sl_excess(&mem->res); > + if (nr_bytes_over_sl <= 0) > + goto next; > + nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT); > + ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages, > + zones); > +next: > + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); > + } what prevents the cgroup 'mem' from disappearing while we are dropping mem_cgroup_sl_list_lock? YAMAMOTO Takashi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sd0109e.au.ibm.com (d23rh905.au.ibm.com [202.81.18.225]) by e23smtp06.au.ibm.com (8.13.1/8.13.1) with ESMTP id m1F3M4ai021295 for ; Fri, 15 Feb 2008 14:22:04 +1100 Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by sd0109e.au.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1F3PqqB188924 for ; Fri, 15 Feb 2008 14:25:52 +1100 Received: from d23av02.au.ibm.com (loopback [127.0.0.1]) by d23av02.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1F3ME2B008430 for ; Fri, 15 Feb 2008 14:22:14 +1100 Message-ID: <47B504AF.90001@linux.vnet.ibm.com> Date: Fri, 15 Feb 2008 08:49:11 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure References: <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214102758.D2CD91E3C58@siro.lan> In-Reply-To: <20080214102758.D2CD91E3C58@siro.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: YAMAMOTO Takashi Cc: linux-mm@kvack.org, hugh@veritas.com, a.p.zijlstra@chello.nl, menage@google.com, Lee.Schermerhorn@hp.com, herbert@13thfloor.at, ebiederm@xmission.com, rientjes@google.com, xemul@openvz.org, nickpiggin@yahoo.com.au, riel@redhat.com, akpm@linux-foundation.org, kamezawa.hiroyu@jp.fujitsu.com List-ID: YAMAMOTO Takashi wrote: >> +/* >> + * Free all control groups, which are over their soft limit >> + */ >> +unsigned long mem_cgroup_pushback_groups_over_soft_limit(struct zone **zones, >> + gfp_t gfp_mask) >> +{ >> + struct mem_cgroup *mem; >> + unsigned long nr_pages; >> + long long nr_bytes_over_sl; >> + unsigned long ret = 0; >> + unsigned long flags; >> + struct list_head reclaimed_groups; >> >> + INIT_LIST_HEAD(&reclaimed_groups); >> + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); >> + while (!list_empty(&mem_cgroup_sl_exceeded_list)) { >> + mem = list_first_entry(&mem_cgroup_sl_exceeded_list, >> + struct mem_cgroup, sl_exceeded_list); >> + list_move(&mem->sl_exceeded_list, &reclaimed_groups); >> + read_unlock_irqrestore(&mem_cgroup_sl_list_lock, flags); >> + >> + nr_bytes_over_sl = res_counter_sl_excess(&mem->res); >> + if (nr_bytes_over_sl <= 0) >> + goto next; >> + nr_pages = (nr_bytes_over_sl >> PAGE_SHIFT); >> + ret += try_to_free_mem_cgroup_pages(mem, gfp_mask, nr_pages, >> + zones); >> +next: >> + read_lock_irqsave(&mem_cgroup_sl_list_lock, flags); >> + } > > what prevents the cgroup 'mem' from disappearing while we are dropping > mem_cgroup_sl_list_lock? > I thought I had a css_get/put around it, but I don't. Thanks for catching the problem. > YAMAMOTO Takashi -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from zps18.corp.google.com (zps18.corp.google.com [172.25.146.18]) by smtp-out.google.com with ESMTP id m1F4HRaD032410 for ; Fri, 15 Feb 2008 04:17:27 GMT Received: from py-out-1112.google.com (pyha78.prod.google.com [10.34.228.78]) by zps18.corp.google.com with ESMTP id m1F4HQcV004879 for ; Thu, 14 Feb 2008 20:17:26 -0800 Received: by py-out-1112.google.com with SMTP id a78so687822pyh.32 for ; Thu, 14 Feb 2008 20:17:26 -0800 (PST) Message-ID: <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> Date: Thu, 14 Feb 2008 20:17:25 -0800 From: "Paul Menage" Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure In-Reply-To: <47B406E4.9060109@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: KAMEZAWA Hiroyuki , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Thu, Feb 14, 2008 at 1:16 AM, Balbir Singh wrote: > > Probably backgound-reclaim patch will be able to help this soft-limit situation, > > if a daemon can know it should reclaim or not. > > > > Yes, I agree. I might just need to schedule the daemon under memory pressure. > Can we also have a way to trigger a one-off reclaim (of a configurable magnitude) from userspace? Having a background daemon doing it may be fine as a default, but there will be cases when a userspace machine manager knows better than the kernel how frequently/hard to try to reclaim on a given cgroup. Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [202.81.18.234]) by e23smtp05.au.ibm.com (8.13.1/8.13.1) with ESMTP id m1F4SDq8032078 for ; Fri, 15 Feb 2008 15:28:13 +1100 Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1F4SQFw3584116 for ; Fri, 15 Feb 2008 15:28:27 +1100 Received: from d23av02.au.ibm.com (loopback [127.0.0.1]) by d23av02.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1F4SQT2030873 for ; Fri, 15 Feb 2008 15:28:26 +1100 Message-ID: <47B51430.4090009@linux.vnet.ibm.com> Date: Fri, 15 Feb 2008 09:55:20 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> In-Reply-To: <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Menage Cc: KAMEZAWA Hiroyuki , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: Paul Menage wrote: > On Thu, Feb 14, 2008 at 1:16 AM, Balbir Singh wrote: >> > Probably backgound-reclaim patch will be able to help this soft-limit situation, >> > if a daemon can know it should reclaim or not. >> > >> >> Yes, I agree. I might just need to schedule the daemon under memory pressure. >> > > Can we also have a way to trigger a one-off reclaim (of a configurable > magnitude) from userspace? Having a background daemon doing it may be > fine as a default, but there will be cases when a userspace machine > manager knows better than the kernel how frequently/hard to try to > reclaim on a given cgroup. > > Paul We have that capability, but we cannot specify how much to reclaim. There is a force_empty file that when written to, tries to reclaim all pages from the cgroup. Depending on the need, it can be extended so that the number of pages to be reclaimed can be specified. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 15 Feb 2008 14:07:32 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Message-Id: <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <47B51430.4090009@linux.vnet.ibm.com> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: Paul Menage , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Fri, 15 Feb 2008 09:55:20 +0530 Balbir Singh wrote: > Paul Menage wrote: > > On Thu, Feb 14, 2008 at 1:16 AM, Balbir Singh wrote: > >> > Probably backgound-reclaim patch will be able to help this soft-limit situation, > >> > if a daemon can know it should reclaim or not. > >> > > >> > >> Yes, I agree. I might just need to schedule the daemon under memory pressure. > >> > > > > Can we also have a way to trigger a one-off reclaim (of a configurable > > magnitude) from userspace? Having a background daemon doing it may be > > fine as a default, but there will be cases when a userspace machine > > manager knows better than the kernel how frequently/hard to try to > > reclaim on a given cgroup. > > > > Paul > > We have that capability, but we cannot specify how much to reclaim. > There is a force_empty file that when written to, tries to reclaim all pages > from the cgroup. Depending on the need, it can be extended so that the number of > pages to be reclaimed can be specified. > Note: Now, force_empty doesn't try to free memory but just drops charges. We can free memory by just making memory.limit to smaller number. (This may cause OOM. If we added high-low watermark, making memory.high smaller can works well for memory freeing to some extent.) Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from zps75.corp.google.com (zps75.corp.google.com [172.25.146.75]) by smtp-out.google.com with ESMTP id m1F5GnKP003123 for ; Fri, 15 Feb 2008 05:16:50 GMT Received: from py-out-1112.google.com (pybp76.prod.google.com [10.34.92.76]) by zps75.corp.google.com with ESMTP id m1F5GH9F009536 for ; Thu, 14 Feb 2008 21:16:49 -0800 Received: by py-out-1112.google.com with SMTP id p76so917303pyb.2 for ; Thu, 14 Feb 2008 21:16:49 -0800 (PST) Message-ID: <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> Date: Thu, 14 Feb 2008 21:16:48 -0800 From: "Paul Menage" Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure In-Reply-To: <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> Sender: owner-linux-mm@kvack.org Return-Path: To: KAMEZAWA Hiroyuki Cc: balbir@linux.vnet.ibm.com, linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki wrote: > We can free memory by just making memory.limit to smaller number. > (This may cause OOM. If we added high-low watermark, making memory.high smaller > can works well for memory freeing to some extent.) > What about if we want to apply memory pressure to a cgroup to push out unused memory, but not push out memory that it's actively using? Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d28relay02.in.ibm.com (d28relay02.in.ibm.com [9.184.220.59]) by e28esmtp02.in.ibm.com (8.13.1/8.13.1) with ESMTP id m1F5Lkue014694 for ; Fri, 15 Feb 2008 10:51:46 +0530 Received: from d28av01.in.ibm.com (d28av01.in.ibm.com [9.184.220.63]) by d28relay02.in.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1F5Lk6r1073262 for ; Fri, 15 Feb 2008 10:51:46 +0530 Received: from d28av01.in.ibm.com (loopback [127.0.0.1]) by d28av01.in.ibm.com (8.13.1/8.13.3) with ESMTP id m1F5LokE032224 for ; Fri, 15 Feb 2008 05:21:50 GMT Message-ID: <47B520B6.2020101@linux.vnet.ibm.com> Date: Fri, 15 Feb 2008 10:48:46 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> In-Reply-To: <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Menage Cc: KAMEZAWA Hiroyuki , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: Paul Menage wrote: > On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki > wrote: >> We can free memory by just making memory.limit to smaller number. >> (This may cause OOM. If we added high-low watermark, making memory.high smaller >> can works well for memory freeing to some extent.) >> > > What about if we want to apply memory pressure to a cgroup to push out > unused memory, but not push out memory that it's actively using? Both watermarks and reducing the limit will reclaim from the inactive list first. The reclaim logic is the same as that of the per zone LRU. It would be right to assume that both would push out unused memory first. Am I missing something? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 15 Feb 2008 14:29:58 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Message-Id: <20080215142958.511a2732.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Paul Menage Cc: balbir@linux.vnet.ibm.com, linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Thu, 14 Feb 2008 21:16:48 -0800 "Paul Menage" wrote: > On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki > wrote: > > We can free memory by just making memory.limit to smaller number. > > (This may cause OOM. If we added high-low watermark, making memory.high smaller > > can works well for memory freeing to some extent.) > > > > What about if we want to apply memory pressure to a cgroup to push out > unused memory, but not push out memory that it's actively using? > Generally, only way to avoid pageout is mlock() because actively-used is just determeined by reference-bit and heavy pressure can do page-scanning too much. I hope that RvR's LRU improvement may change things better. Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 15 Feb 2008 14:33:09 +0900 From: KAMEZAWA Hiroyuki Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure Message-Id: <20080215143309.d8375918.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <47B520B6.2020101@linux.vnet.ibm.com> References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> <47B520B6.2020101@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: Paul Menage , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Fri, 15 Feb 2008 10:48:46 +0530 Balbir Singh wrote: > Paul Menage wrote: > > On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki > > wrote: > >> We can free memory by just making memory.limit to smaller number. > >> (This may cause OOM. If we added high-low watermark, making memory.high smaller > >> can works well for memory freeing to some extent.) > >> > > > > What about if we want to apply memory pressure to a cgroup to push out > > unused memory, but not push out memory that it's actively using? > > Both watermarks and reducing the limit will reclaim from the inactive list > first. The reclaim logic is the same as that of the per zone LRU. It would be > right to assume that both would push out unused memory first. Am I missing > something? > You are right to some extent. If memory.limit is very small and there is heavy memory pressure, we have no chance. (For example, some text/program for shell-scirpt can be pageout easily because it's not mapped always.) Thanks, -Kame -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from zps76.corp.google.com (zps76.corp.google.com [172.25.146.76]) by smtp-out.google.com with ESMTP id m1F5UMAn014598 for ; Fri, 15 Feb 2008 05:30:23 GMT Received: from py-out-1112.google.com (pyhb50.prod.google.com [10.34.229.50]) by zps76.corp.google.com with ESMTP id m1F5UK11023100 for ; Thu, 14 Feb 2008 21:30:21 -0800 Received: by py-out-1112.google.com with SMTP id b50so678450pyh.30 for ; Thu, 14 Feb 2008 21:30:20 -0800 (PST) Message-ID: <6599ad830802142130h529ecd59w8f9e4e761d4fe20c@mail.gmail.com> Date: Thu, 14 Feb 2008 21:30:19 -0800 From: "Paul Menage" Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure In-Reply-To: <47B520B6.2020101@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> <47B520B6.2020101@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org Return-Path: To: balbir@linux.vnet.ibm.com Cc: KAMEZAWA Hiroyuki , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: On Thu, Feb 14, 2008 at 9:18 PM, Balbir Singh wrote: > > Paul Menage wrote: > > On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki > > wrote: > >> We can free memory by just making memory.limit to smaller number. > >> (This may cause OOM. If we added high-low watermark, making memory.high smaller > >> can works well for memory freeing to some extent.) > >> > > > > What about if we want to apply memory pressure to a cgroup to push out > > unused memory, but not push out memory that it's actively using? > > Both watermarks and reducing the limit will reclaim from the inactive list > first. The reclaim logic is the same as that of the per zone LRU. It would be > right to assume that both would push out unused memory first. Am I missing > something? > Doesn't the per-zone LRU logic try to keep the inactive list at a certain percentage of memory? In which case you can't really tell from the active/inactive stats for a cgroup how much of that memory it's really using. Paul -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [202.81.18.234]) by e23smtp05.au.ibm.com (8.13.1/8.13.1) with ESMTP id m1F6dNuI004878 for ; Fri, 15 Feb 2008 17:39:23 +1100 Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m1F6dbuK2150478 for ; Fri, 15 Feb 2008 17:39:37 +1100 Received: from d23av03.au.ibm.com (loopback [127.0.0.1]) by d23av03.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m1F6ddSV018230 for ; Fri, 15 Feb 2008 17:39:40 +1100 Message-ID: <47B532F2.8010902@linux.vnet.ibm.com> Date: Fri, 15 Feb 2008 12:06:34 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC] [PATCH 3/4] Reclaim from groups over their soft limit under memory pressure References: <20080213151201.7529.53642.sendpatchset@localhost.localdomain> <20080213151242.7529.79924.sendpatchset@localhost.localdomain> <20080214163054.81deaf27.kamezawa.hiroyu@jp.fujitsu.com> <47B3F073.1070804@linux.vnet.ibm.com> <20080214174236.aa2aae9b.kamezawa.hiroyu@jp.fujitsu.com> <47B406E4.9060109@linux.vnet.ibm.com> <6599ad830802142017g7cdb1b9cid8bbc8cb97e2df68@mail.gmail.com> <47B51430.4090009@linux.vnet.ibm.com> <20080215140732.8b2dc04e.kamezawa.hiroyu@jp.fujitsu.com> <6599ad830802142116r1c942d78y7002d90c2690a498@mail.gmail.com> <20080215142958.511a2732.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20080215142958.511a2732.kamezawa.hiroyu@jp.fujitsu.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: KAMEZAWA Hiroyuki Cc: Paul Menage , linux-mm@kvack.org, Hugh Dickins , Peter Zijlstra , YAMAMOTO Takashi , Lee Schermerhorn , Herbert Poetzl , "Eric W. Biederman" , David Rientjes , Pavel Emelianov , Nick Piggin , Rik Van Riel , Andrew Morton List-ID: KAMEZAWA Hiroyuki wrote: > On Thu, 14 Feb 2008 21:16:48 -0800 > "Paul Menage" wrote: > >> On Thu, Feb 14, 2008 at 9:07 PM, KAMEZAWA Hiroyuki >> wrote: >>> We can free memory by just making memory.limit to smaller number. >>> (This may cause OOM. If we added high-low watermark, making memory.high smaller >>> can works well for memory freeing to some extent.) >>> >> What about if we want to apply memory pressure to a cgroup to push out >> unused memory, but not push out memory that it's actively using? >> > Generally, only way to avoid pageout is mlock() because actively-used is just > determeined by reference-bit and heavy pressure can do page-scanning too much. > I hope that RvR's LRU improvement may change things better. There are two other controllers, I plan to work on soon. The mlock() and virtual memory limit controller. Hopefully that should fix the mlock() problem to some extent. -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org