From mboxrd@z Thu Jan 1 00:00:00 1970 From: Balbir Singh Subject: Re: [RFC][-mm] Memory controller hierarchy support (v1) Date: Sat, 19 Apr 2008 14:04:00 +0530 Message-ID: <4809AE78.9030000@linux.vnet.ibm.com> References: <20080419053551.10501.44302.sendpatchset@localhost.localdomain> <20080419065624.9837E5A15@siro.lan> Reply-To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20080419065624.9837E5A15-Pcsii4f/SVk@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: YAMAMOTO Takashi Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org List-Id: containers.vger.kernel.org YAMAMOTO Takashi wrote: >> -int res_counter_charge(struct res_counter *counter, unsigned long val) >> +int res_counter_charge(struct res_counter *counter, unsigned long val, >> + struct res_counter **limit_exceeded_at) >> { >> int ret; >> unsigned long flags; >> + struct res_counter *c, *unroll_c; >> >> - spin_lock_irqsave(&counter->lock, flags); >> - ret = res_counter_charge_locked(counter, val); >> - spin_unlock_irqrestore(&counter->lock, flags); >> + *limit_exceeded_at = NULL; >> + local_irq_save(flags); >> + for (c = counter; c != NULL; c = c->parent) { >> + spin_lock(&c->lock); >> + ret = res_counter_charge_locked(c, val); >> + spin_unlock(&c->lock); >> + if (ret < 0) { >> + *limit_exceeded_at = c; >> + goto unroll; >> + } >> + } >> + local_irq_restore(flags); >> + return 0; >> + >> +unroll: >> + for (unroll_c = counter; unroll_c != c; unroll_c = unroll_c->parent) { >> + spin_lock(&unroll_c->lock); >> + res_counter_uncharge_locked(unroll_c, val); >> + spin_unlock(&unroll_c->lock); >> + } >> + local_irq_restore(flags); >> return ret; >> } > > i wonder how much performance impacts this involves. > > it increases the number of atomic ops per charge/uncharge and > makes the common case (success) of every charge/uncharge in a system > touch a global (ie. root cgroup's) cachelines. > Yes, it does. I'll run some tests to see what the overhead looks like. The multi-hierarchy feature is very useful though and one of the TODOs is to make the feature user selectable (possibly at run-time) >> + /* >> + * Ideally we need to hold cgroup_mutex here >> + */ >> + list_for_each_entry_safe_from(cgroup, cgrp, >> + &curr_cgroup->children, sibling) { >> + struct mem_cgroup *mem_child; >> + >> + mem_child = mem_cgroup_from_cont(cgroup); >> + ret = try_to_free_mem_cgroup_pages(mem_child, >> + gfp_mask); >> + mem->last_scanned_child = mem_child; >> + if (ret == 0) >> + break; >> + } > > if i read it correctly, it makes us hit the last child again and again. > Hmm.. it should probably be set at the beginining of the loop. I'll retest > i think you want to reclaim from all cgroups under the curr_cgroup > including eg. children's children. > Yes, good point, I should break out the function, so that we can work around the recursion problem. Charging can cause further recursion, since we check for last_counter. > YAMAMOTO Takashi -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755829AbYDSIkK (ORCPT ); Sat, 19 Apr 2008 04:40:10 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752922AbYDSIj6 (ORCPT ); Sat, 19 Apr 2008 04:39:58 -0400 Received: from E23SMTP04.au.ibm.com ([202.81.18.173]:42815 "EHLO e23smtp04.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752332AbYDSIj4 (ORCPT ); Sat, 19 Apr 2008 04:39:56 -0400 Message-ID: <4809AE78.9030000@linux.vnet.ibm.com> Date: Sat, 19 Apr 2008 14:04:00 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com Organization: IBM User-Agent: Thunderbird 2.0.0.12 (X11/20080226) MIME-Version: 1.0 To: YAMAMOTO Takashi CC: menage@google.com, xemul@openvz.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, containers@lists.osdl.org Subject: Re: [RFC][-mm] Memory controller hierarchy support (v1) References: <20080419053551.10501.44302.sendpatchset@localhost.localdomain> <20080419065624.9837E5A15@siro.lan> In-Reply-To: <20080419065624.9837E5A15@siro.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org YAMAMOTO Takashi wrote: >> -int res_counter_charge(struct res_counter *counter, unsigned long val) >> +int res_counter_charge(struct res_counter *counter, unsigned long val, >> + struct res_counter **limit_exceeded_at) >> { >> int ret; >> unsigned long flags; >> + struct res_counter *c, *unroll_c; >> >> - spin_lock_irqsave(&counter->lock, flags); >> - ret = res_counter_charge_locked(counter, val); >> - spin_unlock_irqrestore(&counter->lock, flags); >> + *limit_exceeded_at = NULL; >> + local_irq_save(flags); >> + for (c = counter; c != NULL; c = c->parent) { >> + spin_lock(&c->lock); >> + ret = res_counter_charge_locked(c, val); >> + spin_unlock(&c->lock); >> + if (ret < 0) { >> + *limit_exceeded_at = c; >> + goto unroll; >> + } >> + } >> + local_irq_restore(flags); >> + return 0; >> + >> +unroll: >> + for (unroll_c = counter; unroll_c != c; unroll_c = unroll_c->parent) { >> + spin_lock(&unroll_c->lock); >> + res_counter_uncharge_locked(unroll_c, val); >> + spin_unlock(&unroll_c->lock); >> + } >> + local_irq_restore(flags); >> return ret; >> } > > i wonder how much performance impacts this involves. > > it increases the number of atomic ops per charge/uncharge and > makes the common case (success) of every charge/uncharge in a system > touch a global (ie. root cgroup's) cachelines. > Yes, it does. I'll run some tests to see what the overhead looks like. The multi-hierarchy feature is very useful though and one of the TODOs is to make the feature user selectable (possibly at run-time) >> + /* >> + * Ideally we need to hold cgroup_mutex here >> + */ >> + list_for_each_entry_safe_from(cgroup, cgrp, >> + &curr_cgroup->children, sibling) { >> + struct mem_cgroup *mem_child; >> + >> + mem_child = mem_cgroup_from_cont(cgroup); >> + ret = try_to_free_mem_cgroup_pages(mem_child, >> + gfp_mask); >> + mem->last_scanned_child = mem_child; >> + if (ret == 0) >> + break; >> + } > > if i read it correctly, it makes us hit the last child again and again. > Hmm.. it should probably be set at the beginining of the loop. I'll retest > i think you want to reclaim from all cgroups under the curr_cgroup > including eg. children's children. > Yes, good point, I should break out the function, so that we can work around the recursion problem. Charging can cause further recursion, since we check for last_counter. > YAMAMOTO Takashi -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [202.81.18.234]) by e23smtp05.au.ibm.com (8.13.1/8.13.1) with ESMTP id m3J8dTqf029854 for ; Sat, 19 Apr 2008 18:39:29 +1000 Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v8.7) with ESMTP id m3J8doLq4112590 for ; Sat, 19 Apr 2008 18:39:50 +1000 Received: from d23av04.au.ibm.com (loopback [127.0.0.1]) by d23av04.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m3J8dsgX023957 for ; Sat, 19 Apr 2008 18:39:54 +1000 Message-ID: <4809AE78.9030000@linux.vnet.ibm.com> Date: Sat, 19 Apr 2008 14:04:00 +0530 From: Balbir Singh Reply-To: balbir@linux.vnet.ibm.com MIME-Version: 1.0 Subject: Re: [RFC][-mm] Memory controller hierarchy support (v1) References: <20080419053551.10501.44302.sendpatchset@localhost.localdomain> <20080419065624.9837E5A15@siro.lan> In-Reply-To: <20080419065624.9837E5A15@siro.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: YAMAMOTO Takashi Cc: menage@google.com, xemul@openvz.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, containers@lists.osdl.org List-ID: YAMAMOTO Takashi wrote: >> -int res_counter_charge(struct res_counter *counter, unsigned long val) >> +int res_counter_charge(struct res_counter *counter, unsigned long val, >> + struct res_counter **limit_exceeded_at) >> { >> int ret; >> unsigned long flags; >> + struct res_counter *c, *unroll_c; >> >> - spin_lock_irqsave(&counter->lock, flags); >> - ret = res_counter_charge_locked(counter, val); >> - spin_unlock_irqrestore(&counter->lock, flags); >> + *limit_exceeded_at = NULL; >> + local_irq_save(flags); >> + for (c = counter; c != NULL; c = c->parent) { >> + spin_lock(&c->lock); >> + ret = res_counter_charge_locked(c, val); >> + spin_unlock(&c->lock); >> + if (ret < 0) { >> + *limit_exceeded_at = c; >> + goto unroll; >> + } >> + } >> + local_irq_restore(flags); >> + return 0; >> + >> +unroll: >> + for (unroll_c = counter; unroll_c != c; unroll_c = unroll_c->parent) { >> + spin_lock(&unroll_c->lock); >> + res_counter_uncharge_locked(unroll_c, val); >> + spin_unlock(&unroll_c->lock); >> + } >> + local_irq_restore(flags); >> return ret; >> } > > i wonder how much performance impacts this involves. > > it increases the number of atomic ops per charge/uncharge and > makes the common case (success) of every charge/uncharge in a system > touch a global (ie. root cgroup's) cachelines. > Yes, it does. I'll run some tests to see what the overhead looks like. The multi-hierarchy feature is very useful though and one of the TODOs is to make the feature user selectable (possibly at run-time) >> + /* >> + * Ideally we need to hold cgroup_mutex here >> + */ >> + list_for_each_entry_safe_from(cgroup, cgrp, >> + &curr_cgroup->children, sibling) { >> + struct mem_cgroup *mem_child; >> + >> + mem_child = mem_cgroup_from_cont(cgroup); >> + ret = try_to_free_mem_cgroup_pages(mem_child, >> + gfp_mask); >> + mem->last_scanned_child = mem_child; >> + if (ret == 0) >> + break; >> + } > > if i read it correctly, it makes us hit the last child again and again. > Hmm.. it should probably be set at the beginining of the loop. I'll retest > i think you want to reclaim from all cgroups under the curr_cgroup > including eg. children's children. > Yes, good point, I should break out the function, so that we can work around the recursion problem. Charging can cause further recursion, since we check for last_counter. > YAMAMOTO Takashi -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org