From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail143.messagelabs.com (mail143.messagelabs.com [216.82.254.35]) by kanga.kvack.org (Postfix) with ESMTP id F0CAC6B0087 for ; Tue, 17 Feb 2009 00:39:13 -0500 (EST) Received: from d23relay02.au.ibm.com (d23relay02.au.ibm.com [202.81.31.244]) by e23smtp08.au.ibm.com (8.13.1/8.13.1) with ESMTP id n1H5d6fq007579 for ; Tue, 17 Feb 2009 16:39:06 +1100 Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139]) by d23relay02.au.ibm.com (8.13.8/8.13.8/NCO v9.1) with ESMTP id n1H5d6EH975092 for ; Tue, 17 Feb 2009 16:39:06 +1100 Received: from d23av04.au.ibm.com (loopback [127.0.0.1]) by d23av04.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n1H5d5bg002575 for ; Tue, 17 Feb 2009 16:39:06 +1100 Date: Tue, 17 Feb 2009 11:09:03 +0530 From: Balbir Singh Subject: Re: [RFC][PATCH 0/4] Memory controller soft limit patches (v2) Message-ID: <20090217053903.GA3513@balbir.in.ibm.com> Reply-To: balbir@linux.vnet.ibm.com References: <20090216110844.29795.17804.sendpatchset@localhost.localdomain> <20090217090523.975bbec2.kamezawa.hiroyu@jp.fujitsu.com> <20090217030526.GA20958@balbir.in.ibm.com> <20090217130352.4ba7f91c.kamezawa.hiroyu@jp.fujitsu.com> <20090217044110.GD20958@balbir.in.ibm.com> <20090217141039.440e5463.kamezawa.hiroyu@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20090217141039.440e5463.kamezawa.hiroyu@jp.fujitsu.com> Sender: owner-linux-mm@kvack.org To: KAMEZAWA Hiroyuki Cc: linux-mm@kvack.org, Sudhir Kumar , YAMAMOTO Takashi , Bharata B Rao , Paul Menage , lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org, KOSAKI Motohiro , David Rientjes , Pavel Emelianov , Dhaval Giani , Rik van Riel , Andrew Morton List-ID: * KAMEZAWA Hiroyuki [2009-02-17 14:10:39]: > On Tue, 17 Feb 2009 10:11:10 +0530 > Balbir Singh wrote: > > > * KAMEZAWA Hiroyuki [2009-02-17 13:03:52]: > > > > > On Tue, 17 Feb 2009 08:35:26 +0530 > > > Balbir Singh wrote: > > > I don't want to add any new big burden to kernel hackers of memory management, > > > they work hard to improve memory reclaim. This patch will change the behavior. > > > > > > > I don't think I agree, this approach suggests that before doing global > > reclaim, there are several groups that are using more than their > > share of memory, so it makes sense to reclaim from them first. > > > > > > > > BTW, in typical bad case, several threads on cpus goes into memory recalim at once and > > > all thread will visit this memcg's soft-limit tree at once and soft-limit will > > > not work as desired anyway. > > > You can't avoid this problem at alloc_page() hot-path. > > > > Even if all threads go into soft-reclaim at once, the tree will become > > empty after a point and we will just return saying there are no more > > memcg's to reclaim from (we remove the memcg from the tree when > > reclaiming), then those threads will go into regular reclaim if there > > is still memory pressure. > > Yes. the largest-excess group will be removed. So, it seems that it doesn't work > as designed. rbtree is considered as just a hint ? If so, rbtree seems to be > overkill. > > just a question: > Assume memcg under hierarchy. > ../group_A/ usage=1G, soft_limit=900M hierarchy=1 > 01/ usage=200M, soft_limit=100M > 02/ usage=300M, soft_limit=200M > 03/ usage=500M, soft_limit=300M <==== 200M over. > 004/ usage=200M, soft_limit=100M > 005/ usage=300M, soft_limit=200M > > At memory shortage, group 03's memory will be reclaimed > - reclaim memory from 03, 03/004, 03/005 > > When 100M of group 03' memory is reclaimed, group_A 's memory is reclaimd at the > same time, implicitly. Doesn't this break your rb-tree ? > > I recommend you that soft-limit can be only applied to the node which is top of > hierarchy. Yes, that can be done, but the reason for putting both was to target the right memcg early. > > > > > > > > > 3. After this patch, res_counter is no longer for general purpose res_counter... > > > > > It seems to have too many unnecessary accessories for general purpose. > > > > > > > > Why not? Soft limits are a feature of any controller. The return of > > > > highest ancestor might be the only policy we impose right now. But as > > > > new controllers start using res_counter, we can clearly add a policy > > > > callback. > > > > > > > I think you forget that memroy cgroups is an only controller in which the kernel > > > can reduce the usage of resource without any harmful to users. > > > soft-limit is nonsense for general resources, I think. > > > > > > > Really? Even for CPUs? soft-limit is a form of shares (please don't > > confuse with cpu.shares). Soft limits is used as a way of implementing > > work conserving controllers. > > > > I don't think cpu needs this. It works under share and no hardlimit. > Forget CPUs for now. The concept of soft-limits is applicable to all resource controllers. Look at check_thread_timers, you'll see CPU soft limits for rlimit. soft limits allow overcommit as long as there is no contention on the resource. -- Balbir -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org