From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>,
YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
containers@lists.osdl.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [RFC][-mm] Memory controller hierarchy support (v1)
Date: Sun, 20 Apr 2008 13:46:37 +0530 [thread overview]
Message-ID: <480AFBE5.1070702@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830804190849u31f13191m4dcca4e471493c2b@mail.gmail.com>
Paul Menage wrote:
> On Fri, Apr 18, 2008 at 10:35 PM, Balbir Singh
> <balbir@linux.vnet.ibm.com> wrote:
>> 1. We need to hold cgroup_mutex while walking through the children
>> in reclaim. We need to figure out the best way to do so. Should
>> cgroups provide a helper function/macro for it?
>
> There's already a function, cgroup_lock(). But it would be nice to
> avoid such a heavy locking here, particularly since memory allocations
> can occur with cgroup_mutex held, which could lead to a nasty deadlock
> if the allocation triggered reclaim.
>
Hmm.. probably..
> One of the things that I've been considering was to put the
> parent/child/sibling hierarchy explicitly in cgroup_subsys_state. This
> would give subsystems their own copy to refer to, and could use their
> own internal locking to synchronize with callbacks from cgroups that
> might change the hierarchy. Cpusets could make use of this too, since
> it has to traverse hierarchies sometimes.
>
Very cool! I look forward to that infrastructure. I'll also look at the cpuset
code and see how to traverse the hierarchy.
>> 2. Do not allow children to have a limit greater than their parents.
>> 3. Allow the user to select if hierarchial support is required
>
> My thoughts on this would be:
>
> 1) Never attach a first-level child's counter to its parent. As
> Yamamoto points out, otherwise we end up with extra global operations
> whenever any cgroup allocates or frees memory. Limiting the total
> system memory used by all user processes doesn't seem to be something
> that people are going to generally want to do, and if they really do
> want to they can just create a non-root child and move the whole
> system into that.
>
> The one big advantage that you currently get from having all
> first-level children be attached to the root is that the reclaim logic
> automatically scans other groups when it reaches the top-level - but I
> think that can be provided as a special-case in the reclaim traversal,
> avoiding the overhead of hitting the root cgroup that we have in this
> patch.
>
I've been doing some thinking along these lines, I'll think more about this.
> 2) Always attach other children's counters to their parents - if the
> user didn't want a hierarchy, they could create a flat grouping rather
> than nested groupings.
>
Yes, that's a TODO
> Paul
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
WARNING: multiple messages have this Message-ID (diff)
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>,
YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
containers@lists.osdl.org,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [RFC][-mm] Memory controller hierarchy support (v1)
Date: Sun, 20 Apr 2008 13:46:37 +0530 [thread overview]
Message-ID: <480AFBE5.1070702@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830804190849u31f13191m4dcca4e471493c2b@mail.gmail.com>
Paul Menage wrote:
> On Fri, Apr 18, 2008 at 10:35 PM, Balbir Singh
> <balbir@linux.vnet.ibm.com> wrote:
>> 1. We need to hold cgroup_mutex while walking through the children
>> in reclaim. We need to figure out the best way to do so. Should
>> cgroups provide a helper function/macro for it?
>
> There's already a function, cgroup_lock(). But it would be nice to
> avoid such a heavy locking here, particularly since memory allocations
> can occur with cgroup_mutex held, which could lead to a nasty deadlock
> if the allocation triggered reclaim.
>
Hmm.. probably..
> One of the things that I've been considering was to put the
> parent/child/sibling hierarchy explicitly in cgroup_subsys_state. This
> would give subsystems their own copy to refer to, and could use their
> own internal locking to synchronize with callbacks from cgroups that
> might change the hierarchy. Cpusets could make use of this too, since
> it has to traverse hierarchies sometimes.
>
Very cool! I look forward to that infrastructure. I'll also look at the cpuset
code and see how to traverse the hierarchy.
>> 2. Do not allow children to have a limit greater than their parents.
>> 3. Allow the user to select if hierarchial support is required
>
> My thoughts on this would be:
>
> 1) Never attach a first-level child's counter to its parent. As
> Yamamoto points out, otherwise we end up with extra global operations
> whenever any cgroup allocates or frees memory. Limiting the total
> system memory used by all user processes doesn't seem to be something
> that people are going to generally want to do, and if they really do
> want to they can just create a non-root child and move the whole
> system into that.
>
> The one big advantage that you currently get from having all
> first-level children be attached to the root is that the reclaim logic
> automatically scans other groups when it reaches the top-level - but I
> think that can be provided as a special-case in the reclaim traversal,
> avoiding the overhead of hitting the root cgroup that we have in this
> patch.
>
I've been doing some thinking along these lines, I'll think more about this.
> 2) Always attach other children's counters to their parents - if the
> user didn't want a hierarchy, they could create a flat grouping rather
> than nested groupings.
>
Yes, that's a TODO
> Paul
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-04-20 8:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-19 5:35 [RFC][-mm] Memory controller hierarchy support (v1) Balbir Singh
2008-04-19 5:35 ` Balbir Singh
2008-04-19 6:56 ` YAMAMOTO Takashi
2008-04-19 6:56 ` YAMAMOTO Takashi
[not found] ` <20080419065624.9837E5A15-Pcsii4f/SVk@public.gmane.org>
2008-04-19 8:34 ` Balbir Singh
2008-04-19 8:34 ` Balbir Singh
2008-04-19 8:34 ` Balbir Singh
2008-04-21 0:41 ` KAMEZAWA Hiroyuki
2008-04-21 0:41 ` KAMEZAWA Hiroyuki
[not found] ` <20080419053551.10501.44302.sendpatchset-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-04-19 10:47 ` Pavel Emelyanov
2008-04-19 10:47 ` Pavel Emelyanov
2008-04-19 10:47 ` Pavel Emelyanov
2008-04-20 7:43 ` Balbir Singh
2008-04-20 7:43 ` Balbir Singh
2008-04-19 15:49 ` Paul Menage
2008-04-19 15:49 ` Paul Menage
2008-04-19 15:49 ` Paul Menage
2008-04-20 8:16 ` Balbir Singh [this message]
2008-04-20 8:16 ` Balbir Singh
2008-04-21 6:33 ` Paul Jackson
2008-04-21 6:33 ` Paul Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=480AFBE5.1070702@linux.vnet.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=containers@lists.osdl.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=menage@google.com \
--cc=xemul@openvz.org \
--cc=yamamoto@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.