All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Paul Menage <paul@paulmenage.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>,
	Li Zefan <lizf@cn.fujitsu.com>, Tim Hockin <thockin@hockin.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Aditya Kali <adityakali@google.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: [RFD] Task counter: cgroup core feature or cgroup subsystem? (was Re: [PATCH 0/8 v3] cgroups: Task counter subsystem)
Date: Sat, 27 Aug 2011 15:40:40 +0200	[thread overview]
Message-ID: <20110827134038.GH3298@somewhere> (raw)
In-Reply-To: <CALdu-PAiRgjFaERETAZVH=+Ky-0ekjBWn6aDK5Hzam1AQ7sC4Q@mail.gmail.com>

On Fri, Aug 26, 2011 at 08:16:32AM -0700, Paul Menage wrote:
> On Wed, Aug 24, 2011 at 10:54 AM, Frederic Weisbecker
> <fweisbec@gmail.com> wrote:
> >
> > It seems your patch doesn't handle the ->fork() and ->exit() calls.
> > We probably need a quick access to states of multi-subsystems from
> > the task, some lists available from task->cgroups, I don't know yet.
> >
> 
> That state is available, but currently only while holding cgroup_mutex
> - at least, that's what task_cgroup_from_root() requires.
> 
> It might be the case that we could achieve the same effect by just
> locking the task, so the pre-condition for task_cgroup_from_root()
> would be either that cgroup_mutex is held or the task lock is held.
> 
> We could extend the signature of cgroup_subsys.fork to include a
> reference to the cgroup; for the singly-bindable subsystems this would
> be trivially available via task->cgroups; for the multi-bindable
> subsystems then for each hierarchy that the subsystem is mounted on
> we'd call task_cgroup_from_root() to get the cgroup for that
> hierarchy. So multi-bindable subsystems with fork/exit callbacks would
> get called once for each mounted instance of the subsystem.
> 
> This would still make the task counter subsystem a bit painful - it
> would read_lock a global rwlock (css_set_lock) on every fork/exit in
> order to find the cgroup to charge/uncharge. I'm not sure how painful
> that would be on a big system. If that were a noticeable performance
> problem, we could have a variable-length extension on the end of
> css_set that contains a list of hierarchy_index/cgroup pairs for any
> hierarchies that had multi-bindable subsystems (or maybe for all
> hierarchies, for simplicity). This would make creating a css_set a
> little bit more complicated, but overall shouldn't be too painful, and
> would make the problem of finding a cgroup for a given hierarchy
> trivial.

Oh you're right. My first idea was to reference multi-bindable
subsystem states in cgroup_subsys_state, like it's done currently
for singletons subsystems. But this indeed require cgroup_mutex
or task_lock. And only the last one look sensible in fork/exit path.
And if that becomes a scalability problem we can still have a
dedicated lock for cgroup attach/detach on tasks.

Whatever we do, we need that lock. So we can pick your
solution that references cgroups that belong to multi-bindable
subsystems for a given task in css_set, or we can have tsk->cgroups->subsys[]
a variable size array that references 1 * singletons and N * multi
bindable subsystems, N beeing the number of hierarchies that use
a given subsystem.

What do you think?

  reply	other threads:[~2011-08-27 13:40 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-29 16:13 [PATCH 0/8 v3] cgroups: Task counter subsystem (was: New max number of tasks subsystem) Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 1/8] cgroups: Add res_counter_write_u64() API Frederic Weisbecker
2011-08-09 15:17   ` Oleg Nesterov
2011-08-09 17:31     ` Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 2/8] cgroups: New resource counter inheritance API Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 3/8] cgroups: Add previous cgroup in can_attach_task/attach_task callbacks Frederic Weisbecker
2011-08-17  2:40   ` Li Zefan
2011-08-27 13:58     ` Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 4/8] cgroups: New cancel_attach_task subsystem callback Frederic Weisbecker
2011-08-17  2:40   ` Li Zefan
2011-08-27 13:58     ` Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 5/8] cgroups: Ability to stop res charge propagation on bounded ancestor Frederic Weisbecker
2011-08-17  2:41   ` Li Zefan
2011-08-27 13:59     ` Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 6/8] cgroups: Add res counter common ancestor searching Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 7/8] cgroups: Add a task counter subsystem Frederic Weisbecker
2011-08-01 23:13   ` Andrew Morton
2011-08-04 14:05     ` Frederic Weisbecker
2011-08-09 15:11   ` Oleg Nesterov
2011-08-09 17:27     ` Frederic Weisbecker
2011-08-09 17:57       ` Oleg Nesterov
2011-08-09 18:09         ` Frederic Weisbecker
2011-08-09 18:19           ` Oleg Nesterov
2011-08-09 18:34             ` Frederic Weisbecker
2011-08-09 18:39               ` Oleg Nesterov
2011-08-17  3:18   ` Li Zefan
2011-08-27 14:16     ` Frederic Weisbecker
2011-07-29 16:13 ` [PATCH 8/8] res_counter: Allow charge failure pointer to be null Frederic Weisbecker
2011-08-17  2:44   ` Li Zefan
2011-08-27 14:05     ` Frederic Weisbecker
2011-08-01 23:19 ` [PATCH 0/8 v3] cgroups: Task counter subsystem (was: New max number of tasks subsystem) Andrew Morton
2011-08-03 14:29   ` Frederic Weisbecker
2011-08-12 21:11   ` Tim Hockin
2011-08-16 16:01     ` Kay Sievers
2011-08-18 14:33       ` [RFD] Task counter: cgroup core feature or cgroup subsystem? (was Re: [PATCH 0/8 v3] cgroups: Task counter subsystem) Frederic Weisbecker
2011-08-23 16:07         ` Paul Menage
2011-08-24 17:54           ` Frederic Weisbecker
2011-08-26  7:28             ` Li Zefan
2011-08-26 14:58               ` Paul Menage
2011-09-06  9:06                 ` Li Zefan
2011-08-26 15:16             ` Paul Menage
2011-08-27 13:40               ` Frederic Weisbecker [this message]
2011-08-31 22:36                 ` Paul Menage
2011-08-31 21:54               ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110827134038.GH3298@somewhere \
    --to=fweisbec@gmail.com \
    --cc=adityakali@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=oleg@redhat.com \
    --cc=paul@paulmenage.org \
    --cc=thockin@hockin.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.