From: Frederic Weisbecker <fweisbec@gmail.com>
To: Li Zefan <lizf@cn.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Ben Blum <bblum@andrew.cmu.edu>,
Frederic Weisbecker <fweisbec@redhat.com>,
Paul Menage <paul@paulmenage.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cgroups: don't cache common ancestor in task counter subsys
Date: Mon, 17 Oct 2011 19:25:41 +0200 [thread overview]
Message-ID: <20111017172537.GH2829@somewhere> (raw)
In-Reply-To: <4E9BDA43.6060406@cn.fujitsu.com>
On Mon, Oct 17, 2011 at 03:33:23PM +0800, Li Zefan wrote:
> To reproduce the bug:
>
> # mount -t cgroup -o tasks xxx /cgroup
> # mkdir /cgroup/tmp
> # echo PID > /cgroup/tmp/cgroups.proc (PID has child threads)
> # echo PID > /cgroup/tasks
> # ecoh PID > /cgroup/tmp/cgroups.proc
> (oops!)
>
> the call graph is:
>
> for_each_thread()
> can_attach_task();
>
> for_each_thead()
> attach_task();
>
> but not:
>
> for_each_thread() {
> can_attach_task();
> attach_task();
> }
>
> So common_ancestor used in attach_task() is always the value calculated
> in the last can_attach_task() call.
Ah! good catch!
Just a comment below:
>
> To fix it, instead of caching, we always calculate the value every time
> we want to use it.
>
> Reported-by: Ben Blum <bblum@andrew.cmu.edu>
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> ---
> Documentation/cgroups/cgroups.txt | 3 ++-
> include/linux/cgroup.h | 2 +-
> kernel/cgroup.c | 7 ++++---
> kernel/cgroup_task_counter.c | 24 +++++++++++++++++-------
> 4 files changed, 24 insertions(+), 12 deletions(-)
>
> diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
> index 3bc1dc9..f9e51a8 100644
> --- a/Documentation/cgroups/cgroups.txt
> +++ b/Documentation/cgroups/cgroups.txt
> @@ -623,7 +623,8 @@ function, so that the subsystem can implement a rollback. If not, not necessary.
> This will be called only about subsystems whose can_attach() operation have
> succeeded.
>
> -void cancel_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> +void cancel_attach_task(struct cgroup *cgrp, struct cgroup *oldcgrp,
> + struct task_struct *tsk)
> (cgroup_mutex held by caller)
>
> As cancel_attach, but for operations that must be cancelled once per
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 9c8151e..e4659c4 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -475,7 +475,7 @@ struct cgroup_subsys {
> struct task_struct *tsk);
> void (*cancel_attach)(struct cgroup_subsys *ss, struct cgroup *cgrp,
> struct task_struct *tsk);
> - void (*cancel_attach_task)(struct cgroup *cgrp,
> + void (*cancel_attach_task)(struct cgroup *cgrp, struct cgroup *oldcgrp,
> struct task_struct *tsk);
> void (*pre_attach)(struct cgroup *cgrp);
> void (*attach_task)(struct cgroup *cgrp, struct cgroup *old_cgrp,
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 018df9d..91abc9b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1885,7 +1885,7 @@ out:
> break;
>
> if (ss->cancel_attach_task)
> - ss->cancel_attach_task(cgrp, tsk);
> + ss->cancel_attach_task(cgrp, oldcgrp, tsk);
> if (ss->cancel_attach)
> ss->cancel_attach(ss, cgrp, tsk);
> }
> @@ -2151,7 +2151,7 @@ int cgroup_attach_proc(struct cgroup *cgrp, struct task_struct *leader)
> }
> } else if (retval == -ESRCH) {
> if (ss->cancel_attach_task)
> - ss->cancel_attach_task(cgrp, tsk);
> + ss->cancel_attach_task(cgrp, oldcgrp, tsk);
> } else {
> BUG_ON(1);
> }
> @@ -2191,7 +2191,8 @@ out_cancel_attach:
> tsk = flex_array_get_ptr(group, i);
> if (tsk == failed_task)
> break;
> - ss->cancel_attach_task(cgrp, tsk);
> + ss->cancel_attach_task(cgrp, oldcgrp,
> + tsk);
When we rollback there, we are dealing with oldcgrp of the last thread
we have treated. All threads in the rollback list don't necessary belonged
to that old_cgroup.
And we can't try to retrieve these old_cgroup through task_cgroup_from_root() because
the threads might have exited and thus could be assigned to the init cgroup.
I believe we need to cache these old cgroups in the flex array.
next prev parent reply other threads:[~2011-10-17 17:25 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-17 7:33 [PATCH] cgroups: don't cache common ancestor in task counter subsys Li Zefan
2011-10-17 17:25 ` Frederic Weisbecker [this message]
2011-10-17 17:27 ` Ben Blum
2011-10-27 15:12 ` Frederic Weisbecker
2011-10-31 3:14 ` Li Zefan
2011-10-18 3:17 ` Li Zefan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111017172537.GH2829@somewhere \
--to=fweisbec@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bblum@andrew.cmu.edu \
--cc=fweisbec@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=paul@paulmenage.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.