From: Tejun Heo <tj@kernel.org>
To: paul@paulmenage.org, rjw@sisk.pl, lizf@cn.fujitsu.com
Cc: linux-pm@lists.linux-foundation.org,
linux-kernel@vger.kernel.org,
containers@lists.linux-foundation.org, fweisbec@gmail.com,
matthltc@us.ibm.com, akpm@linux-foundation.org, oleg@redhat.com,
kamezawa.hiroyu@Jp.fujitsu.com, Tejun Heo <tj@kernel.org>
Subject: [PATCH 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec
Date: Tue, 1 Nov 2011 16:46:26 -0700 [thread overview]
Message-ID: <1320191193-8110-4-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1320191193-8110-1-git-send-email-tj@kernel.org>
threadgroup_lock() protected only protected against new addition to
the threadgroup, which was inherently somewhat incomplete and
problematic for its only user cgroup. On-going migration could race
against exec and exit leading to interesting problems - the symmetry
between various attach methods, task exiting during method execution,
->exit() racing against attach methods, migrating task switching basic
properties during exec and so on.
This patch extends threadgroup_lock() such that it protects against
all three threadgroup altering operations - fork, exit and exec. For
exit, threadgroup_change_begin/end() calls are added to exit path.
For exec, threadgroup_[un]lock() are updated to also grab and release
cred_guard_mutex.
With this change, threadgroup_lock() guarantees that the target
threadgroup will remain stable - no new task will be added, no new
PF_EXITING will be set and exec won't happen.
The next patch will update cgroup so that it can take full advantage
of this change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
---
include/linux/sched.h | 36 ++++++++++++++++++++++++++++++------
kernel/exit.c | 17 +++++++++++++----
2 files changed, 43 insertions(+), 10 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index aa47d0f..6fb546e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -635,11 +635,12 @@ struct signal_struct {
#endif
#ifdef CONFIG_CGROUPS
/*
- * The group_rwsem prevents threads from forking with
- * CLONE_THREAD while held for writing. Use this for fork-sensitive
- * threadgroup-wide operations. It's taken for reading in fork.c in
- * copy_process().
- * Currently only needed write-side by cgroups.
+ * group_rwsem prevents new tasks from entering the threadgroup and
+ * member tasks from exiting. fork and exit paths are protected
+ * with this rwsem using threadgroup_change_begin/end(). Users
+ * which require threadgroup to remain stable should use
+ * threadgroup_[un]lock() which also takes care of exec path.
+ * Currently, cgroup is the only user.
*/
struct rw_semaphore group_rwsem;
#endif
@@ -2367,7 +2368,6 @@ static inline void unlock_task_sighand(struct task_struct *tsk,
spin_unlock_irqrestore(&tsk->sighand->siglock, *flags);
}
-/* See the declaration of group_rwsem in signal_struct. */
#ifdef CONFIG_CGROUPS
static inline void threadgroup_change_begin(struct task_struct *tsk)
{
@@ -2377,13 +2377,37 @@ static inline void threadgroup_change_done(struct task_struct *tsk)
{
up_read(&tsk->signal->group_rwsem);
}
+
+/**
+ * threadgroup_lock - lock threadgroup
+ * @tsk: member task of the threadgroup to lock
+ *
+ * Lock the threadgroup @tsk belongs to. No new task is allowed to enter
+ * and member tasks aren't allowed to exit (as indicated by PF_EXITING) or
+ * perform exec. This is useful for cases where the threadgroup needs to
+ * stay stable across blockable operations.
+ *
+ * fork and exit explicitly call threadgroup_change_{begin|end}() for
+ * synchronization. exec is already synchronized with cread_guard_mutex
+ * which we grab here.
+ */
static inline void threadgroup_lock(struct task_struct *tsk)
{
+ /* exec uses exit for de-threading, grab cred_guard_mutex first */
+ mutex_lock(&tsk->signal->cred_guard_mutex);
down_write(&tsk->signal->group_rwsem);
}
+
+/**
+ * threadgroup_unlock - unlock threadgroup
+ * @tsk: member task of the threadgroup to unlock
+ *
+ * Reverse threadgroup_lock().
+ */
static inline void threadgroup_unlock(struct task_struct *tsk)
{
up_write(&tsk->signal->group_rwsem);
+ mutex_unlock(&tsk->signal->cred_guard_mutex);
}
#else
static inline void threadgroup_change_begin(struct task_struct *tsk) {}
diff --git a/kernel/exit.c b/kernel/exit.c
index 2913b35..3565f00 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -938,6 +938,12 @@ NORET_TYPE void do_exit(long code)
schedule();
}
+ /*
+ * @tsk's threadgroup is going through changes - lock out users
+ * which expect stable threadgroup.
+ */
+ threadgroup_change_begin(tsk);
+
exit_irq_thread();
exit_signals(tsk); /* sets PF_EXITING */
@@ -1020,10 +1026,6 @@ NORET_TYPE void do_exit(long code)
kfree(current->pi_state_cache);
#endif
/*
- * Make sure we are holding no locks:
- */
- debug_check_no_locks_held(tsk);
- /*
* We can do this unlocked here. The futex code uses this flag
* just to verify whether the pi state cleanup has been done
* or not. In the worst case it loops once more.
@@ -1040,6 +1042,13 @@ NORET_TYPE void do_exit(long code)
preempt_disable();
exit_rcu();
+
+ /*
+ * Release threadgroup and make sure we are holding no locks.
+ */
+ threadgroup_change_done(tsk);
+ debug_check_no_locks_held(tsk);
+
/* causes final put_task_struct in finish_task_switch(). */
tsk->state = TASK_DEAD;
schedule();
--
1.7.3.1
next prev parent reply other threads:[~2011-11-01 23:46 UTC|newest]
Thread overview: 167+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-01 23:46 [PATCHSET] cgroup: stable threadgroup during attach & subsys methods consolidation Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-01 23:46 ` [PATCH 01/10] cgroup: add cgroup_root_mutex Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-04 8:38 ` KAMEZAWA Hiroyuki
2011-11-04 8:38 ` KAMEZAWA Hiroyuki
[not found] ` <1320191193-8110-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04 8:38 ` KAMEZAWA Hiroyuki
2011-11-01 23:46 ` [PATCH 02/10] threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem Tejun Heo
2011-11-01 23:46 ` Tejun Heo [this message]
2011-11-04 8:45 ` [PATCH 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec KAMEZAWA Hiroyuki
2011-11-04 8:45 ` KAMEZAWA Hiroyuki
2011-11-13 16:44 ` Frederic Weisbecker
2011-11-13 18:20 ` Frederic Weisbecker
2011-11-13 18:20 ` Frederic Weisbecker
2011-11-24 22:50 ` [PATCH UPDATED " Tejun Heo
2011-11-24 22:50 ` Tejun Heo
2011-11-25 4:02 ` Linus Torvalds
2011-11-25 4:02 ` Linus Torvalds
[not found] ` <CA+55aFxq1wztMzYhKaY5RHazLBDz4pSXUgiGzTj2wA6EJcDbAw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-27 19:21 ` Tejun Heo
2011-11-27 19:21 ` Tejun Heo
[not found] ` <20111127192155.GB4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-27 21:25 ` Tejun Heo
2011-11-27 21:25 ` Tejun Heo
[not found] ` <20111127212558.GE4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-01 19:29 ` Tejun Heo
2011-12-01 19:29 ` Tejun Heo
2011-12-01 19:29 ` Tejun Heo
2011-11-27 21:25 ` Tejun Heo
2011-11-27 19:21 ` Tejun Heo
[not found] ` <20111124225054.GA14828-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-25 4:02 ` Linus Torvalds
2011-11-25 14:01 ` Frederic Weisbecker
2011-11-25 14:01 ` Frederic Weisbecker
2011-11-25 14:01 ` Frederic Weisbecker
[not found] ` <20111125140136.GC23307-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-11-27 19:30 ` Tejun Heo
2011-11-27 19:30 ` Tejun Heo
[not found] ` <20111127193001.GC4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-02 16:28 ` Frederic Weisbecker
2011-12-02 16:28 ` Frederic Weisbecker
2011-12-02 16:28 ` Frederic Weisbecker
2011-12-05 18:43 ` Tejun Heo
[not found] ` <20111202162753.GA19752-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-12-05 18:43 ` Tejun Heo
2011-12-05 18:43 ` Tejun Heo
2011-12-07 15:30 ` Frederic Weisbecker
2011-12-07 15:30 ` Frederic Weisbecker
[not found] ` <20111207153046.GC13252-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-12-07 18:22 ` Tejun Heo
2011-12-07 18:22 ` Tejun Heo
2011-12-07 18:22 ` Tejun Heo
[not found] ` <20111207182214.GA7610-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-08 20:50 ` [PATCH UPDATED AGAIN " Tejun Heo
2011-12-08 20:50 ` Tejun Heo
2011-12-08 20:50 ` Tejun Heo
2011-12-09 23:42 ` Frederic Weisbecker
[not found] ` <20111208205055.GB12108-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-09 23:42 ` Frederic Weisbecker
2011-12-09 23:42 ` Frederic Weisbecker
2011-12-13 1:33 ` Tejun Heo
2011-12-13 1:33 ` Tejun Heo
2011-12-13 2:17 ` Tejun Heo
[not found] ` <20111213013334.GC25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-13 2:17 ` Tejun Heo
2011-12-13 2:17 ` Tejun Heo
2011-12-13 1:33 ` Tejun Heo
[not found] ` <20111205184315.GJ627-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-07 15:30 ` [PATCH UPDATED " Frederic Weisbecker
2011-11-27 19:30 ` Tejun Heo
[not found] ` <1320191193-8110-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04 8:45 ` [PATCH " KAMEZAWA Hiroyuki
2011-11-13 16:44 ` Frederic Weisbecker
2011-11-13 16:44 ` Frederic Weisbecker
2011-11-14 13:54 ` Frederic Weisbecker
2011-11-14 13:54 ` Frederic Weisbecker
2011-11-14 13:54 ` Frederic Weisbecker
2011-11-21 22:03 ` Tejun Heo
2011-11-21 22:03 ` Tejun Heo
2011-11-21 22:03 ` Tejun Heo
2011-11-23 14:34 ` Frederic Weisbecker
[not found] ` <20111121220326.GM25776-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-23 14:34 ` Frederic Weisbecker
2011-11-23 14:34 ` Frederic Weisbecker
2011-11-21 21:58 ` Tejun Heo
2011-11-21 21:58 ` Tejun Heo
2011-11-21 21:58 ` Tejun Heo
2011-11-23 14:02 ` Frederic Weisbecker
[not found] ` <20111121215839.GL25776-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-23 14:02 ` Frederic Weisbecker
2011-11-23 14:02 ` Frederic Weisbecker
[not found] ` <20111123140139.GB10669-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-11-24 21:22 ` Tejun Heo
2011-11-24 21:22 ` Tejun Heo
2011-11-24 21:22 ` Tejun Heo
2011-11-13 18:20 ` Frederic Weisbecker
2011-11-24 22:50 ` [PATCH UPDATED " Tejun Heo
2011-11-01 23:46 ` [PATCH " Tejun Heo
2011-11-01 23:46 ` [PATCH 04/10] cgroup: always lock threadgroup during migration Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-04 8:54 ` KAMEZAWA Hiroyuki
[not found] ` <1320191193-8110-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04 8:54 ` KAMEZAWA Hiroyuki
2011-11-14 18:46 ` Frederic Weisbecker
2011-11-14 18:46 ` Frederic Weisbecker
2011-11-14 18:52 ` Frederic Weisbecker
2011-11-14 18:52 ` Frederic Weisbecker
2011-11-14 18:52 ` Frederic Weisbecker
2011-11-21 22:05 ` Tejun Heo
2011-11-21 22:05 ` Tejun Heo
2011-11-21 22:05 ` Tejun Heo
2011-11-04 8:54 ` KAMEZAWA Hiroyuki
[not found] ` <20111104175413.30afaf8e.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-11-04 15:21 ` Tejun Heo
2011-11-04 15:21 ` Tejun Heo
2011-11-04 15:21 ` Tejun Heo
2011-11-14 18:46 ` Frederic Weisbecker
[not found] ` <1320191193-8110-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-01 23:46 ` [PATCH 01/10] cgroup: add cgroup_root_mutex Tejun Heo
2011-11-01 23:46 ` [PATCH 02/10] threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-04 8:40 ` KAMEZAWA Hiroyuki
2011-11-04 8:40 ` KAMEZAWA Hiroyuki
2011-11-04 15:16 ` Tejun Heo
2011-11-04 15:16 ` Tejun Heo
[not found] ` <20111104174032.e0c4fc11.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-11-04 15:16 ` Tejun Heo
[not found] ` <1320191193-8110-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04 8:40 ` KAMEZAWA Hiroyuki
2011-11-01 23:46 ` [PATCH 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec Tejun Heo
2011-11-01 23:46 ` [PATCH 04/10] cgroup: always lock threadgroup during migration Tejun Heo
2011-11-01 23:46 ` [PATCH 05/10] cgroup: subsys->attach_task() should be called after migration Tejun Heo
2011-11-01 23:46 ` [PATCH 06/10] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
2011-11-01 23:46 ` Tejun Heo
[not found] ` <1320191193-8110-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-14 20:37 ` Frederic Weisbecker
2011-11-14 20:37 ` Frederic Weisbecker
2011-11-14 20:37 ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 07/10] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-14 21:16 ` Frederic Weisbecker
[not found] ` <1320191193-8110-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-14 21:16 ` Frederic Weisbecker
2011-11-14 21:16 ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 08/10] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
2011-11-01 23:46 ` [PATCH 09/10] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-15 0:51 ` Frederic Weisbecker
[not found] ` <1320191193-8110-10-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-15 0:51 ` Frederic Weisbecker
2011-11-15 0:51 ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 10/10] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-04 9:10 ` KAMEZAWA Hiroyuki
2011-11-04 9:10 ` KAMEZAWA Hiroyuki
2011-11-15 0:54 ` Frederic Weisbecker
[not found] ` <1320191193-8110-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04 9:10 ` KAMEZAWA Hiroyuki
2011-11-15 0:54 ` Frederic Weisbecker
2011-11-15 0:54 ` Frederic Weisbecker
2011-11-21 22:07 ` [PATCHSET] cgroup: stable threadgroup during attach & subsys methods consolidation Tejun Heo
2011-11-21 22:07 ` Tejun Heo
2011-11-22 2:27 ` Li Zefan
[not found] ` <20111121220719.GP25776-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-22 2:27 ` Li Zefan
2011-11-22 2:27 ` Li Zefan
2011-11-22 16:20 ` Tejun Heo
[not found] ` <4ECB089C.3080208-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2011-11-22 16:20 ` Tejun Heo
2011-11-22 16:20 ` Tejun Heo
2011-11-24 22:51 ` Tejun Heo
2011-11-01 23:46 ` [PATCH 05/10] cgroup: subsys->attach_task() should be called after migration Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-14 20:06 ` Frederic Weisbecker
2011-11-14 20:06 ` Frederic Weisbecker
2011-11-21 22:04 ` Tejun Heo
2011-11-21 22:04 ` Tejun Heo
2011-11-21 22:04 ` Tejun Heo
[not found] ` <1320191193-8110-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-14 20:06 ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 06/10] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
2011-11-01 23:46 ` [PATCH 07/10] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-11-01 23:46 ` [PATCH 08/10] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-04 9:08 ` KAMEZAWA Hiroyuki
2011-11-04 9:08 ` KAMEZAWA Hiroyuki
[not found] ` <1320191193-8110-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04 9:08 ` KAMEZAWA Hiroyuki
2011-11-14 23:54 ` Frederic Weisbecker
2011-11-14 23:54 ` Frederic Weisbecker
2011-11-14 23:54 ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 09/10] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
2011-11-01 23:46 ` [PATCH 10/10] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
2011-11-21 22:07 ` [PATCHSET] cgroup: stable threadgroup during attach & subsys methods consolidation Tejun Heo
2011-11-24 22:51 ` Tejun Heo
2011-11-24 22:51 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1320191193-8110-4-git-send-email-tj@kernel.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=fweisbec@gmail.com \
--cc=kamezawa.hiroyu@Jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=lizf@cn.fujitsu.com \
--cc=matthltc@us.ibm.com \
--cc=oleg@redhat.com \
--cc=paul@paulmenage.org \
--cc=rjw@sisk.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.