From: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Dave Jones <davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Linux Kernel
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Subject: [PATCH] do not abuse ->cred_guard_mutex in threadgroup_lock()
Date: Thu, 21 Mar 2013 17:21:38 +0100 [thread overview]
Message-ID: <20130321162138.GA21859@redhat.com> (raw)
In-Reply-To: <20130309200046.GA8149-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
threadgroup_lock() takes signal->cred_guard_mutex to ensure that
thread_group_leader() is stable. This doesn't look nice, the scope
of this lock in do_execve() is huge.
And as Dave pointed out this can lead to deadlock, we have the
following dependencies:
do_execve: cred_guard_mutex -> i_mutex
cgroup_mount: i_mutex -> cgroup_mutex
attach_task_by_pid: cgroup_mutex -> cred_guard_mutex
Change de_thread() to take threadgroup_change_begin() around the
switch-the-leader code and change threadgroup_lock() to avoid
->cred_guard_mutex.
Note that de_thread() can't sleep with ->group_rwsem held, this
can obviously deadlock with the exiting leader if the writer is
active, so it does threadgroup_change_end() before schedule().
Reported-by: Dave Jones <davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Signed-off-by: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/exec.c | 3 +++
include/linux/sched.h | 18 ++++--------------
2 files changed, 7 insertions(+), 14 deletions(-)
diff --git a/fs/exec.c b/fs/exec.c
index 20df02c..bea2f7d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -898,11 +898,13 @@ static int de_thread(struct task_struct *tsk)
sig->notify_count = -1; /* for exit_notify() */
for (;;) {
+ threadgroup_change_begin(tsk);
write_lock_irq(&tasklist_lock);
if (likely(leader->exit_state))
break;
__set_current_state(TASK_KILLABLE);
write_unlock_irq(&tasklist_lock);
+ threadgroup_change_end(tsk);
schedule();
if (unlikely(__fatal_signal_pending(tsk)))
goto killed;
@@ -960,6 +962,7 @@ static int de_thread(struct task_struct *tsk)
if (unlikely(leader->ptrace))
__wake_up_parent(leader, leader->parent);
write_unlock_irq(&tasklist_lock);
+ threadgroup_change_end(tsk);
release_task(leader);
}
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 932a90c..67cfdb5 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2486,27 +2486,18 @@ static inline void threadgroup_change_end(struct task_struct *tsk)
*
* Lock the threadgroup @tsk belongs to. No new task is allowed to enter
* and member tasks aren't allowed to exit (as indicated by PF_EXITING) or
- * perform exec. This is useful for cases where the threadgroup needs to
- * stay stable across blockable operations.
+ * change ->group_leader/pid. This is useful for cases where the threadgroup
+ * needs to stay stable across blockable operations.
*
* fork and exit paths explicitly call threadgroup_change_{begin|end}() for
* synchronization. While held, no new task will be added to threadgroup
* and no existing live task will have its PF_EXITING set.
*
- * During exec, a task goes and puts its thread group through unusual
- * changes. After de-threading, exclusive access is assumed to resources
- * which are usually shared by tasks in the same group - e.g. sighand may
- * be replaced with a new one. Also, the exec'ing task takes over group
- * leader role including its pid. Exclude these changes while locked by
- * grabbing cred_guard_mutex which is used to synchronize exec path.
+ * de_thread() does threadgroup_change_{begin|end}() when a non-leader
+ * sub-thread becomes a new leader.
*/
static inline void threadgroup_lock(struct task_struct *tsk)
{
- /*
- * exec uses exit for de-threading nesting group_rwsem inside
- * cred_guard_mutex. Grab cred_guard_mutex first.
- */
- mutex_lock(&tsk->signal->cred_guard_mutex);
down_write(&tsk->signal->group_rwsem);
}
@@ -2519,7 +2510,6 @@ static inline void threadgroup_lock(struct task_struct *tsk)
static inline void threadgroup_unlock(struct task_struct *tsk)
{
up_write(&tsk->signal->group_rwsem);
- mutex_unlock(&tsk->signal->cred_guard_mutex);
}
#else
static inline void threadgroup_change_begin(struct task_struct *tsk) {}
--
1.5.5.1
next prev parent reply other threads:[~2013-03-21 16:21 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20130306223657.GA7392@redhat.com>
[not found] ` <20130306223657.GA7392-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-07 17:25 ` lockdep trace from prepare_bprm_creds Oleg Nesterov
2013-03-07 18:01 ` Tejun Heo
[not found] ` <20130307180139.GD29601-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-03-07 18:03 ` Tejun Heo
[not found] ` <20130307180332.GE29601-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-03-07 19:12 ` Oleg Nesterov
[not found] ` <20130307191242.GA18265-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-07 19:38 ` Tejun Heo
[not found] ` <20130307193820.GB3209-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-03-09 2:11 ` Li Zefan
[not found] ` <513A9A67.60909-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-03-09 3:29 ` Tejun Heo
[not found] ` <20130309032936.GT14556-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-03-09 7:47 ` Li Zefan
[not found] ` <513AE918.7020704-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-03-09 20:00 ` [PATCH 0/1] do not abuse ->cred_guard_mutex in threadgroup_lock() Oleg Nesterov
2013-03-09 20:01 ` [PATCH 1/1] " Oleg Nesterov
[not found] ` <20130309200106.GB8149-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-09 20:15 ` Tejun Heo
2013-03-11 1:50 ` Li Zefan
[not found] ` <20130309200046.GA8149-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-21 16:21 ` Oleg Nesterov [this message]
[not found] ` <20130321162138.GA21859-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-21 22:06 ` [PATCH] " Andrew Morton
[not found] ` <20130321150626.a7934d989fb80d835fa92255-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2013-03-22 13:20 ` Oleg Nesterov
2013-03-19 22:02 ` [PATCH cgroup/for-3.10] cgroup: make cgroup_mutex outer to threadgroup_lock Tejun Heo
[not found] ` <20130319220246.GR3042-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-03-20 0:58 ` Li Zefan
2013-03-20 15:03 ` Tejun Heo
[not found] ` <20130320150351.GW3042-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-03-20 18:35 ` Oleg Nesterov
[not found] ` <20130320183523.GA29365-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20 18:42 ` Tejun Heo
[not found] ` <CAOS58YPxGXt+iq1GZ4hryqm1Z_p+r7eRRC7ruUDDd=LQrWtAxg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-03-21 16:17 ` Oleg Nesterov
2013-03-07 18:21 ` lockdep trace from prepare_bprm_creds Tejun Heo
[not found] ` <20130307182140.GF29601-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-03-07 18:32 ` Oleg Nesterov
[not found] ` <20130307183213.GA18022-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-07 19:33 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130321162138.GA21859@redhat.com \
--to=oleg-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).