* [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
@ 2009-09-03 16:05 Oleg Nesterov
2009-09-03 20:09 ` Roland McGrath
2009-09-04 8:39 ` [PATCH 1/1] " David Howells
0 siblings, 2 replies; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-03 16:05 UTC (permalink / raw)
To: Andrew Morton, Linus Torvalds
Cc: David Howells, James Morris, Roland McGrath, Tom Horsley,
linux-kernel, stable
Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since
"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d
commit in 2.6.31.
But I strongly believe we should blame another patch
"CRED: Make execve() take advantage of copy-on-write credentials"
a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
The tracee must not sleep in TASK_TRACED holding this mutex (it was named
cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
hang until it is killed or the tracee resumes.
With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:
- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.
- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().
or, if exec fails,
free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.
Reported-by: Tom Horsley <tom.horsley@att.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/exec.c | 49 +++++++++++++++++++++++++++---------------------
fs/compat.c | 17 +++-------------
include/linux/binfmts.h | 3 +-
3 files changed, 34 insertions(+), 35 deletions(-)
--- WAIT/fs/exec.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/exec.c 2009-09-03 17:37:54.000000000 +0200
@@ -1017,6 +1017,29 @@ out:
EXPORT_SYMBOL(flush_old_exec);
+int prepare_bprm_creds(struct linux_binprm *bprm)
+{
+ if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ return -ERESTARTNOINTR;
+
+ bprm->cred = prepare_exec_creds();
+ if (likely(bprm->cred))
+ return 0;
+
+ mutex_unlock(¤t->cred_guard_mutex);
+ return -ENOMEM;
+}
+
+void free_bprm(struct linux_binprm *bprm)
+{
+ free_arg_pages(bprm);
+ if (bprm->cred) {
+ mutex_unlock(¤t->cred_guard_mutex);
+ abort_creds(bprm->cred);
+ }
+ kfree(bprm);
+}
+
/*
* install the new credentials for this executable
*/
@@ -1032,6 +1055,7 @@ void install_exec_creds(struct linux_bin
* credentials; any time after this it may be unlocked */
security_bprm_committed_creds(bprm);
+ mutex_unlock(¤t->cred_guard_mutex);
}
EXPORT_SYMBOL(install_exec_creds);
@@ -1248,14 +1272,6 @@ int search_binary_handler(struct linux_b
EXPORT_SYMBOL(search_binary_handler);
-void free_bprm(struct linux_binprm *bprm)
-{
- free_arg_pages(bprm);
- if (bprm->cred)
- abort_creds(bprm->cred);
- kfree(bprm);
-}
-
/*
* sys_execve() executes a new program.
*/
@@ -1279,20 +1295,15 @@ int do_execve(char * filename,
if (!bprm)
goto out_files;
- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;
retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;
file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1342,7 +1353,6 @@ int do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1362,10 +1372,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
out_free:
free_bprm(bprm);
--- WAIT/fs/compat.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/compat.c 2009-09-03 16:42:17.000000000 +0200
@@ -1486,20 +1486,15 @@ int compat_do_execve(char * filename,
if (!bprm)
goto out_files;
- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;
retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;
file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1548,7 +1543,6 @@ int compat_do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1568,10 +1562,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
out_free:
free_bprm(bprm);
--- WAIT/include/linux/binfmts.h~CRED_MUTEX_EEEVENT_EXEC 2009-07-24 19:02:19.000000000 +0200
+++ WAIT/include/linux/binfmts.h 2009-09-03 17:37:04.000000000 +0200
@@ -117,9 +117,10 @@ extern int setup_arg_pages(struct linux_
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc,char ** argv,struct linux_binprm *bprm);
-extern void install_exec_creds(struct linux_binprm *bprm);
extern void do_coredump(long signr, int exit_code, struct pt_regs *regs);
extern void set_binfmt(struct linux_binfmt *new);
+extern int prepare_bprm_creds(struct linux_binprm *bprm);
+extern void install_exec_creds(struct linux_binprm *bprm);
extern void free_bprm(struct linux_binprm *);
#endif /* __KERNEL__ */
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-03 16:05 [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex Oleg Nesterov
@ 2009-09-03 20:09 ` Roland McGrath
2009-09-04 8:43 ` David Howells
2009-09-04 13:39 ` Oleg Nesterov
2009-09-04 8:39 ` [PATCH 1/1] " David Howells
1 sibling, 2 replies; 16+ messages in thread
From: Roland McGrath @ 2009-09-03 20:09 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Linus Torvalds, David Howells, James Morris,
Tom Horsley, linux-kernel, stable
I certainly think it's right to hold the mutex only as long as necessary.
Clearly holding it when we stop is wrong.
I'm a bit concerned about holding it for arbitrary periods while we block
in the filesystem code. e.g., consider the scenario with a hangs-forever
NFS server or suchlike. But I'm not sure there is a reasonable way around
that one.
The paired calls that leave the mutex locked in between should have some
clear comments calling attention to their pairing. Aside from that making
sure that subtlety is clear, I don't see any problems in the patch off hand.
But I haven't scoured the code path lately to have full confidence.
I'd like to hear David's reactions.
Thanks,
Roland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-03 16:05 [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex Oleg Nesterov
2009-09-03 20:09 ` Roland McGrath
@ 2009-09-04 8:39 ` David Howells
2009-09-04 9:24 ` Roland McGrath
2009-09-04 12:46 ` Oleg Nesterov
1 sibling, 2 replies; 16+ messages in thread
From: David Howells @ 2009-09-04 8:39 UTC (permalink / raw)
To: Oleg Nesterov
Cc: dhowells, Andrew Morton, Linus Torvalds, James Morris,
Roland McGrath, Tom Horsley, linux-kernel, stable
Oleg Nesterov <oleg@redhat.com> wrote:
> But I strongly believe we should blame another patch
>
> "CRED: Make execve() take advantage of copy-on-write credentials"
> a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
>
> The tracee must not sleep in TASK_TRACED holding this mutex (it was named
> cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
> and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
> hang until it is killed or the tracee resumes.
Ummm... I don't see how this is relevant.
Yes, a task must not sleep in TASK_TRACED if it is holding this mutex, but how
does that apply to do_exec(), mm_for_maps() or proc_pid_attr_write()? A
process can't be in any of those three if it is in the TASK_TRACED state.
A process can only be in the TASK_TRACED state in one of two ways: its parent
moved it there from the TASK_STOPPED state, or it put itself in that state -
neither of which should apply here.
Apart from that, I've no objection to dropping the guard semaphore earlier.
I do think though, the problem is elsewhere. Are we failing to unlock the
semaphore somewhere? Or double locking it, I wonder? Has Tom tried lockdep?
Btw, should mm_for_maps() use mutex_lock_interruptible()? There doesn't seem
any point making it non-interruptible (except for kill signals) - unless that
would muck up seqfile handling.
> +int prepare_bprm_creds(struct linux_binprm *bprm)
> +{
__acquires()
> +void free_bprm(struct linux_binprm *bprm)
> +{
__releases()
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-03 20:09 ` Roland McGrath
@ 2009-09-04 8:43 ` David Howells
2009-09-04 13:39 ` Oleg Nesterov
1 sibling, 0 replies; 16+ messages in thread
From: David Howells @ 2009-09-04 8:43 UTC (permalink / raw)
To: Roland McGrath
Cc: dhowells, Oleg Nesterov, Andrew Morton, Linus Torvalds,
James Morris, Tom Horsley, linux-kernel, stable
Roland McGrath <roland@redhat.com> wrote:
> I certainly think it's right to hold the mutex only as long as necessary.
> Clearly holding it when we stop is wrong.
As far as I can tell, we don't hold it when we stop for debugging, or stop on
signal.
> I'm a bit concerned about holding it for arbitrary periods while we block
> in the filesystem code. e.g., consider the scenario with a hangs-forever
> NFS server or suchlike. But I'm not sure there is a reasonable way around
> that one.
If you drop the sem before committing the creds, you have to recalculate the
new credentials.
> The paired calls that leave the mutex locked in between should have some
> clear comments calling attention to their pairing. Aside from that making
> sure that subtlety is clear, I don't see any problems in the patch off hand.
> But I haven't scoured the code path lately to have full confidence.
> I'd like to hear David's reactions.
Looking at the patch description, I don't see how the patch it relevant to the
problem. There must be something else, either a call that's now being
skipped, or it's a matter of timing.
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 8:39 ` [PATCH 1/1] " David Howells
@ 2009-09-04 9:24 ` Roland McGrath
2009-09-04 12:46 ` Oleg Nesterov
1 sibling, 0 replies; 16+ messages in thread
From: Roland McGrath @ 2009-09-04 9:24 UTC (permalink / raw)
To: David Howells
Cc: Oleg Nesterov, Andrew Morton, Linus Torvalds, James Morris,
Tom Horsley, linux-kernel, stable
> > +int prepare_bprm_creds(struct linux_binprm *bprm)
> > +{
>
> __acquires()
>
> > +void free_bprm(struct linux_binprm *bprm)
> > +{
>
> __releases()
No, they're conditional.
Thanks,
Roland
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 8:39 ` [PATCH 1/1] " David Howells
2009-09-04 9:24 ` Roland McGrath
@ 2009-09-04 12:46 ` Oleg Nesterov
2009-09-04 13:39 ` David Howells
1 sibling, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-04 12:46 UTC (permalink / raw)
To: David Howells
Cc: Andrew Morton, Linus Torvalds, James Morris, Roland McGrath,
Tom Horsley, linux-kernel, stable
On 09/04, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > But I strongly believe we should blame another patch
> >
> > "CRED: Make execve() take advantage of copy-on-write credentials"
> > a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
> >
> > The tracee must not sleep in TASK_TRACED holding this mutex (it was named
> > cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
> > and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
> > hang until it is killed or the tracee resumes.
(Argh. Sorry David, the changelog should have mentioned tracehook_report_exec()
more explicitely).
So, David, do you agree with this patch? Do you think it can go to 2.6.31 ?
> Btw, should mm_for_maps() use mutex_lock_interruptible()? There doesn't seem
> any point making it non-interruptible (except for kill signals) - unless that
> would muck up seqfile handling.
Perhaps, but we should change m_start() first, it should check IS_ERR() instead
of mm != NULL. Afaics, vfs_read()->seq_read() path will return ERESTART...
correctly.
I am not sure would be right though, short reads can confuse user space. And
this can't solve the problem, this only helps to react to signals.
Oleg.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 12:46 ` Oleg Nesterov
@ 2009-09-04 13:39 ` David Howells
2009-09-04 13:55 ` Oleg Nesterov
0 siblings, 1 reply; 16+ messages in thread
From: David Howells @ 2009-09-04 13:39 UTC (permalink / raw)
To: Oleg Nesterov
Cc: dhowells, Andrew Morton, Linus Torvalds, James Morris,
Roland McGrath, Tom Horsley, linux-kernel, stable
Oleg Nesterov <oleg@redhat.com> wrote:
> (Argh. Sorry David, the changelog should have mentioned
> tracehook_report_exec() more explicitely).
>
> So, David, do you agree with this patch? Do you think it can go to 2.6.31 ?
Okay, add my Acked-by. Can I recommend you mention tracehook_report_exec() in
the changelog and then push it to Linus again?
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-03 20:09 ` Roland McGrath
2009-09-04 8:43 ` David Howells
@ 2009-09-04 13:39 ` Oleg Nesterov
2009-09-04 14:47 ` David Howells
1 sibling, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-04 13:39 UTC (permalink / raw)
To: Roland McGrath
Cc: Andrew Morton, Linus Torvalds, David Howells, James Morris,
Tom Horsley, linux-kernel, stable
On 09/03, Roland McGrath wrote:
>
> The paired calls that leave the mutex locked in between should have some
> clear comments calling attention to their pairing.
Agreed. Please see the same patch + some comments below.
------------------------------------------------------------------------------
[PATCH] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since
"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d
commit in 2.6.31.
But I strongly believe we should blame another patch
"CRED: Make execve() take advantage of copy-on-write credentials"
a6f76f23d297f70e2a6b3ec607f7aeeea9e37e8d
The tracee must not sleep in TASK_TRACED holding this mutex (it was named
cred_exec_mutex). Even if we remove ->cred_guard_mutex from mm_for_maps()
and proc_pid_attr_write(), another task doing PTRACE_ATTACH should not
hang until it is killed or the tracee resumes.
With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:
- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.
- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().
or, if exec fails,
free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.
Reported-by: Tom Horsley <tom.horsley@att.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
--- WAIT/include/linux/binfmts.h~CRED_MUTEX_EEEVENT_EXEC 2009-07-24 19:02:19.000000000 +0200
+++ WAIT/include/linux/binfmts.h 2009-09-03 17:37:04.000000000 +0200
@@ -117,9 +117,10 @@ extern int setup_arg_pages(struct linux_
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc,char ** argv,struct linux_binprm *bprm);
-extern void install_exec_creds(struct linux_binprm *bprm);
extern void do_coredump(long signr, int exit_code, struct pt_regs *regs);
extern void set_binfmt(struct linux_binfmt *new);
+extern int prepare_bprm_creds(struct linux_binprm *bprm);
+extern void install_exec_creds(struct linux_binprm *bprm);
extern void free_bprm(struct linux_binprm *);
#endif /* __KERNEL__ */
--- WAIT/fs/exec.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/exec.c 2009-09-04 15:35:52.000000000 +0200
@@ -1018,6 +1018,35 @@ out:
EXPORT_SYMBOL(flush_old_exec);
/*
+ * Prepare credentials and lock ->cred_guard_mutex.
+ * install_exec_creds() commits the new creds and drops the lock.
+ * Or, if exec fails before, free_bprm() should release ->cred and
+ * and unlock.
+ */
+int prepare_bprm_creds(struct linux_binprm *bprm)
+{
+ if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ return -ERESTARTNOINTR;
+
+ bprm->cred = prepare_exec_creds();
+ if (likely(bprm->cred))
+ return 0;
+
+ mutex_unlock(¤t->cred_guard_mutex);
+ return -ENOMEM;
+}
+
+void free_bprm(struct linux_binprm *bprm)
+{
+ free_arg_pages(bprm);
+ if (bprm->cred) {
+ mutex_unlock(¤t->cred_guard_mutex);
+ abort_creds(bprm->cred);
+ }
+ kfree(bprm);
+}
+
+/*
* install the new credentials for this executable
*/
void install_exec_creds(struct linux_binprm *bprm)
@@ -1026,12 +1055,9 @@ void install_exec_creds(struct linux_bin
commit_creds(bprm->cred);
bprm->cred = NULL;
-
- /* cred_guard_mutex must be held at least to this point to prevent
- * ptrace_attach() from altering our determination of the task's
- * credentials; any time after this it may be unlocked */
-
security_bprm_committed_creds(bprm);
+
+ mutex_unlock(¤t->cred_guard_mutex);
}
EXPORT_SYMBOL(install_exec_creds);
@@ -1248,14 +1274,6 @@ int search_binary_handler(struct linux_b
EXPORT_SYMBOL(search_binary_handler);
-void free_bprm(struct linux_binprm *bprm)
-{
- free_arg_pages(bprm);
- if (bprm->cred)
- abort_creds(bprm->cred);
- kfree(bprm);
-}
-
/*
* sys_execve() executes a new program.
*/
@@ -1279,20 +1297,15 @@ int do_execve(char * filename,
if (!bprm)
goto out_files;
- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;
retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;
file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1342,7 +1355,6 @@ int do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1362,10 +1374,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
out_free:
free_bprm(bprm);
--- WAIT/fs/compat.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/compat.c 2009-09-03 16:42:17.000000000 +0200
@@ -1486,20 +1486,15 @@ int compat_do_execve(char * filename,
if (!bprm)
goto out_files;
- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;
retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;
file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1548,7 +1543,6 @@ int compat_do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1568,10 +1562,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
out_free:
free_bprm(bprm);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 13:39 ` David Howells
@ 2009-09-04 13:55 ` Oleg Nesterov
0 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-04 13:55 UTC (permalink / raw)
To: David Howells
Cc: Andrew Morton, Linus Torvalds, James Morris, Roland McGrath,
Tom Horsley, linux-kernel, stable
On 09/04, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > (Argh. Sorry David, the changelog should have mentioned
> > tracehook_report_exec() more explicitely).
> >
> > So, David, do you agree with this patch? Do you think it can go to 2.6.31 ?
>
> Okay, add my Acked-by.
Great.
> Can I recommend you mention tracehook_report_exec() in
> the changelog and then push it to Linus again?
OK, will do.
I just sent v2 (the same patch + a bit of comments). Please tell me
if you disagree with the comments.
Oleg.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 13:39 ` Oleg Nesterov
@ 2009-09-04 14:47 ` David Howells
2009-09-04 15:49 ` Oleg Nesterov
0 siblings, 1 reply; 16+ messages in thread
From: David Howells @ 2009-09-04 14:47 UTC (permalink / raw)
To: Oleg Nesterov
Cc: dhowells, Roland McGrath, Andrew Morton, Linus Torvalds,
James Morris, Tom Horsley, linux-kernel, stable
Oleg Nesterov <oleg@redhat.com> wrote:
> - install_exec_creds() drops the mutex after commit_creds(),
> and thus before tracehook_report_exec()->ptrace_stop().
Thanks!
> -
> - /* cred_guard_mutex must be held at least to this point to prevent
> - * ptrace_attach() from altering our determination of the task's
> - * credentials; any time after this it may be unlocked */
> -
Why are you removing this comment, btw? In what way does your patch
invalidate it?
David
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 14:47 ` David Howells
@ 2009-09-04 15:49 ` Oleg Nesterov
2009-09-04 17:26 ` [PATCH v3] " Oleg Nesterov
0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-04 15:49 UTC (permalink / raw)
To: David Howells
Cc: Roland McGrath, Andrew Morton, Linus Torvalds, James Morris,
Tom Horsley, linux-kernel, stable
On 09/04, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > -
> > - /* cred_guard_mutex must be held at least to this point to prevent
> > - * ptrace_attach() from altering our determination of the task's
> > - * credentials; any time after this it may be unlocked */
> > -
>
> Why are you removing this comment, btw? In what way does your patch
> invalidate it?
No, the comment is fine. Except, now we have more users of this mutex,
but it only mentions ptrace_attach(). Tried to improve it, but failed
to make the concise and understandable comment,
I tried to collect all comments aboout cred_guard_mutex in one place,
above prepare_bprm_creds()... OK, agreed. Will revert this bit and
resend.
Oleg.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 15:49 ` Oleg Nesterov
@ 2009-09-04 17:26 ` Oleg Nesterov
2009-09-04 19:42 ` Andrew Morton
0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-04 17:26 UTC (permalink / raw)
To: Andrew Morton, Linus Torvalds
Cc: David Howells, Roland McGrath, James Morris, Tom Horsley,
linux-kernel, stable
Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since
"mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
04b836cbf19e885f8366bccb2e4b0474346c02d
commit in 2.6.31.
But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.
The tracee must not sleep in TASK_TRACED holding this mutex. Even if we
remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or
the tracee resumes.
With this patch do_execve() does not use ->cred_guard_mutex directly and
we do not hold it throughout, instead:
- introduce prepare_bprm_creds() helper, it locks the mutex
and calls prepare_exec_creds() to initialize bprm->cred.
- install_exec_creds() drops the mutex after commit_creds(),
and thus before tracehook_report_exec()->ptrace_stop().
or, if exec fails,
free_bprm() drops this mutex when bprm->cred != NULL which
indicates install_exec_creds() was not called.
Reported-by: Tom Horsley <tom.horsley@att.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
---
include/linux/binfmts.h | 3 +-
fs/exec.c | 63 ++++++++++++++++++++++++++++--------------------
fs/compat.c | 17 +++---------
3 files changed, 44 insertions(+), 39 deletions(-)
--- WAIT/include/linux/binfmts.h~CRED_MUTEX_EEEVENT_EXEC 2009-07-24 19:02:19.000000000 +0200
+++ WAIT/include/linux/binfmts.h 2009-09-03 17:37:04.000000000 +0200
@@ -117,9 +117,10 @@ extern int setup_arg_pages(struct linux_
int executable_stack);
extern int bprm_mm_init(struct linux_binprm *bprm);
extern int copy_strings_kernel(int argc,char ** argv,struct linux_binprm *bprm);
-extern void install_exec_creds(struct linux_binprm *bprm);
extern void do_coredump(long signr, int exit_code, struct pt_regs *regs);
extern void set_binfmt(struct linux_binfmt *new);
+extern int prepare_bprm_creds(struct linux_binprm *bprm);
+extern void install_exec_creds(struct linux_binprm *bprm);
extern void free_bprm(struct linux_binprm *);
#endif /* __KERNEL__ */
--- WAIT/fs/exec.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/exec.c 2009-09-04 19:09:14.000000000 +0200
@@ -1018,6 +1018,35 @@ out:
EXPORT_SYMBOL(flush_old_exec);
/*
+ * Prepare credentials and lock ->cred_guard_mutex.
+ * install_exec_creds() commits the new creds and drops the lock.
+ * Or, if exec fails before, free_bprm() should release ->cred and
+ * and unlock.
+ */
+int prepare_bprm_creds(struct linux_binprm *bprm)
+{
+ if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ return -ERESTARTNOINTR;
+
+ bprm->cred = prepare_exec_creds();
+ if (likely(bprm->cred))
+ return 0;
+
+ mutex_unlock(¤t->cred_guard_mutex);
+ return -ENOMEM;
+}
+
+void free_bprm(struct linux_binprm *bprm)
+{
+ free_arg_pages(bprm);
+ if (bprm->cred) {
+ mutex_unlock(¤t->cred_guard_mutex);
+ abort_creds(bprm->cred);
+ }
+ kfree(bprm);
+}
+
+/*
* install the new credentials for this executable
*/
void install_exec_creds(struct linux_binprm *bprm)
@@ -1026,12 +1055,13 @@ void install_exec_creds(struct linux_bin
commit_creds(bprm->cred);
bprm->cred = NULL;
-
- /* cred_guard_mutex must be held at least to this point to prevent
+ /*
+ * cred_guard_mutex must be held at least to this point to prevent
* ptrace_attach() from altering our determination of the task's
- * credentials; any time after this it may be unlocked */
-
+ * credentials; any time after this it may be unlocked.
+ */
security_bprm_committed_creds(bprm);
+ mutex_unlock(¤t->cred_guard_mutex);
}
EXPORT_SYMBOL(install_exec_creds);
@@ -1248,14 +1278,6 @@ int search_binary_handler(struct linux_b
EXPORT_SYMBOL(search_binary_handler);
-void free_bprm(struct linux_binprm *bprm)
-{
- free_arg_pages(bprm);
- if (bprm->cred)
- abort_creds(bprm->cred);
- kfree(bprm);
-}
-
/*
* sys_execve() executes a new program.
*/
@@ -1279,20 +1301,15 @@ int do_execve(char * filename,
if (!bprm)
goto out_files;
- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;
retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;
file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1342,7 +1359,6 @@ int do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1362,10 +1378,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
out_free:
free_bprm(bprm);
--- WAIT/fs/compat.c~CRED_MUTEX_EEEVENT_EXEC 2009-09-03 15:31:08.000000000 +0200
+++ WAIT/fs/compat.c 2009-09-03 16:42:17.000000000 +0200
@@ -1486,20 +1486,15 @@ int compat_do_execve(char * filename,
if (!bprm)
goto out_files;
- retval = -ERESTARTNOINTR;
- if (mutex_lock_interruptible(¤t->cred_guard_mutex))
+ retval = prepare_bprm_creds(bprm);
+ if (retval)
goto out_free;
- current->in_execve = 1;
-
- retval = -ENOMEM;
- bprm->cred = prepare_exec_creds();
- if (!bprm->cred)
- goto out_unlock;
retval = check_unsafe_exec(bprm);
if (retval < 0)
- goto out_unlock;
+ goto out_free;
clear_in_exec = retval;
+ current->in_execve = 1;
file = open_exec(filename);
retval = PTR_ERR(file);
@@ -1548,7 +1543,6 @@ int compat_do_execve(char * filename,
/* execve succeeded */
current->fs->in_exec = 0;
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
acct_update_integrals(current);
free_bprm(bprm);
if (displaced)
@@ -1568,10 +1562,7 @@ out_file:
out_unmark:
if (clear_in_exec)
current->fs->in_exec = 0;
-
-out_unlock:
current->in_execve = 0;
- mutex_unlock(¤t->cred_guard_mutex);
out_free:
free_bprm(bprm);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 17:26 ` [PATCH v3] " Oleg Nesterov
@ 2009-09-04 19:42 ` Andrew Morton
2009-09-04 21:33 ` Oleg Nesterov
0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2009-09-04 19:42 UTC (permalink / raw)
To: Oleg Nesterov
Cc: torvalds, dhowells, roland, jmorris, tom.horsley, linux-kernel,
stable
On Fri, 4 Sep 2009 19:26:48 +0200
Oleg Nesterov <oleg@redhat.com> wrote:
> Tom Horsley reports that his debugger hangs when it tries to read
> /proc/pid_of_tracee/maps, this happens since
>
> "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> 04b836cbf19e885f8366bccb2e4b0474346c02d
>
> commit in 2.6.31.
>
> But the root of the problem lies in the fact that do_execve() path calls
> tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.
>
> The tracee must not sleep in TASK_TRACED holding this mutex. Even if we
> remove ->cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
> another task doing PTRACE_ATTACH should not hang until it is killed or
> the tracee resumes.
>
> With this patch do_execve() does not use ->cred_guard_mutex directly and
> we do not hold it throughout, instead:
>
> - introduce prepare_bprm_creds() helper, it locks the mutex
> and calls prepare_exec_creds() to initialize bprm->cred.
>
> - install_exec_creds() drops the mutex after commit_creds(),
> and thus before tracehook_report_exec()->ptrace_stop().
>
> or, if exec fails,
>
> free_bprm() drops this mutex when bprm->cred != NULL which
> indicates install_exec_creds() was not called.
>
> Reported-by: Tom Horsley <tom.horsley@att.net>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Acked-by: David Howells <dhowells@redhat.com>
I get a reject in binfmts.h because your kernel has `extern void
set_binfmt' and mine has `extern int set_binfmt'.
Hopefully this patch works OK in mainline as well as in whatever kernel
you tested against!
I see a Cc:stable in the mail headers, but not in the changelog. I
don't think the patch is applicable to -stable unless we miss 2.6.31.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 19:42 ` Andrew Morton
@ 2009-09-04 21:33 ` Oleg Nesterov
2009-09-09 21:57 ` Chuck Ebbert
0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-04 21:33 UTC (permalink / raw)
To: Andrew Morton
Cc: torvalds, dhowells, roland, jmorris, tom.horsley, linux-kernel,
stable
On 09/04, Andrew Morton wrote:
>
> On Fri, 4 Sep 2009 19:26:48 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Tom Horsley reports that his debugger hangs when it tries to read
> > /proc/pid_of_tracee/maps, this happens since
> >
> > "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> > 04b836cbf19e885f8366bccb2e4b0474346c02d
> >
> > commit in 2.6.31.
>
> I get a reject in binfmts.h because your kernel has `extern void
> set_binfmt' and mine has `extern int set_binfmt'.
Ah, I have exec-fix-set_binfmt-vs-sys_delete_module-race.patch from -mm
applied...
> Hopefully this patch works OK in mainline as well as in whatever kernel
> you tested against!
Yes I tried to test it ;)
> I see a Cc:stable in the mail headers, but not in the changelog. I
> don't think the patch is applicable to -stable unless we miss 2.6.31.
-stable has the same problems but I agree, it can live without this fix.
Oleg.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-04 21:33 ` Oleg Nesterov
@ 2009-09-09 21:57 ` Chuck Ebbert
2009-09-09 22:58 ` Oleg Nesterov
0 siblings, 1 reply; 16+ messages in thread
From: Chuck Ebbert @ 2009-09-09 21:57 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, torvalds, dhowells, roland, jmorris, tom.horsley,
linux-kernel, stable
On Fri, 4 Sep 2009 23:33:37 +0200
Oleg Nesterov <oleg@redhat.com> wrote:
> On 09/04, Andrew Morton wrote:
> >
> > On Fri, 4 Sep 2009 19:26:48 +0200
> > Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > Tom Horsley reports that his debugger hangs when it tries to read
> > > /proc/pid_of_tracee/maps, this happens since
> > >
> > > "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> > > 04b836cbf19e885f8366bccb2e4b0474346c02d
> > >
> > > commit in 2.6.31.
>
> > I see a Cc:stable in the mail headers, but not in the changelog. I
> > don't think the patch is applicable to -stable unless we miss 2.6.31.
>
> -stable has the same problems but I agree, it can live without this fix.
>
That patch was backported in 2.6.30.5 -- are you sure -stable doesn't need
this fix?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex
2009-09-09 21:57 ` Chuck Ebbert
@ 2009-09-09 22:58 ` Oleg Nesterov
0 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2009-09-09 22:58 UTC (permalink / raw)
To: Chuck Ebbert
Cc: Andrew Morton, torvalds, dhowells, roland, jmorris, tom.horsley,
linux-kernel, stable
On 09/09, Chuck Ebbert wrote:
>
> On Fri, 4 Sep 2009 23:33:37 +0200
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > On 09/04, Andrew Morton wrote:
> > >
> > > On Fri, 4 Sep 2009 19:26:48 +0200
> > > Oleg Nesterov <oleg@redhat.com> wrote:
> > >
> > > > Tom Horsley reports that his debugger hangs when it tries to read
> > > > /proc/pid_of_tracee/maps, this happens since
> > > >
> > > > "mm_for_maps: take ->cred_guard_mutex to fix the race with exec"
> > > > 04b836cbf19e885f8366bccb2e4b0474346c02d
> > > >
> > > > commit in 2.6.31.
>
> >
> > > I see a Cc:stable in the mail headers, but not in the changelog. I
> > > don't think the patch is applicable to -stable unless we miss 2.6.31.
> >
> > -stable has the same problems but I agree, it can live without this fix.
> >
>
> That patch was backported in 2.6.30.5 -- are you sure -stable doesn't need
> this fix?
No, I am not sure, that is why I cc'ed -stable.
But otoh the problem is minor, both tracer and tracee can be killed.
I dunno. I don't know if gdb or any other "important" application
read /proc/pid/maps after PTRACE_EVENT_EXEC...
Oleg.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2009-09-09 23:04 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-03 16:05 [PATCH 1/1] exec: do not sleep in TASK_TRACED under ->cred_guard_mutex Oleg Nesterov
2009-09-03 20:09 ` Roland McGrath
2009-09-04 8:43 ` David Howells
2009-09-04 13:39 ` Oleg Nesterov
2009-09-04 14:47 ` David Howells
2009-09-04 15:49 ` Oleg Nesterov
2009-09-04 17:26 ` [PATCH v3] " Oleg Nesterov
2009-09-04 19:42 ` Andrew Morton
2009-09-04 21:33 ` Oleg Nesterov
2009-09-09 21:57 ` Chuck Ebbert
2009-09-09 22:58 ` Oleg Nesterov
2009-09-04 8:39 ` [PATCH 1/1] " David Howells
2009-09-04 9:24 ` Roland McGrath
2009-09-04 12:46 ` Oleg Nesterov
2009-09-04 13:39 ` David Howells
2009-09-04 13:55 ` Oleg Nesterov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox