public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] in_exec/etc cleanups
@ 2013-11-22 17:54 Oleg Nesterov
  2013-11-22 17:54 ` [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread() Oleg Nesterov
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Kees Cook, KOSAKI Motohiro, Michal Hocko, Sameer Nanda,
	Sergey Dyasly, linux-kernel

Hello.

Initially I was going to make a single patch (1/4) as a preparation
for the next changes, but imho this code needs cleanups. The patches
do not depend on each other, they can be applied individually.

Oleg.

 fs/exec.c             |   53 +++++++++++++-----------------------------------
 include/linux/sched.h |    1 -
 kernel/fork.c         |    1 -
 kernel/sys.c          |    5 +--
 4 files changed, 17 insertions(+), 43 deletions(-)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread()
  2013-11-22 17:54 [PATCH 0/4] in_exec/etc cleanups Oleg Nesterov
@ 2013-11-22 17:54 ` Oleg Nesterov
  2013-11-22 19:42   ` KOSAKI Motohiro
  2013-11-22 17:54 ` [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic Oleg Nesterov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Kees Cook, KOSAKI Motohiro, Michal Hocko, Sameer Nanda,
	Sergey Dyasly, linux-kernel

next_thread() should be avoided, change check_unsafe_exec()
to use while_each_thread(). This also saves 32 bytes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/exec.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 1dee8ef..0cd9c25 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1245,10 +1245,11 @@ static int check_unsafe_exec(struct linux_binprm *bprm)
 	if (current->no_new_privs)
 		bprm->unsafe |= LSM_UNSAFE_NO_NEW_PRIVS;
 
+	t = p;
 	n_fs = 1;
 	spin_lock(&p->fs->lock);
 	rcu_read_lock();
-	for (t = next_thread(p); t != p; t = next_thread(t)) {
+	while_each_thread(p, t) {
 		if (t->fs == p->fs)
 			n_fs++;
 	}
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic
  2013-11-22 17:54 [PATCH 0/4] in_exec/etc cleanups Oleg Nesterov
  2013-11-22 17:54 ` [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread() Oleg Nesterov
@ 2013-11-22 17:54 ` Oleg Nesterov
  2013-11-22 20:27   ` KOSAKI Motohiro
  2013-11-22 17:54 ` [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm() Oleg Nesterov
  2013-11-22 17:54 ` [PATCH 4/4] kill task_struct->did_exec Oleg Nesterov
  3 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Kees Cook, KOSAKI Motohiro, Michal Hocko, Sameer Nanda,
	Sergey Dyasly, linux-kernel

fs_struct->in_exec == T means that this ->fs is used by a single
process (thread group), and one of the treads does do_execve().

To avoid the mt-exec races this code has the following complications:

	1. check_unsafe_exec() returns -EBUSY if ->in_exec was
	   already set by another thread.

	2. do_execve_common() records "clear_in_exec" to ensure
	   that the error path can only clear ->in_exec if it was
	   set by current.

However, after 9b1bf12d5d51 "signals: move cred_guard_mutex from
task_struct to signal_struct" we do not need these complications:

	1. We can't race with our sub-thread, this is called under
	   per-process ->cred_guard_mutex. And we can't race with
	   another CLONE_FS task, we already checked that this fs
	   is not shared.

	   We can remove the  dead -EAGAIN logic.

	2. "out_unmark:" in do_execve_common() is either called
	   under ->cred_guard_mutex, or after de_thread() which
	   kills other threads, so we can't race with sub-thread
	   which could set ->in_exec. And if ->fs is shared with
	   another process ->in_exec should be false anyway.

	   We can clear in_exec unconditionally.

This also means that check_unsafe_exec() can be void.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/exec.c |   29 ++++++++---------------------
 1 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 0cd9c25..60eb5c5 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1225,11 +1225,10 @@ EXPORT_SYMBOL(install_exec_creds);
  * - the caller must hold ->cred_guard_mutex to protect against
  *   PTRACE_ATTACH
  */
-static int check_unsafe_exec(struct linux_binprm *bprm)
+static void check_unsafe_exec(struct linux_binprm *bprm)
 {
 	struct task_struct *p = current, *t;
 	unsigned n_fs;
-	int res = 0;
 
 	if (p->ptrace) {
 		if (p->ptrace & PT_PTRACE_CAP)
@@ -1255,22 +1254,15 @@ static int check_unsafe_exec(struct linux_binprm *bprm)
 	}
 	rcu_read_unlock();
 
-	if (p->fs->users > n_fs) {
+	if (p->fs->users > n_fs)
 		bprm->unsafe |= LSM_UNSAFE_SHARE;
-	} else {
-		res = -EAGAIN;
-		if (!p->fs->in_exec) {
-			p->fs->in_exec = 1;
-			res = 1;
-		}
-	}
+	else
+		p->fs->in_exec = 1;
 	spin_unlock(&p->fs->lock);
-
-	return res;
 }
 
-/* 
- * Fill the binprm structure from the inode. 
+/*
+ * Fill the binprm structure from the inode.
  * Check permissions, then read the first 128 (BINPRM_BUF_SIZE) bytes
  *
  * This may be called multiple times for binary chains (scripts for example).
@@ -1461,7 +1453,6 @@ static int do_execve_common(const char *filename,
 	struct linux_binprm *bprm;
 	struct file *file;
 	struct files_struct *displaced;
-	bool clear_in_exec;
 	int retval;
 
 	/*
@@ -1493,10 +1484,7 @@ static int do_execve_common(const char *filename,
 	if (retval)
 		goto out_free;
 
-	retval = check_unsafe_exec(bprm);
-	if (retval < 0)
-		goto out_free;
-	clear_in_exec = retval;
+	check_unsafe_exec(bprm);
 	current->in_execve = 1;
 
 	file = open_exec(filename);
@@ -1565,8 +1553,7 @@ out_file:
 	}
 
 out_unmark:
-	if (clear_in_exec)
-		current->fs->in_exec = 0;
+	current->fs->in_exec = 0;
 	current->in_execve = 0;
 
 out_free:
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm()
  2013-11-22 17:54 [PATCH 0/4] in_exec/etc cleanups Oleg Nesterov
  2013-11-22 17:54 ` [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread() Oleg Nesterov
  2013-11-22 17:54 ` [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic Oleg Nesterov
@ 2013-11-22 17:54 ` Oleg Nesterov
  2013-11-22 20:29   ` KOSAKI Motohiro
  2013-11-23 19:22   ` Kees Cook
  2013-11-22 17:54 ` [PATCH 4/4] kill task_struct->did_exec Oleg Nesterov
  3 siblings, 2 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Kees Cook, KOSAKI Motohiro, Michal Hocko, Sameer Nanda,
	Sergey Dyasly, linux-kernel

Both success/failure paths cleanup bprm->file, we can move this
code into free_bprm() to simlify and cleanup this logic.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/exec.c |   20 +++++---------------
 1 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 60eb5c5..9944bbf 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1140,9 +1140,7 @@ void setup_new_exec(struct linux_binprm * bprm)
 
 	/* An exec changes our domain. We are no longer part of the thread
 	   group */
-
 	current->self_exec_id++;
-			
 	flush_signal_handlers(current, 0);
 	do_close_on_exec(current->files);
 }
@@ -1174,6 +1172,10 @@ void free_bprm(struct linux_binprm *bprm)
 		mutex_unlock(&current->signal->cred_guard_mutex);
 		abort_creds(bprm->cred);
 	}
+	if (bprm->file) {
+		allow_write_access(bprm->file);
+		fput(bprm->file);
+	}
 	/* If a binfmt changed the interp, free it. */
 	if (bprm->interp != bprm->filename)
 		kfree(bprm->interp);
@@ -1432,12 +1434,6 @@ static int exec_binprm(struct linux_binprm *bprm)
 		ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
 		current->did_exec = 1;
 		proc_exec_connector(current);
-
-		if (bprm->file) {
-			allow_write_access(bprm->file);
-			fput(bprm->file);
-			bprm->file = NULL; /* to catch use-after-free */
-		}
 	}
 
 	return ret;
@@ -1500,7 +1496,7 @@ static int do_execve_common(const char *filename,
 
 	retval = bprm_mm_init(bprm);
 	if (retval)
-		goto out_file;
+		goto out_unmark;
 
 	bprm->argc = count(argv, MAX_ARG_STRINGS);
 	if ((retval = bprm->argc) < 0)
@@ -1546,12 +1542,6 @@ out:
 		mmput(bprm->mm);
 	}
 
-out_file:
-	if (bprm->file) {
-		allow_write_access(bprm->file);
-		fput(bprm->file);
-	}
-
 out_unmark:
 	current->fs->in_exec = 0;
 	current->in_execve = 0;
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] kill task_struct->did_exec
  2013-11-22 17:54 [PATCH 0/4] in_exec/etc cleanups Oleg Nesterov
                   ` (2 preceding siblings ...)
  2013-11-22 17:54 ` [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm() Oleg Nesterov
@ 2013-11-22 17:54 ` Oleg Nesterov
  2013-11-22 19:46   ` KOSAKI Motohiro
  3 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Al Viro, Kees Cook, KOSAKI Motohiro, Michal Hocko, Sameer Nanda,
	Sergey Dyasly, linux-kernel

We can kill either task->did_exec or PF_FORKNOEXEC, they are
mutually exclusive. The patch kill ->did_exec because it has
a single user.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/exec.c             |    1 -
 include/linux/sched.h |    1 -
 kernel/fork.c         |    1 -
 kernel/sys.c          |    5 ++---
 4 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 9944bbf..380640f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1432,7 +1432,6 @@ static int exec_binprm(struct linux_binprm *bprm)
 	if (ret >= 0) {
 		trace_sched_process_exec(current, old_pid, bprm);
 		ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
-		current->did_exec = 1;
 		proc_exec_connector(current);
 	}
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 34c1903..4f1958e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1114,7 +1114,6 @@ struct task_struct {
 	/* Used for emulating ABI behavior of previous Linux versions */
 	unsigned int personality;
 
-	unsigned did_exec:1;
 	unsigned in_execve:1;	/* Tell the LSMs that the process is doing an
 				 * execve */
 	unsigned in_iowait:1;
diff --git a/kernel/fork.c b/kernel/fork.c
index 2cb6024..7dbf504 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1224,7 +1224,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	if (!try_module_get(task_thread_info(p)->exec_domain->module))
 		goto bad_fork_cleanup_count;
 
-	p->did_exec = 0;
 	delayacct_tsk_init(p);	/* Must remain after dup_task_struct() */
 	copy_flags(clone_flags, p);
 	INIT_LIST_HEAD(&p->children);
diff --git a/kernel/sys.c b/kernel/sys.c
index c18ecca..1d9a218 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -896,8 +896,7 @@ SYSCALL_DEFINE1(times, struct tms __user *, tbuf)
  * only important on a multi-user system anyway, to make sure one user
  * can't send a signal to a process owned by another.  -TYT, 12/12/91
  *
- * Auch. Had to add the 'did_exec' flag to conform completely to POSIX.
- * LBT 04.03.94
+ * !PF_FORKNOEXEC check to conform completely to POSIX. LBT 04.03.94.
  */
 SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
 {
@@ -933,7 +932,7 @@ SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
 		if (task_session(p) != task_session(group_leader))
 			goto out;
 		err = -EACCES;
-		if (p->did_exec)
+		if (!(p->flags & PF_FORKNOEXEC))
 			goto out;
 	} else {
 		err = -ESRCH;
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread()
  2013-11-22 17:54 ` [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread() Oleg Nesterov
@ 2013-11-22 19:42   ` KOSAKI Motohiro
  2013-11-22 20:24     ` Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 19:42 UTC (permalink / raw)
  To: oleg, akpm
  Cc: kosaki.motohiro, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

(11/22/2013 12:54 PM), Oleg Nesterov wrote:
> next_thread() should be avoided, change check_unsafe_exec()
> to use while_each_thread(). This also saves 32 bytes.

Just curious.
Why it should be avoided? Just for cleaner code? Or is there
serious issue?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] kill task_struct->did_exec
  2013-11-22 17:54 ` [PATCH 4/4] kill task_struct->did_exec Oleg Nesterov
@ 2013-11-22 19:46   ` KOSAKI Motohiro
  2013-11-22 20:33     ` [PATCH v2 " Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 19:46 UTC (permalink / raw)
  To: oleg, akpm
  Cc: kosaki.motohiro, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

(11/22/2013 12:54 PM), Oleg Nesterov wrote:
> We can kill either task->did_exec or PF_FORKNOEXEC, they are
> mutually exclusive. The patch kill ->did_exec because it has
> a single user.

It's ok.

but,

> - * Auch. Had to add the 'did_exec' flag to conform completely to POSIX.
> - * LBT 04.03.94
> + * !PF_FORKNOEXEC check to conform completely to POSIX. LBT 04.03.94.

I guess LBT is his name and !PF_FORKNOEXEC is not his opinion. Please just
remove "LBT 04.03.94" too. git repo still keep his achievement and can avoid
confusion.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread()
  2013-11-22 19:42   ` KOSAKI Motohiro
@ 2013-11-22 20:24     ` Oleg Nesterov
  2013-11-22 20:32       ` KOSAKI Motohiro
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 20:24 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: akpm, viro, keescook, mhocko, snanda, dserrg, linux-kernel

On 11/22, KOSAKI Motohiro wrote:
>
> (11/22/2013 12:54 PM), Oleg Nesterov wrote:
> > next_thread() should be avoided, change check_unsafe_exec()
> > to use while_each_thread(). This also saves 32 bytes.
>
> Just curious.
> Why it should be avoided? Just for cleaner code?

Nobody except signal->curr_target actually need next_thread-like
code, and

> Or is there
> serious issue?

We need to change (fix) this interface. This particular code is
fine, p == current. But in general the code like this can loop
forever if p exits and next_thread(t) can't reach the unhashed
thread.

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic
  2013-11-22 17:54 ` [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic Oleg Nesterov
@ 2013-11-22 20:27   ` KOSAKI Motohiro
  2013-11-22 20:49     ` Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 20:27 UTC (permalink / raw)
  To: oleg, akpm
  Cc: kosaki.motohiro, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

(11/22/2013 12:54 PM), Oleg Nesterov wrote:
> fs_struct->in_exec == T means that this ->fs is used by a single
> process (thread group), and one of the treads does do_execve().
> 
> To avoid the mt-exec races this code has the following complications:
> 
> 	1. check_unsafe_exec() returns -EBUSY if ->in_exec was
> 	   already set by another thread.
> 
> 	2. do_execve_common() records "clear_in_exec" to ensure
> 	   that the error path can only clear ->in_exec if it was
> 	   set by current.
> 
> However, after 9b1bf12d5d51 "signals: move cred_guard_mutex from
> task_struct to signal_struct" we do not need these complications:
> 
> 	1. We can't race with our sub-thread, this is called under
> 	   per-process ->cred_guard_mutex. And we can't race with
> 	   another CLONE_FS task, we already checked that this fs
> 	   is not shared.
> 
> 	   We can remove the  dead -EAGAIN logic.
> 
> 	2. "out_unmark:" in do_execve_common() is either called
> 	   under ->cred_guard_mutex, or after de_thread() which
> 	   kills other threads, so we can't race with sub-thread
> 	   which could set ->in_exec. And if ->fs is shared with
> 	   another process ->in_exec should be false anyway.
> 
> 	   We can clear in_exec unconditionally.
> 
> This also means that check_unsafe_exec() can be void.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

I have found no problem in this patch. However, I have a very basic question.
Why do we need to keep fs->in_exec? If I understand correctly, it is needed for
retrieving fork() and exec() race in the same process. If it is correct,
can't we move it it to signal->in_exec? It seems to match signal->cred_guard_mutex
and _I_ can easily understand what the code want.

In the other words, currently we have no protection against making new thread when
p->fs is shared w/ another process and I have no idea why multi process sharing influence
multi thread behavior.

I am not expert in this area and I may overlook something. Please correct me if I am silly.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm()
  2013-11-22 17:54 ` [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm() Oleg Nesterov
@ 2013-11-22 20:29   ` KOSAKI Motohiro
  2013-11-23 19:22   ` Kees Cook
  1 sibling, 0 replies; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 20:29 UTC (permalink / raw)
  To: oleg, akpm
  Cc: kosaki.motohiro, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

(11/22/2013 12:54 PM), Oleg Nesterov wrote:
> Both success/failure paths cleanup bprm->file, we can move this
> code into free_bprm() to simlify and cleanup this logic.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread()
  2013-11-22 20:24     ` Oleg Nesterov
@ 2013-11-22 20:32       ` KOSAKI Motohiro
  0 siblings, 0 replies; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 20:32 UTC (permalink / raw)
  To: oleg
  Cc: kosaki.motohiro, akpm, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

(11/22/2013 3:24 PM), Oleg Nesterov wrote:
> On 11/22, KOSAKI Motohiro wrote:
>>
>> (11/22/2013 12:54 PM), Oleg Nesterov wrote:
>>> next_thread() should be avoided, change check_unsafe_exec()
>>> to use while_each_thread(). This also saves 32 bytes.
>>
>> Just curious.
>> Why it should be avoided? Just for cleaner code?
> 
> Nobody except signal->curr_target actually need next_thread-like
> code, and
> 
>> Or is there
>> serious issue?
> 
> We need to change (fix) this interface. This particular code is
> fine, p == current. But in general the code like this can loop
> forever if p exits and next_thread(t) can't reach the unhashed
> thread.

That's enough and good reason.

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/4] kill task_struct->did_exec
  2013-11-22 20:33     ` [PATCH v2 " Oleg Nesterov
@ 2013-11-22 20:33       ` KOSAKI Motohiro
  0 siblings, 0 replies; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 20:33 UTC (permalink / raw)
  To: oleg
  Cc: kosaki.motohiro, akpm, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

(11/22/2013 3:33 PM), Oleg Nesterov wrote:
> On 11/22, KOSAKI Motohiro wrote:
>>
>> (11/22/2013 12:54 PM), Oleg Nesterov wrote:
>>> We can kill either task->did_exec or PF_FORKNOEXEC, they are
>>> mutually exclusive. The patch kill ->did_exec because it has
>>> a single user.
>>
>> It's ok.
>>
>> but,
>>
>>> - * Auch. Had to add the 'did_exec' flag to conform completely to POSIX.
>>> - * LBT 04.03.94
>>> + * !PF_FORKNOEXEC check to conform completely to POSIX. LBT 04.03.94.
>>
>> I guess LBT is his name and !PF_FORKNOEXEC is not his opinion. Please just
>> remove "LBT 04.03.94" too. git repo still keep his achievement and can avoid
>> confusion.
> 
> OK, please see v2.
> 
> 
> Subject: [PATCH] kill task_struct->did_exec
> From: Oleg Nesterov <oleg@redhat.com>
> Date: Fri, 22 Nov 2013 18:43:40 +0100
> 
> We can kill either task->did_exec or PF_FORKNOEXEC, they are
> mutually exclusive. The patch kills ->did_exec because it has
> a single user.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>




^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 4/4] kill task_struct->did_exec
  2013-11-22 19:46   ` KOSAKI Motohiro
@ 2013-11-22 20:33     ` Oleg Nesterov
  2013-11-22 20:33       ` KOSAKI Motohiro
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 20:33 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: akpm, viro, keescook, mhocko, snanda, dserrg, linux-kernel

On 11/22, KOSAKI Motohiro wrote:
>
> (11/22/2013 12:54 PM), Oleg Nesterov wrote:
> > We can kill either task->did_exec or PF_FORKNOEXEC, they are
> > mutually exclusive. The patch kill ->did_exec because it has
> > a single user.
>
> It's ok.
>
> but,
>
> > - * Auch. Had to add the 'did_exec' flag to conform completely to POSIX.
> > - * LBT 04.03.94
> > + * !PF_FORKNOEXEC check to conform completely to POSIX. LBT 04.03.94.
>
> I guess LBT is his name and !PF_FORKNOEXEC is not his opinion. Please just
> remove "LBT 04.03.94" too. git repo still keep his achievement and can avoid
> confusion.

OK, please see v2.


Subject: [PATCH] kill task_struct->did_exec
From: Oleg Nesterov <oleg@redhat.com>
Date: Fri, 22 Nov 2013 18:43:40 +0100

We can kill either task->did_exec or PF_FORKNOEXEC, they are
mutually exclusive. The patch kills ->did_exec because it has
a single user.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/exec.c             |    1 -
 include/linux/sched.h |    1 -
 kernel/fork.c         |    1 -
 kernel/sys.c          |    5 ++---
 4 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 9944bbf..380640f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1432,7 +1432,6 @@ static int exec_binprm(struct linux_binprm *bprm)
 	if (ret >= 0) {
 		trace_sched_process_exec(current, old_pid, bprm);
 		ptrace_event(PTRACE_EVENT_EXEC, old_vpid);
-		current->did_exec = 1;
 		proc_exec_connector(current);
 	}
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 34c1903..4f1958e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1114,7 +1114,6 @@ struct task_struct {
 	/* Used for emulating ABI behavior of previous Linux versions */
 	unsigned int personality;
 
-	unsigned did_exec:1;
 	unsigned in_execve:1;	/* Tell the LSMs that the process is doing an
 				 * execve */
 	unsigned in_iowait:1;
diff --git a/kernel/fork.c b/kernel/fork.c
index 2cb6024..7dbf504 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1224,7 +1224,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 	if (!try_module_get(task_thread_info(p)->exec_domain->module))
 		goto bad_fork_cleanup_count;
 
-	p->did_exec = 0;
 	delayacct_tsk_init(p);	/* Must remain after dup_task_struct() */
 	copy_flags(clone_flags, p);
 	INIT_LIST_HEAD(&p->children);
diff --git a/kernel/sys.c b/kernel/sys.c
index c18ecca..e54ef1c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -896,8 +896,7 @@ SYSCALL_DEFINE1(times, struct tms __user *, tbuf)
  * only important on a multi-user system anyway, to make sure one user
  * can't send a signal to a process owned by another.  -TYT, 12/12/91
  *
- * Auch. Had to add the 'did_exec' flag to conform completely to POSIX.
- * LBT 04.03.94
+ * !PF_FORKNOEXEC check to conform completely to POSIX.
  */
 SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
 {
@@ -933,7 +932,7 @@ SYSCALL_DEFINE2(setpgid, pid_t, pid, pid_t, pgid)
 		if (task_session(p) != task_session(group_leader))
 			goto out;
 		err = -EACCES;
-		if (p->did_exec)
+		if (!(p->flags & PF_FORKNOEXEC))
 			goto out;
 	} else {
 		err = -ESRCH;
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic
  2013-11-22 20:27   ` KOSAKI Motohiro
@ 2013-11-22 20:49     ` Oleg Nesterov
  2013-11-22 21:00       ` KOSAKI Motohiro
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-22 20:49 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: akpm, viro, keescook, mhocko, snanda, dserrg, linux-kernel

On 11/22, KOSAKI Motohiro wrote:
>
> (11/22/2013 12:54 PM), Oleg Nesterov wrote:
> > fs_struct->in_exec == T means that this ->fs is used by a single
> > process (thread group), and one of the treads does do_execve().
> >
> > To avoid the mt-exec races this code has the following complications:
> >
> > 	1. check_unsafe_exec() returns -EBUSY if ->in_exec was
> > 	   already set by another thread.
> >
> > 	2. do_execve_common() records "clear_in_exec" to ensure
> > 	   that the error path can only clear ->in_exec if it was
> > 	   set by current.
> >
> > However, after 9b1bf12d5d51 "signals: move cred_guard_mutex from
> > task_struct to signal_struct" we do not need these complications:
> >
> > 	1. We can't race with our sub-thread, this is called under
> > 	   per-process ->cred_guard_mutex. And we can't race with
> > 	   another CLONE_FS task, we already checked that this fs
> > 	   is not shared.
> >
> > 	   We can remove the  dead -EAGAIN logic.
> >
> > 	2. "out_unmark:" in do_execve_common() is either called
> > 	   under ->cred_guard_mutex, or after de_thread() which
> > 	   kills other threads, so we can't race with sub-thread
> > 	   which could set ->in_exec. And if ->fs is shared with
> > 	   another process ->in_exec should be false anyway.
> >
> > 	   We can clear in_exec unconditionally.
> >
> > This also means that check_unsafe_exec() can be void.
> >
> > Signed-off-by: Oleg Nesterov <oleg@redhat.com>
>
> I have found no problem in this patch. However, I have a very basic question.
> Why do we need to keep fs->in_exec?

To ensure that a sub-thread can't create a new process with the same
->fs while we are doing exec without LSM_UNSAFE_SHARE, I guess. This
is only for security/ code.

> If it is correct,
> can't we move it it to signal->in_exec?

Yes, perhaps, I am thinking about more cleanups too. But not that this
will add the subtle change. CLONE_THREAD doesn't require CLONE_FS, so
copy_fs() can fail even it the caller doesn't share ->fs with the execing
thread. And we still need fs->lock to set signal->in_exec, this looks
a bit strange.

> I am not expert in this area and I may overlook something.

Neither me ;) So this patch tries to not change the current logic.

I feel that perhaps we can do more cleanups, but I am not really sure
and this needs a separate change.

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic
  2013-11-22 20:49     ` Oleg Nesterov
@ 2013-11-22 21:00       ` KOSAKI Motohiro
  2013-11-23 15:32         ` Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: KOSAKI Motohiro @ 2013-11-22 21:00 UTC (permalink / raw)
  To: oleg
  Cc: kosaki.motohiro, akpm, viro, keescook, mhocko, snanda, dserrg,
	linux-kernel

>> I have found no problem in this patch. However, I have a very basic question.
>> Why do we need to keep fs->in_exec?
> 
> To ensure that a sub-thread can't create a new process with the same
> ->fs while we are doing exec without LSM_UNSAFE_SHARE, I guess. This
> is only for security/ code.

But in LSM_UNSAFE_SHARE case, we have no check, right? I'm amazing why
we don't need anything.


> 
>> If it is correct,
>> can't we move it it to signal->in_exec?
> 
> Yes, perhaps, I am thinking about more cleanups too. But not that this
> will add the subtle change. CLONE_THREAD doesn't require CLONE_FS, so
> copy_fs() can fail even it the caller doesn't share ->fs with the execing
> thread. And we still need fs->lock to set signal->in_exec, this looks
> a bit strange.

Oops. Yes, this is totally odd. Sorry, we need to stop off topic discussion.
Anyway

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

> 
>> I am not expert in this area and I may overlook something.
> 
> Neither me ;) So this patch tries to not change the current logic.
> 
> I feel that perhaps we can do more cleanups, but I am not really sure
> and this needs a separate change.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic
  2013-11-22 21:00       ` KOSAKI Motohiro
@ 2013-11-23 15:32         ` Oleg Nesterov
  0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2013-11-23 15:32 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: akpm, viro, keescook, mhocko, snanda, dserrg, linux-kernel

On 11/22, KOSAKI Motohiro wrote:
>
> >> I have found no problem in this patch. However, I have a very basic question.
> >> Why do we need to keep fs->in_exec?
> >
> > To ensure that a sub-thread can't create a new process with the same
> > ->fs while we are doing exec without LSM_UNSAFE_SHARE, I guess. This
> > is only for security/ code.
>
> But in LSM_UNSAFE_SHARE case, we have no check, right? I'm amazing why
> we don't need anything.

Yes. We rely on security/ code in this case, it can nack this exec if it
looks unsafe.

IOW. If LSM_UNSAFE_SHARE is not set, we promise that this fs has a single
user: the execing thread (it will kill other subthreads which can have the
same fs). That is why we need to cancel any attempt to create another
CLONE_FS process in between.

But let me repeat this is only my speculations, I know nothing about selinux
and selinux_bprm_set_creds() in particular. Although it looks obvious that
potentially exec with the shared ->fs has the additional security problems.

Kosaki, thank you for review!

Oleg.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm()
  2013-11-22 17:54 ` [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm() Oleg Nesterov
  2013-11-22 20:29   ` KOSAKI Motohiro
@ 2013-11-23 19:22   ` Kees Cook
  1 sibling, 0 replies; 17+ messages in thread
From: Kees Cook @ 2013-11-23 19:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Al Viro, KOSAKI Motohiro, Michal Hocko,
	Sameer Nanda, Sergey Dyasly, LKML

On Fri, Nov 22, 2013 at 9:54 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> Both success/failure paths cleanup bprm->file, we can move this
> code into free_bprm() to simlify and cleanup this logic.

Nice, yeah, this looks right.

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-11-23 19:22 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-22 17:54 [PATCH 0/4] in_exec/etc cleanups Oleg Nesterov
2013-11-22 17:54 ` [PATCH 1/4] check_unsafe_exec: use while_each_thread() rather than next_thread() Oleg Nesterov
2013-11-22 19:42   ` KOSAKI Motohiro
2013-11-22 20:24     ` Oleg Nesterov
2013-11-22 20:32       ` KOSAKI Motohiro
2013-11-22 17:54 ` [PATCH 2/4] check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic Oleg Nesterov
2013-11-22 20:27   ` KOSAKI Motohiro
2013-11-22 20:49     ` Oleg Nesterov
2013-11-22 21:00       ` KOSAKI Motohiro
2013-11-23 15:32         ` Oleg Nesterov
2013-11-22 17:54 ` [PATCH 3/4] exec: move the final allow_write_access/fput into free_bprm() Oleg Nesterov
2013-11-22 20:29   ` KOSAKI Motohiro
2013-11-23 19:22   ` Kees Cook
2013-11-22 17:54 ` [PATCH 4/4] kill task_struct->did_exec Oleg Nesterov
2013-11-22 19:46   ` KOSAKI Motohiro
2013-11-22 20:33     ` [PATCH v2 " Oleg Nesterov
2013-11-22 20:33       ` KOSAKI Motohiro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox