All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	Eric Dumazet <edumazet@google.com>,
	Jan Kratochvil <jan.kratochvil@redhat.com>,
	Julien Tinnes <jln@google.com>, Kees Cook <keescook@google.com>,
	Kostya Serebryany <kcc@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
	Pedro Alves <palves@redhat.com>,
	Robert Swiecki <swiecki@google.com>,
	Roland McGrath <roland@hack.frob.com>,
	syzkaller@googlegroups.com, linux-kernel@vger.kernel.org
Subject: [PATCH 1/2] wait/ptrace: always assume __WALL if the child is traced
Date: Tue, 20 Oct 2015 19:17:54 +0200	[thread overview]
Message-ID: <20151020171754.GA29304@redhat.com> (raw)
In-Reply-To: <20151020171740.GA29290@redhat.com>

The following program (simplified version of generated by syzkaller)

	#include <pthread.h>
	#include <unistd.h>
	#include <sys/ptrace.h>
	#include <stdio.h>
	#include <signal.h>

	void *thread_func(void *arg)
	{
		ptrace(PTRACE_TRACEME, 0,0,0);
		return 0;
	}

	int main(void)
	{
		pthread_t thread;

		if (fork())
			return 0;

		while (getppid() != 1)
			;

		pthread_create(&thread, NULL, thread_func, NULL);
		pthread_join(thread, NULL);
		return 0;
	}

creates the unreapable zombie if /sbin/init doesn't use __WALL.

This is not a kernel bug, at least in a sense that everything works as
expected: debugger should reap a traced sub-thread before it can reap
the leader, but without __WALL/__WCLONE do_wait() ignores sub-threads.

Unfortunately, it seems that /sbin/init in most (all?) distributions
doesn't use it and we have to change the kernel to avoid the problem.

This patch just adds the "ptrace" check into eligible_child(). To some
degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal
is mostly ignored when the tracee reports to debugger.

This obviously means the user-visible change: __WCLONE and __WALL no
longer have any meaning for debugger. And I can only hope that this
won't break something.

We could make a more conservative change. Say, we can take __WCLONE
into account, or !thread_group_leader(). But it would be nice to not
complicate these historical/confusing checks.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org>
---
 kernel/exit.c |   29 ++++++++++++++++++++---------
 1 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 22fcc05..819f51e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -914,17 +914,28 @@ static int eligible_pid(struct wait_opts *wo, struct task_struct *p)
 		task_pid_type(p, wo->wo_type) == wo->wo_pid;
 }
 
-static int eligible_child(struct wait_opts *wo, struct task_struct *p)
+static int
+eligible_child(struct wait_opts *wo, bool ptrace, struct task_struct *p)
 {
 	if (!eligible_pid(wo, p))
 		return 0;
-	/* Wait for all children (clone and not) if __WALL is set;
-	 * otherwise, wait for clone children *only* if __WCLONE is
-	 * set; otherwise, wait for non-clone children *only*.  (Note:
-	 * A "clone" child here is one that reports to its parent
-	 * using a signal other than SIGCHLD.) */
-	if (((p->exit_signal != SIGCHLD) ^ !!(wo->wo_flags & __WCLONE))
-	    && !(wo->wo_flags & __WALL))
+
+	/*
+	 * Wait for all children (clone and not) if __WALL is set or
+	 * if it is traced by us.
+	 */
+	if (ptrace || (wo->wo_flags & __WALL))
+		return 1;
+
+	/*
+	 * Otherwise, wait for clone children *only* if __WCLONE is set;
+	 * otherwise, wait for non-clone children *only*.
+	 *
+	 * Note: a "clone" child here is one that reports to its parent
+	 * using a signal other than SIGCHLD, or a non-leader thread which
+	 * we can only see if it is traced by us.
+	 */
+	if ((p->exit_signal != SIGCHLD) ^ !!(wo->wo_flags & __WCLONE))
 		return 0;
 
 	return 1;
@@ -1297,7 +1308,7 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
 	if (unlikely(exit_state == EXIT_DEAD))
 		return 0;
 
-	ret = eligible_child(wo, p);
+	ret = eligible_child(wo, ptrace, p);
 	if (!ret)
 		return ret;
 
-- 
1.5.5.1


  reply	other threads:[~2015-10-20 17:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20 17:17 [PATCH 0/2] wait/ptrace: always assume __WALL if the child is traced Oleg Nesterov
2015-10-20 17:17 ` Oleg Nesterov [this message]
2015-10-20 22:31   ` [PATCH 1/2] " Andrew Morton
2015-10-21  3:27     ` Vasily Averin
2015-10-21 17:41     ` Oleg Nesterov
2015-10-21 19:47       ` Andrew Morton
2015-10-21 20:44         ` Oleg Nesterov
2015-10-21 19:59     ` Denys Vlasenko
2015-10-21 20:31       ` Denys Vlasenko
2015-10-21 21:47         ` Oleg Nesterov
2015-10-21 23:27           ` Denys Vlasenko
2015-10-25 15:54             ` Oleg Nesterov
2015-10-26 12:08               ` Pedro Alves
2015-10-28 16:11                 ` Oleg Nesterov
2015-10-28 15:43                   ` Pedro Alves
2015-10-28 19:02                     ` Oleg Nesterov
2015-10-22 13:51   ` Denys Vlasenko
2015-10-20 17:17 ` [PATCH 2/2] wait: allow sys_waitid() to use __WNOTHREAD/__WCLONE/__WALL Oleg Nesterov
2015-10-20 17:36 ` [PATCH 0/2] wait/ptrace: always assume __WALL if the child is traced Oleg Nesterov
2015-10-22 14:40 ` Pedro Alves
2015-10-25 15:42   ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151020171754.GA29304@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dvlasenk@redhat.com \
    --cc=dvyukov@google.com \
    --cc=edumazet@google.com \
    --cc=glider@google.com \
    --cc=jan.kratochvil@redhat.com \
    --cc=jln@google.com \
    --cc=kcc@google.com \
    --cc=keescook@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=palves@redhat.com \
    --cc=roland@hack.frob.com \
    --cc=swiecki@google.com \
    --cc=syzkaller@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.