From: Tejun Heo <tj@kernel.org>
To: oleg@redhat.com, jan.kratochvil@redhat.com, vda.linux@googlemail.com
Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, indan@nul.nu, roland@hack.frob.com,
Tejun Heo <tj@kernel.org>
Subject: [PATCH 15/20] job control: Fix ptracer wait(2) hang and explain notask_error clearing
Date: Wed, 23 Mar 2011 11:06:01 +0100 [thread overview]
Message-ID: <1300874766-12941-16-git-send-email-tj@kernel.org> (raw)
In-Reply-To: <1300874766-12941-1-git-send-email-tj@kernel.org>
wait(2) and friends allow access to stopped/continued states through
zombies, which is required as the states are process-wide and should
be accessible whether the leader task is alive or undead.
wait_consider_task() implements this by always clearing notask_error
and going through wait_task_stopped/continued() for unreaped zombies.
However, while ptraced, the stopped state is per-task and as such if
the ptracee became a zombie, there's no further stopped event to
listen to and wait(2) and friends should return -ECHILD on the tracee.
Fix it by clearing notask_error only if WCONTINUED | WEXITED is set
for ptraced zombies. While at it, document why clearing notask_error
is safe for each case.
Test case follows.
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
#include <time.h>
#include <sys/types.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
static void *nooper(void *arg)
{
pause();
return NULL;
}
int main(void)
{
const struct timespec ts1s = { .tv_sec = 1 };
pid_t tracee, tracer;
siginfo_t si;
tracee = fork();
if (tracee == 0) {
pthread_t thr;
pthread_create(&thr, NULL, nooper, NULL);
nanosleep(&ts1s, NULL);
printf("tracee exiting\n");
pthread_exit(NULL); /* let subthread run */
}
tracer = fork();
if (tracer == 0) {
ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
while (1) {
if (waitid(P_PID, tracee, &si, WSTOPPED) < 0) {
perror("waitid");
break;
}
ptrace(PTRACE_CONT, tracee, NULL,
(void *)(long)si.si_status);
}
return 0;
}
waitid(P_PID, tracer, &si, WEXITED);
kill(tracee, SIGKILL);
return 0;
}
Before the patch, after the tracee becomes a zombie, the tracer's
waitid(WSTOPPED) never returns and the program doesn't terminate.
tracee exiting
^C
After the patch, tracee exiting triggers waitid() to fail.
tracee exiting
waitid: No child processes
-v2: Oleg pointed out that exited in addition to continued can happen
for ptraced dead group leader. Clear notask_error for ptraced
child on WEXITED too.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/exit.c | 44 ++++++++++++++++++++++++++++++++++----------
1 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index b4a935c..84d13d6 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1550,17 +1550,41 @@ static int wait_consider_task(struct wait_opts *wo, int ptrace,
return 0;
}
- /*
- * We don't reap group leaders with subthreads.
- */
- if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
- return wait_task_zombie(wo, p);
+ /* slay zombie? */
+ if (p->exit_state == EXIT_ZOMBIE) {
+ /* we don't reap group leaders with subthreads */
+ if (!delay_group_leader(p))
+ return wait_task_zombie(wo, p);
- /*
- * It's stopped or running now, so it might
- * later continue, exit, or stop again.
- */
- wo->notask_error = 0;
+ /*
+ * Allow access to stopped/continued state via zombie by
+ * falling through. Clearing of notask_error is complex.
+ *
+ * When !@ptrace:
+ *
+ * If WEXITED is set, notask_error should naturally be
+ * cleared. If not, subset of WSTOPPED|WCONTINUED is set,
+ * so, if there are live subthreads, there are events to
+ * wait for. If all subthreads are dead, it's still safe
+ * to clear - this function will be called again in finite
+ * amount time once all the subthreads are released and
+ * will then return without clearing.
+ *
+ * When @ptrace:
+ *
+ * Stopped state is per-task and thus can't change once the
+ * target task dies. Only continued and exited can happen.
+ * Clear notask_error if WCONTINUED | WEXITED.
+ */
+ if (likely(!ptrace) || (wo->wo_flags & (WCONTINUED | WEXITED)))
+ wo->notask_error = 0;
+ } else {
+ /*
+ * @p is alive and it's gonna stop, continue or exit, so
+ * there always is something to wait for.
+ */
+ wo->notask_error = 0;
+ }
if (task_stopped_code(p, ptrace))
return wait_task_stopped(wo, ptrace, p);
--
1.7.1
next prev parent reply other threads:[~2011-03-23 10:06 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-23 10:05 [PATCHSET] ptrace,signal: Improve ptrace and job control interaction Tejun Heo
2011-03-23 10:05 ` [PATCH 01/20] signal: Fix SIGCONT notification code Tejun Heo
2011-03-23 10:05 ` [PATCH 02/20] ptrace: Remove the extra wake_up_state() from ptrace_detach() Tejun Heo
2011-03-23 10:05 ` [PATCH 03/20] signal: Remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
2011-03-23 10:05 ` [PATCH 04/20] ptrace: Kill tracehook_notify_jctl() Tejun Heo
2011-03-23 10:05 ` [PATCH 05/20] ptrace: Add @why to ptrace_stop() Tejun Heo
2011-03-23 10:05 ` [PATCH 06/20] signal: Fix premature completion of group stop when interfered by ptrace Tejun Heo
2011-03-23 10:05 ` [PATCH 07/20] signal: Use GROUP_STOP_PENDING to stop once for a single group stop Tejun Heo
2011-03-23 10:05 ` [PATCH 08/20] ptrace: Participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
2011-03-23 10:05 ` [PATCH 09/20] ptrace: Make do_signal_stop() use ptrace_stop() if the task is being ptraced Tejun Heo
2011-03-23 10:05 ` [PATCH 10/20] ptrace: Clean transitions between TASK_STOPPED and TRACED Tejun Heo
2011-03-23 10:05 ` [PATCH 11/20] ptrace: Collapse ptrace_untrace() into __ptrace_unlink() Tejun Heo
2011-03-23 10:05 ` [PATCH 12/20] ptrace: Always put ptracee into appropriate execution state Tejun Heo
2011-03-23 10:05 ` [PATCH 13/20] job control: Don't set group_stop exit_code if re-entering job control stop Tejun Heo
2011-03-23 10:06 ` [PATCH 14/20] job control: Small reorganization of wait_consider_task() Tejun Heo
2011-03-23 10:06 ` Tejun Heo [this message]
2011-03-23 10:06 ` [PATCH 16/20] job control: Allow access to job control events through ptracees Tejun Heo
2011-03-23 10:06 ` [PATCH 17/20] job control: Add @for_ptrace to do_notify_parent_cldstop() Tejun Heo
2011-03-23 10:06 ` [PATCH 18/20] job control: Job control stop notifications should always go to the real parent Tejun Heo
2011-03-23 10:06 ` [PATCH 19/20] job control: Notify the real parent of job control events regardless of ptrace Tejun Heo
2011-03-23 10:06 ` [PATCH 20/20] job control: Don't send duplicate job control stop notification while ptraced Tejun Heo
2011-03-23 18:38 ` [PATCHSET] ptrace,signal: Improve ptrace and job control interaction Oleg Nesterov
2011-03-25 14:26 ` Tejun Heo
2011-03-26 18:25 ` Oleg Nesterov
2011-03-28 8:58 ` Tejun Heo
2011-03-28 12:14 ` Oleg Nesterov
2011-03-28 15:21 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1300874766-12941-16-git-send-email-tj@kernel.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=indan@nul.nu \
--cc=jan.kratochvil@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=roland@hack.frob.com \
--cc=torvalds@linux-foundation.org \
--cc=vda.linux@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).