From: Daniel Jacobowitz <dan@debian.org>
To: linux-kernel@vger.kernel.org
Subject: More waitpid issues with CLONE_DETACHED/CLONE_THREAD
Date: Sat, 31 Jan 2004 22:25:25 -0500 [thread overview]
Message-ID: <20040201032525.GA10254@nevyn.them.org> (raw)
This may be related to the python bug reported today...
I've been playing around with gdbserver support for NPTL threading all day
today. Right now it works, except that when I say "kill" in the GDB client,
gdbserver hangs. The problem is that we kill the child, and wait for it,
but wait never returns it.
write(2, "Killing inferior\n", 17) = 17
ptrace(PTRACE_CONT, 8454, 0, SIG_0) = 0
tkill(8454, SIGKILL) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(8454, 0xbfffec04, WNOHANG) = 0
waitpid(8454, 0xbfffec04, WNOHANG|__WCLONE) = -1 ECHILD (No child processes)
nanosleep({0, 1000000}, 0) = ? ERESTART_RESTARTBLOCK (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
setup() = 0
waitpid(8454, 0xbfffec04, WNOHANG) = 0
waitpid(8454, 0xbfffec04, WNOHANG|__WCLONE) = -1 ECHILD (No child processes)
and so on (looping on waitpid). At this point, process 8454 is marked as a
zombie, and nothing can reap it. After gdbserver is killed, it reparents to
init:
Name: linux-dp
State: Z (zombie)
SleepAVG: 91%
Tgid: 8454
Pid: 8454
PPid: 1
TracerPid: 0
but init can't reap it either.
8454 was the original (i.e. non-CLONE_DETACHED) thread. Same behavior if I
use ptrace_kill.
GDB doesn't suffer from the same problem. A little time with strace and I
found out why: GDB PTRACE_KILL's the detached threads, PTRACE_KILL's the
parent thread, waitpid's the detached threads, and then waitpid's the parent
thread. No design, just different order of items on the linked list.
If I change gdbserver to do "kill thread; wait for thread; kill next thread;
wait for next thread; kill parent last; wait for parent last" then it
terminates and I don't get an unkillable zombie.
ptrace(PTRACE_KILL, 18348, 0, 0) = 0
waitpid(18348, 0xbfffec04, WNOHANG) = -1 ECHILD (No child processes)
--- SIGCHLD (Child exited) @ 0 (0) ---
waitpid(18348, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG|__WCLONE) = 18348
ptrace(PTRACE_KILL, 18349, 0, 0) = 0
waitpid(18349, 0xbfffec04, WNOHANG) = -1 ECHILD (No child processes)
waitpid(18349, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG|__WCLONE) = 18349
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_KILL, 18350, 0, 0) = 0
waitpid(18350, 0xbfffec04, WNOHANG) = -1 ECHILD (No child processes)
waitpid(18350, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG|__WCLONE) = 18350
ptrace(PTRACE_KILL, 18351, 0, 0) = 0
waitpid(18351, 0xbfffec04, WNOHANG) = -1 ECHILD (No child processes)
waitpid(18351, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG|__WCLONE) = 18351
ptrace(PTRACE_KILL, 18352, 0, 0) = 0
waitpid(18352, 0xbfffec04, WNOHANG) = -1 ECHILD (No child processes)
waitpid(18352, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG|__WCLONE) = 18352
--- SIGCHLD (Child exited) @ 0 (0) ---
ptrace(PTRACE_KILL, 18329, 0, 0) = 0
waitpid(18329, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG) = 18329
exit_group(0) = ?
So it looks like something gets very confused if the parent is SIGKILLed
before the children. What should happen?
--
Daniel Jacobowitz
MontaVista Software Debian GNU/Linux Developer
next reply other threads:[~2004-02-01 3:25 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-02-01 3:25 Daniel Jacobowitz [this message]
2004-02-01 4:38 ` More waitpid issues with CLONE_DETACHED/CLONE_THREAD Linus Torvalds
2004-02-01 4:43 ` Daniel Jacobowitz
2004-02-01 5:12 ` Daniel Jacobowitz
2004-02-01 21:41 ` Linus Torvalds
2004-02-01 22:25 ` Roland McGrath
2004-02-02 0:55 ` Linus Torvalds
2004-02-02 2:20 ` Andries Brouwer
2004-02-02 2:30 ` Linus Torvalds
2004-02-02 0:52 ` Daniel Jacobowitz
2004-02-02 2:41 ` Davide Libenzi
2004-02-02 2:55 ` Davide Libenzi
2004-02-04 14:22 ` fs/eventpoll : reduce sizeof(struct epitem) dada1
2004-02-05 4:23 ` Davide Libenzi
2004-02-01 5:12 ` More waitpid issues with CLONE_DETACHED/CLONE_THREAD Linus Torvalds
2004-02-01 5:14 ` Daniel Jacobowitz
2004-02-01 5:42 ` Roland McGrath
2004-02-01 5:46 ` Daniel Jacobowitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040201032525.GA10254@nevyn.them.org \
--to=dan@debian.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.