public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* BUG: NTPL: waitpid() doesn't return?
@ 2004-01-31 10:46 Matthias Urlichs
  2004-01-31 15:37 ` bert hubert
  2004-01-31 19:02 ` Linus Torvalds
  0 siblings, 2 replies; 16+ messages in thread
From: Matthias Urlichs @ 2004-01-31 10:46 UTC (permalink / raw)
  To: linux-kernel

This partial trace is from Debian's mini-dinstall, which is a
multithreaded Python script.

What happens here is that it spawns a bunch of threads, then some of
these fork+execve external programs which they waitpid() for.

Unfortunately, some of these waitpid() calls don't return even though 
the waited-for process clearly has exited.

This is kernel 2.6.2-rc2, unmodified (except for modularized IDE).

I've kept all the intervening clone() calls in the trace, hopefully
somebody can shed some light on what might be going on here.

(Imagine random other things happening between all of the following lines.)

31338 execve("/usr/bin/mini-dinstall", ["mini-dinstall"], [/* 12 vars */]) = 0
31338 rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
31338 execve("/usr/bin/mini-dinstall", ["mini-dinstall"], [/* 12 vars */]) = 0
31338 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x4018e0c8) = 31339
31339 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x4018e0c8) = 31340
31340 clone(child_stack=0x42edbb48, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, parent_tidptr=0x42edbc18, {entry_number:6, base_addr:0x42edbbd0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0x42edbc18) = 31345
31342 clone( <unfinished ...>
31342 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x416dbc18) = 31346
31346 execve("/usr/bin/apt-ftparchive", ["apt-ftparchive", "packages", "testing/all"], [/* 12 vars */] <unfinished ...>
31346 <... execve resumed> )            = 0
31346 exit_group(0)                     = ?
31340 --- SIGCHLD (Child exited) @ 0 (0) ---
31342 waitpid(31346,  <unfinished ...>

This last call never returns.

Any ideas?

NB: When not using strace, the waidpid() call does return;
unfortunately, it does so with "[Errno 10] No child processes".

-- 
Matthias Urlichs     |     noris network AG     |     http://smurf.noris.de/

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-01-31 22:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-31 10:46 BUG: NTPL: waitpid() doesn't return? Matthias Urlichs
2004-01-31 15:37 ` bert hubert
2004-01-31 15:51   ` Matthias Urlichs
2004-01-31 16:18     ` bert hubert
2004-01-31 18:15       ` Matthias Urlichs
2004-01-31 19:19         ` bert hubert
2004-01-31 20:49           ` Matthias Urlichs
2004-01-31 21:18             ` bert hubert
2004-01-31 21:41               ` Matthias Urlichs
2004-01-31 21:52             ` Roland McGrath
2004-01-31 19:02 ` Linus Torvalds
2004-01-31 20:00   ` Matthias Urlichs
2004-01-31 20:58     ` Linus Torvalds
2004-01-31 21:11       ` Matthias Urlichs
2004-01-31 22:52         ` Linus Torvalds
2004-01-31 22:29       ` Matthias Urlichs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox