* wait() and strace -f
@ 2001-12-18 1:14 Kurt Roeckx
2001-12-18 15:32 ` OGAWA Hirofumi
[not found] ` <877krlc60x.fsf@devron.myhome.or.jp>
0 siblings, 2 replies; 4+ messages in thread
From: Kurt Roeckx @ 2001-12-18 1:14 UTC (permalink / raw)
To: linux-kernel
I got a weird problem here. I have a process that creates 2
childs, the first one dies very fast before the parent can call
wait(). When I strace -f this wait() doesn't clean up the zombie
as it should.
Note that this problem only happens when I have 2 childeren, use
strace -f, and call wait after the first child died. Just
strace, without strace, only 1 child, or call wait() after the
child died doesn't seem to cause the problem.
Btw, this is with 2.4.16.
Simple program to demostrate it:
int main()
{
int i;
if (!fork())
{
/* Child 1. */
return 0;
}
if (!fork())
{
/* Child 2. */
sleep(10);
return 0;
}
/* Parent. */
sleep(1);
wait(&i);
return 0;
}
Without strace -f, this program stops after 1 second and the
second child still lives for 9 seconds. With strace -f this
program stops after 10 second after the second child died.
I think it's related to strace being the "real" parent of the
child. But that doesn't really explain why I need 2 childs.
Kurt
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wait() and strace -f
2001-12-18 1:14 wait() and strace -f Kurt Roeckx
@ 2001-12-18 15:32 ` OGAWA Hirofumi
[not found] ` <877krlc60x.fsf@devron.myhome.or.jp>
1 sibling, 0 replies; 4+ messages in thread
From: OGAWA Hirofumi @ 2001-12-18 15:32 UTC (permalink / raw)
To: Kurt Roeckx; +Cc: linux-kernel
Kurt Roeckx <Q@ping.be> writes:
> int main()
> {
> int i;
>
> if (!fork())
> {
> /* Child 1. */
> return 0;
> }
>
> if (!fork())
> {
> /* Child 2. */
> sleep(10);
> return 0;
> }
>
> /* Parent. */
> sleep(1);
> wait(&i);
> return 0;
> }
>
> Without strace -f, this program stops after 1 second and the
> second child still lives for 9 seconds. With strace -f this
> program stops after 10 second after the second child died.
>
> I think it's related to strace being the "real" parent of the
> child. But that doesn't really explain why I need 2 childs.
Probably, it's feature (or bug) of strace. If the trace process has
child, trace of a child is continued before wait() of parent. Then,
exit() of the child process continue wait() of parent.
> if (!fork())
> {
> /* Child 1. */
sleep(2);
> return 0;
> }
The above continued the parent after 2 seconds.
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wait() and strace -f
[not found] ` <877krlc60x.fsf@devron.myhome.or.jp>
@ 2001-12-18 20:18 ` Kurt Roeckx
2001-12-19 15:26 ` OGAWA Hirofumi
0 siblings, 1 reply; 4+ messages in thread
From: Kurt Roeckx @ 2001-12-18 20:18 UTC (permalink / raw)
To: OGAWA Hirofumi; +Cc: linux-kernel
On Tue, Dec 18, 2001 at 04:59:58PM +0900, OGAWA Hirofumi wrote:
> Kurt Roeckx <Q@ping.be> writes:
>
> > I think it's related to strace being the "real" parent of the
> > child. But that doesn't really explain why I need 2 childs.
>
> Probably, it's feature (or bug) of strace. I'm seems, if strace has
> child, trace of a child is started before wait() of parent. Then,
> exit() of child continue wait() of parent.
If I understand what you're saying, sleep(1) in child1, and
sleep(2) in the parent should fix the problem, which it doesn't.
And it still doesn't explain why it only happens with 2 childs.
Maybe I should have mentioned this before: the wait will clean up
the first child at the time the second child dies, or atleast
that's what wait() returns.
> > if (!fork())
> > {
> > /* Child 1. */
> sleep(2);
> > return 0;
> > }
>
> The above change is continued the parent after 2 seconds.
I know that too, as I said, only when child 1 dies before the
parent calls wait().
Kurt
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wait() and strace -f
2001-12-18 20:18 ` Kurt Roeckx
@ 2001-12-19 15:26 ` OGAWA Hirofumi
0 siblings, 0 replies; 4+ messages in thread
From: OGAWA Hirofumi @ 2001-12-19 15:26 UTC (permalink / raw)
To: linux-kernel
Kurt Roeckx <Q@ping.be> writes:
> > Probably, it's feature (or bug) of strace. I'm seems, if strace has
^^^^^^
Sorry, s/strace/the trace process/
> > child, trace of a child is started before wait() of parent. Then,
> > exit() of child continue wait() of parent.
>
> If I understand what you're saying, sleep(1) in child1, and
> sleep(2) in the parent should fix the problem, which it doesn't.
>
> And it still doesn't explain why it only happens with 2 childs.
As far as I read the source, it seems strace is not counting the
zombie. And strace wait exit() of child2 before restarting the wait()
of parent.
strace parent child1 child2
zombie
sleep(1) sleep(10)
before wait()
trap wait()
before exit()
trap exit()
restart child2
run exit()
restart parent
run wait()
> Maybe I should have mentioned this before: the wait will clean up
> the first child at the time the second child dies, or atleast
> that's what wait() returns.
>
> > > if (!fork())
> > > {
> > > /* Child 1. */
> > sleep(2);
> > > return 0;
> > > }
> >
> > The above change is continued the parent after 2 seconds.
>
> I know that too, as I said, only when child 1 dies before the
> parent calls wait().
strace-4.4/process.c in strace_4.4-1.tar.gz
diff -u /tmp/t/strace-4.4/process.c.orig /tmp/t/strace-4.4/process.c
--- /tmp/t/strace-4.4/process.c.orig Fri Aug 3 20:51:28 2001
+++ /tmp/t/strace-4.4/process.c Wed Dec 19 08:20:05 2001
@@ -1349,7 +1349,7 @@
/* WTA: fix bug with hanging children */
if (!(tcp->u_arg[2] & WNOHANG) && tcp->nchildren > 0) {
/* There are traced children */
- tcp->flags |= TCB_SUSPENDED;
+ /* tcp->flags |= TCB_SUSPENDED; */
tcp->waitpid = tcp->u_arg[0];
}
}
Try the above patch. This restart wait() immediately. However, probably
it will break something of other. ;)
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2001-12-19 15:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-18 1:14 wait() and strace -f Kurt Roeckx
2001-12-18 15:32 ` OGAWA Hirofumi
[not found] ` <877krlc60x.fsf@devron.myhome.or.jp>
2001-12-18 20:18 ` Kurt Roeckx
2001-12-19 15:26 ` OGAWA Hirofumi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox