* Unkillable Zombie process under 2.6.3 and 2.6.4
@ 2004-03-11 16:01 David Fort
[not found] ` <20040311151729.57e3d936.akpm@osdl.org>
0 siblings, 1 reply; 5+ messages in thread
From: David Fort @ 2004-03-11 16:01 UTC (permalink / raw)
To: linux-kernel
Hi list,
i have some troubles with some totally unkillable zombie process:
Here's how i can get unkillable zombies debug multi-threaded program
using gdb and
in the execution my program popens a command, sometimes i get the
following gdb message
waiting for new child: No child processes.
(gdb)
And gdb give me back the prompt. I have the impression that the child
process has
been effectively launched.
If i ask gdb to continue the process goes on but the incriminated thread
looks freezed. When
in this state i can contact other threads, but gdb is stuck(Ctrl+C
doesn't work).
Killing -9 my program doesn't have any effect. But killing -9 gdb
effectivelly kills gdb
but not my program(which is a son of gdb). Shouldn't the kernel finish
the job with zombie
process when their father die ?(there's nobody to catch signals, or
return codes).
My big problem is that the faulty program keeps its binding sockets
opened, so i can't
launch anything on that ports.
--
Fort David, Projet IDsA
IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 71 33
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <20040311151729.57e3d936.akpm@osdl.org>]
* Re: Unkillable Zombie process under 2.6.3 and 2.6.4 [not found] ` <20040311151729.57e3d936.akpm@osdl.org> @ 2004-03-12 13:54 ` David Fort 2004-03-12 14:06 ` Christian Borntraeger 0 siblings, 1 reply; 5+ messages in thread From: David Fort @ 2004-03-12 13:54 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 1206 bytes --] Andrew Morton wrote: >Would you have time to prepare a little test app to demonstrate this? > >Thanks. > > > I've wrote this little app that do nothing complicated: it just launch a thread that do popen in its body. This programs sticks gdb completly, i don't know who is to blame gdb or the kernel. The fact is that there's something really strange here. I'm trying to build a test app that can trigger the case where GDBed process become unkillable zombies (i have some still running on my box). I've explored several ideas i had: -related to TLS -> playing around with the tlsData var didn't show anything -SIGCHLD intercepted by the program and not caught by gdb -> even without the signal handler i get the bug I'm gonna modify the app to test that the apps doesn't become unkillable when it has a socket in WAIT_STATE. Attached is a tarbal that contains: the program, the makefile and a quite long report of the behaviour that i'm seeing while debugging and my .config. gcc version 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk) glibc-2.3.3-10mdk 2.6.4 kernel -- Fort David, Projet IDsA IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France Tél: +33 (0) 2 99 84 71 33 [-- Attachment #2: kbug1.tar.bz2 --] [-- Type: application/x-bzip2, Size: 9899 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Unkillable Zombie process under 2.6.3 and 2.6.4 2004-03-12 13:54 ` David Fort @ 2004-03-12 14:06 ` Christian Borntraeger 2004-03-12 14:30 ` David Fort [not found] ` <4051C8C5.5090204@irisa.fr> 0 siblings, 2 replies; 5+ messages in thread From: Christian Borntraeger @ 2004-03-12 14:06 UTC (permalink / raw) To: linux-kernel; +Cc: David Fort, Andrew Morton David Fort wrote: > I'm trying to build a test app that can trigger the case where GDBed > process become unkillable zombies Does it help to send a SIGCONT to all processes in T state? cheers Christian ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Unkillable Zombie process under 2.6.3 and 2.6.4 2004-03-12 14:06 ` Christian Borntraeger @ 2004-03-12 14:30 ` David Fort [not found] ` <4051C8C5.5090204@irisa.fr> 1 sibling, 0 replies; 5+ messages in thread From: David Fort @ 2004-03-12 14:30 UTC (permalink / raw) Cc: linux-kernel Christian Borntraeger wrote: >David Fort wrote: > > >>I'm trying to build a test app that can trigger the case where GDBed >>process become unkillable zombies >> >> > >Does it help to send a SIGCONT to all processes in T state? > > > No it doesn't, gdb loose its context when doing this(this triggers an internal gdb error): lin-lwp.c:653: internal-error: stop_wait_callback: Assertion `pid == GET_LWP (lp->ptid)' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Cheers -- Fort David, Projet IDsA IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France Tél: +33 (0) 2 99 84 71 33 ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <4051C8C5.5090204@irisa.fr>]
* Re: Unkillable Zombie process under 2.6.3 and 2.6.4 [not found] ` <4051C8C5.5090204@irisa.fr> @ 2004-03-12 16:37 ` David Fort 0 siblings, 0 replies; 5+ messages in thread From: David Fort @ 2004-03-12 16:37 UTC (permalink / raw) To: linux-kernel; +Cc: Andrew Morton [-- Attachment #1: Type: text/plain, Size: 2965 bytes --] David Fort wrote: [...] >>Does it help to send a SIGCONT to all processes in T state? >> >> >> > No it doesn't, gdb loose its context when doing this(this triggers an > internal gdb error): > > lin-lwp.c:653: internal-error: stop_wait_callback: Assertion `pid == > GET_LWP (lp->ptid)' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > I was finally able to reproduce the unkillable zombie process. How to reproduce it: - launch gdb with the program in the attached tarball. - once the program is running you have 5 seconds to telnet on localhost on port 7899, and sit on your keyboard. -after 5 seconds the first popen occurs in one of the thread of kbug which causes gdb or the kernel to bug(waiting for new child: No child processes.) [dfo@chiffre kbug]$ ~/cvs/gdb/gdb/gdb ./kbug GNU gdb 2004-03-12-cvs Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". (gdb) r 5 6 Starting program: /home/dfo/test/kbug/kbug 5 6 [Thread debugging using libthread_db enabled] [New Thread 1073961248 (LWP 1882)] [New Thread 1083698096 (LWP 1885)] [New Thread 1092090800 (LWP 1886)] 1882(1083698096)do_task1: starting 1882(1073961248)main: starting 1882(1092090800)do_task1: starting [New Thread 1100487600 (LWP 1888)] 1882(1100487600)remoteworker_thread_run: starting remoteworker_thread_run(1100487600): readed 2 bytes [ ] [.. that's when i'm sending things via the telnet......] ] remoteworker_thread_run(1100487600): readed 2 bytes [ ] waiting for new child: No child processes. (gdb) When here ask to quit, and gdb may hang: (gdb) q The program is running. Exit anyway? (y or n) y Once this is done just kill gdb(killall gdb), here i get the following: ps xwaf | grep kbug 5255 pts18 S 0:00 \_ grep kbug 1882 pts16 Z 0:00 [kbug] <defunct> the kbug process became an unkillable zombie. The provided trace is with the CVS of gdb but i have same behaviour with the legacy gdb-6.0-2mdk kbug: kbug has a POPENFUNC which is a just call to popen(/bin/true) threads of kbug: -main thread creates things(other threads) and then call POPENFUNC -task1 thread binds 7899, accepts the first connection and launch a thead "remoteworker" to treat the incoming connection, once this is done it calls POPENFUNC -remoteworker "select" the incoming connection and prints what is send -task2 does only POPENFUNC I can provide any information that is needed. -- Fort David, Projet IDsA IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France Tél: +33 (0) 2 99 84 71 33 [-- Attachment #2: kbug2.tar.bz2 --] [-- Type: application/x-bzip2, Size: 10565 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-03-12 16:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-11 16:01 Unkillable Zombie process under 2.6.3 and 2.6.4 David Fort
[not found] ` <20040311151729.57e3d936.akpm@osdl.org>
2004-03-12 13:54 ` David Fort
2004-03-12 14:06 ` Christian Borntraeger
2004-03-12 14:30 ` David Fort
[not found] ` <4051C8C5.5090204@irisa.fr>
2004-03-12 16:37 ` David Fort
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox