public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Unkillable Zombie process under 2.6.3 and 2.6.4
@ 2004-03-11 16:01 David Fort
       [not found] ` <20040311151729.57e3d936.akpm@osdl.org>
  0 siblings, 1 reply; 5+ messages in thread
From: David Fort @ 2004-03-11 16:01 UTC (permalink / raw)
  To: linux-kernel

Hi list,
i have some troubles with some totally unkillable zombie process:
Here's how i can get unkillable zombies debug multi-threaded program 
using gdb and
in the execution my program popens a command, sometimes i get the 
following gdb message

waiting for new child: No child processes.
(gdb)

And gdb give me back the prompt. I have the impression that the child 
process has
been effectively launched.
If i ask gdb to continue the process goes on but the incriminated thread 
looks freezed. When
in this state i can contact other threads, but gdb is stuck(Ctrl+C 
doesn't work).

Killing -9 my program doesn't have any effect. But killing -9 gdb 
effectivelly kills gdb
but not my program(which is a son of gdb). Shouldn't the kernel finish 
the job with zombie
process when their father die ?(there's nobody to catch signals, or 
return codes).

My big problem is that the faulty program keeps its binding sockets 
opened, so i can't
launch anything on that ports.

-- 
Fort David, Projet IDsA
IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 71 33



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unkillable Zombie process under 2.6.3 and 2.6.4
       [not found] ` <20040311151729.57e3d936.akpm@osdl.org>
@ 2004-03-12 13:54   ` David Fort
  2004-03-12 14:06     ` Christian Borntraeger
  0 siblings, 1 reply; 5+ messages in thread
From: David Fort @ 2004-03-12 13:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]

Andrew Morton wrote:

>Would you have time to prepare a little test app to demonstrate this?
>
>Thanks.
>
>  
>
I've wrote this little app that do nothing complicated: it just launch a 
thread that do popen in its body.
This programs sticks gdb completly, i don't know who is to blame gdb or 
the kernel.
The fact is that there's something really strange here.
I'm trying to build a test app that can trigger the case where GDBed 
process become unkillable zombies
(i have some still running on my box).

I've explored several ideas i had:
    -related to TLS -> playing around with the tlsData var didn't show 
anything
    -SIGCHLD intercepted by the program and not caught by gdb -> even 
without the signal handler i get the
    bug

I'm gonna modify the app to test that the apps doesn't become unkillable 
when it has a socket in WAIT_STATE.
   
Attached is a tarbal that contains:  the program, the makefile and a 
quite long report of the behaviour that i'm
seeing while debugging and my .config.

gcc version 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)
glibc-2.3.3-10mdk
2.6.4 kernel

-- 
Fort David, Projet IDsA
IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 71 33



[-- Attachment #2: kbug1.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 9899 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unkillable Zombie process under 2.6.3 and 2.6.4
  2004-03-12 13:54   ` David Fort
@ 2004-03-12 14:06     ` Christian Borntraeger
  2004-03-12 14:30       ` David Fort
       [not found]       ` <4051C8C5.5090204@irisa.fr>
  0 siblings, 2 replies; 5+ messages in thread
From: Christian Borntraeger @ 2004-03-12 14:06 UTC (permalink / raw)
  To: linux-kernel; +Cc: David Fort, Andrew Morton

David Fort wrote:
> I'm trying to build a test app that can trigger the case where GDBed
> process become unkillable zombies

Does it help to send a SIGCONT to all processes in T state?

cheers

Christian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unkillable Zombie process under 2.6.3 and 2.6.4
  2004-03-12 14:06     ` Christian Borntraeger
@ 2004-03-12 14:30       ` David Fort
       [not found]       ` <4051C8C5.5090204@irisa.fr>
  1 sibling, 0 replies; 5+ messages in thread
From: David Fort @ 2004-03-12 14:30 UTC (permalink / raw)
  Cc: linux-kernel

Christian Borntraeger wrote:

>David Fort wrote:
>  
>
>>I'm trying to build a test app that can trigger the case where GDBed
>>process become unkillable zombies
>>    
>>
>
>Does it help to send a SIGCONT to all processes in T state?
>
>  
>
No it doesn't, gdb loose its context when doing this(this triggers an 
internal gdb error):

lin-lwp.c:653: internal-error: stop_wait_callback: Assertion `pid == 
GET_LWP (lp->ptid)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.


Cheers

-- 
Fort David, Projet IDsA
IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 71 33



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unkillable Zombie process under 2.6.3 and 2.6.4
       [not found]       ` <4051C8C5.5090204@irisa.fr>
@ 2004-03-12 16:37         ` David Fort
  0 siblings, 0 replies; 5+ messages in thread
From: David Fort @ 2004-03-12 16:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton

[-- Attachment #1: Type: text/plain, Size: 2965 bytes --]

David Fort wrote:
[...]

>>Does it help to send a SIGCONT to all processes in T state?
>>
>>  
>>
> No it doesn't, gdb loose its context when doing this(this triggers an 
> internal gdb error):
>
> lin-lwp.c:653: internal-error: stop_wait_callback: Assertion `pid == 
> GET_LWP (lp->ptid)' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
>
I was finally able to reproduce the unkillable zombie process. How to 
reproduce it:
    - launch gdb with the program in the attached tarball.
    - once the program is running you have 5 seconds to telnet on 
localhost on port 7899, and
    sit on your keyboard.
    -after 5 seconds the first popen occurs in one of the thread of kbug 
which causes gdb or the kernel to bug(waiting for new
child: No child processes.)

[dfo@chiffre kbug]$ ~/cvs/gdb/gdb/gdb ./kbug
GNU gdb 2004-03-12-cvs
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...Using host libthread_db 
library "/lib/tls/libthread_db.so.1".

(gdb) r 5 6
Starting program: /home/dfo/test/kbug/kbug 5 6
[Thread debugging using libthread_db enabled]
[New Thread 1073961248 (LWP 1882)]
[New Thread 1083698096 (LWP 1885)]
[New Thread 1092090800 (LWP 1886)]
1882(1083698096)do_task1: starting
1882(1073961248)main: starting
1882(1092090800)do_task1: starting
[New Thread 1100487600 (LWP 1888)]
1882(1100487600)remoteworker_thread_run: starting
remoteworker_thread_run(1100487600): readed 2 bytes [
]
[.. that's when i'm sending things via the telnet......]
]
remoteworker_thread_run(1100487600): readed 2 bytes [
]
waiting for new child: No child processes.
(gdb)



When here ask to quit, and gdb may hang:
(gdb) q
The program is running.  Exit anyway? (y or n) y

Once this is done just kill gdb(killall gdb), here i get the following:
 ps xwaf | grep kbug
 5255 pts18    S      0:00              \_ grep kbug
 1882 pts16    Z      0:00 [kbug] <defunct>

the kbug process became an unkillable zombie.
The provided trace is with the CVS of gdb but i have same behaviour with the
legacy gdb-6.0-2mdk

kbug:
kbug has a POPENFUNC which is a just call to popen(/bin/true)
threads of kbug:
    -main thread creates things(other threads) and then call POPENFUNC
    -task1 thread binds 7899, accepts the first connection and launch a 
thead "remoteworker" to treat the incoming connection, once this
    is done it calls POPENFUNC
    -remoteworker "select" the incoming connection and prints what is send
    -task2 does only POPENFUNC
   
I can provide any information that is needed.

-- 
Fort David, Projet IDsA
IRISA-INRIA, Campus de Beaulieu, 35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 71 33



[-- Attachment #2: kbug2.tar.bz2 --]
[-- Type: application/x-bzip2, Size: 10565 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-03-12 16:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-11 16:01 Unkillable Zombie process under 2.6.3 and 2.6.4 David Fort
     [not found] ` <20040311151729.57e3d936.akpm@osdl.org>
2004-03-12 13:54   ` David Fort
2004-03-12 14:06     ` Christian Borntraeger
2004-03-12 14:30       ` David Fort
     [not found]       ` <4051C8C5.5090204@irisa.fr>
2004-03-12 16:37         ` David Fort

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox