* is killing zombies possible w/o a reboot? @ 2004-11-03 12:51 Gene Heskett 2004-11-03 14:33 ` bert hubert 2004-11-03 20:48 ` Tom Felker 0 siblings, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 12:51 UTC (permalink / raw) To: linux-kernel Greetings; I thought I'd get caught up on -bkx kernels and made a -bk8 just now. But I'd tried to run gnomeradio earlier to listen to the elections, but it failed leaving to run, as did tvtime then too, claiming it couldn't get a lock on /dev/video0, and gnomeradio apparently left a lock on alsasound that prevented the normal gracefull shutdown by locking up the shutdown on the "stopping alsasound" line. So I had to use the hardware reset. I'd tried to kill the zombie earlier but couldn't. Isn't there some way to clean up a &^$#^#@)_ zombie? -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 12:51 is killing zombies possible w/o a reboot? Gene Heskett @ 2004-11-03 14:33 ` bert hubert 2004-11-03 14:49 ` Måns Rullgård 2004-11-03 16:24 ` Gene Heskett 2004-11-03 20:48 ` Tom Felker 1 sibling, 2 replies; 99+ messages in thread From: bert hubert @ 2004-11-03 14:33 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: > But I'd tried to run gnomeradio earlier to listen to the elections, Depressing enough. > I'd tried to kill the zombie earlier but couldn't. > Isn't there some way to clean up a &^$#^#@)_ zombie? Kill the parent, is the only (portable) way. -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 14:33 ` bert hubert @ 2004-11-03 14:49 ` Måns Rullgård 2004-11-03 15:25 ` DervishD 2004-11-03 16:38 ` Gene Heskett 2004-11-03 16:24 ` Gene Heskett 1 sibling, 2 replies; 99+ messages in thread From: Måns Rullgård @ 2004-11-03 14:49 UTC (permalink / raw) To: linux-kernel bert hubert <ahu@ds9a.nl> writes: > On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: > >> But I'd tried to run gnomeradio earlier to listen to the elections, > > Depressing enough. > >> I'd tried to kill the zombie earlier but couldn't. >> Isn't there some way to clean up a &^$#^#@)_ zombie? > > Kill the parent, is the only (portable) way. Perhaps not as portable, but another possible, though slightly complicated, way is to ptrace the parent and force it to wait(). -- Måns Rullgård mru@inprovide.com ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 14:49 ` Måns Rullgård @ 2004-11-03 15:25 ` DervishD 2004-11-03 15:25 ` Måns Rullgård ` (3 more replies) 2004-11-03 16:38 ` Gene Heskett 1 sibling, 4 replies; 99+ messages in thread From: DervishD @ 2004-11-03 15:25 UTC (permalink / raw) To: Måns Rullgård; +Cc: linux-kernel Hi all :) * Måns Rullgård <mru@inprovide.com> dixit: > >> I'd tried to kill the zombie earlier but couldn't. > >> Isn't there some way to clean up a &^$#^#@)_ zombie? > > Kill the parent, is the only (portable) way. > Perhaps not as portable, but another possible, though slightly > complicated, way is to ptrace the parent and force it to wait(). Or write a little program that just 'wait()'s for the specified PID's. That is perfectly portable IMHO. But I must admit that the preferred way should be killing the parent. 'init' will reap the children after that. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 15:25 ` DervishD @ 2004-11-03 15:25 ` Måns Rullgård 2004-11-03 17:49 ` DervishD 2004-11-03 16:47 ` Gene Heskett ` (2 subsequent siblings) 3 siblings, 1 reply; 99+ messages in thread From: Måns Rullgård @ 2004-11-03 15:25 UTC (permalink / raw) To: linux-kernel DervishD <lkml@dervishd.net> writes: > Hi all :) > > * Måns Rullgård <mru@inprovide.com> dixit: >> >> I'd tried to kill the zombie earlier but couldn't. >> >> Isn't there some way to clean up a &^$#^#@)_ zombie? >> > Kill the parent, is the only (portable) way. >> Perhaps not as portable, but another possible, though slightly >> complicated, way is to ptrace the parent and force it to wait(). > > Or write a little program that just 'wait()'s for the specified > PID's. That is perfectly portable IMHO. But I must admit that the > preferred way should be killing the parent. 'init' will reap the > children after that. You can only wait() for your own children. -- Måns Rullgård mru@inprovide.com ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 15:25 ` Måns Rullgård @ 2004-11-03 17:49 ` DervishD 0 siblings, 0 replies; 99+ messages in thread From: DervishD @ 2004-11-03 17:49 UTC (permalink / raw) To: Måns Rullgård; +Cc: linux-kernel Hi Måns :) * Måns Rullgård <mru@inprovide.com> dixit: > >> >> I'd tried to kill the zombie earlier but couldn't. > >> >> Isn't there some way to clean up a &^$#^#@)_ zombie? > >> > Kill the parent, is the only (portable) way. > >> Perhaps not as portable, but another possible, though slightly > >> complicated, way is to ptrace the parent and force it to wait(). > > Or write a little program that just 'wait()'s for the specified > > PID's. That is perfectly portable IMHO. But I must admit that the > > preferred way should be killing the parent. 'init' will reap the > > children after that. > You can only wait() for your own children. Yes, you will receive 'ECHILD', I didn't remember that, sorry. Anyway, you shouldn't need to do that, since those zombies should have been reparented to 'init'. But, since SUSv3 doesn't specify which PID should be the parent when doing the reparenting, PID 0 could be used when reparenting as a way of telling the kernel "hey, rip those processes". Anyway, since the kernel does the reparenting, the kernel could get rid of zombies. I don't really know why is 'init' (PID 1) responsible of this. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 15:25 ` DervishD 2004-11-03 15:25 ` Måns Rullgård @ 2004-11-03 16:47 ` Gene Heskett 2004-11-03 17:44 ` DervishD 2004-11-04 16:01 ` kernel 2004-11-03 22:58 ` Bill Davidsen 2004-11-03 23:18 ` Adam Heath 3 siblings, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 16:47 UTC (permalink / raw) To: linux-kernel; +Cc: DervishD, Måns Rullgård On Wednesday 03 November 2004 10:25, DervishD wrote: > Hi all :) > > * Måns Rullgård <mru@inprovide.com> dixit: >> >> I'd tried to kill the zombie earlier but couldn't. >> >> Isn't there some way to clean up a &^$#^#@)_ zombie? >> > >> > Kill the parent, is the only (portable) way. >> >> Perhaps not as portable, but another possible, though slightly >> complicated, way is to ptrace the parent and force it to wait(). > > Or write a little program that just 'wait()'s for the specified >PID's. That is perfectly portable IMHO. But I must admit that the >preferred way should be killing the parent. 'init' will reap the >children after that. But what if there is no parent, since the system has already disposed of it? There was no parent visible to kpm. Unforch kpm also doesn't specificaly mark zombies as such either, so its a bit clueless in that regard. Finding them is usually an exersize in stretching the top window out till its about 20 screens high as its always going to be at the bottom of the list. If init can indeed do the cleanup, then how hard is it to have a "kill --total procnumber" pass that info into init and let it do its thing? Or better yet, when X asks me if I want it gone because its not responding to the close button, have X do it all in one swell foop. > Raúl Núñez de Arenas Coronado -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 16:47 ` Gene Heskett @ 2004-11-03 17:44 ` DervishD 2004-11-03 18:53 ` Gene Heskett 2004-11-04 16:01 ` kernel 1 sibling, 1 reply; 99+ messages in thread From: DervishD @ 2004-11-03 17:44 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel, Måns Rullgård Hi Gene :) * Gene Heskett <gene.heskett@verizon.net> dixit: > > Or write a little program that just 'wait()'s for the specified > >PID's. That is perfectly portable IMHO. But I must admit that the > >preferred way should be killing the parent. 'init' will reap the > >children after that. > But what if there is no parent, since the system has already disposed > of it? Then the children are reparented to 'init' and 'init' gets rid of them. That's the way UNIX behaves. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 17:44 ` DervishD @ 2004-11-03 18:53 ` Gene Heskett 2004-11-03 19:01 ` Doug McNaught ` (3 more replies) 0 siblings, 4 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 18:53 UTC (permalink / raw) To: linux-kernel; +Cc: DervishD, Måns Rullgård On Wednesday 03 November 2004 12:44, DervishD wrote: > Hi Gene :) > > * Gene Heskett <gene.heskett@verizon.net> dixit: >> > Or write a little program that just 'wait()'s for the >> > specified PID's. That is perfectly portable IMHO. But I must >> > admit that the preferred way should be killing the parent. >> > 'init' will reap the children after that. >> >> But what if there is no parent, since the system has already >> disposed of it? > > Then the children are reparented to 'init' and 'init' gets rid > of them. That's the way UNIX behaves. Unforch, I've *never* had it work that way. Any dead process I've ever had while running linux has only been disposable by a reboot. > Raúl Núñez de Arenas Coronado -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 18:53 ` Gene Heskett @ 2004-11-03 19:01 ` Doug McNaught 2004-11-03 19:03 ` Måns Rullgård ` (2 subsequent siblings) 3 siblings, 0 replies; 99+ messages in thread From: Doug McNaught @ 2004-11-03 19:01 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, DervishD, Måns Rullgård Gene Heskett <gene.heskett@verizon.net> writes: > On Wednesday 03 November 2004 12:44, DervishD wrote: >> Then the children are reparented to 'init' and 'init' gets rid >> of them. That's the way UNIX behaves. > > Unforch, I've *never* had it work that way. Any dead process I've > ever had while running linux has only been disposable by a reboot. Then it's either (a) not actually a zombie (perhaps stuck in D state), or (b) its parent is still alive. A zombie process is just an entry in the process table where the exit status etc are stored until the parent reaps it--all other resources (memory, FDs etc) have been released. So if your "zombie" process is actually taking up resources (which I think you said in an earlier post), there's something else at work. -Doug ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 18:53 ` Gene Heskett 2004-11-03 19:01 ` Doug McNaught @ 2004-11-03 19:03 ` Måns Rullgård 2004-11-03 19:24 ` Gene Heskett 2004-11-03 19:06 ` Valdis.Kletnieks 2004-11-03 19:26 ` DervishD 3 siblings, 1 reply; 99+ messages in thread From: Måns Rullgård @ 2004-11-03 19:03 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, DervishD Gene Heskett <gene.heskett@verizon.net> writes: > On Wednesday 03 November 2004 12:44, DervishD wrote: >> Hi Gene :) >> >> * Gene Heskett <gene.heskett@verizon.net> dixit: >>> > Or write a little program that just 'wait()'s for the >>> > specified PID's. That is perfectly portable IMHO. But I must >>> > admit that the preferred way should be killing the parent. >>> > 'init' will reap the children after that. >>> >>> But what if there is no parent, since the system has already >>> disposed of it? >> >> Then the children are reparented to 'init' and 'init' gets rid >> of them. That's the way UNIX behaves. > > Unforch, I've *never* had it work that way. Any dead process I've > ever had while running linux has only been disposable by a reboot. That's because its parent was still sitting around refusing to wait() for them. -- Måns Rullgård mru@inprovide.com ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:03 ` Måns Rullgård @ 2004-11-03 19:24 ` Gene Heskett 2004-11-03 19:33 ` Doug McNaught 2004-11-03 19:34 ` Måns Rullgård 0 siblings, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 19:24 UTC (permalink / raw) To: linux-kernel; +Cc: Måns Rullgård, DervishD On Wednesday 03 November 2004 14:03, Måns Rullgård wrote: >Gene Heskett <gene.heskett@verizon.net> writes: >> On Wednesday 03 November 2004 12:44, DervishD wrote: >>> Hi Gene :) >>> >>> * Gene Heskett <gene.heskett@verizon.net> dixit: >>>> > Or write a little program that just 'wait()'s for the >>>> > specified PID's. That is perfectly portable IMHO. But I must >>>> > admit that the preferred way should be killing the parent. >>>> > 'init' will reap the children after that. >>>> >>>> But what if there is no parent, since the system has already >>>> disposed of it? >>> >>> Then the children are reparented to 'init' and 'init' gets rid >>> of them. That's the way UNIX behaves. >> >> Unforch, I've *never* had it work that way. Any dead process I've >> ever had while running linux has only been disposable by a reboot. > >That's because its parent was still sitting around refusing to > wait() for them. Define 'parent' when it was a click on the apps icon on the xwindow screen that started it, please. -- Cheers, gene gheskett at wdtv dot com 99.28% setiathome rank, not too bad for a WV hillbilly ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:24 ` Gene Heskett @ 2004-11-03 19:33 ` Doug McNaught 2004-11-03 19:34 ` Måns Rullgård 1 sibling, 0 replies; 99+ messages in thread From: Doug McNaught @ 2004-11-03 19:33 UTC (permalink / raw) To: gheskett; +Cc: linux-kernel, Måns Rullgård, DervishD Gene Heskett <gheskett@wdtv.com> writes: > On Wednesday 03 November 2004 14:03, Måns Rullgård wrote: >> >>That's because its parent was still sitting around refusing to >> wait() for them. > > Define 'parent' when it was a click on the apps icon on the xwindow > screen that started it, please. Whichever process called fork() to create the app process is the parent. Sounds like it's some component of the desktop environment. -Doug ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:24 ` Gene Heskett 2004-11-03 19:33 ` Doug McNaught @ 2004-11-03 19:34 ` Måns Rullgård 1 sibling, 0 replies; 99+ messages in thread From: Måns Rullgård @ 2004-11-03 19:34 UTC (permalink / raw) To: gheskett; +Cc: linux-kernel, DervishD Gene Heskett <gheskett@wdtv.com> writes: > On Wednesday 03 November 2004 14:03, Måns Rullgård wrote: >>Gene Heskett <gene.heskett@verizon.net> writes: >>> On Wednesday 03 November 2004 12:44, DervishD wrote: >>>> Hi Gene :) >>>> >>>> * Gene Heskett <gene.heskett@verizon.net> dixit: >>>>> > Or write a little program that just 'wait()'s for the >>>>> > specified PID's. That is perfectly portable IMHO. But I must >>>>> > admit that the preferred way should be killing the parent. >>>>> > 'init' will reap the children after that. >>>>> >>>>> But what if there is no parent, since the system has already >>>>> disposed of it? >>>> >>>> Then the children are reparented to 'init' and 'init' gets rid >>>> of them. That's the way UNIX behaves. >>> >>> Unforch, I've *never* had it work that way. Any dead process I've >>> ever had while running linux has only been disposable by a reboot. >> >>That's because its parent was still sitting around refusing to >> wait() for them. > > Define 'parent' when it was a click on the apps icon on the xwindow > screen that started it, please. Run "ps axf". -- Måns Rullgård mru@inprovide.com ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 18:53 ` Gene Heskett 2004-11-03 19:01 ` Doug McNaught 2004-11-03 19:03 ` Måns Rullgård @ 2004-11-03 19:06 ` Valdis.Kletnieks 2004-11-03 19:26 ` Gene Heskett 2004-11-03 19:26 ` DervishD 3 siblings, 1 reply; 99+ messages in thread From: Valdis.Kletnieks @ 2004-11-03 19:06 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, DervishD, Måns Rullgård [-- Attachment #1: Type: text/plain, Size: 1346 bytes --] On Wed, 03 Nov 2004 13:53:39 EST, Gene Heskett said: > On Wednesday 03 November 2004 12:44, DervishD wrote: > > Then the children are reparented to 'init' and 'init' gets rid > > of them. That's the way UNIX behaves. > > Unforch, I've *never* had it work that way. Any dead process I've > ever had while running linux has only been disposable by a reboot. The problem likely isn't the true "zombie" - the only thing that *those* processes have left is a process table entry to save the exit code for a wait() syscall that might not happen anytime soon. And unless you have hundreds of them sitting around causing pressure on the 32K process limit, they're probably not a big problem. More likely, what you're looking at is some process that has gone down into the kernel on some syscall or other and gotten blocked. Since signals aren't delivered until it returns, it ends up "unkillable". Traditionally, a common cause for such wedging was a lost/misplaced interrupt from an I/O operation, so a read()/write()/ioctl() call wouldn't return because the device hadn't reported it completed. (tape drives were notorious for this). Often, power-cycling the I/O device would cause an unsolicited interrupt to be generated, which would clear the "waiting for interrupt" issue and allow the process to return.... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:06 ` Valdis.Kletnieks @ 2004-11-03 19:26 ` Gene Heskett 2004-11-03 19:33 ` Valdis.Kletnieks 2004-11-03 19:42 ` DervishD 0 siblings, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 19:26 UTC (permalink / raw) To: linux-kernel; +Cc: Valdis.Kletnieks, DervishD, Måns Rullgård On Wednesday 03 November 2004 14:06, Valdis.Kletnieks@vt.edu wrote: >On Wed, 03 Nov 2004 13:53:39 EST, Gene Heskett said: >> On Wednesday 03 November 2004 12:44, DervishD wrote: >> > Then the children are reparented to 'init' and 'init' gets >> > rid of them. That's the way UNIX behaves. >> >> Unforch, I've *never* had it work that way. Any dead process I've >> ever had while running linux has only been disposable by a reboot. > >The problem likely isn't the true "zombie" - the only thing that > *those* processes have left is a process table entry to save the > exit code for a wait() syscall that might not happen anytime soon. > And unless you have hundreds of them sitting around causing > pressure on the 32K process limit, they're probably not a big > problem. > >More likely, what you're looking at is some process that has gone > down into the kernel on some syscall or other and gotten blocked. > Since signals aren't delivered until it returns, it ends up > "unkillable". > >Traditionally, a common cause for such wedging was a lost/misplaced > interrupt from an I/O operation, so a read()/write()/ioctl() call > wouldn't return because the device hadn't reported it completed. > (tape drives were notorious for this). Often, power-cycling the I/O > device would cause an unsolicited interrupt to be generated, which > would clear the "waiting for interrupt" issue and allow the process > to return.... Well, since the "device", a bt878 based Haupagge tv card is sitting in a pci socket, thats even more drastic than a reboot. -- Cheers, gene gheskett at wdtv dot com 99.28% setiathome rank, not too bad for a WV hillbilly ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:26 ` Gene Heskett @ 2004-11-03 19:33 ` Valdis.Kletnieks 2004-11-03 20:09 ` Gene Heskett 2004-11-03 19:42 ` DervishD 1 sibling, 1 reply; 99+ messages in thread From: Valdis.Kletnieks @ 2004-11-03 19:33 UTC (permalink / raw) To: gheskett; +Cc: linux-kernel, DervishD, Måns Rullgård [-- Attachment #1: Type: text/plain, Size: 382 bytes --] On Wed, 03 Nov 2004 14:26:23 EST, Gene Heskett said: > Well, since the "device", a bt878 based Haupagge tv card is sitting in > a pci socket, thats even more drastic than a reboot. Not if you have a good hot-swap PCI cage. ;) Anyhow, that points even more at a driver issue for the bt878 - if you can get Sysrq-T output, where does it say the hung process is inside the kernel? [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:33 ` Valdis.Kletnieks @ 2004-11-03 20:09 ` Gene Heskett 2004-11-04 19:24 ` Bill Davidsen 0 siblings, 1 reply; 99+ messages in thread From: Gene Heskett @ 2004-11-03 20:09 UTC (permalink / raw) To: linux-kernel; +Cc: Valdis.Kletnieks, DervishD, Måns Rullgård On Wednesday 03 November 2004 14:33, Valdis.Kletnieks@vt.edu wrote: >On Wed, 03 Nov 2004 14:26:23 EST, Gene Heskett said: >> Well, since the "device", a bt878 based Haupagge tv card is >> sitting in a pci socket, thats even more drastic than a reboot. > >Not if you have a good hot-swap PCI cage. ;) > >Anyhow, that points even more at a driver issue for the bt878 - >if you can get Sysrq-T output, where does it say the hung process is >inside the kernel? Thats another thing I've had compiled in since forever, but it so seldom actually *works*, I've tended to forget about it. -- Cheers, gene gheskett at wdtv dot com 99.28% setiathome rank, not too bad for a WV hillbilly ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 20:09 ` Gene Heskett @ 2004-11-04 19:24 ` Bill Davidsen 0 siblings, 0 replies; 99+ messages in thread From: Bill Davidsen @ 2004-11-04 19:24 UTC (permalink / raw) To: gheskett Cc: linux-kernel, Valdis.Kletnieks, DervishD, Måns Rullgård Gene Heskett wrote: > On Wednesday 03 November 2004 14:33, Valdis.Kletnieks@vt.edu wrote: > >>On Wed, 03 Nov 2004 14:26:23 EST, Gene Heskett said: >> >>>Well, since the "device", a bt878 based Haupagge tv card is >>>sitting in a pci socket, thats even more drastic than a reboot. >> >>Not if you have a good hot-swap PCI cage. ;) >> >>Anyhow, that points even more at a driver issue for the bt878 - >>if you can get Sysrq-T output, where does it say the hung process is >>inside the kernel? > > > Thats another thing I've had compiled in since forever, but it so > seldom actually *works*, I've tended to forget about it. > You have it enabled as well as compiled in, I'm sure. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:26 ` Gene Heskett 2004-11-03 19:33 ` Valdis.Kletnieks @ 2004-11-03 19:42 ` DervishD 2004-11-03 23:12 ` Bill Davidsen 1 sibling, 1 reply; 99+ messages in thread From: DervishD @ 2004-11-03 19:42 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel, Valdis.Kletnieks, Måns Rullgård Hi Gene :) * Gene Heskett <gheskett@wdtv.com> dixit: > >Traditionally, a common cause for such wedging was a lost/misplaced > > interrupt from an I/O operation, so a read()/write()/ioctl() call > > wouldn't return because the device hadn't reported it completed. > > (tape drives were notorious for this). Often, power-cycling the I/O > > device would cause an unsolicited interrupt to be generated, which > > would clear the "waiting for interrupt" issue and allow the process > > to return.... > Well, since the "device", a bt878 based Haupagge tv card is sitting in > a pci socket, thats even more drastic than a reboot. Do you mean your Hauppage got stuck in disk-sleep state? Wow, that's sound *weird*... I think that the parent (which is whatever process did the fork when you clicked your mouse) is still alive and forgetting to do the 'wait()' for its children. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:42 ` DervishD @ 2004-11-03 23:12 ` Bill Davidsen 2004-11-04 10:26 ` DervishD 0 siblings, 1 reply; 99+ messages in thread From: Bill Davidsen @ 2004-11-03 23:12 UTC (permalink / raw) To: linux-kernel, DervishD Cc: Gene Heskett, linux-kernel, Valdis.Kletnieks, Måns Rullgård DervishD wrote: > Hi Gene :) > > * Gene Heskett <gheskett@wdtv.com> dixit: > >>>Traditionally, a common cause for such wedging was a lost/misplaced >>>interrupt from an I/O operation, so a read()/write()/ioctl() call >>>wouldn't return because the device hadn't reported it completed. >>>(tape drives were notorious for this). Often, power-cycling the I/O >>>device would cause an unsolicited interrupt to be generated, which >>>would clear the "waiting for interrupt" issue and allow the process >>>to return.... >> >>Well, since the "device", a bt878 based Haupagge tv card is sitting in >>a pci socket, thats even more drastic than a reboot. > > > Do you mean your Hauppage got stuck in disk-sleep state? Wow, > that's sound *weird*... > > I think that the parent (which is whatever process did the fork > when you clicked your mouse) is still alive and forgetting to do the > 'wait()' for its children. It would be good to know what the PPID is, from ps or similar. Things from X are a pain, the parent is often something you don't want to kill. Sometimes you can reparent from command line, "bash -c foo&" or similar, so the parent can be killed without logging out. I would swear that the parent *is* init in some cases, which is puzzling since they should be reaped. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:12 ` Bill Davidsen @ 2004-11-04 10:26 ` DervishD 2004-11-04 14:23 ` Paul Slootman 2004-11-04 19:22 ` Bill Davidsen 0 siblings, 2 replies; 99+ messages in thread From: DervishD @ 2004-11-04 10:26 UTC (permalink / raw) To: Bill Davidsen Cc: Gene Heskett, linux-kernel, Valdis.Kletnieks, Måns Rullgård Hi Bill :) * Bill Davidsen <davidsen@tmr.com> dixit: > > I think that the parent (which is whatever process did the fork > >when you clicked your mouse) is still alive and forgetting to do the > >'wait()' for its children. > It would be good to know what the PPID is, from ps or similar. Things > from X are a pain, the parent is often something you don't want to kill. > Sometimes you can reparent from command line, "bash -c foo&" or similar, > so the parent can be killed without logging out. Just use ps to reveal the family tree. Is not that hard ;) > I would swear that the parent *is* init in some cases, which is puzzling > since they should be reaped. But that's OK :))) When a parent dies without waiting for its children, the zombies are reparented to init. That's correct. Then init will wait for them. The problem is that sometimes the signals doesn't arrive or the like. Then the zombies are laying around a bit, until a timer in 'init' reaps them. That's correct too: init can only wait for children when it receives SIGCHLD or periodically, using a timer. I've written a init program and that's the way I do it, just in case some signal gets lost. If init is the parent, all works ok, just wait a bit and all those zombies will really die ;) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 10:26 ` DervishD @ 2004-11-04 14:23 ` Paul Slootman 2004-11-04 14:56 ` Gene Heskett 2004-11-04 18:24 ` DervishD 2004-11-04 19:22 ` Bill Davidsen 1 sibling, 2 replies; 99+ messages in thread From: Paul Slootman @ 2004-11-04 14:23 UTC (permalink / raw) To: linux-kernel DervishD <lkml@dervishd.net> wrote: > > If init is the parent, all works ok, just wait a bit and all >those zombies will really die ;) I recently had a system with serial console where some some reason the serial port was stopped. This meant that init blocked while writing some message (e.g. "respawning too rapidly"), and that meant it stopped reaping those zombie processes. The list of these zombie processes with PPID == 1 was amazing. The only thing that helped was rebooting after replacing the serial console cable. (Kernel 2.4.25, sysvinit 2.85 in case you're wondering.) Paul Slootman ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:23 ` Paul Slootman @ 2004-11-04 14:56 ` Gene Heskett 2004-11-04 18:24 ` DervishD 1 sibling, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 14:56 UTC (permalink / raw) To: linux-kernel; +Cc: Paul Slootman On Thursday 04 November 2004 09:23, Paul Slootman wrote: >DervishD <lkml@dervishd.net> wrote: >> If init is the parent, all works ok, just wait a bit and all >>those zombies will really die ;) > >I recently had a system with serial console where some some reason > the serial port was stopped. This meant that init blocked while > writing some message (e.g. "respawning too rapidly"), and that > meant it stopped reaping those zombie processes. The list of these > zombie processes with PPID == 1 was amazing. The only thing that > helped was rebooting after replacing the serial console cable. > >(Kernel 2.4.25, sysvinit 2.85 in case you're wondering.) Both serial ports are already in use here Paul, one for heyu and x10 stuff related to my home automation (mostly the outside lights), and the other to my Belkin ups, whose usb interface has never worked, so I'm stuck using serial for the BullDog interface to gkrellm. I'd like to find a cheap pci rocketport as I have another vintage box in the basement that could use this machine as a network gateway then. Right now its on PL2303 usb<->serial convertor but somethings wrong with the handshaking on that end. >Paul Slootman > >- >To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:23 ` Paul Slootman 2004-11-04 14:56 ` Gene Heskett @ 2004-11-04 18:24 ` DervishD 1 sibling, 0 replies; 99+ messages in thread From: DervishD @ 2004-11-04 18:24 UTC (permalink / raw) To: Paul Slootman; +Cc: linux-kernel Hi Paul :) * Paul Slootman <paul+nospam@wurtel.net> dixit: > > If init is the parent, all works ok, just wait a bit and all > >those zombies will really die ;) > I recently had a system with serial console where some some reason the > serial port was stopped. This meant that init blocked while writing some > message (e.g. "respawning too rapidly"), and that meant it stopped > reaping those zombie processes. The list of these zombie processes with > PPID == 1 was amazing. The only thing that helped was rebooting after > replacing the serial console cable. It looks like a bug in sysvinit: it shouldn't print anything on the console but use syslog and specify that the console NEVER shall be used to print anything even when there is no syslogd running. I'll make sure that it doesn't happen in my VCinit. Thanks for the information :) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 10:26 ` DervishD 2004-11-04 14:23 ` Paul Slootman @ 2004-11-04 19:22 ` Bill Davidsen 2004-11-04 20:53 ` DervishD 1 sibling, 1 reply; 99+ messages in thread From: Bill Davidsen @ 2004-11-04 19:22 UTC (permalink / raw) To: DervishD Cc: Gene Heskett, linux-kernel, Valdis.Kletnieks, Måns Rullgård DervishD wrote: > Hi Bill :) > > * Bill Davidsen <davidsen@tmr.com> dixit: > >>> I think that the parent (which is whatever process did the fork >>>when you clicked your mouse) is still alive and forgetting to do the >>>'wait()' for its children. >> >>It would be good to know what the PPID is, from ps or similar. Things >>from X are a pain, the parent is often something you don't want to kill. >>Sometimes you can reparent from command line, "bash -c foo&" or similar, >>so the parent can be killed without logging out. > > > Just use ps to reveal the family tree. Is not that hard ;) That's what I just said, the original poster should tell us what the PPID is, which may help someone help the OP. > > >>I would swear that the parent *is* init in some cases, which is puzzling >>since they should be reaped. > > > But that's OK :))) When a parent dies without waiting for its > children, the zombies are reparented to init. That's correct. Then > init will wait for them. The problem is that sometimes the signals > doesn't arrive or the like. Then the zombies are laying around a bit, > until a timer in 'init' reaps them. That's correct too: init can only > wait for children when it receives SIGCHLD or periodically, using a > timer. I've written a init program and that's the way I do it, just > in case some signal gets lost. > > If init is the parent, all works ok, just wait a bit and all > those zombies will really die ;) Actually the ones in i/o probably won't, since the kernel either missed the completion or didn't time out if the hardware missed sending the int. And even plain non-i/o zombies, just how long "a bit" are you proposing? Over Thanksgiving weekend I will try to look at the init code and see if a signal could be used to initiate a forced reap without waiting for the timer. By "look at" I mean not only "could I do that" but is it a good thing to do, before someone starts trying to explain that it's going to do something evil not to wait for the timer... -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 19:22 ` Bill Davidsen @ 2004-11-04 20:53 ` DervishD 0 siblings, 0 replies; 99+ messages in thread From: DervishD @ 2004-11-04 20:53 UTC (permalink / raw) To: Bill Davidsen Cc: Gene Heskett, linux-kernel, Valdis.Kletnieks, Måns Rullgård Hi Bill :) * Bill Davidsen <davidsen@tmr.com> dixit: > > If init is the parent, all works ok, just wait a bit and all > >those zombies will really die ;) > Actually the ones in i/o probably won't, since the kernel either missed > the completion or didn't time out if the hardware missed sending the > int. And even plain non-i/o zombies, just how long "a bit" are you > proposing? A zombie *is already dead*, not stuck in some uninterruptible queue in the kernel, so they will be ripped, sure. My last sentence in the paragraph above may be confusing: when I said 'really die' I meant 'be ripped'? > Over Thanksgiving weekend I will try to look at the init code and see if > a signal could be used to initiate a forced reap without waiting for the > timer. By "look at" I mean not only "could I do that" but is it a good > thing to do, before someone starts trying to explain that it's going to > do something evil not to wait for the timer... Don't look: just send SIGCHLD to init. That will do. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 18:53 ` Gene Heskett ` (2 preceding siblings ...) 2004-11-03 19:06 ` Valdis.Kletnieks @ 2004-11-03 19:26 ` DervishD 2004-11-03 20:18 ` Gene Heskett ` (2 more replies) 3 siblings, 3 replies; 99+ messages in thread From: DervishD @ 2004-11-03 19:26 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel, Måns Rullgård Hi Gene :) * Gene Heskett <gene.heskett@verizon.net> dixit: > > Then the children are reparented to 'init' and 'init' gets rid > > of them. That's the way UNIX behaves. > Unforch, I've *never* had it work that way. Any dead process I've > ever had while running linux has only been disposable by a reboot. Well, you know, shit happens... Anyway, could you define 'dead'? Because if you're talking about zombies whose parent dies, they're killable easily: just wait until init reaps them (usually in less than 5 minutes since they dead). If you are talking about zombies who has their parent alive, then it's a bug in the application, not the kernel. In fact I wouldn't like if the kernel reaps my children before I do, just in case I want to do something. If you're talking about unkillable processes (those stuck in disk-sleep state), you're right: only rebooting can kill them (although sometimes they go out of D state and die normally). Bad luck for you if any dead process you've ever had while running linux has been of this kind :( Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:26 ` DervishD @ 2004-11-03 20:18 ` Gene Heskett 2004-11-03 22:15 ` Jim Nelson 2004-11-03 23:07 ` Bill Davidsen 2 siblings, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 20:18 UTC (permalink / raw) To: linux-kernel; +Cc: DervishD, Måns Rullgård On Wednesday 03 November 2004 14:26, DervishD wrote: > Hi Gene :) > > * Gene Heskett <gene.heskett@verizon.net> dixit: >> > Then the children are reparented to 'init' and 'init' gets >> > rid of them. That's the way UNIX behaves. >> >> Unforch, I've *never* had it work that way. Any dead process I've >> ever had while running linux has only been disposable by a reboot. > > Well, you know, shit happens... Anyway, could you define 'dead'? >Because if you're talking about zombies whose parent dies, they're >killable easily: just wait until init reaps them (usually in less >than 5 minutes since they dead). If you are talking about zombies > who has their parent alive, then it's a bug in the application, not > the kernel. In fact I wouldn't like if the kernel reaps my children > before I do, just in case I want to do something. > > If you're talking about unkillable processes (those stuck in >disk-sleep state), you're right: only rebooting can kill them >(although sometimes they go out of D state and die normally). Bad >luck for you if any dead process you've ever had while running linux >has been of this kind :( > > Raúl Núñez de Arenas Coronado That seems to be the only kind of dead processes I get, and thats not too often. Booted to 2.6.10-rc1-bk11 now, its all working just fine except for on messydos patch that finally must have made it into the tree. As it appears I do not have a prayer of convincing folks otherwise about this issue, I suggest we let this thread die a well deserved death till it bites me or someone else again. I'll summerize that os9/nitros9 handles this situation effortlessly and flawlessly, and I expected a 150x more sophisticated os to do likewise. My mistake. OTOH, its one hell of a versatile os IMNSHO. I'm not going away just because it bites me occasionally. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:26 ` DervishD 2004-11-03 20:18 ` Gene Heskett @ 2004-11-03 22:15 ` Jim Nelson 2004-11-03 22:44 ` Russell Miller 2004-11-04 16:30 ` Pedro Venda (SYSADM) 2004-11-03 23:07 ` Bill Davidsen 2 siblings, 2 replies; 99+ messages in thread From: Jim Nelson @ 2004-11-03 22:15 UTC (permalink / raw) To: DervishD; +Cc: Gene Heskett, linux-kernel, Måns Rullgård DervishD wrote: > Hi Gene :) > > * Gene Heskett <gene.heskett@verizon.net> dixit: > >>> Then the children are reparented to 'init' and 'init' gets rid >>>of them. That's the way UNIX behaves. >> >>Unforch, I've *never* had it work that way. Any dead process I've >>ever had while running linux has only been disposable by a reboot. > > > Well, you know, shit happens... Anyway, could you define 'dead'? > Because if you're talking about zombies whose parent dies, they're > killable easily: just wait until init reaps them (usually in less > than 5 minutes since they dead). If you are talking about zombies who > has their parent alive, then it's a bug in the application, not the > kernel. In fact I wouldn't like if the kernel reaps my children > before I do, just in case I want to do something. > > If you're talking about unkillable processes (those stuck in > disk-sleep state), you're right: only rebooting can kill them > (although sometimes they go out of D state and die normally). Bad > luck for you if any dead process you've ever had while running linux > has been of this kind :( > I did this to myself a number of times when I was first learning Samba - even an ls would become unkillable. You couldn't rmmod smb, since it was in use, and you couldn't kill the process, since it was waiting on a syscall. Ergh. > Raúl Núñez de Arenas Coronado > ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 22:15 ` Jim Nelson @ 2004-11-03 22:44 ` Russell Miller 2004-11-03 23:03 ` Doug McNaught ` (2 more replies) 2004-11-04 16:30 ` Pedro Venda (SYSADM) 1 sibling, 3 replies; 99+ messages in thread From: Russell Miller @ 2004-11-03 22:44 UTC (permalink / raw) To: Jim Nelson; +Cc: DervishD, Gene Heskett, linux-kernel, Måns Rullgård On Wednesday 03 November 2004 16:15, Jim Nelson wrote: > I did this to myself a number of times when I was first learning Samba - > even an ls would become unkillable. You couldn't rmmod smb, since it was > in use, and you couldn't kill the process, since it was waiting on a > syscall. Ergh. > I'm not going to pretend to be a kernel expert, or really anything other than a newbie when it comes to kernel internals, so please take this with the merits it deserves - many, or none, depending. Anyway, is there a way to simply signal a syscall that it is to be interrupted and forcibly cause the syscall to end? Kicking the program execution out of kernel space would be sufficient to "unstick" the process - and coupling that with an automatic KILL signal may not be a bad idea. I'm pretty sure that someone will think of a way why this wouldn't work with very little effort. Please enlighten me? --Russell -- Russell Miller - rmiller@duskglow.com - Le Mars, IA Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. http://www.duskglow.com - 712-546-5886 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 22:44 ` Russell Miller @ 2004-11-03 23:03 ` Doug McNaught 2004-11-03 23:33 ` Russell Miller 2004-11-03 23:06 ` vlobanov 2004-11-04 10:04 ` Helge Hafting 2 siblings, 1 reply; 99+ messages in thread From: Doug McNaught @ 2004-11-03 23:03 UTC (permalink / raw) To: Russell Miller Cc: Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård Russell Miller <rmiller@duskglow.com> writes: > Anyway, is there a way to simply signal a syscall that it is to be > interrupted and forcibly cause the syscall to end? Kicking the > program execution out of kernel space would be sufficient to > "unstick" the process - and coupling that with an automatic KILL > signal may not be a bad idea. It was already mentioned in this thread that the bookkeeping required to clean up properly from such an abort would add a lot of overhead and slow down the normal, non-buggy case. -Doug ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:03 ` Doug McNaught @ 2004-11-03 23:33 ` Russell Miller 2004-11-03 23:47 ` Mathieu Segaud ` (2 more replies) 0 siblings, 3 replies; 99+ messages in thread From: Russell Miller @ 2004-11-03 23:33 UTC (permalink / raw) To: Doug McNaught Cc: Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård On Wednesday 03 November 2004 17:03, Doug McNaught wrote: > It was already mentioned in this thread that the bookkeeping required > to clean up properly from such an abort would add a lot of overhead > and slow down the normal, non-buggy case. > I am going to continue pursuing this at the risk of making a bigger fool of myself than I already am, but I want to make sure that I understand the issues - and I did read the message you are referring to. I think what you are saying is that there is kind of a race condition here. When something is on the wait queue, it has to be followed through to completion. An interrupt could be received at any time, and if it's taken off of the wait queue prematurely, it'll crash the kernel, because the interrupt has no way of telling that. That's fine as it goes, I understand that. But I submit that this is a horrible design. I've been bitten by this more than once - usually regarding broken NFS connections. But what I don't understand is why the bookkeeping would be so inefficient. It seems to me that all that would be required is a bitfield of some sort. If that position in the qait queue becomes invalid, when the interrupt is received to process it, the kernel notes that a flag is set invalidating that part of the wait queue, dumps the output to dave null, and goes on to the next. This doesn't seem inefficient to me, unless I'm missing something. A little more inefficient, yes, but not to near the cost that seems to be implied. And I also have to ask this question: what is more inefficient, slowing down processing of output waiting on the queue, or having to reboot when a process gets stuck due to faulty drivers? At the very least, a compile option seems like it would be worthwhile for those that would like this behavior. And I probably am. Missing something, that is. --Russell > -Doug -- Russell Miller - rmiller@duskglow.com - Le Mars, IA Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. http://www.duskglow.com - 712-546-5886 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:33 ` Russell Miller @ 2004-11-03 23:47 ` Mathieu Segaud 2004-11-03 23:56 ` Russell Miller 2004-11-04 6:39 ` Denis Vlasenko 2004-11-04 20:06 ` Bill Davidsen 2 siblings, 1 reply; 99+ messages in thread From: Mathieu Segaud @ 2004-11-03 23:47 UTC (permalink / raw) To: Russell Miller Cc: Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård Russell Miller <rmiller@duskglow.com> disait dernièrement que : > I am going to continue pursuing this at the risk of making a bigger fool of > myself than I already am, but I want to make sure that I understand the > issues - and I did read the message you are referring to. > > I think what you are saying is that there is kind of a race condition here. > When something is on the wait queue, it has to be followed through to > completion. An interrupt could be received at any time, and if it's taken > off of the wait queue prematurely, it'll crash the kernel, because the > interrupt has no way of telling that. > > That's fine as it goes, I understand that. But I submit that this is a > horrible design. I've been bitten by this more than once - usually regarding > broken NFS connections. this is because nfs related syscalls are not interruptible by default. you can make them interruptible by mounting your nfs's with the 'intr' option. -- I love people saying 'we' even though they never contributed a single line of code to the project! - Jens Axboe turning a troll down on linux-kernel ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:47 ` Mathieu Segaud @ 2004-11-03 23:56 ` Russell Miller 2004-11-04 0:05 ` Mathieu Segaud 0 siblings, 1 reply; 99+ messages in thread From: Russell Miller @ 2004-11-03 23:56 UTC (permalink / raw) To: Mathieu Segaud Cc: Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård On Wednesday 03 November 2004 17:47, Mathieu Segaud wrote: > this is because nfs related syscalls are not interruptible by default. > you can make them interruptible by mounting your nfs's with the 'intr' > option. That doesn't appear to work, then. Because we do mount them with the intr option, and the behavior doesn't seem to be any different. --Russell -- Russell Miller - rmiller@duskglow.com - Le Mars, IA Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. http://www.duskglow.com - 712-546-5886 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:56 ` Russell Miller @ 2004-11-04 0:05 ` Mathieu Segaud 0 siblings, 0 replies; 99+ messages in thread From: Mathieu Segaud @ 2004-11-04 0:05 UTC (permalink / raw) To: Russell Miller Cc: Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård Russell Miller <rmiller@duskglow.com> disait dernièrement que : > On Wednesday 03 November 2004 17:47, Mathieu Segaud wrote: > >> this is because nfs related syscalls are not interruptible by default. >> you can make them interruptible by mounting your nfs's with the 'intr' >> option. > > That doesn't appear to work, then. Because we do mount them with the intr > option, and the behavior doesn't seem to be any different. weird, it works by here.... I can even umount() lost shares.... NFS is quite an unknown beast to me, sorry... But it is clearly a bug, if you do mount them with -o intr... -- <ajh> I always viewed HURD development like the Special Olympics of free software. - Is Hurd a opponent to Linux? ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:33 ` Russell Miller 2004-11-03 23:47 ` Mathieu Segaud @ 2004-11-04 6:39 ` Denis Vlasenko 2004-11-05 2:38 ` Elladan 2004-11-04 20:06 ` Bill Davidsen 2 siblings, 1 reply; 99+ messages in thread From: Denis Vlasenko @ 2004-11-04 6:39 UTC (permalink / raw) To: Russell Miller, Doug McNaught Cc: Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård On Thursday 04 November 2004 01:33, Russell Miller wrote: > On Wednesday 03 November 2004 17:03, Doug McNaught wrote: > > > It was already mentioned in this thread that the bookkeeping required > > to clean up properly from such an abort would add a lot of overhead > > and slow down the normal, non-buggy case. > > > I am going to continue pursuing this at the risk of making a bigger fool of > myself than I already am, but I want to make sure that I understand the > issues - and I did read the message you are referring to. > > I think what you are saying is that there is kind of a race condition here. > When something is on the wait queue, it has to be followed through to > completion. An interrupt could be received at any time, and if it's taken > off of the wait queue prematurely, it'll crash the kernel, because the > interrupt has no way of telling that. The problem is in locking. You must not kill process while it is in uninterruptible state because it is uninterruptible for a reason - has taken semaphore, or get_cpu(), etc. You do want it to do put_cpu(), right? Processes must never get stuck in D, it's a kernel bug. Find out how did process ended up in D state forever, and fix it - that's what I'm trying to do in these cases. -- vda ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 6:39 ` Denis Vlasenko @ 2004-11-05 2:38 ` Elladan 2004-11-05 3:10 ` Tim Connors 0 siblings, 1 reply; 99+ messages in thread From: Elladan @ 2004-11-05 2:38 UTC (permalink / raw) To: Denis Vlasenko Cc: Russell Miller, Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, M?ns Rullg?rd On Thu, Nov 04, 2004 at 08:39:34AM +0200, Denis Vlasenko wrote: > On Thursday 04 November 2004 01:33, Russell Miller wrote: > > On Wednesday 03 November 2004 17:03, Doug McNaught wrote: > > > > > It was already mentioned in this thread that the bookkeeping required > > > to clean up properly from such an abort would add a lot of overhead > > > and slow down the normal, non-buggy case. > > > > > I am going to continue pursuing this at the risk of making a bigger fool of > > myself than I already am, but I want to make sure that I understand the > > issues - and I did read the message you are referring to. > > > > I think what you are saying is that there is kind of a race condition here. > > When something is on the wait queue, it has to be followed through to > > completion. An interrupt could be received at any time, and if it's taken > > off of the wait queue prematurely, it'll crash the kernel, because the > > interrupt has no way of telling that. > > The problem is in locking. You must not kill process while it is > in uninterruptible state because it is uninterruptible > for a reason - has taken semaphore, or get_cpu(), etc. > You do want it to do put_cpu(), right? > > Processes must never get stuck in D, it's a kernel bug. > > Find out how did process ended up in D state forever, > and fix it - that's what I'm trying to do > in these cases. Perhaps it would be useful to add some debugging to the kernel for these cases, somewhat akin to Ingo's preempt trace stuff? If a process is in D state and receives a SIGKILL, assume it must exit within a few seconds or it's a bug, and dump as much information about it as is practical...? -J ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-05 2:38 ` Elladan @ 2004-11-05 3:10 ` Tim Connors 2004-11-05 3:17 ` Russell Miller ` (2 more replies) 0 siblings, 3 replies; 99+ messages in thread From: Tim Connors @ 2004-11-05 3:10 UTC (permalink / raw) To: Elladan Cc: Denis Vlasenko, Russell Miller, Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, M?ns Rullg?rd Elladan <elladan@eskimo.com> said on Thu, 4 Nov 2004 18:38:50 -0800: > If a process is in D state and receives a SIGKILL, assume it must exit > within a few seconds or it's a bug, and dump as much information about > it as is practical...? Of course, it's not necessarily a bug. Someone could have just kicked the ethernet, and so your process is stuck waiting for a read/write. -- TimC -- http://astronomy.swin.edu.au/staff/tconnors/ Theoretically one might have been wearing pants at work. -- Anthony de Boer in Scary Devil Monastry ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-05 3:10 ` Tim Connors @ 2004-11-05 3:17 ` Russell Miller 2004-11-05 4:38 ` Elladan 2004-11-05 5:00 ` Kyle Moffett 2 siblings, 0 replies; 99+ messages in thread From: Russell Miller @ 2004-11-05 3:17 UTC (permalink / raw) To: Tim Connors Cc: Elladan, Denis Vlasenko, Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, M?ns Rullg?rd On Thursday 04 November 2004 21:10, Tim Connors wrote: > Of course, it's not necessarily a bug. Someone could have just kicked > the ethernet, and so your process is stuck waiting for a read/write. But it *is* a process hung in D state after you sent it a kill. It's safe to assume, at least, that something is screwed up somewhere. More information is always a good thing. --Russell -- Russell Miller - rmiller@duskglow.com - Le Mars, IA Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. http://www.duskglow.com - 712-546-5886 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-05 3:10 ` Tim Connors 2004-11-05 3:17 ` Russell Miller @ 2004-11-05 4:38 ` Elladan 2004-11-05 5:00 ` Kyle Moffett 2 siblings, 0 replies; 99+ messages in thread From: Elladan @ 2004-11-05 4:38 UTC (permalink / raw) To: Tim Connors Cc: Elladan, Denis Vlasenko, Russell Miller, Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, M?ns Rullg?rd On Fri, Nov 05, 2004 at 02:10:35PM +1100, Tim Connors wrote: > Elladan <elladan@eskimo.com> said on Thu, 4 Nov 2004 18:38:50 -0800: > > If a process is in D state and receives a SIGKILL, assume it must exit > > within a few seconds or it's a bug, and dump as much information about > > it as is practical...? > > Of course, it's not necessarily a bug. Someone could have just kicked > the ethernet, and so your process is stuck waiting for a read/write. Sounds like a bug to me. Kernel resource leak due to network activity? -J ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-05 3:10 ` Tim Connors 2004-11-05 3:17 ` Russell Miller 2004-11-05 4:38 ` Elladan @ 2004-11-05 5:00 ` Kyle Moffett 2 siblings, 0 replies; 99+ messages in thread From: Kyle Moffett @ 2004-11-05 5:00 UTC (permalink / raw) To: Tim Connors Cc: Denis Vlasenko, DervishD, Russell Miller, Elladan, linux-kernel, Jim Nelson, M?ns Rullg?rd, Gene Heskett, Doug McNaught On Nov 04, 2004, at 22:10, Tim Connors wrote: > Elladan <elladan@eskimo.com> said on Thu, 4 Nov 2004 18:38:50 -0800: >> If a process is in D state and receives a SIGKILL, assume it must exit >> within a few seconds or it's a bug, and dump as much information about >> it as is practical...? > > Of course, it's not necessarily a bug. Someone could have just kicked > the ethernet, and so your process is stuck waiting for a read/write. In any case, if a process is sleeping in-kernel, I expect that either it's an interruptible sleep or a guaranteed-short sleep. If it's neither, it's a bug. If I kick out an ethernet and it makes "ping" hang in "D", that's bad. I think that eventually _all_ kernel sleeps on the behalf of user-space processes will become interruptible. Cheers, Kyle Moffett -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCM/CS/IT/U d- s++: a17 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$ L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+ PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r !y?(-) ------END GEEK CODE BLOCK------ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:33 ` Russell Miller 2004-11-03 23:47 ` Mathieu Segaud 2004-11-04 6:39 ` Denis Vlasenko @ 2004-11-04 20:06 ` Bill Davidsen 2 siblings, 0 replies; 99+ messages in thread From: Bill Davidsen @ 2004-11-04 20:06 UTC (permalink / raw) To: Russell Miller Cc: Doug McNaught, Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård Russell Miller wrote: > On Wednesday 03 November 2004 17:03, Doug McNaught wrote: > > >>It was already mentioned in this thread that the bookkeeping required >>to clean up properly from such an abort would add a lot of overhead >>and slow down the normal, non-buggy case. >> > > I am going to continue pursuing this at the risk of making a bigger fool of > myself than I already am, but I want to make sure that I understand the > issues - and I did read the message you are referring to. > > I think what you are saying is that there is kind of a race condition here. At least in the usual sense, no. There is a condition from which there is no graceful way back, only forward. > When something is on the wait queue, it has to be followed through to > completion. An interrupt could be received at any time, and if it's taken > off of the wait queue prematurely, it'll crash the kernel, because the > interrupt has no way of telling that. That's part of it, but in some cases there's also i/o in progress, the hardware may not have a way to HALT the transfer, so the memory in question can't be used for something else. > > That's fine as it goes, I understand that. But I submit that this is a > horrible design. I've been bitten by this more than once - usually regarding > broken NFS connections. > > But what I don't understand is why the bookkeeping would be so inefficient. > It seems to me that all that would be required is a bitfield of some sort. > If that position in the qait queue becomes invalid, when the interrupt is > received to process it, the kernel notes that a flag is set invalidating that > part of the wait queue, dumps the output to dave null, and goes on to the > next. This doesn't seem inefficient to me, unless I'm missing something. > A little more inefficient, yes, but not to near the cost that seems to be > implied. > > And I also have to ask this question: what is more inefficient, slowing down > processing of output waiting on the queue, or having to reboot when a process > gets stuck due to faulty drivers? At the very least, a compile option seems > like it would be worthwhile for those that would like this behavior. > > And I probably am. Missing something, that is. You are asking to program around a problem rather than fix it. These hangs (usually) happen because the hardware behaviour is either undocumented, incorrectly documented, or flat out broken. Second likely cause is a bug in the driver. In the case of a real bug, adding code to bypass the error instead of fixing it is more effort, more complex in most cases, and therefore less reliable. Where the hardware does something unexpected, the driver needs to fit the behaviour rather than the spec. And where the hardware is broken, you fix or replace it. None of those cases suggest "pretend it didn't happen," because in most cases you can't. What I think you are missing: Processes hung in D state are the result of real problems, and ignoring rather than fixing them is like giving a cancer patient a face lift; it doesn't fix the problem, it just gives you a good looking corpse. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 22:44 ` Russell Miller 2004-11-03 23:03 ` Doug McNaught @ 2004-11-03 23:06 ` vlobanov 2004-11-04 10:04 ` Helge Hafting 2 siblings, 0 replies; 99+ messages in thread From: vlobanov @ 2004-11-03 23:06 UTC (permalink / raw) To: Russell Miller Cc: Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård Also a kernel newbie here, so apply appropriate amount of salt to response. :) One common scenario for why a program is blocked within a syscall is that it is waiting for data to arrive. Consider, for example, a read() on a file -- simplifying a lot, the data has to be fetched from disk, which is slow. So, while the disk is doing it's thing, the program is blocked within the system call. Then, when an interrupt arrives signalling that the data is ready, it is placed into the user-space buffer, and the program is kicked out of the syscall so that it can continue executing. Consider what happens if the program suddenly dies within the read() syscall above: when the data from disk comes back, the kernel needs to figure out where to put it. This would make for a very confused kernel, since the original requester "vanished" without a trace. Even worse, another program might have taken the original program's place in the meantime! Very bad things happen. This is certainly not an _impossible_ problem to solve (as far as I know), but solving it in the general case would involve a lot of expensive and complex book-keeping code, so it's simply not done. Am I right? Wrong? Please enlighten me as well. :) -Vadim Lobanov On Wed, 3 Nov 2004, Russell Miller wrote: > On Wednesday 03 November 2004 16:15, Jim Nelson wrote: > > > I did this to myself a number of times when I was first learning Samba - > > even an ls would become unkillable. You couldn't rmmod smb, since it was > > in use, and you couldn't kill the process, since it was waiting on a > > syscall. Ergh. > > > > I'm not going to pretend to be a kernel expert, or really anything other than > a newbie when it comes to kernel internals, so please take this with the > merits it deserves - many, or none, depending. > > Anyway, is there a way to simply signal a syscall that it is to be interrupted > and forcibly cause the syscall to end? Kicking the program execution out of > kernel space would be sufficient to "unstick" the process - and coupling that > with an automatic KILL signal may not be a bad idea. > > I'm pretty sure that someone will think of a way why this wouldn't work with > very little effort. Please enlighten me? > > --Russell > > -- > > Russell Miller - rmiller@duskglow.com - Le Mars, IA > Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. > http://www.duskglow.com - 712-546-5886 > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 22:44 ` Russell Miller 2004-11-03 23:03 ` Doug McNaught 2004-11-03 23:06 ` vlobanov @ 2004-11-04 10:04 ` Helge Hafting 2004-11-04 17:16 ` Alex Bennee 2 siblings, 1 reply; 99+ messages in thread From: Helge Hafting @ 2004-11-04 10:04 UTC (permalink / raw) To: Russell Miller Cc: Jim Nelson, DervishD, Gene Heskett, linux-kernel, Måns Rullgård Russell Miller wrote: >On Wednesday 03 November 2004 16:15, Jim Nelson wrote: > > > >>I did this to myself a number of times when I was first learning Samba - >>even an ls would become unkillable. You couldn't rmmod smb, since it was >>in use, and you couldn't kill the process, since it was waiting on a >>syscall. Ergh. >> >> >> > >I'm not going to pretend to be a kernel expert, or really anything other than >a newbie when it comes to kernel internals, so please take this with the >merits it deserves - many, or none, depending. > >Anyway, is there a way to simply signal a syscall that it is to be interrupted >and forcibly cause the syscall to end? > There is a way. Processes go into D state happens all the time when waiting for disk io or similiar. Then the io happens a few ms later, and the fs or device driver tells the kernel to wake up the process so it gets a chance at the next scheduling opportunity. So the mechanism to unstick a prcess exists, and is used by every device driver that use sleeping. Which is most of them. Breakage happens when something never comes out of D-state. One could write a trivial syscall (or addition to "kill") that "wakes" processes waiting for io. It itsn't hard to do at all - just copy the waking code from any device driver. This will allow to kill and fully remove any process that hangs around in D-state. This might also release other stuck resources as the syscall continues, returns to userspace, and allows the process to die. Unfortunately, this isn't enough. In some cases the syscall expects the io device interrupt handler to have done something vital - but this haven't happened when we forcibly wakes a process. We can hope for an io error, but might get a crash instead. This can be fixes with a lot of work - basically check at every wakeup if the process were woken by this new killing mechanism and act accordingly. It shouldn't be hard, but _lots_ of work inspecting every sleeping point, at least every device driver. Another problem exist if the long-waiting io wasn't lost - just extremely slow. If the io actually comes through after the process is gone and the memory is used for something else - bang! Dealing properly with this case is harder - a new generic mechanism for cancelling outstanding io requests is needed for this. It might even be impossible in some cases. If a memory address is handed over to a bus-mastering device such as a scsi adapter, then the memory must be pinned down until the operation completes. It cannot be released. The rest of the process can go, but the hw might not support any way of cancelling the request. A few may have a way, many won't. Some devices can be reset - but at a considerable cost. A disk controller might be unavailable for seconds during such a reset - instant DOS attack if a user keeps starting lots of disk intensive processes and kill them off while in a D-state that normally last way shorter than a reset. PCI devices can be turned off, but we might really want to use them again . . . Fortunately, most cases of long-running D-state is just driver bugs and can be fixed as such. nfs has a forced umount option. If samba can hang, then it _can_ be fixed in similiar ways. (smbfs is software only - no quirky hw to deal with.) Hw drivers that puts processes into everlasting D-state usually do so because of a bug. (Lost request or interrupt because of internal errors.) Fix that, and the problem never happens. So the hard problem of killing stuff stuck in D-state doesn't need a solution - fix the real bug instead. Having a way to kill such processes will only mean that hard-to-trigger bugs won't get fixed because there is workaround. This is bad for stability too, as broken hw drivers can hang the kernel even if a better process killer comes into existence. > Kicking the program execution out of >kernel space would be sufficient to "unstick" the process - and coupling that >with an automatic KILL signal may not be a bad idea. > >I'm pretty sure that someone will think of a way why this wouldn't work with >very little effort. Please enlighten me? > > It is doable - but not with "very little effort". I have outlined above the trouble you get if you trivially wake up the sleeping process. Another trivial alternative is to remove the process while it is in-kernel. The downside is that it might be holding a lock or semaphore that won't ever be released this way. And no, locks aren't necessarily accounted for anywhere. (They are implicitly accounted for by the fact that a process exists whose future execution path leads to the release of said lock.) Explicit accounting that allows lock-breaking is deemed too slow, and what to do about the data structures the lock/semaphore were protecting? The stuck process is a sign of another bug - better fix that one. Helge Hafting ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 10:04 ` Helge Hafting @ 2004-11-04 17:16 ` Alex Bennee 0 siblings, 0 replies; 99+ messages in thread From: Alex Bennee @ 2004-11-04 17:16 UTC (permalink / raw) To: Helge Hafting Cc: Russell Miller, Jim Nelson, DervishD, Gene Heskett, Linux Kernel Mailing List, Måns Rullgård On Thu, 2004-11-04 at 10:04, Helge Hafting wrote: > Russell Miller wrote: > >On Wednesday 03 November 2004 16:15, Jim Nelson wrote: > > > >Anyway, is there a way to simply signal a syscall that it is to be interrupted > >and forcibly cause the syscall to end? > > > There is a way. Processes go into D state happens all the time > when waiting for disk io or similiar. Then the io happens a few ms later, > and the fs or device driver tells the kernel to wake up the process > so it gets a chance at the next scheduling opportunity. So the mechanism to > unstick a prcess exists, and is used by every device driver that > use sleeping. Which is most of them. > > Breakage happens when something never comes out of D-state. > One could write a trivial syscall (or addition to "kill") that "wakes" > processes waiting for io. It itsn't hard to do at all - just copy the > waking code from any device driver. This will allow to kill and > fully remove any process that hangs around in D-state. This might > also release other stuck resources as the syscall > continues, returns to userspace, and allows the process to die. > > Unfortunately, this isn't enough. In some cases the syscall > expects the io device interrupt handler to have done something > vital - but this haven't happened when we forcibly wakes a process. > We can hope for an io error, but might get a crash instead. This > can be fixes with a lot of work - basically check at every wakeup > if the process were woken by this new killing mechanism and > act accordingly. It shouldn't be hard, but _lots_ of work > inspecting every sleeping point, at least every device driver. Timeouts and interruptible sleeps are the two ways to solve the problem. All good drivers should have covering timeouts in case the event they where hoping for never happens. If the code path that assumes magic has happened after it wakes up doesn't check its not defensive enough. Also you can make tasks interruptible so signals can get through: result = wait_event_interruptible(dev->waitq,dev_irq_event(dev)); if (result) { printk(KERN_ALERT "dev_irq_wait: Interrupted by a signal\n"); return -ERESTARTSYS; }; As you have noted you can't always make things interruptible, but decent timeouts should always exist. Hardware has bugs too! -- Alex, Kernel Hacker: http://www.bennee.com/~alex/ In English, every word can be verbed. Would that it were so in our programming languages. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 22:15 ` Jim Nelson 2004-11-03 22:44 ` Russell Miller @ 2004-11-04 16:30 ` Pedro Venda (SYSADM) 2004-11-04 22:28 ` Helge Hafting 1 sibling, 1 reply; 99+ messages in thread From: Pedro Venda (SYSADM) @ 2004-11-04 16:30 UTC (permalink / raw) To: linux-kernel; +Cc: Jim Nelson Jim Nelson wrote: > DervishD wrote: > >> Hi Gene :) >> >> * Gene Heskett <gene.heskett@verizon.net> dixit: >> >>>> Then the children are reparented to 'init' and 'init' gets rid >>>> of them. That's the way UNIX behaves. >>> >>> >>> Unforch, I've *never* had it work that way. Any dead process I've >>> ever had while running linux has only been disposable by a reboot. >> >> >> >> Well, you know, shit happens... Anyway, could you define 'dead'? >> Because if you're talking about zombies whose parent dies, they're >> killable easily: just wait until init reaps them (usually in less >> than 5 minutes since they dead). If you are talking about zombies who >> has their parent alive, then it's a bug in the application, not the >> kernel. In fact I wouldn't like if the kernel reaps my children >> before I do, just in case I want to do something. >> >> If you're talking about unkillable processes (those stuck in >> disk-sleep state), you're right: only rebooting can kill them >> (although sometimes they go out of D state and die normally). Bad >> luck for you if any dead process you've ever had while running linux >> has been of this kind :( >> > > I did this to myself a number of times when I was first learning Samba - > even an ls would become unkillable. You couldn't rmmod smb, since it > was in use, and you couldn't kill the process, since it was waiting on a > syscall. Ergh. the exact same happened to me, but my case was with ntfs. zip processes just got stuch in "D" state because of some unhandled names... i couldn't kill the processes. i don't think this is an easy thing to do, tough it should be possible to kill -9 these processes and make them exit. is this feasible? regards, pedro venda. -- Pedro João Lopes Venda email: pjvenda@rnl.ist.utl.pt http://maxwell.rnl.ist.utl.pt Equipa de Administração de Sistemas Rede das Novas Licenciaturas (RNL) Instituto Superior Técnico http://www.rnl.ist.utl.pt http://mega.ist.utl.pt ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 16:30 ` Pedro Venda (SYSADM) @ 2004-11-04 22:28 ` Helge Hafting 0 siblings, 0 replies; 99+ messages in thread From: Helge Hafting @ 2004-11-04 22:28 UTC (permalink / raw) To: Pedro Venda (SYSADM); +Cc: linux-kernel, Jim Nelson On Thu, Nov 04, 2004 at 04:30:47PM +0000, Pedro Venda (SYSADM) wrote: > Jim Nelson wrote: > >DervishD wrote: > > the exact same happened to me, but my case was with ntfs. zip processes > just got stuch in "D" state because of some unhandled names... i > couldn't kill the processes. i don't think this is an easy thing to do, > tough it should be possible to kill -9 these processes and make them exit. > > is this feasible? > The correct approach here is to fix ntfs so it doesn't make processes wait forever for anything. There is no need for a workaround. Helge Hafting ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 19:26 ` DervishD 2004-11-03 20:18 ` Gene Heskett 2004-11-03 22:15 ` Jim Nelson @ 2004-11-03 23:07 ` Bill Davidsen 2004-11-04 1:19 ` Michael Clark 2 siblings, 1 reply; 99+ messages in thread From: Bill Davidsen @ 2004-11-03 23:07 UTC (permalink / raw) To: linux-kernel, DervishD Cc: Gene Heskett, linux-kernel, Måns Rullgård DervishD wrote: > Hi Gene :) > > * Gene Heskett <gene.heskett@verizon.net> dixit: > >>> Then the children are reparented to 'init' and 'init' gets rid >>>of them. That's the way UNIX behaves. >> >>Unforch, I've *never* had it work that way. Any dead process I've >>ever had while running linux has only been disposable by a reboot. > > > Well, you know, shit happens... Anyway, could you define 'dead'? > Because if you're talking about zombies whose parent dies, they're > killable easily: just wait until init reaps them (usually in less > than 5 minutes since they dead). If you are talking about zombies who > has their parent alive, then it's a bug in the application, not the > kernel. In fact I wouldn't like if the kernel reaps my children > before I do, just in case I want to do something. > > If you're talking about unkillable processes (those stuck in > disk-sleep state), you're right: only rebooting can kill them > (although sometimes they go out of D state and die normally). Bad > luck for you if any dead process you've ever had while running linux > has been of this kind :( That often seems to be the case, the kernel thinks there's an i/o going on which isn't, and doesn't time it out. It would be nice if there were a way to get the kernel to abort all outstanding i/o on kill -9, but I'm sure if it were easy it would have happened. Timeouts in the application are useful, but in some cases I believe the process dies because it detects a long i/o time but has nothing to do but terminate, which creates the zombie. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 23:07 ` Bill Davidsen @ 2004-11-04 1:19 ` Michael Clark 0 siblings, 0 replies; 99+ messages in thread From: Michael Clark @ 2004-11-04 1:19 UTC (permalink / raw) To: Bill Davidsen Cc: linux-kernel, DervishD, Gene Heskett, Måns Rullgård On 11/04/04 07:07, Bill Davidsen wrote: > DervishD wrote: > >> Hi Gene :) >> >> * Gene Heskett <gene.heskett@verizon.net> dixit: >> >>>> Then the children are reparented to 'init' and 'init' gets rid >>>> of them. That's the way UNIX behaves. >>> >>> >>> Unforch, I've *never* had it work that way. Any dead process I've >>> ever had while running linux has only been disposable by a reboot. >> >> >> >> Well, you know, shit happens... Anyway, could you define 'dead'? >> Because if you're talking about zombies whose parent dies, they're >> killable easily: just wait until init reaps them (usually in less >> than 5 minutes since they dead). If you are talking about zombies who >> has their parent alive, then it's a bug in the application, not the >> kernel. In fact I wouldn't like if the kernel reaps my children >> before I do, just in case I want to do something. >> >> If you're talking about unkillable processes (those stuck in >> disk-sleep state), you're right: only rebooting can kill them >> (although sometimes they go out of D state and die normally). Bad >> luck for you if any dead process you've ever had while running linux >> has been of this kind :( > > > That often seems to be the case, the kernel thinks there's an i/o going > on which isn't, and doesn't time it out. It would be nice if there were > a way to get the kernel to abort all outstanding i/o on kill -9, but I'm > sure if it were easy it would have happened. Timeouts in the application > are useful, but in some cases I believe the process dies because it > detects a long i/o time but has nothing to do but terminate, which > creates the zombie. It could be any driver code that uses uninterruptible sleeps rather than interruptible sleeps I believe. If a process is doing a read or write to one of these devices and it stays stuck in kernel code with TASK_UNINTERRUPTIBLE and never gets it's expected wake up, then the signal will never be delivered and the process is stuck indefinately. The buggy driver code needs to be fixed (either to use interruptible sleeps and handle the signals or to imlement some sort of timeout). ~mc ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 16:47 ` Gene Heskett 2004-11-03 17:44 ` DervishD @ 2004-11-04 16:01 ` kernel 2004-11-04 16:18 ` Gene Heskett 1 sibling, 1 reply; 99+ messages in thread From: kernel @ 2004-11-04 16:01 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, DervishD, Måns Rullgård On Wed, 2004-11-03 at 11:47, Gene Heskett wrote: > Finding them is usually an exersize in stretching the > top window out till its about 20 screens high as its always going to > be at the bottom of the list. use 'htop' instead, more flexible in showing and parsing. -fd ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 16:01 ` kernel @ 2004-11-04 16:18 ` Gene Heskett 2004-11-04 16:47 ` kernel 0 siblings, 1 reply; 99+ messages in thread From: Gene Heskett @ 2004-11-04 16:18 UTC (permalink / raw) To: linux-kernel, kernel; +Cc: DervishD, Måns Rullgård On Thursday 04 November 2004 11:01, kernel wrote: >On Wed, 2004-11-03 at 11:47, Gene Heskett wrote: >> Finding them is usually an exersize in stretching the >> top window out till its about 20 screens high as its always going >> to be at the bottom of the list. > >use 'htop' instead, more flexible in showing and parsing. > And where is htop, it apparently isn't part of an FC2 install. > >-fd -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 16:18 ` Gene Heskett @ 2004-11-04 16:47 ` kernel 2004-11-04 17:58 ` Gene Heskett 0 siblings, 1 reply; 99+ messages in thread From: kernel @ 2004-11-04 16:47 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, DervishD, Måns Rullgård On Thu, 2004-11-04 at 11:18, Gene Heskett wrote: > And where is htop, it apparently isn't part of an FC2 install. > > http://htop.sourceforge.net/ from site above; Comparison between htop and top * In 'htop' you can scroll the list vertically and horizontally to see all processes and complete command lines. * In 'top' you are subject to a delay for each unassigned key you press (especially annoying when multi-key escape sequences are triggered by accident). * 'htop' starts faster ('top' seems to collect data for a while before displaying anything). * In 'htop' you don't need to type the process number to kill a process, in 'top' you do. * In 'htop' you don't need to type the process number or the priority value to renice a process, in 'top' you do. * 'htop' supports mouse operation, 'top' doesn't * 'top' is older, hence, more used and tested. cheers! -fd ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 16:47 ` kernel @ 2004-11-04 17:58 ` Gene Heskett 0 siblings, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 17:58 UTC (permalink / raw) To: linux-kernel, kernel; +Cc: DervishD, Måns Rullgård On Thursday 04 November 2004 11:47, kernel wrote: >On Thu, 2004-11-04 at 11:18, Gene Heskett wrote: >> And where is htop, it apparently isn't part of an FC2 install. > >http://htop.sourceforge.net/ > Thanks, got it. Looks good, more thanks... -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 15:25 ` DervishD 2004-11-03 15:25 ` Måns Rullgård 2004-11-03 16:47 ` Gene Heskett @ 2004-11-03 22:58 ` Bill Davidsen 2004-11-04 10:23 ` DervishD 2004-11-03 23:18 ` Adam Heath 3 siblings, 1 reply; 99+ messages in thread From: Bill Davidsen @ 2004-11-03 22:58 UTC (permalink / raw) To: linux-kernel, DervishD; +Cc: Måns Rullgård, linux-kernel DervishD wrote: > Hi all :) > > * Måns Rullgård <mru@inprovide.com> dixit: > >>>>I'd tried to kill the zombie earlier but couldn't. >>>>Isn't there some way to clean up a &^$#^#@)_ zombie? >>> >>>Kill the parent, is the only (portable) way. >> >>Perhaps not as portable, but another possible, though slightly >>complicated, way is to ptrace the parent and force it to wait(). > > > Or write a little program that just 'wait()'s for the specified > PID's. That is perfectly portable IMHO. But I must admit that the > preferred way should be killing the parent. 'init' will reap the > children after that. You can't wait() for the process, you have to use waitfor(), and the last time I tried that it didn't work, although I don't remember the symptom beyond that. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 22:58 ` Bill Davidsen @ 2004-11-04 10:23 ` DervishD 2004-11-04 19:32 ` Bill Davidsen 0 siblings, 1 reply; 99+ messages in thread From: DervishD @ 2004-11-04 10:23 UTC (permalink / raw) To: Bill Davidsen; +Cc: Måns Rullgård, linux-kernel Hi Bill :) * Bill Davidsen <davidsen@tmr.com> dixit: > > Or write a little program that just 'wait()'s for the specified > >PID's. That is perfectly portable IMHO. But I must admit that the > >preferred way should be killing the parent. 'init' will reap the > >children after that. > You can't wait() for the process, you have to use waitfor(), and the > last time I tried that it didn't work, although I don't remember the > symptom beyond that. You can't wait for other's children. OTOH, if we talk about your children, you can do wait() or waitpid() (I assume that you referred to waitpid(), since there isn't waitfor() AFAIK). The only difference is that wait suspends the process until information from a child is available. If you are talking about others' children, then your call to waitpid() (or wait()) failed with ECHILD: not your child. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 10:23 ` DervishD @ 2004-11-04 19:32 ` Bill Davidsen 2004-11-04 21:11 ` DervishD 0 siblings, 1 reply; 99+ messages in thread From: Bill Davidsen @ 2004-11-04 19:32 UTC (permalink / raw) To: DervishD; +Cc: Måns Rullgård, linux-kernel DervishD wrote: > Hi Bill :) > > * Bill Davidsen <davidsen@tmr.com> dixit: > >>> Or write a little program that just 'wait()'s for the specified >>>PID's. That is perfectly portable IMHO. But I must admit that the >>>preferred way should be killing the parent. 'init' will reap the >>>children after that. >> >>You can't wait() for the process, you have to use waitfor(), and the >>last time I tried that it didn't work, although I don't remember the >>symptom beyond that. > > > You can't wait for other's children. OTOH, if we talk about your > children, you can do wait() or waitpid() (I assume that you referred > to waitpid(), since there isn't waitfor() AFAIK). The only difference > is that wait suspends the process until information from a child is > available. Yes, thank you, I was thinking "wait for the PID" and typed that. > > If you are talking about others' children, then your call to > waitpid() (or wait()) failed with ECHILD: not your child. That's what happened when I tried it a few months ago. I suppose one could try sending a SIGCHLD to the parent and see if it does something helpful. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 19:32 ` Bill Davidsen @ 2004-11-04 21:11 ` DervishD 2004-11-09 23:31 ` Bill Davidsen 0 siblings, 1 reply; 99+ messages in thread From: DervishD @ 2004-11-04 21:11 UTC (permalink / raw) To: Bill Davidsen; +Cc: Måns Rullgård, linux-kernel Hi Bill :) * Bill Davidsen <davidsen@tmr.com> dixit: > > If you are talking about others' children, then your call to > >waitpid() (or wait()) failed with ECHILD: not your child. > That's what happened when I tried it a few months ago. I suppose one > could try sending a SIGCHLD to the parent and see if it does something > helpful. Probably it won't do. If the zombies are there due to a signal delivery problem, sending a SIGCHLD to the parent will (probably) solve the problem. But the common case is that the parent is screwed up or simply so badly programmed that the only way of getting rid of the zombies is to kill the parent... Anyway I suppose that sending the SIGCHLD won't do any harm so it may be worth trying. Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 21:11 ` DervishD @ 2004-11-09 23:31 ` Bill Davidsen 2004-11-10 9:11 ` DervishD 0 siblings, 1 reply; 99+ messages in thread From: Bill Davidsen @ 2004-11-09 23:31 UTC (permalink / raw) To: DervishD; +Cc: Måns Rullgård, linux-kernel DervishD wrote: > Hi Bill :) > > * Bill Davidsen <davidsen@tmr.com> dixit: > >>> If you are talking about others' children, then your call to >>>waitpid() (or wait()) failed with ECHILD: not your child. >> >>That's what happened when I tried it a few months ago. I suppose one >>could try sending a SIGCHLD to the parent and see if it does something >>helpful. > > > Probably it won't do. If the zombies are there due to a signal > delivery problem, sending a SIGCHLD to the parent will (probably) > solve the problem. But the common case is that the parent is screwed > up or simply so badly programmed that the only way of getting rid of > the zombies is to kill the parent... Wait a minute, in another message you just suggested that a SIGCHLD to init would cause the status to be reaped. > > Anyway I suppose that sending the SIGCHLD won't do any harm so it > may be worth trying. It won't hurt init, but some processes do use the SIGCHLD to trigger a wait(), which might hang the parent. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-09 23:31 ` Bill Davidsen @ 2004-11-10 9:11 ` DervishD 0 siblings, 0 replies; 99+ messages in thread From: DervishD @ 2004-11-10 9:11 UTC (permalink / raw) To: Bill Davidsen; +Cc: Måns Rullgård, linux-kernel Hi Bill :) * Bill Davidsen <davidsen@tmr.com> dixit: > > Probably it won't do. If the zombies are there due to a signal > >delivery problem, sending a SIGCHLD to the parent will (probably) > >solve the problem. But the common case is that the parent is screwed > >up or simply so badly programmed that the only way of getting rid of > >the zombies is to kill the parent... > Wait a minute, in another message you just suggested that a SIGCHLD to > init would cause the status to be reaped. I don't consider init the parent of such processes. It just 'adopts' them when the real parent doesn't care for them. I was talking, in the paragraph above, about the *real* parent. I don't see any contradiction, although sending SIGCHLD to a program that has not waited for a children is risky: if the programmer was so clueless that children were not waited for in the first place, chances are that SIGCHLD handling is damaged, too. > > Anyway I suppose that sending the SIGCHLD won't do any harm so it > >may be worth trying. > It won't hurt init, but some processes do use the SIGCHLD to trigger a > wait(), which might hang the parent. If a parent does 'wait()' instead of 'waitpid', that's lazy programming. The signal won't hurt anyway: if the parent blocks (bug in the program), then a 'kill -9' is the correct medication (it's what I use for buggy programs), the children are reparented to init and correctly handled (because a good init should, IMHO, use waitpid instead of wait). Let's say that sending SIGCHLD is 'mostly harmless' ;)) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 15:25 ` DervishD ` (2 preceding siblings ...) 2004-11-03 22:58 ` Bill Davidsen @ 2004-11-03 23:18 ` Adam Heath 3 siblings, 0 replies; 99+ messages in thread From: Adam Heath @ 2004-11-03 23:18 UTC (permalink / raw) To: DervishD; +Cc: Måns Rullgård, linux-kernel On Wed, 3 Nov 2004, DervishD wrote: > Hi all :) > > * Måns Rullgård <mru@inprovide.com> dixit: > > >> I'd tried to kill the zombie earlier but couldn't. > > >> Isn't there some way to clean up a &^$#^#@)_ zombie? > > > Kill the parent, is the only (portable) way. > > Perhaps not as portable, but another possible, though slightly > > complicated, way is to ptrace the parent and force it to wait(). > > Or write a little program that just 'wait()'s for the specified > PID's. That is perfectly portable IMHO. But I must admit that the > preferred way should be killing the parent. 'init' will reap the > children after that. ptrace the parent, cause it to wait() for it's children, then change IP, etc. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 14:49 ` Måns Rullgård 2004-11-03 15:25 ` DervishD @ 2004-11-03 16:38 ` Gene Heskett 1 sibling, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 16:38 UTC (permalink / raw) To: linux-kernel; +Cc: Måns Rullgård On Wednesday 03 November 2004 09:49, Måns Rullgård wrote: >bert hubert <ahu@ds9a.nl> writes: >> On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: >>> But I'd tried to run gnomeradio earlier to listen to the >>> elections, >> >> Depressing enough. >> >>> I'd tried to kill the zombie earlier but couldn't. >>> Isn't there some way to clean up a &^$#^#@)_ zombie? >> >> Kill the parent, is the only (portable) way. > >Perhaps not as portable, but another possible, though slightly >complicated, way is to ptrace the parent and force it to wait(). No deal. No way. The user needs something to clean up when he clicks on an icon, and things go to hell in a handbasket. He has no advance warning available to him to tell him he had better ptrace this one that I'm aware of. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 14:33 ` bert hubert 2004-11-03 14:49 ` Måns Rullgård @ 2004-11-03 16:24 ` Gene Heskett 2004-11-03 16:46 ` linux-os 2004-11-03 20:13 ` Helge Hafting 1 sibling, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 16:24 UTC (permalink / raw) To: linux-kernel; +Cc: bert hubert On Wednesday 03 November 2004 09:33, bert hubert wrote: >On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: >> But I'd tried to run gnomeradio earlier to listen to the >> elections, > >Depressing enough. > >> I'd tried to kill the zombie earlier but couldn't. >> Isn't there some way to clean up a &^$#^#@)_ zombie? > >Kill the parent, is the only (portable) way. The parent would have been the icon. It opened its usual sized small window, but never did anything to it. I clicked on closing the window, but 10 seconds later the system asked me if I wanted to kill it as it wasn't responding. I said yes, the window disappeared, but kpm said gomeradio was still present as process 8162, and that wasn't killable. Funny thing is, on the reboot, it automaticly self restored and ran just fine. I consider this as one of linux's achilles heels. Such a hung and dead process can be properly disposed of by a primitive os called os9 because it keeps track of all resources in tables in the kernel memory space. Issueing a kill procnumber removes the process from the exec queue, reclaims all its memory to the system free memory pool, and removes it from the IRQ service tables if an entry exists there. Near instant, total cleanup, nothing left, in about 250 microseconds max. 1.79 mhz cpu's aren't quite instant :) Lets just say that I think having to reboot because of a zombie that has resources locked up, and have the reboot fubared by it too, aren't exactly friendly actions. I fully realise that linux has a much more complex method of allocating resources, but doesn't it *know* exactly what resources have been passed out to each process? And why is there no entry from the kill function into that resource management portion of the kernel so that this could also be done by the linux kernel, say with a "kill --total procnumber"? Seems like a heck of a good question to me since an os written to run on a 64k machine in 1981, and expanded to run on a 128K to 2 megabyte machine in 1986 can do it just fine. Even if that process is still running and spitting out data to its parent window/shell! Or if its crashed and scribbled over all its memory, makes no difference to os9. You (root) wants it gone, fine, its gone. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 16:24 ` Gene Heskett @ 2004-11-03 16:46 ` linux-os 2004-11-03 19:12 ` Gene Heskett 2004-11-03 19:56 ` Måns Rullgård 2004-11-03 20:13 ` Helge Hafting 1 sibling, 2 replies; 99+ messages in thread From: linux-os @ 2004-11-03 16:46 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel, bert hubert On Wed, 3 Nov 2004, Gene Heskett wrote: > On Wednesday 03 November 2004 09:33, bert hubert wrote: >> On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: >>> But I'd tried to run gnomeradio earlier to listen to the >>> elections, >> >> Depressing enough. >> >>> I'd tried to kill the zombie earlier but couldn't. >>> Isn't there some way to clean up a &^$#^#@)_ zombie? >> >> Kill the parent, is the only (portable) way. > > The parent would have been the icon. It opened its usual sized small > window, but never did anything to it. I clicked on closing the > window, but 10 seconds later the system asked me if I wanted to kill > it as it wasn't responding. I said yes, the window disappeared, but > kpm said gomeradio was still present as process 8162, and that wasn't > killable. Funny thing is, on the reboot, it automaticly self > restored and ran just fine. > > I consider this as one of linux's achilles heels. Such a hung and > dead process can be properly disposed of by a primitive os called os9 > because it keeps track of all resources in tables in the kernel > memory space. Issueing a kill procnumber removes the process from > the exec queue, reclaims all its memory to the system free memory > pool, and removes it from the IRQ service tables if an entry exists > there. Near instant, total cleanup, nothing left, in about 250 > microseconds max. 1.79 mhz cpu's aren't quite instant :) > > Lets just say that I think having to reboot because of a zombie that > has resources locked up, and have the reboot fubared by it too, > aren't exactly friendly actions. [SNIPPED....] There is no problem killing a task and freeing its resources. The problem is that Linux and other Unix variations need to do this in a specific manner. That manner being that some parent (or ultimately init) needs to receive the terminating status. A task that has been otherwise killed, but is awaiting its status to be obtained is in the 'Z' or zombie state. If the code for either the child task or its parent was improperly written, the death of a parent could allow a child to wait forever (zombie). The fix is to fix the code. Your temporary fix is to use Ctrl-Alt-backspace to kill the X11 server (the parent). If it doesn't restart (it's not a kernel problem, it's a distribution problem), you can log in as root and execute: /etc/X11/prefdm & All these little windows and icons are the 'children' of the X server. The above is a temporary work-around for a non-kernel problem. Cheers, Dick Johnson Penguin : Linux version 2.6.9 on an i686 machine (5537.79 BogoMips). Notice : All mail here is now cached for review by John Ashcroft. 98.36% of all statistics are fiction. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 16:46 ` linux-os @ 2004-11-03 19:12 ` Gene Heskett 2004-11-03 19:56 ` Måns Rullgård 1 sibling, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 19:12 UTC (permalink / raw) To: linux-kernel, linux-os; +Cc: bert hubert On Wednesday 03 November 2004 11:46, linux-os wrote: >On Wed, 3 Nov 2004, Gene Heskett wrote: >> On Wednesday 03 November 2004 09:33, bert hubert wrote: >>> On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: >>>> But I'd tried to run gnomeradio earlier to listen to the >>>> elections, >>> >>> Depressing enough. >>> >>>> I'd tried to kill the zombie earlier but couldn't. >>>> Isn't there some way to clean up a &^$#^#@)_ zombie? >>> >>> Kill the parent, is the only (portable) way. >> >> The parent would have been the icon. It opened its usual sized >> small window, but never did anything to it. I clicked on closing >> the window, but 10 seconds later the system asked me if I wanted >> to kill it as it wasn't responding. I said yes, the window >> disappeared, but kpm said gomeradio was still present as process >> 8162, and that wasn't killable. Funny thing is, on the reboot, it >> automaticly self restored and ran just fine. >> >> I consider this as one of linux's achilles heels. Such a hung and >> dead process can be properly disposed of by a primitive os called >> os9 because it keeps track of all resources in tables in the >> kernel memory space. Issueing a kill procnumber removes the >> process from the exec queue, reclaims all its memory to the system >> free memory pool, and removes it from the IRQ service tables if an >> entry exists there. Near instant, total cleanup, nothing left, in >> about 250 microseconds max. 1.79 mhz cpu's aren't quite instant :) >> >> Lets just say that I think having to reboot because of a zombie >> that has resources locked up, and have the reboot fubared by it >> too, aren't exactly friendly actions. > >[SNIPPED....] > >There is no problem killing a task and freeing its resources. >The problem is that Linux and other Unix variations need to >do this in a specific manner. That manner being that some >parent (or ultimately init) needs to receive the terminating >status. A task that has been otherwise killed, but is awaiting >its status to be obtained is in the 'Z' or zombie state. If >the code for either the child task or its parent was improperly >written, the death of a parent could allow a child to wait >forever (zombie). > >The fix is to fix the code. In other words, its gnomeradio that needs fixed then? Its the best 'radio' proggy I've run across that works with my hardware, but I'm not sure it has a support person at ths late date. Its probably not been touched in 2 years. Kde doesn't appear to have a similar util that I've run across in the menu's so far, and its 3.3.0 here. All of which seems to be dancing around the real problem though. There seems to be no handy (to the user) path into the kernel to allow such a killing unconditionally function. root should have that ability. >Your temporary fix is to use >Ctrl-Alt-backspace to kill the X11 server (the parent). The logout took about 2 minutes because X couldn't clear itself either. >If it doesn't restart (it's not a kernel problem, it's >a distribution problem), you can log in as root and >execute: > > /etc/X11/prefdm & I'll try that next time. >All these little windows and icons are the 'children' of >the X server. The above is a temporary work-around for >a non-kernel problem. But a problem the kernel really should be capable of handling transparently. > >Cheers, >Dick Johnson >Penguin : Linux version 2.6.9 on an i686 machine (5537.79 BogoMips). > Notice : All mail here is now cached for review by John Ashcroft. What on earth for? I don't issue anything he would be interested in except the first part of my sig. And thats been in my sig for a year or so, and will stay there till the so-called Patriot Act is repealed. John Ashcroft has done more damage to democracy single-handedly because of his paranoia than any other 20 men in our history. G. Washington certainly wouldn't have tolerated such a person in his 1st term of government. Depending on the mailing list, data here has a lifetime as short as 30 days. Sorry about spilling politics into the list folks. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 16:46 ` linux-os 2004-11-03 19:12 ` Gene Heskett @ 2004-11-03 19:56 ` Måns Rullgård 1 sibling, 0 replies; 99+ messages in thread From: Måns Rullgård @ 2004-11-03 19:56 UTC (permalink / raw) To: linux-kernel linux-os <linux-os@chaos.analogic.com> writes: > The fix is to fix the code. Your temporary fix is to use > Ctrl-Alt-backspace to kill the X11 server (the parent). The X server is not the parent. The desktop manager (or whatever those beasts are called) is more likely to be. > All these little windows and icons are the 'children' of the X > server. The X server manages a set of windows, arranged in a logical tree structure, with all windows ultimately descending from the root windows. The parent-child relationships between windows should under no circumstance be confused, or compared, with that between processes. Any process, on any machine on the network, can, given enough privileges, create subwindows of any window on the X server. Windows and process belong to different worlds, the only connection between which is that processes create windows, simply since anything that happens in the computer is done by a process (or interrupt handler). Am I really reading this on linux-kernel? -- Måns Rullgård mru@inprovide.com ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 16:24 ` Gene Heskett 2004-11-03 16:46 ` linux-os @ 2004-11-03 20:13 ` Helge Hafting 2004-11-03 20:40 ` Gene Heskett 1 sibling, 1 reply; 99+ messages in thread From: Helge Hafting @ 2004-11-03 20:13 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel, bert hubert On Wed, Nov 03, 2004 at 11:24:19AM -0500, Gene Heskett wrote: > On Wednesday 03 November 2004 09:33, bert hubert wrote: > >On Wed, Nov 03, 2004 at 07:51:39AM -0500, Gene Heskett wrote: > >> But I'd tried to run gnomeradio earlier to listen to the > >> elections, > > > >Depressing enough. > > > >> I'd tried to kill the zombie earlier but couldn't. > >> Isn't there some way to clean up a &^$#^#@)_ zombie? > > > >Kill the parent, is the only (portable) way. > > The parent would have been the icon. It opened its usual sized small > window, but never did anything to it. I clicked on closing the > window, but 10 seconds later the system asked me if I wanted to kill > it as it wasn't responding. I said yes, the window disappeared, but > kpm said gomeradio was still present as process 8162, and that wasn't > killable. Funny thing is, on the reboot, it automaticly self > restored and ran just fine. > > I consider this as one of linux's achilles heels. Such a hung and > dead process can be properly disposed of by a primitive os called os9 > because it keeps track of all resources in tables in the kernel > memory space. Issueing a kill procnumber removes the process from > the exec queue, reclaims all its memory to the system free memory > pool, and removes it from the IRQ service tables if an entry exists > there. Near instant, total cleanup, nothing left, in about 250 > microseconds max. 1.79 mhz cpu's aren't quite instant :) > Killing a process in linux with "kill -9 oid" also release all resources, such as memory and file descriptors. The resource consumption of a "zombie" is measured in bytes, not kilobytes. > Lets just say that I think having to reboot because of a zombie that > has resources locked up, and have the reboot fubared by it too, > aren't exactly friendly actions. > Did you try logging out from the graphical user interface, and then logging in again? GUI programs are usually children of the window manager (or some app launcher, all of these quit when you log out. A plain zombie started from the GUI will disappear after that. Only something stuck in a device driver will need the reboot, but that tends to be a bug in the driver. You can try unloading the driver module, but linux has a nasty tendency to answer that with an OOPS or worse. When something goes wrong - it does so properly and thourougly. :-) > I fully realise that linux has a much more complex method of > allocating resources, but doesn't it *know* exactly what resources > have been passed out to each process? > Yes it does - the problem is that not all resources are managed by processes. Some allocations are managed by drivers, so a driver bug can get the device into a unuseable state _and_ tie up the process(es) that were using the driver at the moment. > And why is there no entry from the kill function into that resource > management portion of the kernel so that this could also be done by > the linux kernel, say with a "kill --total procnumber"? > Interesting, but you might need a path from "kill" into every device driver. :-/ And of course it wtill won't work if there is a bug in the driver. > Seems like a heck of a good question to me since an os written to run > on a 64k machine in 1981, and expanded to run on a 128K to 2 megabyte > machine in 1986 can do it just fine. Even if that process is still > running and spitting out data to its parent window/shell! Or if its > crashed and scribbled over all its memory, makes no difference to > os9. You (root) wants it gone, fine, its gone. > Can os9 do this if the process is busy calling into a buggy device driver that simply doesn't return or perhaps believes that some dma operation into process memory is taking forever? Or perhaps os9 doesn't have lots and lots of drivers written by different people with varying competence? Often, the real solution is to fix the driver to deal with "unexpected" conditions. Helge Hafting ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 20:13 ` Helge Hafting @ 2004-11-03 20:40 ` Gene Heskett 2004-11-04 0:43 ` Kurt Wall 2004-11-04 10:07 ` Matthias Andree 0 siblings, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-03 20:40 UTC (permalink / raw) To: linux-kernel; +Cc: Helge Hafting, bert hubert On Wednesday 03 November 2004 15:13, Helge Hafting wrote: >On Wed, Nov 03, 2004 at 11:24:19AM -0500, Gene Heskett wrote: [...] >> Lets just say that I think having to reboot because of a zombie >> that has resources locked up, and have the reboot fubared by it >> too, aren't exactly friendly actions. > >Did you try logging out from the graphical user interface, >and then logging in again? It took around 2 minutes for the logout of X to get back to a VC. So obviously something slowed it down as thats a 4 second operation here normally. And it didn't surprise me when the "reboot" shutdown hung on "stopping alsasound" and I had to use the reset button. [...] >> I fully realise that linux has a much more complex method of >> allocating resources, but doesn't it *know* exactly what resources >> have been passed out to each process? > >Yes it does - the problem is that not all resources are managed >by processes. Some allocations are managed by drivers, so a driver >bug can get the device into a unuseable state _and_ tie up the >process(es) that were using the driver at the moment. This from my viewpoint, is wrong. The kernel, and only the kernel should be ultimately responsible for handing out resources, and reclaiming at its convienience. >> And why is there no entry from the kill function into that >> resource management portion of the kernel so that this could also >> be done by the linux kernel, say with a "kill --total procnumber"? > >Interesting, but you might need a path from "kill" into >every device driver. :-/ And of course it wtill won't work >if there is a bug in the driver. Thats the fault of the design IMO. >> Seems like a heck of a good question to me since an os written to >> run on a 64k machine in 1981, and expanded to run on a 128K to 2 >> megabyte machine in 1986 can do it just fine. Even if that >> process is still running and spitting out data to its parent >> window/shell! Or if its crashed and scribbled over all its >> memory, makes no difference to os9. You (root) wants it gone, >> fine, its gone. > >Can os9 do this if the process is busy calling into a buggy >device driver that simply doesn't return or perhaps believes >that some dma operation into process memory is taking forever? >Or perhaps os9 doesn't have lots and lots of drivers written by >different people with varying competence? It did have quite a few authors involved in it over the years including me, I did many of its utilities, and converted the rbf.mn from 6809 code to 6309 code, roughly doubleing its speed without fiddling with the clock speed, which is married to the video on that machine. I also did a couple of its clock modules, which are the heart of the multitasking it does. And yes, it could kill, absolutely cleanly, any process you named on the command line at any time. Any drivers involved got their scratch space from the callers loading of a set of pointers, so if a driver was being accessed by 2 or more processes, each instance had its own stack/process space. When the process disappeared, the recovery included that space in memory. The driver proper had no long term history of that processes actions, even if a disk seek microsleep or similar was in progress when the caller disappeared. >Often, the real solution is to fix the driver to deal with >"unexpected" conditions. > >Helge Hafting As I said earlier, lets let this horse be buried, "its dead Jim", and my beating on it is only wasting bandwitdh. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 20:40 ` Gene Heskett @ 2004-11-04 0:43 ` Kurt Wall 2004-11-04 1:01 ` Russell Miller 2004-11-04 10:07 ` Matthias Andree 1 sibling, 1 reply; 99+ messages in thread From: Kurt Wall @ 2004-11-04 0:43 UTC (permalink / raw) To: linux-kernel On Wed, Nov 03, 2004 at 03:40:03PM -0500, Gene Heskett took 89 lines to write: > On Wednesday 03 November 2004 15:13, Helge Hafting wrote: > > > >Yes it does - the problem is that not all resources are managed > >by processes. Some allocations are managed by drivers, so a driver > >bug can get the device into a unuseable state _and_ tie up the > >process(es) that were using the driver at the moment. > > This from my viewpoint, is wrong. The kernel, and only the kernel > should be ultimately responsible for handing out resources, and > reclaiming at its convienience. This might just be semantics, but device drivers are part of the kernel. Kurt -- In 1750 Issac Newton became discouraged when he fell up a flight of stairs. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 0:43 ` Kurt Wall @ 2004-11-04 1:01 ` Russell Miller 2004-11-04 1:38 ` Doug McNaught 0 siblings, 1 reply; 99+ messages in thread From: Russell Miller @ 2004-11-04 1:01 UTC (permalink / raw) To: Kurt Wall; +Cc: linux-kernel On Wednesday 03 November 2004 18:43, Kurt Wall wrote: > This might just be semantics, but device drivers are part of the kernel. > This brings up another question I've had since reading the documentation on later pentium-class chips: why are only rings 0 and 3 used in linux? --Russell > Kurt -- Russell Miller - rmiller@duskglow.com - Le Mars, IA Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. http://www.duskglow.com - 712-546-5886 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 1:01 ` Russell Miller @ 2004-11-04 1:38 ` Doug McNaught 2004-11-04 1:45 ` Russell Miller 0 siblings, 1 reply; 99+ messages in thread From: Doug McNaught @ 2004-11-04 1:38 UTC (permalink / raw) To: Russell Miller; +Cc: Kurt Wall, linux-kernel Russell Miller <rmiller@duskglow.com> writes: > This brings up another question I've had since reading the documentation on > later pentium-class chips: > > why are only rings 0 and 3 used in linux? Because the "traditional" Unix privilege model only has two levels, and Linux runs on many architectures, most of which have only two privilege levels (the 68000 called them "user" and "supervisor"). Special-casing x86 is possible but probably wouldn't be worth it. -Doug ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 1:38 ` Doug McNaught @ 2004-11-04 1:45 ` Russell Miller 2004-11-04 1:56 ` Doug McNaught 2004-11-04 1:59 ` Mitchell Blank Jr 0 siblings, 2 replies; 99+ messages in thread From: Russell Miller @ 2004-11-04 1:45 UTC (permalink / raw) To: Doug McNaught; +Cc: Kurt Wall, linux-kernel On Wednesday 03 November 2004 19:38, Doug McNaught wrote: > Russell Miller <rmiller@duskglow.com> writes: > > This brings up another question I've had since reading the documentation > > on later pentium-class chips: > > > > why are only rings 0 and 3 used in linux? > > Because the "traditional" Unix privilege model only has two levels, > and Linux runs on many architectures, most of which have only two > privilege levels (the 68000 called them "user" and "supervisor"). > Special-casing x86 is possible but probably wouldn't be worth it. > Wouldn't it help with device driver problems? Couldn't ring 1 be used to make sure an errant driver doesn't drop the kernel, at least on x86 machines? I remember the 68000 architecture. Quite nice (but I was 10 when I studied it, so..). --Russell > -Doug -- Russell Miller - rmiller@duskglow.com - Le Mars, IA Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs. http://www.duskglow.com - 712-546-5886 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 1:45 ` Russell Miller @ 2004-11-04 1:56 ` Doug McNaught 2004-11-04 1:59 ` Mitchell Blank Jr 1 sibling, 0 replies; 99+ messages in thread From: Doug McNaught @ 2004-11-04 1:56 UTC (permalink / raw) To: Russell Miller; +Cc: Kurt Wall, linux-kernel Russell Miller <rmiller@duskglow.com> writes: > Wouldn't it help with device driver problems? Couldn't ring 1 be > used to make sure an errant driver doesn't drop the kernel, at least > on x86 machines? As I understand it: 1) Ring transitions aren't free. 2) The API between drivers and kernel is always in flux; drivers expect to be able to access internal kernel data structures. Making drivers run in ring 1 on even one of the N architectures would be a major refactoring and would constrain API changes. Freezing the internal API is something the developers don't want to do. 3) There are probably plenty of ways for a buggy driver to crash the kernel even if it's running in ring 1 (turn off interrupts and leave them off, etc). So the upshot is that it's probably not worth the work and portability hassles. -Doug ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 1:45 ` Russell Miller 2004-11-04 1:56 ` Doug McNaught @ 2004-11-04 1:59 ` Mitchell Blank Jr 2004-11-04 20:10 ` Bill Davidsen 1 sibling, 1 reply; 99+ messages in thread From: Mitchell Blank Jr @ 2004-11-04 1:59 UTC (permalink / raw) To: Russell Miller; +Cc: linux-kernel Russell Miller wrote: > Couldn't ring 1 be used to make > sure an errant driver doesn't drop the kernel, at least on x86 machines? Not really -- drivers could still do things like mis-program their associated hardware making it do DMA writes all over kernel memory (just as one example) Basically it'd add a lot of complexity (and inefficiency) without adding much real safety. -Mitch ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 1:59 ` Mitchell Blank Jr @ 2004-11-04 20:10 ` Bill Davidsen 0 siblings, 0 replies; 99+ messages in thread From: Bill Davidsen @ 2004-11-04 20:10 UTC (permalink / raw) To: Mitchell Blank Jr; +Cc: Russell Miller, linux-kernel Mitchell Blank Jr wrote: > Russell Miller wrote: > >>Couldn't ring 1 be used to make >>sure an errant driver doesn't drop the kernel, at least on x86 machines? > > > Not really -- drivers could still do things like mis-program their associated > hardware making it do DMA writes all over kernel memory (just as one example) > > Basically it'd add a lot of complexity (and inefficiency) without adding > much real safety. It would be nice on x86 to run ring 1 for kernel debugging, getting faults at appropriate points. Sorry, I'm an old MULTICS guy, wish Honeywell would OS it. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 20:40 ` Gene Heskett 2004-11-04 0:43 ` Kurt Wall @ 2004-11-04 10:07 ` Matthias Andree 2004-11-04 22:31 ` Peter Chubb 2004-11-04 23:33 ` Benno 1 sibling, 2 replies; 99+ messages in thread From: Matthias Andree @ 2004-11-04 10:07 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel On Wed, 03 Nov 2004, Gene Heskett wrote: > >Yes it does - the problem is that not all resources are managed > >by processes. Some allocations are managed by drivers, so a driver > >bug can get the device into a unuseable state _and_ tie up the > >process(es) that were using the driver at the moment. > > This from my viewpoint, is wrong. The kernel, and only the kernel > should be ultimately responsible for handing out resources, and > reclaiming at its convienience. Linux's driver model is the way it is. If you want the kernel to clean up after a driver has puked, you need something like a microkernel I believe, where only a minimal core kernel is a real kernel and where all the drivers are actually in user-space, but that's no longer Linux then. I'm not reflecting the down- and upsides to of this as I have no experience with microkernels (and have never used OS9 or GNU Hurd either). I know there have been attempts to port Linux to a Microkernel but I don't know what's come out of it. -- Matthias Andree ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 10:07 ` Matthias Andree @ 2004-11-04 22:31 ` Peter Chubb 2004-11-04 23:33 ` Benno 1 sibling, 0 replies; 99+ messages in thread From: Peter Chubb @ 2004-11-04 22:31 UTC (permalink / raw) To: Matthias Andree; +Cc: Gene Heskett, linux-kernel >>>>> "Matthias" == Matthias Andree <matthias.andree@gmx.de> writes: Matthias> On Wed, 03 Nov 2004, Gene Heskett wrote: >> >Yes it does - the problem is that not all resources are managed >> >by processes. Some allocations are managed by drivers, so a >> driver >bug can get the device into a unuseable state _and_ tie up >> the >process(es) that were using the driver at the moment. >> >> This from my viewpoint, is wrong. The kernel, and only the kernel >> should be ultimately responsible for handing out resources, and >> reclaiming at its convienience. Matthias> Linux's driver model is the way it is. If you want the Matthias> kernel to clean up after a driver has puked, you need Matthias> something like a microkernel I believe, where only a minimal Matthias> core kernel is a real kernel and where all the drivers are Matthias> actually in user-space, but that's no longer Linux then. Matthias> I'm not reflecting the down- and upsides to of this as I Matthias> have no experience with microkernels (and have never used Matthias> OS9 or GNU Hurd either). I know there have been attempts to Matthias> port Linux to a Microkernel but I don't know what's come out Matthias> of it. There are actually several ports of Linux onto microkernels, but the only one I know anything about is the Wombat project here at UNSW. Linux running on the L4 microkernel runs at around the same speed as on the bare metal. The home page is at http://www.disy.cse.unsw.edu.au/Software/Wombat/ but there's not much there yet. -- Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au The technical we do immediately, the political takes *forever* ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 10:07 ` Matthias Andree 2004-11-04 22:31 ` Peter Chubb @ 2004-11-04 23:33 ` Benno 1 sibling, 0 replies; 99+ messages in thread From: Benno @ 2004-11-04 23:33 UTC (permalink / raw) To: Gene Heskett, linux-kernel On Thu Nov 04, 2004 at 11:07:49 +0100, Matthias Andree wrote: >On Wed, 03 Nov 2004, Gene Heskett wrote: > >> >Yes it does - the problem is that not all resources are managed >> >by processes. Some allocations are managed by drivers, so a driver >> >bug can get the device into a unuseable state _and_ tie up the >> >process(es) that were using the driver at the moment. >> >> This from my viewpoint, is wrong. The kernel, and only the kernel >> should be ultimately responsible for handing out resources, and >> reclaiming at its convienience. > >Linux's driver model is the way it is. If you want the kernel to clean >up after a driver has puked, you need something like a microkernel I >believe, where only a minimal core kernel is a real kernel and where all >the drivers are actually in user-space, but that's no longer Linux then. Of course some drivers are already in user-space on Linux. (E.g: X graphics cards). Work by the Gelato project has added support to the Linux kernel to allow more complicated drivers (e.g: those requiring interrupts) to be run outside the kernel on Linux. http://www.gelato.unsw.edu.au/cgi-bin/viewcvs.cgi/cvs/kernel/usrdrivers/ Cheers, Benno ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 12:51 is killing zombies possible w/o a reboot? Gene Heskett 2004-11-03 14:33 ` bert hubert @ 2004-11-03 20:48 ` Tom Felker 2004-11-03 21:08 ` Gene Heskett 2004-11-05 0:29 ` Gene Heskett 1 sibling, 2 replies; 99+ messages in thread From: Tom Felker @ 2004-11-03 20:48 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel On Wednesday 03 November 2004 06:51 am, Gene Heskett wrote: > Greetings; > > I thought I'd get caught up on -bkx kernels and made a -bk8 just now. > > But I'd tried to run gnomeradio earlier to listen to the elections, > but it failed leaving to run, as did tvtime then too, claiming it > couldn't get a lock on /dev/video0, and gnomeradio apparently left a > lock on alsasound that prevented the normal gracefull shutdown by > locking up the shutdown on the "stopping alsasound" line. So I had > to use the hardware reset. > > I'd tried to kill the zombie earlier but couldn't. > > Isn't there some way to clean up a &^$#^#@)_ zombie? Ok, let me try to explain what probably happened. First, terminology. When one process wants to be come two processes, it fork()s. One process is the parent, and one it the child. The child usually exec()s to become a different program. The parent sometimes wants to know when the child ends and whether it succeeded. Thus, the wait() system calls. The parent can either check whether a child died, or go to sleep until one does. When the parent is awaken, it's told which child died and what the child's exit status was (usually 0 for success). But if the child dies before the parent wait()s, the kernel must keep a record of which child died and what its exit status was, and it can't reassign the late child's PID yet. This record is a "zombie," and shows up under top or ps with the 'Z' state. Zombies do _not_ hold open files, memory, or resources of any kind. That's the technical definition of a zombie, which I'm telling you because that's probably not your situation: I assume you used "zombie" as an informal term for a process that you can't kill. Your problem is a process in uninterruptible sleep (the "D" state). When a process executing in userspace wants information from a device, like a disk or TV capture card, it calls read(), and context switches into kernel space. Usually, it will take a moment for the data to be available from the device, so the process gets put on a wait queue so other processes can run. Obviously nothing is deallocated, because everyone expects the process will get it's data and proceed as normal. When the device has the data, it interrupts the CPU, and the kernel figures out who wanted the data and puts them on the run queue. When a process is on a wait queue waiting for data from a device (the D state), it's impossible to kill. This is because otherwise, when the interrupt did come, the structures associated with the process would have been freed, and the kernel would crash. It would require an incredible amount of innefficient bookkeeping to avoid this, and it's unnecessary because normally, the data request will finish (successfully or not), and the process will be woken up, or if it was sent SIGKILL, it will be killed. Long story short, what happened was, some faulty hardware or some buggy driver, probably associated with the capture card, had a problem and left the process in D state. Thus, it couldn't be killed, and since it had /dev/video open, tvtime couldn't run and failed gracefully, and because it held /dev/dsp open, and couldn't be killed as the init scripts would normally do in that situation, the audio drivers couldn't be unloaded and the boot process hung. So give us a bunch of information about what hardware you're using, output of dmesg, and steps to reproduce the driver bug (if it is that). -- Tom Felker, <tcfelker@mtco.com> <http://vlevel.sourceforge.net> - Stop fiddling with the volume knob. If you have to design something and control freaks are involved, give them plenty of knobs, but don't connect them to anything important. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 20:48 ` Tom Felker @ 2004-11-03 21:08 ` Gene Heskett 2004-11-04 7:19 ` Jan Knutar 2004-11-05 0:29 ` Gene Heskett 1 sibling, 1 reply; 99+ messages in thread From: Gene Heskett @ 2004-11-03 21:08 UTC (permalink / raw) To: linux-kernel; +Cc: Tom Felker On Wednesday 03 November 2004 15:48, Tom Felker wrote: >On Wednesday 03 November 2004 06:51 am, Gene Heskett wrote: >> Greetings; >> >> I thought I'd get caught up on -bkx kernels and made a -bk8 just >> now. >> >> But I'd tried to run gnomeradio earlier to listen to the >> elections, but it failed leaving to run, as did tvtime then too, >> claiming it couldn't get a lock on /dev/video0, and gnomeradio >> apparently left a lock on alsasound that prevented the normal >> gracefull shutdown by locking up the shutdown on the "stopping >> alsasound" line. So I had to use the hardware reset. >> >> I'd tried to kill the zombie earlier but couldn't. >> >> Isn't there some way to clean up a &^$#^#@)_ zombie? > >Ok, let me try to explain what probably happened. > >First, terminology. When one process wants to be come two > processes, it fork()s. One process is the parent, and one it the > child. The child usually exec()s to become a different program. > The parent sometimes wants to know when the child ends and whether > it succeeded. Thus, the wait() system calls. The parent can either > check whether a child died, or go to sleep until one does. When > the parent is awaken, it's told which child died and what the > child's exit status was (usually 0 for success). But if the child > dies before the parent wait()s, the kernel must keep a record of > which child died and what its exit status was, and it can't > reassign the late child's PID yet. This record is a "zombie," and > shows up under top or ps with the 'Z' state. Zombies do _not_ hold > open files, memory, or resources of any kind. > >That's the technical definition of a zombie, which I'm telling you > because that's probably not your situation: I assume you used > "zombie" as an informal term for a process that you can't kill. > Your problem is a process in uninterruptible sleep (the "D" state). > >When a process executing in userspace wants information from a > device, like a disk or TV capture card, it calls read(), and > context switches into kernel space. Usually, it will take a moment > for the data to be available from the device, so the process gets > put on a wait queue so other processes can run. Obviously nothing > is deallocated, because everyone expects the process will get it's > data and proceed as normal. When the device has the data, it > interrupts the CPU, and the kernel figures out who wanted the data > and puts them on the run queue. > >When a process is on a wait queue waiting for data from a device > (the D state), it's impossible to kill. This is because otherwise, > when the interrupt did come, the structures associated with the > process would have been freed, and the kernel would crash. It > would require an incredible amount of innefficient bookkeeping to > avoid this, and it's unnecessary because normally, the data request > will finish (successfully or not), and the process will be woken > up, or if it was sent SIGKILL, it will be killed. > >Long story short, what happened was, some faulty hardware or some > buggy driver, probably associated with the capture card, had a > problem and left the process in D state. Thus, it couldn't be > killed, and since it had /dev/video open, tvtime couldn't run and > failed gracefully, and because it held /dev/dsp open, and couldn't > be killed as the init scripts would normally do in that situation, > the audio drivers couldn't be unloaded and the boot process hung. > >So give us a bunch of information about what hardware you're using, > output of dmesg, and steps to reproduce the driver bug (if it is > that). Its a dead horse Tom, lets bury it. I've rebooted to 4 new kernels since that time as I march toward getting caught up with whatever bk(nn) is out today. Other than that, which took place on bk7's watch, its all working rather well. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 21:08 ` Gene Heskett @ 2004-11-04 7:19 ` Jan Knutar 2004-11-04 11:57 ` Gene Heskett 0 siblings, 1 reply; 99+ messages in thread From: Jan Knutar @ 2004-11-04 7:19 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Tom Felker On Wednesday 03 November 2004 23:08, Gene Heskett wrote: > Its a dead horse Tom, lets bury it. I've rebooted to 4 new kernels > since that time as I march toward getting caught up with whatever > bk(nn) is out today. Other than that, which took place on bk7's > watch, its all working rather well. Since nobody else seems to have said it, it would be a good idea to enable sysrq and do a sysrq-T the next time (if) this happens, so that there would be atleast some information to go on. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 7:19 ` Jan Knutar @ 2004-11-04 11:57 ` Gene Heskett 2004-11-04 12:12 ` Jan Knutar 0 siblings, 1 reply; 99+ messages in thread From: Gene Heskett @ 2004-11-04 11:57 UTC (permalink / raw) To: linux-kernel; +Cc: Jan Knutar, Tom Felker On Thursday 04 November 2004 02:19, Jan Knutar wrote: >On Wednesday 03 November 2004 23:08, Gene Heskett wrote: >> Its a dead horse Tom, lets bury it. I've rebooted to 4 new >> kernels since that time as I march toward getting caught up with >> whatever bk(nn) is out today. Other than that, which took place >> on bk7's watch, its all working rather well. > >Since nobody else seems to have said it, it would be a good idea >to enable sysrq and do a sysrq-T the next time (if) this happens, >so that there would be atleast some information to go on. I'e had that turned on since forever Jan, but usually, when its hung someplace, its well and truely hung, and hardware reset button time. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 11:57 ` Gene Heskett @ 2004-11-04 12:12 ` Jan Knutar 2004-11-04 12:18 ` Gene Heskett 2004-11-04 12:39 ` Gene Heskett 0 siblings, 2 replies; 99+ messages in thread From: Jan Knutar @ 2004-11-04 12:12 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Tom Felker On Thursday 04 November 2004 13:57, Gene Heskett wrote: > I'e had that turned on since forever Jan, but usually, when its hung > someplace, its well and truely hung, and hardware reset button time. Are you saying that these zombies (or tasks stuck in state D) also make sysrq-T hang, and not list all tasks? ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:12 ` Jan Knutar @ 2004-11-04 12:18 ` Gene Heskett 2004-11-04 12:29 ` Jan Knutar 2004-11-04 12:39 ` Gene Heskett 1 sibling, 1 reply; 99+ messages in thread From: Gene Heskett @ 2004-11-04 12:18 UTC (permalink / raw) To: linux-kernel; +Cc: Jan Knutar, Tom Felker On Thursday 04 November 2004 07:12, Jan Knutar wrote: >On Thursday 04 November 2004 13:57, Gene Heskett wrote: >> I'e had that turned on since forever Jan, but usually, when its >> hung someplace, its well and truely hung, and hardware reset >> button time. > >Are you saying that these zombies (or tasks stuck in state D) also > make sysrq-T hang, and not list all tasks? The machine is hung. No ssh, no ping response, the only button that works is the hardware reset on the front of the tower. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:18 ` Gene Heskett @ 2004-11-04 12:29 ` Jan Knutar 2004-11-04 13:56 ` Gene Heskett 0 siblings, 1 reply; 99+ messages in thread From: Jan Knutar @ 2004-11-04 12:29 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Tom Felker On Thursday 04 November 2004 14:18, Gene Heskett wrote: > The machine is hung. No ssh, no ping response, the only button that > works is the hardware reset on the front of the tower. I must've missed where the thread went from zombies into totally hung machine. My apologies for the noise. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:29 ` Jan Knutar @ 2004-11-04 13:56 ` Gene Heskett 0 siblings, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 13:56 UTC (permalink / raw) To: linux-kernel; +Cc: Jan Knutar, Tom Felker On Thursday 04 November 2004 07:29, Jan Knutar wrote: >On Thursday 04 November 2004 14:18, Gene Heskett wrote: >> The machine is hung. No ssh, no ping response, the only button >> that works is the hardware reset on the front of the tower. > >I must've missed where the thread went from zombies into totally > hung machine. My apologies for the noise. It went from an unkillable process (gnomeradio) that was blocking other programs like tvtime with its locks on /dev/video0, to completely hung at "stopping alsasound" when I tried to reboot. That required the reset button to get going again. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:12 ` Jan Knutar 2004-11-04 12:18 ` Gene Heskett @ 2004-11-04 12:39 ` Gene Heskett 2004-11-04 13:01 ` Ian Campbell ` (2 more replies) 1 sibling, 3 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 12:39 UTC (permalink / raw) To: linux-kernel; +Cc: Jan Knutar, Tom Felker On Thursday 04 November 2004 07:12, Jan Knutar wrote: >On Thursday 04 November 2004 13:57, Gene Heskett wrote: >> I'e had that turned on since forever Jan, but usually, when its >> hung someplace, its well and truely hung, and hardware reset >> button time. > >Are you saying that these zombies (or tasks stuck in state D) also > make sysrq-T hang, and not list all tasks? I thought I'd test it right now while the system is runnng normally, but I got only a beep from the console, so I went to Documentation/sysrq.txt to make sure I was doing it right, and it is _not_ working right now. But it is compiled in according to a make xconfig, or a grep of the .config. [root@coyote linux-2.6.10-rc1-bk13]# grep SYSRQ .config CONFIG_MAGIC_SYSRQ=y I get a couple of beeps from the console, but thats the limit of the response, and a tail -f on the log shows nothing. I also logged into VC2, and tried it there, but that attempt didn't even get me a beep, several times. The keyboard is a cheap ($24) M$ with a few extra buttons that don't do anything along the top. And getting a bit creaky in its old age, a lot like me, but I'm about 68 years older than the keyboard :) -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:39 ` Gene Heskett @ 2004-11-04 13:01 ` Ian Campbell 2004-11-04 14:07 ` Gene Heskett 2004-11-04 13:10 ` Doug McNaught 2004-11-04 20:18 ` Bill Davidsen 2 siblings, 1 reply; 99+ messages in thread From: Ian Campbell @ 2004-11-04 13:01 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Jan Knutar, Tom Felker On Thu, 2004-11-04 at 07:39 -0500, Gene Heskett wrote: > On Thursday 04 November 2004 07:12, Jan Knutar wrote: > >On Thursday 04 November 2004 13:57, Gene Heskett wrote: > >> I'e had that turned on since forever Jan, but usually, when its > >> hung someplace, its well and truely hung, and hardware reset > >> button time. > > > >Are you saying that these zombies (or tasks stuck in state D) also > > make sysrq-T hang, and not list all tasks? > > I thought I'd test it right now while the system is runnng normally, > but I got only a beep from the console, so I went to > Documentation/sysrq.txt to make sure I was doing it right, and it is > _not_ working right now. But it is compiled in according to a make > xconfig, or a grep of the .config. It can also be enabled/disabled at runtime, Documentation/sysrq.txt says that the default now is on (but that it used to default to off). Perhaps it is getting turned off somewhere in your boot scripts etc. You can check with $ cat /proc/sys/kernel/sysrq 1 > The keyboard is a cheap ($24) M$ with a few extra buttons that don't > do anything along the top. And getting a bit creaky in its old age, > a lot like me, but I'm about 68 years older than the keyboard :) Documentation/sysrq.txt also says: * How do I use the magic SysRq key? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On x86 - You press the key combo 'ALT-SysRq-<command key>'. Note - Some keyboards may not have a key labeled 'SysRq'. The 'SysRq' key is also known as the 'Print Screen' key. Also some keyboards cannot handle so many keys being pressed at the same time, so you might have better luck with "press Alt", "press SysRq", "release Alt", "press <command key>", release everything. Perhaps your keyboard is one of those that can't cope with all those keys? Ian. -- Ian Campbell, Senior Design Engineer Web: http://www.arcom.com Arcom, Clifton Road, Direct: +44 (0)1223 403 465 Cambridge CB1 7EA, United Kingdom Phone: +44 (0)1223 411 200 ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 13:01 ` Ian Campbell @ 2004-11-04 14:07 ` Gene Heskett 2004-11-04 14:24 ` Ian Campbell 2004-11-04 14:26 ` DervishD 0 siblings, 2 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 14:07 UTC (permalink / raw) To: linux-kernel; +Cc: Ian Campbell, Jan Knutar, Tom Felker On Thursday 04 November 2004 08:01, Ian Campbell wrote: >On Thu, 2004-11-04 at 07:39 -0500, Gene Heskett wrote: >> On Thursday 04 November 2004 07:12, Jan Knutar wrote: >> >On Thursday 04 November 2004 13:57, Gene Heskett wrote: >> >> I'e had that turned on since forever Jan, but usually, when its >> >> hung someplace, its well and truely hung, and hardware reset >> >> button time. >> > >> >Are you saying that these zombies (or tasks stuck in state D) >> > also make sysrq-T hang, and not list all tasks? >> >> I thought I'd test it right now while the system is runnng >> normally, but I got only a beep from the console, so I went to >> Documentation/sysrq.txt to make sure I was doing it right, and it >> is _not_ working right now. But it is compiled in according to a >> make xconfig, or a grep of the .config. > >It can also be enabled/disabled at runtime, Documentation/sysrq.txt > says that the default now is on (but that it used to default to > off). Perhaps it is getting turned off somewhere in your boot > scripts etc. > >You can check with > >$ cat /proc/sys/kernel/sysrq >1 > >> The keyboard is a cheap ($24) M$ with a few extra buttons that >> don't do anything along the top. And getting a bit creaky in its >> old age, a lot like me, but I'm about 68 years older than the >> keyboard :) > >Documentation/sysrq.txt also says: > >* How do I use the magic SysRq key? >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >On x86 - You press the key combo 'ALT-SysRq-<command key>'. Note - > Some keyboards may not have a key labeled 'SysRq'. The 'SysRq' key > is also known as the 'Print Screen' key. Also some keyboards cannot > handle so many keys being pressed at the same time, so you might > have better luck with "press Alt", "press SysRq", "release Alt", > "press <command key>", release everything. > >Perhaps your keyboard is one of those that can't cope with all those >keys? > >Ian. Possibly, but OTOH, [root@coyote root]# cat /proc/sys/kernel/sysrq 0 And no, I'm not turning it off anyplace in the boot proceedure. An 'echo 1 >/proc/sys/kernel/sysrq', and repeating the keypresses now gets a boatload of stuff in the logs, but nothing on the console. The logs look something like this: Nov 4 08:59:29 coyote kernel: kdeinit S C0453F08 0 18964 3327 18965 18963 (NOTLB) Nov 4 08:59:29 coyote kernel: c657ae8c 00200082 c6820120 c0453f08 0000202c 00000000 b4d18366 0000202c Nov 4 08:59:29 coyote kernel: 00002ecd b4d1e78f 0000202c c6820600 c682075c 0217d045 c657aea0 fffffff5 Nov 4 08:59:29 coyote kernel: c657aedc c033bca3 c657aea0 0217d045 c657aec4 dfa88ea0 ee3aeea0 0217d045 Nov 4 08:59:29 coyote kernel: Call Trace: Nov 4 08:59:29 coyote kernel: [<c033bca3>] schedule_timeout+0x63/0xc0 Nov 4 08:59:29 coyote kernel: [<c0120150>] process_timeout+0x0/0x10 Nov 4 08:59:29 coyote kernel: [<c012c12f>] futex_wait+0x12f/0x1a0 Nov 4 08:59:29 coyote kernel: [<c0114160>] default_wake_function+0x0/0x20 Nov 4 08:59:29 coyote kernel: [<c0114160>] default_wake_function+0x0/0x20 Nov 4 08:59:29 coyote kernel: [<c012c418>] do_futex+0x48/0xa0 Nov 4 08:59:29 coyote kernel: [<c012c55e>] sys_futex+0xee/0x100 Nov 4 08:59:29 coyote kernel: [<c01040a9>] sysenter_past_esp+0x52/0x71 Nov 4 08:59:29 coyote kernel: kdeinit S C0453A60 0 18965 3327 18966 18964 (NOTLB) Nov 4 08:59:29 coyote kernel: dfa88e8c 00200082 c6820120 c0453a60 dfa88eac 00000000 ed99e990 00000000 Nov 4 08:59:29 coyote kernel: 00006be7 b816258b 0000202c c6820120 c682027c 0217d07c dfa88ea0 fffffff5 Nov 4 08:59:29 coyote kernel: dfa88edc c033bca3 dfa88ea0 0217d07c dfa88ec4 c0459928 c657aea0 0217d07c Nov 4 08:59:29 coyote kernel: Call Trace: Nov 4 08:59:29 coyote kernel: [<c033bca3>] schedule_timeout+0x63/0xc0 Nov 4 08:59:29 coyote kernel: [<c0120150>] process_timeout+0x0/0x10 Nov 4 08:59:29 coyote kernel: [<c012c12f>] futex_wait+0x12f/0x1a0 Nov 4 08:59:29 coyote kernel: [<c0114160>] default_wake_function+0x0/0x20 Nov 4 08:59:29 coyote kernel: [<c0114160>] default_wake_function+0x0/0x20 Nov 4 08:59:29 coyote kernel: [<c012c418>] do_futex+0x48/0xa0 Nov 4 08:59:29 coyote kernel: [<c012c55e>] sys_futex+0xee/0x100 Nov 4 08:59:29 coyote kernel: [<c01040a9>] sysenter_past_esp+0x52/0x71 Nov 4 08:59:29 coyote kernel: kdeinit S C0453A60 0 18966 3327 18965 (NOTLB) Nov 4 08:59:29 coyote kernel: ee3aee8c 00200082 e770fb00 c0453a60 ee3aeeac 00000000 ed99e990 00000000 Nov 4 08:59:29 coyote kernel: 00001e29 b4b250fe 0000202c e770fb00 e770fc5c 0217d043 ee3aeea0 fffffff5 Nov 4 08:59:29 coyote kernel: ee3aeedc c033bca3 ee3aeea0 0217d043 666c6573 c657aea0 c039be78 0217d043 Nov 4 08:59:29 coyote kernel: Call Trace: Nov 4 08:59:29 coyote kernel: [<c033bca3>] schedule_timeout+0x63/0xc0 Nov 4 08:59:29 coyote kernel: [<c0120150>] process_timeout+0x0/0x10 Nov 4 08:59:29 coyote kernel: [<c012c12f>] futex_wait+0x12f/0x1a0 Nov 4 08:59:29 coyote kernel: [<c0114160>] default_wake_function+0x0/0x20 Nov 4 08:59:29 coyote kernel: [<c0114160>] default_wake_function+0x0/0x20 Nov 4 08:59:29 coyote kernel: [<c012c418>] do_futex+0x48/0xa0 Nov 4 08:59:29 coyote kernel: [<c012c55e>] sys_futex+0xee/0x100 Nov 4 08:59:29 coyote kernel: [<c01040a9>] sysenter_past_esp+0x52/0x71 There is a lot more of that of that above that snip, several pages. And of course the system seems to be running fine ATM. :-) But I'm learning, and that echo will go into my rc.local as soon as I'm done here. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:07 ` Gene Heskett @ 2004-11-04 14:24 ` Ian Campbell 2004-11-04 15:10 ` Gene Heskett 2004-11-04 14:26 ` DervishD 1 sibling, 1 reply; 99+ messages in thread From: Ian Campbell @ 2004-11-04 14:24 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Jan Knutar, Tom Felker On Thu, 2004-11-04 at 09:07 -0500, Gene Heskett wrote: > [root@coyote root]# cat /proc/sys/kernel/sysrq > 0 Aha :-) > And no, I'm not turning it off anyplace in the boot proceedure. Something must be -- you can see in drivers/char/sysrq.c that sysrq_enabled is set to 1 by default and according to bkbits.net it has been that way since at least 2.4.0. does the following not come up with any culprits? # grep -r sysrq /etc Ian. -- Ian Campbell, Senior Design Engineer Web: http://www.arcom.com Arcom, Clifton Road, Direct: +44 (0)1223 403 465 Cambridge CB1 7EA, United Kingdom Phone: +44 (0)1223 411 200 _____________________________________________________________________ The message in this transmission is sent in confidence for the attention of the addressee only and should not be disclosed to any other party. Unauthorised recipients are requested to preserve this confidentiality. Please advise the sender if the addressee is not resident at the receiving end. Email to and from Arcom is automatically monitored for operational and lawful business reasons. This message has been virus scanned by MessageLabs. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:24 ` Ian Campbell @ 2004-11-04 15:10 ` Gene Heskett 0 siblings, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 15:10 UTC (permalink / raw) To: linux-kernel; +Cc: Ian Campbell, Jan Knutar, Tom Felker On Thursday 04 November 2004 09:24, Ian Campbell wrote: >grep -r sysrq /etc Gets me a bunch. The revelant ones would be: /etc/rc.d/rc3.d/K20iscsi: if [ -e /proc/sys/kernel/sysrq ] ; then /etc/rc.d/rc3.d/K20iscsi: echo "1" > /proc/sys/kernel/sysrq and /etc/rc.d/rc.local:# Turn on the magic sysrq keys /etc/rc.d/rc.local:echo 1 >/proc/sys/kernel/sysrq But, what about this: /etc/sysctl.conf:# Disables the magic-sysrq key /etc/sysctl.conf:kernel.sysrq = 0 which I just commented out... And this: /etc/linuxconf/archive/Office/etc/sysctl.conf,v:kernel.sysrq = 0 But everything there is dated early 2001. I think its filesystem cruft nowadays, subject to being a space patrol target eventually. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:07 ` Gene Heskett 2004-11-04 14:24 ` Ian Campbell @ 2004-11-04 14:26 ` DervishD 2004-11-04 15:13 ` Gene Heskett 1 sibling, 1 reply; 99+ messages in thread From: DervishD @ 2004-11-04 14:26 UTC (permalink / raw) To: Gene Heskett; +Cc: linux-kernel, Ian Campbell, Jan Knutar, Tom Felker Hi Gene :) * Gene Heskett <gene.heskett@verizon.net> dixit: > Possibly, but OTOH, > [root@coyote root]# cat /proc/sys/kernel/sysrq > 0 > > And no, I'm not turning it off anyplace in the boot proceedure. An > 'echo 1 >/proc/sys/kernel/sysrq', and repeating the keypresses now > gets a boatload of stuff in the logs, but nothing on the console. Well, the stuff goes to the logs and not the console because of the console log level. You can change that using proc, too. Look in /proc/sys/kernel/printk (well, at least under 2.4.x). You'll see four numbers. The first one is the console loglevel. Any message directed to syslog with a priority higher than this number will be printed in the console. Otherwise they won't. The second number is the default message level. Any message without a priority will get this priority. The third number is the highest value you can assign to the first number (the console loglevel). The fourth number is the default value for the first number. The interesting number for you is the first one. Set it to a correct value for you (see syslog(2) to see what the numbers mean). Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 http://www.dervishd.net & http://www.pleyades.net/ ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:26 ` DervishD @ 2004-11-04 15:13 ` Gene Heskett 0 siblings, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 15:13 UTC (permalink / raw) To: linux-kernel; +Cc: DervishD, Ian Campbell, Jan Knutar, Tom Felker On Thursday 04 November 2004 09:26, DervishD wrote: > Hi Gene :) > > * Gene Heskett <gene.heskett@verizon.net> dixit: >> Possibly, but OTOH, >> [root@coyote root]# cat /proc/sys/kernel/sysrq >> 0 >> >> And no, I'm not turning it off anyplace in the boot proceedure. >> An 'echo 1 >/proc/sys/kernel/sysrq', and repeating the keypresses >> now gets a boatload of stuff in the logs, but nothing on the >> console. > > Well, the stuff goes to the logs and not the console because of >the console log level. You can change that using proc, too. Look in >/proc/sys/kernel/printk (well, at least under 2.4.x). You'll see >four numbers. The first one is the console loglevel. Any message >directed to syslog with a priority higher than this number will be >printed in the console. Otherwise they won't. > > The second number is the default message level. Any message >without a priority will get this priority. > > The third number is the highest value you can assign to the > first number (the console loglevel). > > The fourth number is the default value for the first number. > > The interesting number for you is the first one. Set it to a >correct value for you (see syslog(2) to see what the numbers mean). > > Raúl Núñez de Arenas Coronado I have it going to the logs as the prefered method as thats permanent whereas the console output is 100% volatile. That way I can look at the logs when the machine has been made functional again. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:39 ` Gene Heskett 2004-11-04 13:01 ` Ian Campbell @ 2004-11-04 13:10 ` Doug McNaught 2004-11-04 14:11 ` Gene Heskett 2004-11-04 20:18 ` Bill Davidsen 2 siblings, 1 reply; 99+ messages in thread From: Doug McNaught @ 2004-11-04 13:10 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Jan Knutar, Tom Felker Gene Heskett <gene.heskett@verizon.net> writes: > [root@coyote linux-2.6.10-rc1-bk13]# grep SYSRQ .config > CONFIG_MAGIC_SYSRQ=y Did you also enable it in /proc? -Doug ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 13:10 ` Doug McNaught @ 2004-11-04 14:11 ` Gene Heskett 2004-11-04 14:42 ` tlaurent 0 siblings, 1 reply; 99+ messages in thread From: Gene Heskett @ 2004-11-04 14:11 UTC (permalink / raw) To: linux-kernel; +Cc: Doug McNaught, Jan Knutar, Tom Felker On Thursday 04 November 2004 08:10, Doug McNaught wrote: >Gene Heskett <gene.heskett@verizon.net> writes: >> [root@coyote linux-2.6.10-rc1-bk13]# grep SYSRQ .config >> CONFIG_MAGIC_SYSRQ=y > >Did you also enable it in /proc? > >-Doug I just now discovered it defaults to a 0, so I put an echo 1 >proc/sys/kermel/sysrq in rc.local just now. Thanks for the heads up. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:11 ` Gene Heskett @ 2004-11-04 14:42 ` tlaurent 2004-11-04 15:14 ` Gene Heskett 0 siblings, 1 reply; 99+ messages in thread From: tlaurent @ 2004-11-04 14:42 UTC (permalink / raw) To: gene.heskett@verizon.net Cc: linux-kernel@vger.kernel.org, Doug McNaught, Jan Knutar, Tom Felker Selon Gene Heskett <gene.heskett@verizon.net>: > On Thursday 04 November 2004 08:10, Doug McNaught wrote: > >Gene Heskett <gene.heskett@verizon.net> writes: > >> [root@coyote linux-2.6.10-rc1-bk13]# grep SYSRQ .config > >> CONFIG_MAGIC_SYSRQ=y > > > >Did you also enable it in /proc? > > > >-Doug > > I just now discovered it defaults to a 0, so I put an > echo 1 >proc/sys/kermel/sysrq > in rc.local just now. You might also want to have a look at /etc/sysctl.conf. Some distros put a kernel.sysrq=0 in it... Cheers, Thibaut > > Thanks for the heads up. > > -- > Cheers, Gene > "There are four boxes to be used in defense of liberty: > soap, ballot, jury, and ammo. Please use in that order." > -Ed Howdershelt (Author) > 99.28% setiathome rank, not too shabby for a WV hillbilly > Yahoo.com attorneys please note, additions to this message > by Gene Heskett are: > Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 14:42 ` tlaurent @ 2004-11-04 15:14 ` Gene Heskett 0 siblings, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-04 15:14 UTC (permalink / raw) To: linux-kernel Cc: tlaurent, gene.heskett@verizon.net, linux-kernel@vger.kernel.org, Doug McNaught, Jan Knutar, Tom Felker On Thursday 04 November 2004 09:42, tlaurent@linagora.com wrote: >Selon Gene Heskett <gene.heskett@verizon.net>: >> On Thursday 04 November 2004 08:10, Doug McNaught wrote: >> >Gene Heskett <gene.heskett@verizon.net> writes: >> >> [root@coyote linux-2.6.10-rc1-bk13]# grep SYSRQ .config >> >> CONFIG_MAGIC_SYSRQ=y >> > >> >Did you also enable it in /proc? >> > >> >-Doug >> >> I just now discovered it defaults to a 0, so I put an >> echo 1 >proc/sys/kermel/sysrq >> in rc.local just now. > >You might also want to have a look at /etc/sysctl.conf. Some distros > put a kernel.sysrq=0 in it... And I just put a comment in front of that puppy! >Cheers, >Thibaut > >> Thanks for the heads up. >> >> -- >> Cheers, Gene >> "There are four boxes to be used in defense of liberty: >> soap, ballot, jury, and ammo. Please use in that order." >> -Ed Howdershelt (Author) >> 99.28% setiathome rank, not too shabby for a WV hillbilly >> Yahoo.com attorneys please note, additions to this message >> by Gene Heskett are: >> Copyright 2004 by Maurice Eugene Heskett, all rights reserved. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-04 12:39 ` Gene Heskett 2004-11-04 13:01 ` Ian Campbell 2004-11-04 13:10 ` Doug McNaught @ 2004-11-04 20:18 ` Bill Davidsen 2 siblings, 0 replies; 99+ messages in thread From: Bill Davidsen @ 2004-11-04 20:18 UTC (permalink / raw) To: gene.heskett; +Cc: linux-kernel, Jan Knutar, Tom Felker Gene Heskett wrote: > On Thursday 04 November 2004 07:12, Jan Knutar wrote: > >>On Thursday 04 November 2004 13:57, Gene Heskett wrote: >> >>>I'e had that turned on since forever Jan, but usually, when its >>>hung someplace, its well and truely hung, and hardware reset >>>button time. >> >>Are you saying that these zombies (or tasks stuck in state D) also >>make sysrq-T hang, and not list all tasks? > > > I thought I'd test it right now while the system is runnng normally, > but I got only a beep from the console, so I went to > Documentation/sysrq.txt to make sure I was doing it right, and it is > _not_ working right now. But it is compiled in according to a make > xconfig, or a grep of the .config. > > [root@coyote linux-2.6.10-rc1-bk13]# grep SYSRQ .config > CONFIG_MAGIC_SYSRQ=y > > I get a couple of beeps from the console, but thats the limit of the > response, and a tail -f on the log shows nothing. I also logged into > VC2, and tried it there, but that attempt didn't even get me a beep, > several times. > > The keyboard is a cheap ($24) M$ with a few extra buttons that don't > do anything along the top. And getting a bit creaky in its old age, > a lot like me, but I'm about 68 years older than the keyboard :) > Don't need to log in, do need two hands to hit all the keys at once;-) It works for me on a VC and unhung system, but I agree, when the system is well and truly hung reset is the only thing left. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me ^ permalink raw reply [flat|nested] 99+ messages in thread
* Re: is killing zombies possible w/o a reboot? 2004-11-03 20:48 ` Tom Felker 2004-11-03 21:08 ` Gene Heskett @ 2004-11-05 0:29 ` Gene Heskett 1 sibling, 0 replies; 99+ messages in thread From: Gene Heskett @ 2004-11-05 0:29 UTC (permalink / raw) To: linux-kernel; +Cc: Tom Felker On Wednesday 03 November 2004 15:48, Tom Felker wrote: [...] >> Isn't there some way to clean up a &^$#^#@)_ zombie? > >Ok, let me try to explain what probably happened. > >First, terminology. When one process wants to be come two > processes, it fork()s. One process is the parent, and one it the > child. The child usually exec()s to become a different program. > The parent sometimes wants to know when the child ends and whether > it succeeded. Thus, the wait() system calls. The parent can either > check whether a child died, or go to sleep until one does. When > the parent is awaken, it's told which child died and what the > child's exit status was (usually 0 for success). But if the child > dies before the parent wait()s, the kernel must keep a record of > which child died and what its exit status was, and it can't > reassign the late child's PID yet. This record is a "zombie," and > shows up under top or ps with the 'Z' state. Zombies do _not_ hold > open files, memory, or resources of any kind. > >That's the technical definition of a zombie, which I'm telling you > because that's probably not your situation: I assume you used > "zombie" as an informal term for a process that you can't kill. > Your problem is a process in uninterruptible sleep (the "D" state). > >When a process executing in userspace wants information from a > device, like a disk or TV capture card, it calls read(), and > context switches into kernel space. Usually, it will take a moment > for the data to be available from the device, so the process gets > put on a wait queue so other processes can run. Obviously nothing > is deallocated, because everyone expects the process will get it's > data and proceed as normal. When the device has the data, it > interrupts the CPU, and the kernel figures out who wanted the data > and puts them on the run queue. > >When a process is on a wait queue waiting for data from a device > (the D state), it's impossible to kill. This is because otherwise, > when the interrupt did come, the structures associated with the > process would have been freed, and the kernel would crash. It > would require an incredible amount of innefficient bookkeeping to > avoid this, and it's unnecessary because normally, the data request > will finish (successfully or not), and the process will be woken > up, or if it was sent SIGKILL, it will be killed. > >Long story short, what happened was, some faulty hardware or some > buggy driver, probably associated with the capture card, had a > problem and left the process in D state. Thus, it couldn't be > killed, and since it had /dev/video open, tvtime couldn't run and > failed gracefully, and because it held /dev/dsp open, and couldn't > be killed as the init scripts would normally do in that situation, > the audio drivers couldn't be unloaded and the boot process hung. > >So give us a bunch of information about what hardware you're using, > output of dmesg, and steps to reproduce the driver bug (if it is > that). I cannot do that as it apparently was a transient thing. After the reboot to the next kernel in the series, everythings has been working as well as can be expected. I've listened to the radio for about 30 seconds, and the tv maybe 6 hours since. Now that I know howto make the magic sysrq actually work and leave meaningfull stuff in the logs, maybe I can report something that might be constructive the next time it happens. Until then, I wait for the other shoe I guess. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.28% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2004 by Maurice Eugene Heskett, all rights reserved. ^ permalink raw reply [flat|nested] 99+ messages in thread
end of thread, other threads:[~2004-11-10 9:29 UTC | newest] Thread overview: 99+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-11-03 12:51 is killing zombies possible w/o a reboot? Gene Heskett 2004-11-03 14:33 ` bert hubert 2004-11-03 14:49 ` Måns Rullgård 2004-11-03 15:25 ` DervishD 2004-11-03 15:25 ` Måns Rullgård 2004-11-03 17:49 ` DervishD 2004-11-03 16:47 ` Gene Heskett 2004-11-03 17:44 ` DervishD 2004-11-03 18:53 ` Gene Heskett 2004-11-03 19:01 ` Doug McNaught 2004-11-03 19:03 ` Måns Rullgård 2004-11-03 19:24 ` Gene Heskett 2004-11-03 19:33 ` Doug McNaught 2004-11-03 19:34 ` Måns Rullgård 2004-11-03 19:06 ` Valdis.Kletnieks 2004-11-03 19:26 ` Gene Heskett 2004-11-03 19:33 ` Valdis.Kletnieks 2004-11-03 20:09 ` Gene Heskett 2004-11-04 19:24 ` Bill Davidsen 2004-11-03 19:42 ` DervishD 2004-11-03 23:12 ` Bill Davidsen 2004-11-04 10:26 ` DervishD 2004-11-04 14:23 ` Paul Slootman 2004-11-04 14:56 ` Gene Heskett 2004-11-04 18:24 ` DervishD 2004-11-04 19:22 ` Bill Davidsen 2004-11-04 20:53 ` DervishD 2004-11-03 19:26 ` DervishD 2004-11-03 20:18 ` Gene Heskett 2004-11-03 22:15 ` Jim Nelson 2004-11-03 22:44 ` Russell Miller 2004-11-03 23:03 ` Doug McNaught 2004-11-03 23:33 ` Russell Miller 2004-11-03 23:47 ` Mathieu Segaud 2004-11-03 23:56 ` Russell Miller 2004-11-04 0:05 ` Mathieu Segaud 2004-11-04 6:39 ` Denis Vlasenko 2004-11-05 2:38 ` Elladan 2004-11-05 3:10 ` Tim Connors 2004-11-05 3:17 ` Russell Miller 2004-11-05 4:38 ` Elladan 2004-11-05 5:00 ` Kyle Moffett 2004-11-04 20:06 ` Bill Davidsen 2004-11-03 23:06 ` vlobanov 2004-11-04 10:04 ` Helge Hafting 2004-11-04 17:16 ` Alex Bennee 2004-11-04 16:30 ` Pedro Venda (SYSADM) 2004-11-04 22:28 ` Helge Hafting 2004-11-03 23:07 ` Bill Davidsen 2004-11-04 1:19 ` Michael Clark 2004-11-04 16:01 ` kernel 2004-11-04 16:18 ` Gene Heskett 2004-11-04 16:47 ` kernel 2004-11-04 17:58 ` Gene Heskett 2004-11-03 22:58 ` Bill Davidsen 2004-11-04 10:23 ` DervishD 2004-11-04 19:32 ` Bill Davidsen 2004-11-04 21:11 ` DervishD 2004-11-09 23:31 ` Bill Davidsen 2004-11-10 9:11 ` DervishD 2004-11-03 23:18 ` Adam Heath 2004-11-03 16:38 ` Gene Heskett 2004-11-03 16:24 ` Gene Heskett 2004-11-03 16:46 ` linux-os 2004-11-03 19:12 ` Gene Heskett 2004-11-03 19:56 ` Måns Rullgård 2004-11-03 20:13 ` Helge Hafting 2004-11-03 20:40 ` Gene Heskett 2004-11-04 0:43 ` Kurt Wall 2004-11-04 1:01 ` Russell Miller 2004-11-04 1:38 ` Doug McNaught 2004-11-04 1:45 ` Russell Miller 2004-11-04 1:56 ` Doug McNaught 2004-11-04 1:59 ` Mitchell Blank Jr 2004-11-04 20:10 ` Bill Davidsen 2004-11-04 10:07 ` Matthias Andree 2004-11-04 22:31 ` Peter Chubb 2004-11-04 23:33 ` Benno 2004-11-03 20:48 ` Tom Felker 2004-11-03 21:08 ` Gene Heskett 2004-11-04 7:19 ` Jan Knutar 2004-11-04 11:57 ` Gene Heskett 2004-11-04 12:12 ` Jan Knutar 2004-11-04 12:18 ` Gene Heskett 2004-11-04 12:29 ` Jan Knutar 2004-11-04 13:56 ` Gene Heskett 2004-11-04 12:39 ` Gene Heskett 2004-11-04 13:01 ` Ian Campbell 2004-11-04 14:07 ` Gene Heskett 2004-11-04 14:24 ` Ian Campbell 2004-11-04 15:10 ` Gene Heskett 2004-11-04 14:26 ` DervishD 2004-11-04 15:13 ` Gene Heskett 2004-11-04 13:10 ` Doug McNaught 2004-11-04 14:11 ` Gene Heskett 2004-11-04 14:42 ` tlaurent 2004-11-04 15:14 ` Gene Heskett 2004-11-04 20:18 ` Bill Davidsen 2004-11-05 0:29 ` Gene Heskett
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox