All of lore.kernel.org
 help / color / mirror / Atom feed
* Automounter hangs...
@ 2008-08-11 13:11 Ondrej Valousek
  2008-08-20 13:56 ` Jeff Moyer
  0 siblings, 1 reply; 10+ messages in thread
From: Ondrej Valousek @ 2008-08-11 13:11 UTC (permalink / raw)
  To: autofs

Hi List,

I have strange error (RHEL-4, autofs 4.1.3-234). Using automounter to
mount home directories for users. After a while, /home got unusable and
I see the following:

root     17345 20453  0 10:33 ?        00:00:00 [automount] <defunct>
root     17348 20453  0 10:33 ?        00:00:00 [automount] <defunct>
root     20453     1  0 Aug05 ?        00:00:15 /usr/sbin/automount
--timeout=60 --debug /home yp auto.home -rw


I can not kill any of the processes listed (tried even kill -9).
The last I can see in /var/log/messages is the following:
Aug 11 10:32:39 login01 automount[17100]: expired /home/login
Aug 11 10:33:01 login01 automount[20453]: attempting to mount entry
/home/login
Aug 11 10:33:01 login01 automount[17185]: mount(nfs): mounted
belfast:/vol/users/users/login on /home/login
Aug 11 10:33:09 login01 automount[17346]: expired /home/localmgr
Aug 11 10:33:09 login01 automount[17348]: expired /home/support

Maybe, the processes 17346 and 17348 failed to unmount /home/support and
/home/localmgr. This perhaps caused even the parent process 20453 to hang.

Has anyone seen anything similar? Is there any solution to this?
Thanks,

Ondrej

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-08-11 13:11 Automounter hangs Ondrej Valousek
@ 2008-08-20 13:56 ` Jeff Moyer
  2008-08-25  6:31   ` Ondrej Valousek
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Moyer @ 2008-08-20 13:56 UTC (permalink / raw)
  To: Ondrej Valousek; +Cc: autofs

Ondrej Valousek <webserv@s3group.cz> writes:

> Hi List,
>
> I have strange error (RHEL-4, autofs 4.1.3-234). Using automounter to
> mount home directories for users. After a while, /home got unusable and
> I see the following:
>
> root     17345 20453  0 10:33 ?        00:00:00 [automount] <defunct>
> root     17348 20453  0 10:33 ?        00:00:00 [automount] <defunct>
> root     20453     1  0 Aug05 ?        00:00:15 /usr/sbin/automount
> --timeout=60 --debug /home yp auto.home -rw
>
>
> I can not kill any of the processes listed (tried even kill -9).
> The last I can see in /var/log/messages is the following:
> Aug 11 10:32:39 login01 automount[17100]: expired /home/login
> Aug 11 10:33:01 login01 automount[20453]: attempting to mount entry
> /home/login
> Aug 11 10:33:01 login01 automount[17185]: mount(nfs): mounted
> belfast:/vol/users/users/login on /home/login
> Aug 11 10:33:09 login01 automount[17346]: expired /home/localmgr
> Aug 11 10:33:09 login01 automount[17348]: expired /home/support
>
> Maybe, the processes 17346 and 17348 failed to unmount /home/support and
> /home/localmgr. This perhaps caused even the parent process 20453 to hang.
>
> Has anyone seen anything similar? Is there any solution to this?
> Thanks,

It rings a bell, but I can't put my finger on it.  Can you reproduce
this?  If so, could you send along a debug log?  Instructions for
collecting debug information can be found at:
  http://people.redhat.com/~jmoyer/

Cheers,

Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-08-20 13:56 ` Jeff Moyer
@ 2008-08-25  6:31   ` Ondrej Valousek
  2008-11-28 15:31     ` Ondrej Valousek
  0 siblings, 1 reply; 10+ messages in thread
From: Ondrej Valousek @ 2008-08-25  6:31 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: autofs

Hi Jeff,

Yes I am trying to reproduce this with the debug enabled - it will take
some time.
Please stay tuned.

Ondrej
> It rings a bell, but I can't put my finger on it.  Can you reproduce
> this?  If so, could you send along a debug log?  Instructions for
> collecting debug information can be found at:
>   http://people.redhat.com/~jmoyer/
>
> Cheers,
>
> Jeff
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-08-25  6:31   ` Ondrej Valousek
@ 2008-11-28 15:31     ` Ondrej Valousek
  2008-11-28 19:57       ` webserv
  0 siblings, 1 reply; 10+ messages in thread
From: Ondrej Valousek @ 2008-11-28 15:31 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: autofs

Hi All,

I hoped this went away forever, but I was wrong (unfortunately). Here we
go again:
RHEL-4, full updates, autofs 4, automounter hangs:
ps -ef | grep auto:
root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
--timeout=3600 --debug --use-old-ldap-lookup /softappli yp
auto.softappli -rw
root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
--timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
--timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
--timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
--timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
--timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
root     28454 21820  0 15:25 pts/134  00:00:00 grep auto

Debug logs:
Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
token 7150, name towerip
Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
Nov 27 13:07:28 login02 automount[18849]: umount_multi:
path=/proj/towerip incl=1
Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
dir=/proj/towerip
Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
sig 0 (0), stat 0
Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
pid 18849: signalled 0 (sig 0), exit status 0
Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
token 7151, name pdld4
Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
incl=1
Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
dir=/proj/pdld4
Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
7152, name towerip

The automounter daemon handling the /proj map stalled.
Please help.
Thanks,

Ondrej

Ondrej Valousek wrote:
> Hi Jeff,
>
> Yes I am trying to reproduce this with the debug enabled - it will take
> some time.
> Please stay tuned.
>
> Ondrej
>   
>> It rings a bell, but I can't put my finger on it.  Can you reproduce
>> this?  If so, could you send along a debug log?  Instructions for
>> collecting debug information can be found at:
>>   http://people.redhat.com/~jmoyer/
>>
>> Cheers,
>>
>> Jeff
>>   
>>     
>
>
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-11-28 15:31     ` Ondrej Valousek
@ 2008-11-28 19:57       ` webserv
  2008-11-29  9:13         ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: webserv @ 2008-11-28 19:57 UTC (permalink / raw)
  To: Ondrej Valousek; +Cc: autofs

Process 18851 - the <defunc> one below.
Last message from this one in logs is "expired /proj/pdld4".
From autofs sources I see, that after this message follows directly exit(0).
Without calling or touching anything else, just exit the fork.
So what the hell is going on here?
Ondrej

> Hi All,
>
> I hoped this went away forever, but I was wrong (unfortunately). Here we
> go again:
> RHEL-4, full updates, autofs 4, automounter hangs:
> ps -ef | grep auto:
> root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> --timeout=3600 --debug --use-old-ldap-lookup /softappli yp
> auto.softappli -rw
> root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> --timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
> root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> --timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
> root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> --timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
> root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
> root     28454 21820  0 15:25 pts/134  00:00:00 grep auto
>
> Debug logs:
> Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
> Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
> Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
> Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> token 7150, name towerip
> Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
> Nov 27 13:07:28 login02 automount[18849]: umount_multi:
> path=/proj/towerip incl=1
> Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
> dir=/proj/towerip
> Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
> Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
> sig 0 (0), stat 0
> Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
> pid 18849: signalled 0 (sig 0), exit status 0
> Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> token 7151, name pdld4
> Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
> Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
> incl=1
> Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
> dir=/proj/pdld4
> Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
> Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
> 7152, name towerip
>
> The automounter daemon handling the /proj map stalled.
> Please help.
> Thanks,
>
> Ondrej
>
> Ondrej Valousek wrote:
>> Hi Jeff,
>>
>> Yes I am trying to reproduce this with the debug enabled - it will take
>> some time.
>> Please stay tuned.
>>
>> Ondrej
>>
>>> It rings a bell, but I can't put my finger on it.  Can you reproduce
>>> this?  If so, could you send along a debug log?  Instructions for
>>> collecting debug information can be found at:
>>>   http://people.redhat.com/~jmoyer/
>>>
>>> Cheers,
>>>
>>> Jeff
>>>
>>>
>>
>>
>>
>
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs
>



The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s).
Please direct any additional queries to: communications@s3group.com.
Thank You.
Silicon and Software Systems Limited. Registered in Ireland no. 378073.
Registered Office: South County Business Park, Leopardstown, Dublin 18

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-11-28 19:57       ` webserv
@ 2008-11-29  9:13         ` Ian Kent
  2008-11-29 17:22           ` Ondrej Valousek
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Kent @ 2008-11-29  9:13 UTC (permalink / raw)
  To: webserv; +Cc: autofs

On Fri, 2008-11-28 at 19:57 +0000, webserv@s3group.com wrote:
> Process 18851 - the <defunc> one below.
> Last message from this one in logs is "expired /proj/pdld4".
> From autofs sources I see, that after this message follows directly exit(0).
> Without calling or touching anything else, just exit the fork.
> So what the hell is going on here?

Seems that the expire is completing before the parent signals are
restored. But I thought a signal that is sent while it is blocked
(SIGCHLD in this case) is delivered once the signal is unblocked so this
is a bit of a puzzle.

> Ondrej
> 
> > Hi All,
> >
> > I hoped this went away forever, but I was wrong (unfortunately). Here we
> > go again:
> > RHEL-4, full updates, autofs 4, automounter hangs:
> > ps -ef | grep auto:
> > root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> > --timeout=3600 --debug --use-old-ldap-lookup /softappli yp
> > auto.softappli -rw
> > root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> > --timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
> > root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> > --timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
> > root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> > --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> > root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> > --timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
> > root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
> > --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> > root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
> > root     28454 21820  0 15:25 pts/134  00:00:00 grep auto
> >
> > Debug logs:
> > Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
> > Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
> > Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
> > Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
> > Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> > Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> > token 7150, name towerip
> > Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
> > Nov 27 13:07:28 login02 automount[18849]: umount_multi:
> > path=/proj/towerip incl=1
> > Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
> > dir=/proj/towerip
> > Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
> > Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
> > sig 0 (0), stat 0
> > Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
> > pid 18849: signalled 0 (sig 0), exit status 0
> > Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
> > Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> > Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> > token 7151, name pdld4
> > Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
> > Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
> > incl=1
> > Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
> > dir=/proj/pdld4
> > Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
> > Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
> > Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
> > 7152, name towerip
> >
> > The automounter daemon handling the /proj map stalled.
> > Please help.
> > Thanks,
> >
> > Ondrej
> >
> > Ondrej Valousek wrote:
> >> Hi Jeff,
> >>
> >> Yes I am trying to reproduce this with the debug enabled - it will take
> >> some time.
> >> Please stay tuned.
> >>
> >> Ondrej
> >>
> >>> It rings a bell, but I can't put my finger on it.  Can you reproduce
> >>> this?  If so, could you send along a debug log?  Instructions for
> >>> collecting debug information can be found at:
> >>>   http://people.redhat.com/~jmoyer/
> >>>
> >>> Cheers,
> >>>
> >>> Jeff
> >>>
> >>>
> >>
> >>
> >>
> >
> > _______________________________________________
> > autofs mailing list
> > autofs@linux.kernel.org
> > http://linux.kernel.org/mailman/listinfo/autofs
> >
> 
> 
> 
> The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s).
> Please direct any additional queries to: communications@s3group.com.
> Thank You.
> Silicon and Software Systems Limited. Registered in Ireland no. 378073.
> Registered Office: South County Business Park, Leopardstown, Dublin 18
> 
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-11-29  9:13         ` Ian Kent
@ 2008-11-29 17:22           ` Ondrej Valousek
  2008-11-29 17:48             ` Ondrej Valousek
  0 siblings, 1 reply; 10+ messages in thread
From: Ondrej Valousek @ 2008-11-29 17:22 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs@linux.kernel.org


> Seems that the expire is completing before the parent signals are
> restored. But I thought a signal that is sent while it is blocked
> (SIGCHLD in this case) is delivered once the signal is unblocked so this
> is a bit of a puzzle.
>
>   
And which game plays the process 18848 here - this is the first one to
hang (looks like)....

Nov 20 15:02:39 login02 automount[18848]: lookup(yp): looking up .directory
Nov 20 15:02:39 login02 automount[18848]: failed to mount /proj/.directory
Nov 20 15:02:39 login02 automount[18848]: umount_multi:
path=/proj/.directory incl=1
Nov 20 15:02:39 login02 automount[4125]: handle_child: got pid 18848,
sig 0 (0), stat 1
Nov 20 15:02:39 login02 automount[4125]: sig_child: found pending iop
pid 18848: signalled 0 (sig 0), exit status 1
Nov 21 15:07:55 login02 automount[18848]: lookup(yp): looking up .raw_data
Nov 21 15:07:55 login02 automount[18848]: failed to mount /proj/.raw_data
Nov 21 15:07:55 login02 automount[18848]: umount_multi:
path=/proj/.raw_data incl=1
Nov 21 15:07:55 login02 automount[4125]: handle_child: got pid 18848,
sig 0 (0), stat 1
Nov 21 15:07:55 login02 automount[4125]: sig_child: found pending iop
pid 18848: signalled 0 (sig 0), exit status 1

>> Ondrej
>>
>>     
>>> Hi All,
>>>
>>> I hoped this went away forever, but I was wrong (unfortunately). Here we
>>> go again:
>>> RHEL-4, full updates, autofs 4, automounter hangs:
>>> ps -ef | grep auto:
>>> root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>> --timeout=3600 --debug --use-old-ldap-lookup /softappli yp
>>> auto.softappli -rw
>>> root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>> --timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
>>> root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>> --timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
>>> root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
>>> root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>> --timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
>>> root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
>>> root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
>>> root     28454 21820  0 15:25 pts/134  00:00:00 grep auto
>>>
>>> Debug logs:
>>> Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
>>> Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
>>> Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
>>> Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
>>> token 7150, name towerip
>>> Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi:
>>> path=/proj/towerip incl=1
>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
>>> dir=/proj/towerip
>>> Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
>>> Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
>>> sig 0 (0), stat 0
>>> Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
>>> pid 18849: signalled 0 (sig 0), exit status 0
>>> Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
>>> token 7151, name pdld4
>>> Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
>>> incl=1
>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
>>> dir=/proj/pdld4
>>> Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
>>> 7152, name towerip
>>>
>>> The automounter daemon handling the /proj map stalled.
>>> Please help.
>>> Thanks,
>>>
>>> Ondrej
>>>
>>> Ondrej Valousek wrote:
>>>       
>>>> Hi Jeff,
>>>>
>>>> Yes I am trying to reproduce this with the debug enabled - it will take
>>>> some time.
>>>> Please stay tuned.
>>>>
>>>> Ondrej
>>>>
>>>>         
>>>>> It rings a bell, but I can't put my finger on it.  Can you reproduce
>>>>> this?  If so, could you send along a debug log?  Instructions for
>>>>> collecting debug information can be found at:
>>>>>   http://people.redhat.com/~jmoyer/
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Jeff
>>>>>
>>>>>
>>>>>           
>>>>
>>>>         
>>> _______________________________________________
>>> autofs mailing list
>>> autofs@linux.kernel.org
>>> http://linux.kernel.org/mailman/listinfo/autofs
>>>
>>>       
>>
>> The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s).
>> Please direct any additional queries to: communications@s3group.com.
>> Thank You.
>> Silicon and Software Systems Limited. Registered in Ireland no. 378073.
>> Registered Office: South County Business Park, Leopardstown, Dublin 18
>>
>> _______________________________________________
>> autofs mailing list
>> autofs@linux.kernel.org
>> http://linux.kernel.org/mailman/listinfo/autofs
>>     
>
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-11-29 17:22           ` Ondrej Valousek
@ 2008-11-29 17:48             ` Ondrej Valousek
  2008-12-01 12:45               ` Ian Kent
  0 siblings, 1 reply; 10+ messages in thread
From: Ondrej Valousek @ 2008-11-29 17:48 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs@linux.kernel.org

To summarize:
process 4032 - D (disk sleep)
process 18848 - S (sleep, but does not react to kill)
process 18841 - Z (zombie)
O.
Ondrej Valousek wrote:
>> Seems that the expire is completing before the parent signals are
>> restored. But I thought a signal that is sent while it is blocked
>> (SIGCHLD in this case) is delivered once the signal is unblocked so this
>> is a bit of a puzzle.
>>
>>   
>>     
> And which game plays the process 18848 here - this is the first one to
> hang (looks like)....
>
> Nov 20 15:02:39 login02 automount[18848]: lookup(yp): looking up .directory
> Nov 20 15:02:39 login02 automount[18848]: failed to mount /proj/.directory
> Nov 20 15:02:39 login02 automount[18848]: umount_multi:
> path=/proj/.directory incl=1
> Nov 20 15:02:39 login02 automount[4125]: handle_child: got pid 18848,
> sig 0 (0), stat 1
> Nov 20 15:02:39 login02 automount[4125]: sig_child: found pending iop
> pid 18848: signalled 0 (sig 0), exit status 1
> Nov 21 15:07:55 login02 automount[18848]: lookup(yp): looking up .raw_data
> Nov 21 15:07:55 login02 automount[18848]: failed to mount /proj/.raw_data
> Nov 21 15:07:55 login02 automount[18848]: umount_multi:
> path=/proj/.raw_data incl=1
> Nov 21 15:07:55 login02 automount[4125]: handle_child: got pid 18848,
> sig 0 (0), stat 1
> Nov 21 15:07:55 login02 automount[4125]: sig_child: found pending iop
> pid 18848: signalled 0 (sig 0), exit status 1
>
>   
>>> Ondrej
>>>
>>>     
>>>       
>>>> Hi All,
>>>>
>>>> I hoped this went away forever, but I was wrong (unfortunately). Here we
>>>> go again:
>>>> RHEL-4, full updates, autofs 4, automounter hangs:
>>>> ps -ef | grep auto:
>>>> root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>>> --timeout=3600 --debug --use-old-ldap-lookup /softappli yp
>>>> auto.softappli -rw
>>>> root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>>> --timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
>>>> root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>>> --timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
>>>> root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
>>>> root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
>>>> --timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
>>>> root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
>>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
>>>> root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
>>>> root     28454 21820  0 15:25 pts/134  00:00:00 grep auto
>>>>
>>>> Debug logs:
>>>> Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
>>>> Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
>>>> Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
>>>> Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
>>>> token 7150, name towerip
>>>> Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
>>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi:
>>>> path=/proj/towerip incl=1
>>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
>>>> dir=/proj/towerip
>>>> Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
>>>> sig 0 (0), stat 0
>>>> Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
>>>> pid 18849: signalled 0 (sig 0), exit status 0
>>>> Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
>>>> token 7151, name pdld4
>>>> Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
>>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
>>>> incl=1
>>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
>>>> dir=/proj/pdld4
>>>> Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
>>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
>>>> 7152, name towerip
>>>>
>>>> The automounter daemon handling the /proj map stalled.
>>>> Please help.
>>>> Thanks,
>>>>
>>>> Ondrej
>>>>
>>>> Ondrej Valousek wrote:
>>>>       
>>>>         
>>>>> Hi Jeff,
>>>>>
>>>>> Yes I am trying to reproduce this with the debug enabled - it will take
>>>>> some time.
>>>>> Please stay tuned.
>>>>>
>>>>> Ondrej
>>>>>
>>>>>         
>>>>>           
>>>>>> It rings a bell, but I can't put my finger on it.  Can you reproduce
>>>>>> this?  If so, could you send along a debug log?  Instructions for
>>>>>> collecting debug information can be found at:
>>>>>>   http://people.redhat.com/~jmoyer/
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Jeff
>>>>>>
>>>>>>
>>>>>>           
>>>>>>             
>>>>>         
>>>>>           
>>>> _______________________________________________
>>>> autofs mailing list
>>>> autofs@linux.kernel.org
>>>> http://linux.kernel.org/mailman/listinfo/autofs
>>>>
>>>>       
>>>>         
>>> The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s).
>>> Please direct any additional queries to: communications@s3group.com.
>>> Thank You.
>>> Silicon and Software Systems Limited. Registered in Ireland no. 378073.
>>> Registered Office: South County Business Park, Leopardstown, Dublin 18
>>>
>>> _______________________________________________
>>> autofs mailing list
>>> autofs@linux.kernel.org
>>> http://linux.kernel.org/mailman/listinfo/autofs
>>>     
>>>       
>>   
>>     
>
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-11-29 17:48             ` Ondrej Valousek
@ 2008-12-01 12:45               ` Ian Kent
  2008-12-01 13:10                 ` Ondrej Valousek
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Kent @ 2008-12-01 12:45 UTC (permalink / raw)
  To: Ondrej Valousek; +Cc: autofs@linux.kernel.org

On Sat, 2008-11-29 at 18:48 +0100, Ondrej Valousek wrote:
> To summarize:
> process 4032 - D (disk sleep)
> process 18848 - S (sleep, but does not react to kill)
> process 18841 - Z (zombie)
> O.

For a start you need to make the entire debug log of the session in
which this occurs available.

> Ondrej Valousek wrote:
> >> Seems that the expire is completing before the parent signals are
> >> restored. But I thought a signal that is sent while it is blocked
> >> (SIGCHLD in this case) is delivered once the signal is unblocked so this
> >> is a bit of a puzzle.
> >>
> >>   
> >>     
> > And which game plays the process 18848 here - this is the first one to
> > hang (looks like)....
> >
> > Nov 20 15:02:39 login02 automount[18848]: lookup(yp): looking up .directory
> > Nov 20 15:02:39 login02 automount[18848]: failed to mount /proj/.directory
> > Nov 20 15:02:39 login02 automount[18848]: umount_multi:
> > path=/proj/.directory incl=1
> > Nov 20 15:02:39 login02 automount[4125]: handle_child: got pid 18848,
> > sig 0 (0), stat 1
> > Nov 20 15:02:39 login02 automount[4125]: sig_child: found pending iop
> > pid 18848: signalled 0 (sig 0), exit status 1
> > Nov 21 15:07:55 login02 automount[18848]: lookup(yp): looking up .raw_data
> > Nov 21 15:07:55 login02 automount[18848]: failed to mount /proj/.raw_data
> > Nov 21 15:07:55 login02 automount[18848]: umount_multi:
> > path=/proj/.raw_data incl=1
> > Nov 21 15:07:55 login02 automount[4125]: handle_child: got pid 18848,
> > sig 0 (0), stat 1
> > Nov 21 15:07:55 login02 automount[4125]: sig_child: found pending iop
> > pid 18848: signalled 0 (sig 0), exit status 1
> >
> >   
> >>> Ondrej
> >>>
> >>>     
> >>>       
> >>>> Hi All,
> >>>>
> >>>> I hoped this went away forever, but I was wrong (unfortunately). Here we
> >>>> go again:
> >>>> RHEL-4, full updates, autofs 4, automounter hangs:
> >>>> ps -ef | grep auto:
> >>>> root      3805     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /softappli yp
> >>>> auto.softappli -rw
> >>>> root      3880     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /cadappl yp auto.cadappl -rw
> >>>> root      3947     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /appli yp auto.appli -rw
> >>>> root      4032     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> >>>> root      4118     1  0 Nov21 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /home yp auto.home -rw
> >>>> root     18848  4032  0 Nov27 ?        00:00:00 /usr/sbin/automount
> >>>> --timeout=3600 --debug --use-old-ldap-lookup /proj yp auto.proj -rw
> >>>> root     18851  4032  0 Nov27 ?        00:00:00 [automount] <defunct>
> >>>> root     28454 21820  0 15:25 pts/134  00:00:00 grep auto
> >>>>
> >>>> Debug logs:
> >>>> Nov 27 13:07:28 login02 automount[4032]: sig 14 switching from 1 to 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: get_pkt: state 1, next 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: st_expire(): state = 1
> >>>> Nov 27 13:07:28 login02 automount[4032]: expire_proc: exp_proc=18848
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> >>>> token 7150, name towerip
> >>>> Nov 27 13:07:28 login02 automount[18849]: expiring path /proj/towerip
> >>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi:
> >>>> path=/proj/towerip incl=1
> >>>> Nov 27 13:07:28 login02 automount[18849]: umount_multi: unmounting
> >>>> dir=/proj/towerip
> >>>> Nov 27 13:07:28 login02 automount[18849]: expired /proj/towerip
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_child: got pid 18849,
> >>>> sig 0 (0), stat 0
> >>>> Nov 27 13:07:28 login02 automount[4032]: sig_child: found pending iop
> >>>> pid 18849: signalled 0 (sig 0), exit status 0
> >>>> Nov 27 13:07:28 login02 automount[4032]: send_ready: token=7150
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 2
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_expire_multi:
> >>>> token 7151, name pdld4
> >>>> Nov 27 13:07:28 login02 automount[18851]: expiring path /proj/pdld4
> >>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: path=/proj/pdld4
> >>>> incl=1
> >>>> Nov 27 13:07:28 login02 automount[18851]: umount_multi: unmounting
> >>>> dir=/proj/pdld4
> >>>> Nov 27 13:07:28 login02 automount[18851]: expired /proj/pdld4
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet: type = 0
> >>>> Nov 27 13:07:28 login02 automount[4032]: handle_packet_missing: token
> >>>> 7152, name towerip
> >>>>
> >>>> The automounter daemon handling the /proj map stalled.
> >>>> Please help.
> >>>> Thanks,
> >>>>
> >>>> Ondrej
> >>>>
> >>>> Ondrej Valousek wrote:
> >>>>       
> >>>>         
> >>>>> Hi Jeff,
> >>>>>
> >>>>> Yes I am trying to reproduce this with the debug enabled - it will take
> >>>>> some time.
> >>>>> Please stay tuned.
> >>>>>
> >>>>> Ondrej
> >>>>>
> >>>>>         
> >>>>>           
> >>>>>> It rings a bell, but I can't put my finger on it.  Can you reproduce
> >>>>>> this?  If so, could you send along a debug log?  Instructions for
> >>>>>> collecting debug information can be found at:
> >>>>>>   http://people.redhat.com/~jmoyer/
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> Jeff
> >>>>>>
> >>>>>>
> >>>>>>           
> >>>>>>             
> >>>>>         
> >>>>>           
> >>>> _______________________________________________
> >>>> autofs mailing list
> >>>> autofs@linux.kernel.org
> >>>> http://linux.kernel.org/mailman/listinfo/autofs
> >>>>
> >>>>       
> >>>>         
> >>> The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s).
> >>> Please direct any additional queries to: communications@s3group.com.
> >>> Thank You.
> >>> Silicon and Software Systems Limited. Registered in Ireland no. 378073.
> >>> Registered Office: South County Business Park, Leopardstown, Dublin 18
> >>>
> >>> _______________________________________________
> >>> autofs mailing list
> >>> autofs@linux.kernel.org
> >>> http://linux.kernel.org/mailman/listinfo/autofs
> >>>     
> >>>       
> >>   
> >>     
> >
> > _______________________________________________
> > autofs mailing list
> > autofs@linux.kernel.org
> > http://linux.kernel.org/mailman/listinfo/autofs
> >   
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Automounter hangs...
  2008-12-01 12:45               ` Ian Kent
@ 2008-12-01 13:10                 ` Ondrej Valousek
  0 siblings, 0 replies; 10+ messages in thread
From: Ondrej Valousek @ 2008-12-01 13:10 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs@linux.kernel.org

Ian Kent wrote:
>
> For a start you need to make the entire debug log of the session in
> which this occurs available.
>   
Service request of the high priority #1877284 opened with RedHat with
the attached sosreport and automounter debug log.
Regards
Ondrej

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-12-01 13:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-11 13:11 Automounter hangs Ondrej Valousek
2008-08-20 13:56 ` Jeff Moyer
2008-08-25  6:31   ` Ondrej Valousek
2008-11-28 15:31     ` Ondrej Valousek
2008-11-28 19:57       ` webserv
2008-11-29  9:13         ` Ian Kent
2008-11-29 17:22           ` Ondrej Valousek
2008-11-29 17:48             ` Ondrej Valousek
2008-12-01 12:45               ` Ian Kent
2008-12-01 13:10                 ` Ondrej Valousek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.