All of lore.kernel.org
 help / color / mirror / Atom feed
* automounter segfaults when using negative cache?
@ 2008-12-04 11:15 Ondrej Valousek
  2008-12-05 16:00 ` Jeff Moyer
  0 siblings, 1 reply; 6+ messages in thread
From: Ondrej Valousek @ 2008-12-04 11:15 UTC (permalink / raw)
  To: autofs@linux.kernel.org

Interesting thing:
I have 2 identical login01 and login02 machines (Rhel4, full updates).
login01 is running old kernel (2.6.9-42.ELsmp), login02 is running the
latest one (2.6.9-78.0.8.ELsmp). Both have autofs debug enabled and are
running the latest autofs (autofs-4.1.3-234).
Now:
1) login01:
[root@login01 ondrejv]# cat /var/log/debug.log* | grep "Dec  4 09:15:09"
Dec  4 09:15:09 login01 automount[3939]: handle_packet: type = 0
Dec  4 09:15:09 login01 automount[3939]: handle_packet_missing: token
3533, name .raw_data
Dec  4 09:15:09 login01 automount[3939]: attempting to mount entry
/proj/.raw_data
Dec  4 09:15:09 login01 automount[1649]: lookup(yp): looking up .raw_data
Dec  4 09:15:09 login01 automount[3939]: mt->key set to .raw_data
Dec  4 09:15:09 login01 kernel: automount[1649]: segfault at
0000000000000000 rip 0000002a95916be0 rsp 0000007fbfffd5c8 error 4
Dec  4 09:15:09 login01 automount[3939]: handle_child: got pid 1649, sig
1 (11), stat 0
Dec  4 09:15:09 login01 automount[3939]: sig_child: found pending iop
pid 1649: signalled 1 (sig 11), exit status 0
Dec  4 09:15:09 login01 automount[3939]: send_fail: token=3533
Dec  4 09:15:09 login01 automount[3939]: handle_packet: type = 0
Dec  4 09:15:09 login01 automount[3939]: handle_packet_missing: token
3534, name .raw_data
Dec  4 09:15:09 login01 automount[3939]: attempting to mount entry
/proj/.raw_data
Dec  4 09:15:09 login01 automount[1650]: lookup(yp): looking up .raw_data
Dec  4 09:15:09 login01 automount[3939]: mt->key set to .raw_data
Dec  4 09:15:09 login01 kernel: automount[1650]: segfault at
0000000000000000 rip 0000002a95916be0 rsp 0000007fbfffd5c8 error 4

Note that .raw_data map does not exist in the yp map so it should go
into the negative cache - which happens correctly on login02:
ov 28 23:27:43 login01 automount[4201]: attempting to mount entry
/proj/.raw_data
Nov 28 23:27:43 login01 automount[13394]: lookup(yp): looking up .raw_data
Nov 28 23:27:43 login01 automount[4201]: mt->key set to .raw_data
Nov 28 23:27:43 login01 automount[13394]: lookup(yp): key ".raw_data"
not found in map.
Nov 28 23:27:43 login01 automount[13394]: failed to mount /proj/.raw_data
Nov 28 23:27:43 login01 automount[13394]: umount_multi:
path=/proj/.raw_data incl=1
Nov 28 23:27:43 login01 automount[4201]: handle_child: got pid 13394,
sig 0 (0), stat 3
Nov 28 23:27:43 login01 automount[4201]: sig_child: found pending iop
pid 13394: signalled 0 (sig 0), exit status 3
Nov 28 23:27:43 login01 automount[4201]: update_negative_cache: key:
.raw_data
Nov 28 23:27:43 login01 automount[4201]: Adding negative cache entry for
key .raw_data
Nov 28 23:27:43 login01 automount[4201]: Key .raw_data added to negative
cache
Nov 28 23:27:43 login01 automount[4201]: send_fail: token=15625

I thought all the negative caching is done in user space, but apparently
kernel has some influence here....
Any explanation?
Thanks,

Ondrej

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: automounter segfaults when using negative cache?
  2008-12-04 11:15 automounter segfaults when using negative cache? Ondrej Valousek
@ 2008-12-05 16:00 ` Jeff Moyer
  2008-12-05 16:23   ` Ian Kent
  2008-12-08  9:08   ` Ondrej Valousek
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff Moyer @ 2008-12-05 16:00 UTC (permalink / raw)
  To: Ondrej Valousek; +Cc: autofs@linux.kernel.org

Ondrej Valousek <webserv@s3group.cz> writes:

> Interesting thing:
> I have 2 identical login01 and login02 machines (Rhel4, full updates).
> login01 is running old kernel (2.6.9-42.ELsmp), login02 is running the
> latest one (2.6.9-78.0.8.ELsmp). Both have autofs debug enabled and are
> running the latest autofs (autofs-4.1.3-234).
> Now:
> 1) login01:
> [root@login01 ondrejv]# cat /var/log/debug.log* | grep "Dec  4 09:15:09"
> Dec  4 09:15:09 login01 automount[3939]: handle_packet: type = 0
> Dec  4 09:15:09 login01 automount[3939]: handle_packet_missing: token
> 3533, name .raw_data
> Dec  4 09:15:09 login01 automount[3939]: attempting to mount entry
> /proj/.raw_data
> Dec  4 09:15:09 login01 automount[1649]: lookup(yp): looking up .raw_data
> Dec  4 09:15:09 login01 automount[3939]: mt->key set to .raw_data
> Dec  4 09:15:09 login01 kernel: automount[1649]: segfault at
> 0000000000000000 rip 0000002a95916be0 rsp 0000007fbfffd5c8 error 4
> Dec  4 09:15:09 login01 automount[3939]: handle_child: got pid 1649, sig
> 1 (11), stat 0
> Dec  4 09:15:09 login01 automount[3939]: sig_child: found pending iop
> pid 1649: signalled 1 (sig 11), exit status 0
> Dec  4 09:15:09 login01 automount[3939]: send_fail: token=3533
> Dec  4 09:15:09 login01 automount[3939]: handle_packet: type = 0
> Dec  4 09:15:09 login01 automount[3939]: handle_packet_missing: token
> 3534, name .raw_data
> Dec  4 09:15:09 login01 automount[3939]: attempting to mount entry
> /proj/.raw_data
> Dec  4 09:15:09 login01 automount[1650]: lookup(yp): looking up .raw_data
> Dec  4 09:15:09 login01 automount[3939]: mt->key set to .raw_data
> Dec  4 09:15:09 login01 kernel: automount[1650]: segfault at
> 0000000000000000 rip 0000002a95916be0 rsp 0000007fbfffd5c8 error 4

Can you enable core dumps and get us a stack trace of this?

> Note that .raw_data map does not exist in the yp map so it should go
> into the negative cache - which happens correctly on login02:
> ov 28 23:27:43 login01 automount[4201]: attempting to mount entry
> /proj/.raw_data
> Nov 28 23:27:43 login01 automount[13394]: lookup(yp): looking up .raw_data
> Nov 28 23:27:43 login01 automount[4201]: mt->key set to .raw_data
> Nov 28 23:27:43 login01 automount[13394]: lookup(yp): key ".raw_data"
> not found in map.
> Nov 28 23:27:43 login01 automount[13394]: failed to mount /proj/.raw_data
> Nov 28 23:27:43 login01 automount[13394]: umount_multi:
> path=/proj/.raw_data incl=1
> Nov 28 23:27:43 login01 automount[4201]: handle_child: got pid 13394,
> sig 0 (0), stat 3
> Nov 28 23:27:43 login01 automount[4201]: sig_child: found pending iop
> pid 13394: signalled 0 (sig 0), exit status 3
> Nov 28 23:27:43 login01 automount[4201]: update_negative_cache: key:
> .raw_data
> Nov 28 23:27:43 login01 automount[4201]: Adding negative cache entry for
> key .raw_data
> Nov 28 23:27:43 login01 automount[4201]: Key .raw_data added to negative
> cache
> Nov 28 23:27:43 login01 automount[4201]: send_fail: token=15625
>
> I thought all the negative caching is done in user space, but apparently
> kernel has some influence here....
> Any explanation?

The negative caching is done in userspace.  I'm not sure I understand
your question.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: automounter segfaults when using negative cache?
  2008-12-05 16:00 ` Jeff Moyer
@ 2008-12-05 16:23   ` Ian Kent
  2008-12-05 16:27     ` Jeff Moyer
  2008-12-08  9:08   ` Ondrej Valousek
  1 sibling, 1 reply; 6+ messages in thread
From: Ian Kent @ 2008-12-05 16:23 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: autofs@linux.kernel.org

On Fri, 2008-12-05 at 11:00 -0500, Jeff Moyer wrote:
> Ondrej Valousek <webserv@s3group.cz> writes:
> 
> > Interesting thing:
> > I have 2 identical login01 and login02 machines (Rhel4, full updates).
> > login01 is running old kernel (2.6.9-42.ELsmp), login02 is running the
> > latest one (2.6.9-78.0.8.ELsmp). Both have autofs debug enabled and are
> > running the latest autofs (autofs-4.1.3-234).
> > Now:
> > 1) login01:
> > [root@login01 ondrejv]# cat /var/log/debug.log* | grep "Dec  4 09:15:09"
> > Dec  4 09:15:09 login01 automount[3939]: handle_packet: type = 0
> > Dec  4 09:15:09 login01 automount[3939]: handle_packet_missing: token
> > 3533, name .raw_data
> > Dec  4 09:15:09 login01 automount[3939]: attempting to mount entry
> > /proj/.raw_data
> > Dec  4 09:15:09 login01 automount[1649]: lookup(yp): looking up .raw_data
> > Dec  4 09:15:09 login01 automount[3939]: mt->key set to .raw_data
> > Dec  4 09:15:09 login01 kernel: automount[1649]: segfault at
> > 0000000000000000 rip 0000002a95916be0 rsp 0000007fbfffd5c8 error 4
> > Dec  4 09:15:09 login01 automount[3939]: handle_child: got pid 1649, sig
> > 1 (11), stat 0
> > Dec  4 09:15:09 login01 automount[3939]: sig_child: found pending iop
> > pid 1649: signalled 1 (sig 11), exit status 0
> > Dec  4 09:15:09 login01 automount[3939]: send_fail: token=3533
> > Dec  4 09:15:09 login01 automount[3939]: handle_packet: type = 0
> > Dec  4 09:15:09 login01 automount[3939]: handle_packet_missing: token
> > 3534, name .raw_data
> > Dec  4 09:15:09 login01 automount[3939]: attempting to mount entry
> > /proj/.raw_data
> > Dec  4 09:15:09 login01 automount[1650]: lookup(yp): looking up .raw_data
> > Dec  4 09:15:09 login01 automount[3939]: mt->key set to .raw_data
> > Dec  4 09:15:09 login01 kernel: automount[1650]: segfault at
> > 0000000000000000 rip 0000002a95916be0 rsp 0000007fbfffd5c8 error 4
> 
> Can you enable core dumps and get us a stack trace of this?
> 
> > Note that .raw_data map does not exist in the yp map so it should go
> > into the negative cache - which happens correctly on login02:
> > ov 28 23:27:43 login01 automount[4201]: attempting to mount entry
> > /proj/.raw_data
> > Nov 28 23:27:43 login01 automount[13394]: lookup(yp): looking up .raw_data
> > Nov 28 23:27:43 login01 automount[4201]: mt->key set to .raw_data
> > Nov 28 23:27:43 login01 automount[13394]: lookup(yp): key ".raw_data"
> > not found in map.
> > Nov 28 23:27:43 login01 automount[13394]: failed to mount /proj/.raw_data
> > Nov 28 23:27:43 login01 automount[13394]: umount_multi:
> > path=/proj/.raw_data incl=1
> > Nov 28 23:27:43 login01 automount[4201]: handle_child: got pid 13394,
> > sig 0 (0), stat 3
> > Nov 28 23:27:43 login01 automount[4201]: sig_child: found pending iop
> > pid 13394: signalled 0 (sig 0), exit status 3
> > Nov 28 23:27:43 login01 automount[4201]: update_negative_cache: key:
> > .raw_data
> > Nov 28 23:27:43 login01 automount[4201]: Adding negative cache entry for
> > key .raw_data
> > Nov 28 23:27:43 login01 automount[4201]: Key .raw_data added to negative
> > cache
> > Nov 28 23:27:43 login01 automount[4201]: send_fail: token=15625
> >
> > I thought all the negative caching is done in user space, but apparently
> > kernel has some influence here....
> > Any explanation?
> 
> The negative caching is done in userspace.  I'm not sure I understand
> your question.

Me neither but I'm working on the negative caching of non-existent map
keys. Turns out to be a bit tricky when multiple nsswitch sources can be
tried for each mount since the map may not exist within a given source
and so we get NSS_STATUS_UNAVAIL instead of NSS_STATUS_NOTFOUND. Maybe I
should just cache the unavailable in the same way?

Ian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: automounter segfaults when using negative cache?
  2008-12-05 16:23   ` Ian Kent
@ 2008-12-05 16:27     ` Jeff Moyer
  2008-12-05 17:12       ` Ian Kent
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Moyer @ 2008-12-05 16:27 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs@linux.kernel.org

Ian Kent <raven@themaw.net> writes:

> On Fri, 2008-12-05 at 11:00 -0500, Jeff Moyer wrote:
>> 
>> The negative caching is done in userspace.  I'm not sure I understand
>> your question.
>
> Me neither but I'm working on the negative caching of non-existent map
> keys. Turns out to be a bit tricky when multiple nsswitch sources can be
> tried for each mount since the map may not exist within a given source
> and so we get NSS_STATUS_UNAVAIL instead of NSS_STATUS_NOTFOUND. Maybe I
> should just cache the unavailable in the same way?

Ian, this is autofs version 4.  It's been a while since I implemented
the negative caching there, but I'd be interested in knowing if you
think I got it wrong for the case above!

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: automounter segfaults when using negative cache?
  2008-12-05 16:27     ` Jeff Moyer
@ 2008-12-05 17:12       ` Ian Kent
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Kent @ 2008-12-05 17:12 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: autofs@linux.kernel.org

On Fri, 2008-12-05 at 11:27 -0500, Jeff Moyer wrote:
> Ian Kent <raven@themaw.net> writes:
> 
> > On Fri, 2008-12-05 at 11:00 -0500, Jeff Moyer wrote:
> >> 
> >> The negative caching is done in userspace.  I'm not sure I understand
> >> your question.
> >
> > Me neither but I'm working on the negative caching of non-existent map
> > keys. Turns out to be a bit tricky when multiple nsswitch sources can be
> > tried for each mount since the map may not exist within a given source
> > and so we get NSS_STATUS_UNAVAIL instead of NSS_STATUS_NOTFOUND. Maybe I
> > should just cache the unavailable in the same way?
> 
> Ian, this is autofs version 4.  It's been a while since I implemented
> the negative caching there, but I'd be interested in knowing if you
> think I got it wrong for the case above!

Oh, yeah, lucky it will be Xmas holiday time soon.
I'm going to have a look around the v4 code tomorrow.
I'll spend some time on this as well.

> 
> Cheers,
> Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: automounter segfaults when using negative cache?
  2008-12-05 16:00 ` Jeff Moyer
  2008-12-05 16:23   ` Ian Kent
@ 2008-12-08  9:08   ` Ondrej Valousek
  1 sibling, 0 replies; 6+ messages in thread
From: Ondrej Valousek @ 2008-12-08  9:08 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: autofs@linux.kernel.org


>> I thought all the negative caching is done in user space, but apparently
>> kernel has some influence here....
>> Any explanation?
>>     
>
> The negative caching is done in userspace.  I'm not sure I understand
> your question.
>
> Cheers,
> Jeff
>   


Hi Jeff,
Ok I was thinking this way:
I experience this only on U7 system running U4 kernel - on the same U7
system running U7 kernel I can not replicate this. If the negative
caching is done in the user space and there is a bug there which causes
the autofs to segfault, how is it possible that is won't segfault on the
U7 kernel?
I do not know I am still trying to find out if there is any connection
to the problem with the automounter hanging that I reported earlier as:
- in U7 system running U7 kernel, the autofs does not crash, but hangs
(after few days).
- in U7 system running U4 kernel, the autofs forks crash regularly but
otherwise it works fine (I mean at least the parent autofs fork continue
living...)
You know what I am trying to say?
Thanks,
Ondrej

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-12-08  9:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-04 11:15 automounter segfaults when using negative cache? Ondrej Valousek
2008-12-05 16:00 ` Jeff Moyer
2008-12-05 16:23   ` Ian Kent
2008-12-05 16:27     ` Jeff Moyer
2008-12-05 17:12       ` Ian Kent
2008-12-08  9:08   ` Ondrej Valousek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.