* ENOENT on first reference to an automounted file
@ 2007-10-05 21:14 Dan Halbert
2007-10-06 4:48 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Dan Halbert @ 2007-10-05 21:14 UTC (permalink / raw)
To: autofs
I have what looks like an automount race condition, and am very puzzled.
Any suggestions would be appreciated.
The first time I reference an automounted file, it is not there
(ENOENT). On the second and later try, the file is there. For instance:
$ cat /net/fileserver/fs/somefile
cat: /net/fileserver/fs/somefile: No such file or directory
$ cat /net/fileserver/fs/somefile
Contents of somefile.
I watched the log on fileserver, and the automount request is logged
seemingly immediately after the first "cat" prints its error.
This causes havoc with our applications, which expect files to be there
the first time they look for them.
I can repeat the problem after umounting the fileystem.
I see this problem on a CentOS 4.x system running their standard
autofs-4.1.3-199.3. I do NOT see it on CentOS 5.x, using
autofs-5.0.1-0.rc2.43.0.2. Instead I see a slight pause before "cat"
prints the contents of the file, presumably as the automount completes.
Both the CentOS4 and CentOS5 systems are completely up-to-date.
I also only see this problem with our Linux NFS servers (FC5 and FC6),
but not with a non-Fedora NAS server we have.
So I am not sure this is an automount problem, per se. Perhaps it's some
kind of NFS version problem?
The automount options include --ghost. At first I thought it might be
due to --ghost, because the very first time I reference the file, say
after a reboot or restarting autofs, I don't get an ENOENT. The first
time, the mountpoint dir does not yet exist. But removing --ghost from
the automount options does not seem to fix it.
Gory details about the automount maps are below.
Thanks for any help,
Dan Halbert
---------------
More details:
Our automount maps are stored in ldap. The entry in auto.master for
fileserver (for cn=/net/fileserver) is:
ldap:ou=auto.fileserver,ou=autofs,dc=example,dc=com --timeout=86400
--ghost -o
rw,hard,async,noatime,intr,retrans=4,timeo=100,rsize=8192,wsize=8192
The auto.fileserver is (for cn=*):
fileserver.example.com:/export/&
We are not using the fancy executable /net maps that come with these
systems.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-05 21:14 Dan Halbert
@ 2007-10-06 4:48 ` Ian Kent
2007-10-08 15:15 ` Jeff Moyer
[not found] ` <47081453.7000709@everyzing.com>
0 siblings, 2 replies; 21+ messages in thread
From: Ian Kent @ 2007-10-06 4:48 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
On Fri, 2007-10-05 at 17:14 -0400, Dan Halbert wrote:
> I have what looks like an automount race condition, and am very puzzled.
> Any suggestions would be appreciated.
>
> The first time I reference an automounted file, it is not there
> (ENOENT). On the second and later try, the file is there. For instance:
>
> $ cat /net/fileserver/fs/somefile
> cat: /net/fileserver/fs/somefile: No such file or directory
> $ cat /net/fileserver/fs/somefile
> Contents of somefile.
>
> I watched the log on fileserver, and the automount request is logged
> seemingly immediately after the first "cat" prints its error.
>
> This causes havoc with our applications, which expect files to be there
> the first time they look for them.
>
> I can repeat the problem after umounting the fileystem.
>
> I see this problem on a CentOS 4.x system running their standard
> autofs-4.1.3-199.3. I do NOT see it on CentOS 5.x, using
> autofs-5.0.1-0.rc2.43.0.2. Instead I see a slight pause before "cat"
> prints the contents of the file, presumably as the automount completes.
> Both the CentOS4 and CentOS5 systems are completely up-to-date.
>
> I also only see this problem with our Linux NFS servers (FC5 and FC6),
> but not with a non-Fedora NAS server we have.
>
> So I am not sure this is an automount problem, per se. Perhaps it's some
> kind of NFS version problem?
>
> The automount options include --ghost. At first I thought it might be
> due to --ghost, because the very first time I reference the file, say
> after a reboot or restarting autofs, I don't get an ENOENT. The first
> time, the mountpoint dir does not yet exist. But removing --ghost from
> the automount options does not seem to fix it.
We've seen this from time to time for various reasons but to be honest I
have trouble remembering so we'll need to check through a debug log.
Jeff may recall this?
Also, you don't mention the kernel versions?
>
> Gory details about the automount maps are below.
>
> Thanks for any help,
> Dan Halbert
>
> ---------------
> More details:
>
> Our automount maps are stored in ldap. The entry in auto.master for
> fileserver (for cn=/net/fileserver) is:
>
> ldap:ou=auto.fileserver,ou=autofs,dc=example,dc=com --timeout=86400
> --ghost -o
> rw,hard,async,noatime,intr,retrans=4,timeo=100,rsize=8192,wsize=8192
>
>
> The auto.fileserver is (for cn=*):
>
> fileserver.example.com:/export/&
We really must have a debug log, include everything and give some
indication of when the problem occurred. See
http://people.redhat.com/jmoyer for info.
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-06 4:48 ` Ian Kent
@ 2007-10-08 15:15 ` Jeff Moyer
[not found] ` <47081453.7000709@everyzing.com>
1 sibling, 0 replies; 21+ messages in thread
From: Jeff Moyer @ 2007-10-08 15:15 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs
Ian Kent <raven@themaw.net> writes:
> On Fri, 2007-10-05 at 17:14 -0400, Dan Halbert wrote:
>> I have what looks like an automount race condition, and am very puzzled.
>> Any suggestions would be appreciated.
>>
>> The first time I reference an automounted file, it is not there
>> (ENOENT). On the second and later try, the file is there. For instance:
>>
>> $ cat /net/fileserver/fs/somefile
>> cat: /net/fileserver/fs/somefile: No such file or directory
>> $ cat /net/fileserver/fs/somefile
>> Contents of somefile.
>>
>> I watched the log on fileserver, and the automount request is logged
>> seemingly immediately after the first "cat" prints its error.
>>
>> This causes havoc with our applications, which expect files to be there
>> the first time they look for them.
>>
>> I can repeat the problem after umounting the fileystem.
>>
>> I see this problem on a CentOS 4.x system running their standard
>> autofs-4.1.3-199.3. I do NOT see it on CentOS 5.x, using
>> autofs-5.0.1-0.rc2.43.0.2. Instead I see a slight pause before "cat"
>> prints the contents of the file, presumably as the automount completes.
>> Both the CentOS4 and CentOS5 systems are completely up-to-date.
>>
>> I also only see this problem with our Linux NFS servers (FC5 and FC6),
>> but not with a non-Fedora NAS server we have.
>>
>> So I am not sure this is an automount problem, per se. Perhaps it's some
>> kind of NFS version problem?
>>
>> The automount options include --ghost. At first I thought it might be
>> due to --ghost, because the very first time I reference the file, say
>> after a reboot or restarting autofs, I don't get an ENOENT. The first
>> time, the mountpoint dir does not yet exist. But removing --ghost from
>> the automount options does not seem to fix it.
>
> We've seen this from time to time for various reasons but to be honest I
> have trouble remembering so we'll need to check through a debug log.
>
> Jeff may recall this?
I think that the last time we looked at this, the problem was that
there was a replicated server entry, and the first picked entry failed
to mount. Then, the second succeeded, but we returned the wrong
dentry from lookup. This resulted in a reported failure, even though
the mount was successful.
I'm not convinced this is the same problem. I'll try to reproduce it.
Cheers,
Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
[not found] ` <47081453.7000709@everyzing.com>
@ 2007-10-08 16:29 ` Jeff Moyer
2007-10-08 16:35 ` Dan Halbert
0 siblings, 1 reply; 21+ messages in thread
From: Jeff Moyer @ 2007-10-08 16:29 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
Dan Halbert <halbert@everyzing.com> writes:
> Ian Kent wrote:
> > On Fri, 2007-10-05 at 17:14 -0400, Dan Halbert wrote:
> >> I have what looks like an automount race condition, and am very puzzled...
> >>
> >> The first time I reference an automounted file, it is not there
> >> (ENOENT). On the second and later try, the file is there...
> >
> > We really must have a debug log, include everything and give some
> > indication of when the problem occurred....
>
> Details you requested follow.
> autofs-4.1.3-199.3
> (I also reproduced the problem with autofs-4.1.3-214, the latest version
> at jmoyer's webpage.)
> kernel:
> 2.6.9-55.0.9.ELsmp x86_64
I can't reproduce the problem. Would you be willing to enable
debugging in the kernel module. This will generate oodles of output.
Alternatively, I could try to come up with some pointed kprobes to get
the information we most likely need. The first option will definitely
be quicker.
Cheers,
Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-08 16:29 ` Jeff Moyer
@ 2007-10-08 16:35 ` Dan Halbert
2007-10-08 16:43 ` Jeff Moyer
2007-10-08 17:20 ` Jeff Moyer
0 siblings, 2 replies; 21+ messages in thread
From: Dan Halbert @ 2007-10-08 16:35 UTC (permalink / raw)
To: autofs
Jeff Moyer wrote:
> I can't reproduce the problem. Would you be willing to enable
> debugging in the kernel module.
>
Sure, happy to do this, though you'll need to tell me how to turn that
on. I'm testing on a machine I have complete control over.
Dan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-08 16:35 ` Dan Halbert
@ 2007-10-08 16:43 ` Jeff Moyer
2007-10-08 17:20 ` Jeff Moyer
1 sibling, 0 replies; 21+ messages in thread
From: Jeff Moyer @ 2007-10-08 16:43 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
Dan Halbert <halbert@everyzing.com> writes:
> Jeff Moyer wrote:
>> I can't reproduce the problem. Would you be willing to enable
>> debugging in the kernel module.
>>
> Sure, happy to do this, though you'll need to tell me how to turn that
> on. I'm testing on a machine I have complete control over.
Apply the following patch and rebuild the kernel module.
Cheers,
Jeff
--- linux-2.6.9/fs/autofs4/autofs_i.h.orig 2007-10-08 12:43:00.000000000 -0400
+++ linux-2.6.9/fs/autofs4/autofs_i.h 2007-10-08 12:43:06.000000000 -0400
@@ -30,7 +30,7 @@
#include <asm/current.h>
#include <asm/uaccess.h>
-/* #define DEBUG */
+#define DEBUG
#ifdef DEBUG
#define DPRINTK(fmt,args...) do { printk(KERN_DEBUG "pid %d: %s: " fmt "\n" , current->pid , __FUNCTION__ , ##args); } while(0)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-08 16:35 ` Dan Halbert
2007-10-08 16:43 ` Jeff Moyer
@ 2007-10-08 17:20 ` Jeff Moyer
2007-10-08 18:00 ` Dan Halbert
1 sibling, 1 reply; 21+ messages in thread
From: Jeff Moyer @ 2007-10-08 17:20 UTC (permalink / raw)
To: Dan Halbert; +Cc: ikent, autofs
Dan Halbert <halbert@everyzing.com> writes:
> Jeff Moyer wrote:
>> I can't reproduce the problem. Would you be willing to enable
>> debugging in the kernel module.
>>
> Sure, happy to do this, though you'll need to tell me how to turn that
> on. I'm testing on a machine I have complete control over.
I was able to reproduce it. It turns out that I had a kernel
installed that had a fix for the following bug:
Bugzilla Bug 248126: autofs problem with symbolic links
When I moved to the exact kernel you were running, I hit the problem.
So, it's a known issue, and it had better be addressed in the next
update (kernel 2.6.9-61.EL).
Cheers,
Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-08 17:20 ` Jeff Moyer
@ 2007-10-08 18:00 ` Dan Halbert
2007-10-09 3:11 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Dan Halbert @ 2007-10-08 18:00 UTC (permalink / raw)
To: Jeff Moyer; +Cc: ikent, autofs
Jeff Moyer wrote:
> I was able to reproduce it. It turns out that I had a kernel
> installed that had a fix for the following bug:
> Bugzilla Bug 248126: autofs problem with symbolic links
>
> When I moved to the exact kernel you were running, I hit the problem.
> So, it's a known issue, and it had better be addressed in the next
> update (kernel 2.6.9-61.EL).
>
Thanks! You saved me from a module rebuild, which I have not done in
quite a few years.
I had searched the existing bugs and seen 248126, but it did not seem to
me that I could have had a simultaneous expire, since we have such long
timeouts on the automounts (usually --timeout=86400). But I must have
misunderstood what "expire" means in this case.
Also, from 248126 and the bug it references, 174821, it appeared that
these patches were already incorporated into my kernel 2.6.9-55.0.9-smp
(248126 comment #24 mentions 55.0.7, for instance). But apparently not!
Is this bug non-existent in the latest updated RHEL5.0? I am trying to
think of a workaround until 2.6.9-61 comes out. We have a tried cron job
to provoke the automount more often than its timeout, but I am not sure
that would solve the problem.
Dan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-08 18:00 ` Dan Halbert
@ 2007-10-09 3:11 ` Ian Kent
0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2007-10-09 3:11 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
On Mon, 2007-10-08 at 14:00 -0400, Dan Halbert wrote:
> Jeff Moyer wrote:
> > I was able to reproduce it. It turns out that I had a kernel
> > installed that had a fix for the following bug:
> > Bugzilla Bug 248126: autofs problem with symbolic links
> >
> > When I moved to the exact kernel you were running, I hit the problem.
> > So, it's a known issue, and it had better be addressed in the next
> > update (kernel 2.6.9-61.EL).
> >
> Thanks! You saved me from a module rebuild, which I have not done in
> quite a few years.
>
> I had searched the existing bugs and seen 248126, but it did not seem to
> me that I could have had a simultaneous expire, since we have such long
> timeouts on the automounts (usually --timeout=86400). But I must have
> misunderstood what "expire" means in this case.
Yes, that's a bit puzzling.
>
> Also, from 248126 and the bug it references, 174821, it appeared that
> these patches were already incorporated into my kernel 2.6.9-55.0.9-smp
> (248126 comment #24 mentions 55.0.7, for instance). But apparently not!
The patch was reverted in this revision. I'm not sure why.
>
> Is this bug non-existent in the latest updated RHEL5.0? I am trying to
> think of a workaround until 2.6.9-61 comes out. We have a tried cron job
> to provoke the automount more often than its timeout, but I am not sure
> that would solve the problem.
I'd need to check but I believe the patch is present in the RHEL5 kernel
(but I think there are a couple of corrections missing).
You could apply the patch(es) yourself.
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
@ 2007-10-18 10:36 Greg Earle
2007-10-19 8:02 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Greg Earle @ 2007-10-18 10:36 UTC (permalink / raw)
To: autofs
Dan, Ian, Jeff:
I work at a large U.S. Government Lab and after updating some of our
Ops systems from RHEL 4 Update 4 to Update 5, all Hell broke loose
as we have been plagued by this bug ever since - we are a Sun and
Red Hat shop, and our software architecture is heavily dependent upon
lots of automounts. (I can very easily replicate it in our own
environment with a simple test script that usually provokes the
race condition in about 10-15 minutes, tops.)
I am getting the impression from the bug reports (and posts in
this thread) that this bug is *not* fixed in 2.6.9-55.0.9; and might
not be until some point in the future when 2.6.9-61 is available via
"up2date". Am I correct in that assumption?
If so, we may have little choice but to rollback to Update 4 by
doing complete reinstalls from scratch (groan). Is there any
info on when this bug first crept in, and is Update 4 - with
autofs-4.1.3-187 - safe to roll back to? The natives are restless,
and they've already shown up outside my office door with torches
and pitchforks. I've got a lot of unhappy Flight Projects reps
on my hands. We need to make a command decision here Real Soon Now.
Any illumination much appreciated.
- Greg Earle
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-18 10:36 ENOENT on first reference to an automounted file Greg Earle
@ 2007-10-19 8:02 ` Ian Kent
2007-10-19 13:20 ` Dan Halbert
0 siblings, 1 reply; 21+ messages in thread
From: Ian Kent @ 2007-10-19 8:02 UTC (permalink / raw)
To: autofs
On Thu, 2007-10-18 at 03:36 -0700, Greg Earle wrote:
> Dan, Ian, Jeff:
>
> I work at a large U.S. Government Lab and after updating some of our
> Ops systems from RHEL 4 Update 4 to Update 5, all Hell broke loose
> as we have been plagued by this bug ever since - we are a Sun and
> Red Hat shop, and our software architecture is heavily dependent upon
> lots of automounts. (I can very easily replicate it in our own
> environment with a simple test script that usually provokes the
> race condition in about 10-15 minutes, tops.)
>
> I am getting the impression from the bug reports (and posts in
> this thread) that this bug is *not* fixed in 2.6.9-55.0.9; and might
> not be until some point in the future when 2.6.9-61 is available via
> "up2date". Am I correct in that assumption?
>
> If so, we may have little choice but to rollback to Update 4 by
> doing complete reinstalls from scratch (groan). Is there any
> info on when this bug first crept in, and is Update 4 - with
> autofs-4.1.3-187 - safe to roll back to? The natives are restless,
> and they've already shown up outside my office door with torches
> and pitchforks. I've got a lot of unhappy Flight Projects reps
> on my hands. We need to make a command decision here Real Soon Now.
>
> Any illumination much appreciated.
Well, if we can't confirm the problem and resolution then I have no case
to put for an update.
No-one has volunteered to try the patches I referred to in this thread
and that's why I haven't posted them, so how about it, someone?
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-19 8:02 ` Ian Kent
@ 2007-10-19 13:20 ` Dan Halbert
2007-10-19 14:37 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Dan Halbert @ 2007-10-19 13:20 UTC (permalink / raw)
To: autofs
Ian Kent wrote:
>
> Well, if we can't confirm the problem and resolution then I have no case
> to put for an update.
>
> No-one has volunteered to try the patches I referred to in this thread
> and that's why I haven't posted them, so how about it, someone?
Ian (& Greg & Jeff),
Maybe there's a bit of cross-purpose communication here. In an earlier
message, Jeff said he had reproduced the problem by using exactly our
kernel (2.6.9-55.0.9.ELsmp x86_64), and that the problem did NOT happen
with a later kernel he had (which was the one he originally tried). See
http://linux.kernel.org/pipermail/autofs/2007-October/004133.html.
So I think Jeff has confirmed the problem and resolution. Am I telling
you something you already know?
Jeff said his successful test kernel has patches for bug 248126. Comment
#24 in that bug says the patch was put in 2.6.9-55.0.7. So I'd expect
the patch to be in 2.6.9-55.0.9 and for the problem to be fixed already.
Since it isn't fixed, either the patch was pulled between .7 and .9, or
the fix is more complicated than that single patch. Also, the bug
comments refer to several different patch sets and other bugs, so it's
not clear to me which patches Jeff actually has in his test kernel.
My group has various workarounds, so we're not dead in the water. We are
also might move up to 5.x, but are waiting for a completely different
fix as well (kernel.org bug #7768), which is not yet in the released
upstream kernels.
Dan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-19 13:20 ` Dan Halbert
@ 2007-10-19 14:37 ` Ian Kent
2007-10-19 15:22 ` Jeff Moyer
0 siblings, 1 reply; 21+ messages in thread
From: Ian Kent @ 2007-10-19 14:37 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
On Fri, 2007-10-19 at 09:20 -0400, Dan Halbert wrote:
> Ian Kent wrote:
> >
> > Well, if we can't confirm the problem and resolution then I have no case
> > to put for an update.
> >
> > No-one has volunteered to try the patches I referred to in this thread
> > and that's why I haven't posted them, so how about it, someone?
>
> Ian (& Greg & Jeff),
>
> Maybe there's a bit of cross-purpose communication here. In an earlier
> message, Jeff said he had reproduced the problem by using exactly our
> kernel (2.6.9-55.0.9.ELsmp x86_64), and that the problem did NOT happen
> with a later kernel he had (which was the one he originally tried). See
> http://linux.kernel.org/pipermail/autofs/2007-October/004133.html.
>
> So I think Jeff has confirmed the problem and resolution. Am I telling
> you something you already know?
Well, to be honest, I had forgotten about that comment, but that's
partly good. The curious thing is, of course, is hitting this problem is
quite odd because it shouldn't be that prone to occur.
>
> Jeff said his successful test kernel has patches for bug 248126. Comment
> #24 in that bug says the patch was put in 2.6.9-55.0.7. So I'd expect
> the patch to be in 2.6.9-55.0.9 and for the problem to be fixed already.
> Since it isn't fixed, either the patch was pulled between .7 and .9, or
> the fix is more complicated than that single patch. Also, the bug
> comments refer to several different patch sets and other bugs, so it's
> not clear to me which patches Jeff actually has in his test kernel.
I mentioned before (although I may not have been clear on exactly what )
that the patch for the mount/expire race had been reverted in
2.6.9-55.0.9 and the patches in the bug Jeff referred to are corrections
to that patch. Anyway, the story just gets worse because there's another
patch that depends on these that should also be included and isn't.
To this end I've built a RHEL4 kernel with all the patches that "should"
be included. If your interested in testing it we just need to find a way
to get it to you.
It would be good to get some clear information on this because several
people are having, and will continue to have (including possibly RHEL5),
odd little problems that end up being quite serious and I have no sold
case to lobby for inclusion of the reverted or missing patches.
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-19 14:37 ` Ian Kent
@ 2007-10-19 15:22 ` Jeff Moyer
2007-10-19 17:05 ` Dan Halbert
0 siblings, 1 reply; 21+ messages in thread
From: Jeff Moyer @ 2007-10-19 15:22 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs
Ian Kent <raven@themaw.net> writes:
> On Fri, 2007-10-19 at 09:20 -0400, Dan Halbert wrote:
>> Ian Kent wrote:
>> >
>> > Well, if we can't confirm the problem and resolution then I have no case
>> > to put for an update.
>> >
>> > No-one has volunteered to try the patches I referred to in this thread
>> > and that's why I haven't posted them, so how about it, someone?
>>
>> Ian (& Greg & Jeff),
>>
>> Maybe there's a bit of cross-purpose communication here. In an earlier
>> message, Jeff said he had reproduced the problem by using exactly our
>> kernel (2.6.9-55.0.9.ELsmp x86_64), and that the problem did NOT happen
>> with a later kernel he had (which was the one he originally tried). See
>> http://linux.kernel.org/pipermail/autofs/2007-October/004133.html.
>>
>> So I think Jeff has confirmed the problem and resolution. Am I telling
>> you something you already know?
>
> Well, to be honest, I had forgotten about that comment, but that's
> partly good. The curious thing is, of course, is hitting this problem is
> quite odd because it shouldn't be that prone to occur.
I verified that the latest release-candidate kernel for RHEL 4 U6
fixes the problem.
In the mean time, you can work around the bug by turning off ghosting.
Cheers,
Jeff
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-19 15:22 ` Jeff Moyer
@ 2007-10-19 17:05 ` Dan Halbert
2007-10-19 17:21 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Dan Halbert @ 2007-10-19 17:05 UTC (permalink / raw)
To: Ian Kent; +Cc: autofs
Ian Kent wrote:
>Well, to be honest, I had forgotten about that comment, but that's
>partly good. The curious thing is, of course, is hitting this problem
>is quite odd because it shouldn't be that prone to occur.
I think the original diagnosis of an umount/mount race is only one
possible way to hit the bug. We use very long timeouts and would never
have hit that particular race. Since we see it with LDAP but not with a
local map, I wonder if it is due to some slight additional delay caused
by the LDAP lookup.
>To this end I've built a RHEL4 kernel with all the patches that
>"should" be included. If your interested in testing it we just need to
>find a way to get it to you.
Jeff Moyer wrote:
> I verified that the latest release-candidate kernel for RHEL 4 U6
> fixes the problem.
>
> In the mean time, you can work around the bug by turning off ghosting.
Ian, should I try your test kernel, or is it moot now, given what Jeff
says? I can privately give you an FTP location if you would still like
it tested.
Dan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-19 17:05 ` Dan Halbert
@ 2007-10-19 17:21 ` Ian Kent
2007-11-03 15:27 ` Dan Halbert
0 siblings, 1 reply; 21+ messages in thread
From: Ian Kent @ 2007-10-19 17:21 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
On Fri, 2007-10-19 at 13:05 -0400, Dan Halbert wrote:
> Ian Kent wrote:
>
> >Well, to be honest, I had forgotten about that comment, but that's
> >partly good. The curious thing is, of course, is hitting this problem
> >is quite odd because it shouldn't be that prone to occur.
>
> I think the original diagnosis of an umount/mount race is only one
> possible way to hit the bug. We use very long timeouts and would never
> have hit that particular race. Since we see it with LDAP but not with a
> local map, I wonder if it is due to some slight additional delay caused
> by the LDAP lookup.
We understand what is happening now.
Jeff worked it out.
It is to do with the use of a wildcard entry and interaction of the
daemon with the autofs internal cache and the directory create/remove
done by the daemon. This also explains why the kernel patch prevents the
problem from happening.
>
> >To this end I've built a RHEL4 kernel with all the patches that
> >"should" be included. If your interested in testing it we just need to
> >find a way to get it to you.
>
> Jeff Moyer wrote:
> > I verified that the latest release-candidate kernel for RHEL 4 U6
> > fixes the problem.
> >
> > In the mean time, you can work around the bug by turning off ghosting.
>
> Ian, should I try your test kernel, or is it moot now, given what Jeff
> says? I can privately give you an FTP location if you would still like
> it tested.
We probably still have to make our case for getting this into the
revision 55 series kernel I think, given that the patches have been
reverted. Your not the only one seeing this and I expect not everyone
will be able or comfortable upgrading to a later revision kernel so
perhaps we should.
It's pretty much up to you as anyone who can't go to the U6 kernel and
needs this will have to test it for themselves anyway.
I have the build now anyway.
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-19 17:21 ` Ian Kent
@ 2007-11-03 15:27 ` Dan Halbert
2007-11-04 5:12 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Dan Halbert @ 2007-11-03 15:27 UTC (permalink / raw)
To: autofs
Ian,
I see that http://rhn.redhat.com/errata/RHSA-2007-0939.html, dated
2007-11-01, claims to include a fix for autofs bug 248126, with a new
kernel-2.6.9-55.0.12. Do you know if this kernel contains effectively
the same patches that are in the test kernel I got from you? If so,
great! We have been running your test kernel on about thirty machines
with no problems, no ENOENT's, etc.
Dan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-11-03 15:27 ` Dan Halbert
@ 2007-11-04 5:12 ` Ian Kent
0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2007-11-04 5:12 UTC (permalink / raw)
To: Dan Halbert; +Cc: autofs
On Sat, 2007-11-03 at 11:27 -0400, Dan Halbert wrote:
> Ian,
>
> I see that http://rhn.redhat.com/errata/RHSA-2007-0939.html, dated
> 2007-11-01, claims to include a fix for autofs bug 248126, with a new
> kernel-2.6.9-55.0.12. Do you know if this kernel contains effectively
> the same patches that are in the test kernel I got from you? If so,
> great! We have been running your test kernel on about thirty machines
> with no problems, no ENOENT's, etc.
Yep, it looks good to me.
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-10-20 1:28 ` ENOENT on first reference to an automounted file To: autofs@linux.kernel.org Greg Earle
@ 2007-11-07 1:53 ` Dan Halbert
0 siblings, 0 replies; 21+ messages in thread
From: Dan Halbert @ 2007-11-07 1:53 UTC (permalink / raw)
To: autofs
Greg Earle wrote:
> We use NIS for our maps, but just for fun, I decided to test
> turning "--ghost" *on* (we default to it off, and we also use
> "-nobrowse" on our Suns, so we like to keep them consistent),
> and ...
>
Greg, try the latest kernel, 2.6.9-55.0.12, which is now available from
RedHat and has also gone downstream to various other RH-source-based
distributions. This works for us. I agree your ghost/non-ghost
differences are odd and do not match my experience. But I see
differences also based on client load.
Dan
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
[not found] <mailman.446.1194400455.3098.autofs@linux.kernel.org>
@ 2007-11-17 21:25 ` Greg Earle
2007-11-18 2:46 ` Ian Kent
0 siblings, 1 reply; 21+ messages in thread
From: Greg Earle @ 2007-11-17 21:25 UTC (permalink / raw)
To: autofs
On Nov 6, 2007, at 8:53 PM EST, Dan Halbert <halbert@everyzing.com>
wrote:
> Greg Earle wrote:
>> We use NIS for our maps, but just for fun, I decided to test
>> turning "--ghost" *on* (we default to it off, and we also use
>> "-nobrowse" on our Suns, so we like to keep them consistent),
>> and ...
>>
> Greg, try the latest kernel, 2.6.9-55.0.12, which is now available
> from
> RedHat and has also gone downstream to various other RH-source-based
> distributions. This works for us. I agree your ghost/non-ghost
> differences are odd and do not match my experience. But I see
> differences also based on client load.
I see that Red Hat just announced/released RHEL 4 Update 6
yesterday:
https://www.redhat.com/archives/nahant-list/2007-November/msg00068.html
It appears that this includes kernel 2.6.9-67. Can I safely
assume that this new release quashes this pesky ENOENT bug
once and for all?
More interestingly/importantly, RHEL 4 Update 6 includes autofs5
as a "Technology Preview". How does the autofs5 code in this
new release compare with the mainline code in RHEL 5 Update 1,
and is it considered robust enough to use in a production
environment that depends heavily (as in, "life or death" -
we use the automounter for *everything*) on automounting?
We started our RHEL 4 Update 5 upgrade cycle a month and a
half ago but were stopped dead in our tracks by this bug.
Now that we have a workaround ("--ghost"), we are planning
on pushing ahead, but I need to know whether I should try
recommending that we instead move to Update 6 rather than
continue to use Update 5 with a Band-Aid.
Thanks,
- Greg
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: ENOENT on first reference to an automounted file
2007-11-17 21:25 ` Greg Earle
@ 2007-11-18 2:46 ` Ian Kent
0 siblings, 0 replies; 21+ messages in thread
From: Ian Kent @ 2007-11-18 2:46 UTC (permalink / raw)
To: autofs
On Sat, 2007-11-17 at 13:25 -0800, Greg Earle wrote:
> On Nov 6, 2007, at 8:53 PM EST, Dan Halbert <halbert@everyzing.com>
> wrote:
>
> > Greg Earle wrote:
> >> We use NIS for our maps, but just for fun, I decided to test
> >> turning "--ghost" *on* (we default to it off, and we also use
> >> "-nobrowse" on our Suns, so we like to keep them consistent),
> >> and ...
> >>
> > Greg, try the latest kernel, 2.6.9-55.0.12, which is now available
> > from
> > RedHat and has also gone downstream to various other RH-source-based
> > distributions. This works for us. I agree your ghost/non-ghost
> > differences are odd and do not match my experience. But I see
> > differences also based on client load.
>
> I see that Red Hat just announced/released RHEL 4 Update 6
> yesterday:
>
> https://www.redhat.com/archives/nahant-list/2007-November/msg00068.html
>
> It appears that this includes kernel 2.6.9-67. Can I safely
> assume that this new release quashes this pesky ENOENT bug
> once and for all?
Try it out.
All I can say is that the 4.6 release kernel has the patches that were
used in 2.6.9-55.0.12 to resolve the problem.
>
> More interestingly/importantly, RHEL 4 Update 6 includes autofs5
> as a "Technology Preview". How does the autofs5 code in this
> new release compare with the mainline code in RHEL 5 Update 1,
> and is it considered robust enough to use in a production
> environment that depends heavily (as in, "life or death" -
> we use the automounter for *everything*) on automounting?
It's the same as is in RHEL 5 U1, except for some changes to allow
autofs and autofs5 to be installed at the same time. You still need to
use "one or the other", not both.
I'll be keeping RHEL 4 autofs5 in sync with RHEL 5 autofs.
Tech Preview was our only option to get this into RHEL4 as autofs
version 4 is already included as a core package, which must continue to
be included.
>
> We started our RHEL 4 Update 5 upgrade cycle a month and a
> half ago but were stopped dead in our tracks by this bug.
>
> Now that we have a workaround ("--ghost"), we are planning
> on pushing ahead, but I need to know whether I should try
> recommending that we instead move to Update 6 rather than
> continue to use Update 5 with a Band-Aid.
The kernel revision 2.6.9-55.0.12 isn't really a band-aid, it contains a
correction.
Whether you go 4.6 with autofs5 is a decision you'll need to make
yourself after suitable testing. There is of course the issue that a
Tech Preview isn't officially supported so you may have trouble logging
bugs. But then you can always report them here and I can log bugs if
needed.
Ian
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2007-11-18 2:46 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-18 10:36 ENOENT on first reference to an automounted file Greg Earle
2007-10-19 8:02 ` Ian Kent
2007-10-19 13:20 ` Dan Halbert
2007-10-19 14:37 ` Ian Kent
2007-10-19 15:22 ` Jeff Moyer
2007-10-19 17:05 ` Dan Halbert
2007-10-19 17:21 ` Ian Kent
2007-11-03 15:27 ` Dan Halbert
2007-11-04 5:12 ` Ian Kent
[not found] <mailman.446.1194400455.3098.autofs@linux.kernel.org>
2007-11-17 21:25 ` Greg Earle
2007-11-18 2:46 ` Ian Kent
[not found] <mailman.1.1192795201.25176.autofs@linux.kernel.org>
2007-10-20 1:28 ` ENOENT on first reference to an automounted file To: autofs@linux.kernel.org Greg Earle
2007-11-07 1:53 ` ENOENT on first reference to an automounted file Dan Halbert
-- strict thread matches above, loose matches on Subject: below --
2007-10-05 21:14 Dan Halbert
2007-10-06 4:48 ` Ian Kent
2007-10-08 15:15 ` Jeff Moyer
[not found] ` <47081453.7000709@everyzing.com>
2007-10-08 16:29 ` Jeff Moyer
2007-10-08 16:35 ` Dan Halbert
2007-10-08 16:43 ` Jeff Moyer
2007-10-08 17:20 ` Jeff Moyer
2007-10-08 18:00 ` Dan Halbert
2007-10-09 3:11 ` Ian Kent
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.