Re: rm hangs on symlinks to down mount points

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: rm hangs on symlinks to down mount points
  2003-12-31  3:17           ` Paul Raines
@ 2003-12-30 18:38             ` Ian Kent
  2003-12-31 14:25               ` Paul Raines
  2003-12-31 10:58             ` Christian Vogel
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Kent @ 2003-12-30 18:38 UTC (permalink / raw)
  To: autofs mailing list

On Tue, 30 Dec 2003, Paul Raines wrote:

> On Tue, 30 Dec 2003, H. Peter Anvin wrote:
> > People constantly ask for various kinds of NFS support crap in autofs,
> > and it's *ALWAYS* wrong.  There is really no excuse for a feature
> > working from within autofs and not when mounted normally.
> >
> > None.
>
> Also, one could conceivably expect the kernel NFS developers to say to me
> "that is feature only people who use autofs would ever want so get those
> guys to do it".
>
> I have sort of considered autofs as a wrapper around that basic NFS that
> improves it.  And this is one way I would like it improved.  And I sort of
> feel like I am really asking for a change in autofs behavior rather than
> NFS.  I want autofs to give up on a mount that is taking too long after a
> shorter time then for mounts done by hand or in /etc/fstab.

While Peter is absolutely right in what he says I'm more inclined to add
this type of behaviour to autofs v4.

You are talking about the way in which autofs handles the umount process
under adverse conditions. For me this falls into the area of stability.
Something that I have always been concerned about.

On the other hand, even though this sounds simple enough to do, it's not.
There is quite a bit of work to do to implement this. For example, if you
forceably umount after a timeout what side affects could result, perhaps
none after the first, but what might happen after the fifyth? So ther's
quite a bit of testing to do here.

>
> >
> > >
> > >>>I am often getting automount processes that are hung and don't die
> > >>>with a simple kill.  I can "kill -9" them but that leaves things in
> > >>>a bad state usually (though sometimes I will just hand edit /etc/mtab
> > >>>so I can get something done).
> > >>
> > >>That's why they don't die with a simple kill.
> > >
> > > Which is a pain.  Just because some remote server went down I had
> > > automounted should not force me into a reboot of the client.
> > >
> >
> > OK... clue call... *AUTOFS ISN'T INVOLVED.*  Autofs *CANNOT* help you
> > when an in-use filesystem has its server removed from underneath it.
> > All autofs does is mount and unmount filesystems... it's not involved in
> > any shape, way or form with the running thereof.  All autofs can see is
> > that the filesystem is still in use, and there is nothing it can do
> > about it.
>
> Yet somehow it is invovled.  I never have the problem with any hard mounts.
> On those I can always succeed in unmounting them when they go down. I do
> that by killing processes and doing a 'umount -f'.  A small percentage of
> mounts made by autofs seem to become "permanent".  They seem usually tied
> to a automount subprocess that will not go away.  Again, I can do a 'kill
> -9' on that automount subprocess.  After that, sometimes an 'umount -f'
> will work, and sometimes I still get 'device busy'.

Now your talking about different problems that appear related. The mounts
are tied to the automount process and if you 'kill -9' the process,
antomount can't clean up and you should expect to have problems. One
possibility to help with this is to implement a umount_begin function in
the kernel module to allow it to clean up if possible during a umount -f.
I'm not sure how much mileage this will get but could prove effective.

>
> fuser is totally useless in these situations since it hangs on down
> NFS mounts.  I have discovered doing "find /proc -lname '/mnt/point/*'"
> is the way to find processes with files open on a mount.  But even
> when that shows nothing, I still get device busy.
>
> I can shutdown autofs totally but that sometimes stops in ways that
> leave the pidXXX entries in mtab.  And that of course disturbs users
> of the system accessing volumes that have nothing to do with the
> problem volume.

Oh another different problem.

Do you  happen to have a fairly sizable number of active mounts in these
situations? Using the RedHat package init script perhaps?

Could you be suffering from the frequent signals the this script sends
during shutdown, which cause pending umounts to fail to update the mtab
before exiting?

But it sounds like you don't have that many entries in your master map or
you would have had similar problems at startup as well.

>
> Why am I so keen on getting these volumes unmounted?  Well, too many
> programs stop working as soon as any single NFS volume goes down. df is
> obvious and not unexpected.  But even 'df /mnt' fails when /mnt is a still
> valid mount as for some stupid reason df still goes out and stats every NFS
> mount.  Other problems are quota and rpm which both stat every mount. And
> we have talked about rm and fuser. These are of course not an autofs
> problem.
>
> I guess what I should do is write a 'unautofs' service.  It would
> constantly monitor a specific subset of NFS mounts and when it sees a
> server has been down for X amount of time then it will unmount its volumes
> killing whatever processes it needs to to do it.  However, I am having
> a problem finding a reliable way to do that because of the behavior
> described above.
>

Don't like that idea much.

> >
> > If this sort of things happen to you often you may want to consider soft
> > mounts.  Of course, you take the risk of data loss, but that is the only
> > possible choice -- if the server cannot be accessed, the only options
> > are to wait (hard mount) or return failure and throw the data away (soft
> > mount.)
>
> Yes, using "soft" has been called a bad thing by many, many people
> and I want to avoid it.  And I don't think that would help with my problem
> anway.  I do use "intr" so I can kill the processing using a dead mount.
>
> Maybe I should give some perspective by describing my situation.
>
> Our center is behind a firewall with incoming ssh access only allowed to a
> Linux box called 'gate'.  Within the center we have over 200 Linux user
> desktops each with various amounts of data volumes.  We are a Biomedical
> Imaging research facility so I am not joking when I say many desktops
> have over 2TB of space.  For backup reasons, partitions are limited to
> 100GB or under so most desktops have several NFS exported volumes.
>
> The user desktop volumes are mounted through an autofs map on the gate
> server.  This is mainly so users can scp or sftp their data from remote
> sites.  As I sit right now there are 91 volumes mounted.  The map has
> over 1200 entries.

Ohh! Lets do a find -ls on some of these mount points. He, he, he.

>
> Of course inevitably users login to gate and cd into their desktop's
> volumes and then leave that login and forget about it.  Later they may shut
> down their desktop (or with over 200 we always have one or two down for
> hardware reasons).  Then gate become unhappy when users start using df,
> quota, etc.

It's a fun life for u, oh yes.

>
> Of course I have tons of problems with users browsing to the autofs
> mountpoint, in nautilus for instance, and then basically reporting to me
> that their desktops are frozen.  The really bad thing about nautilus is it
> constantly caches the mount points it finds for its Trashcan and then they
> all get remounted when the user logins in again. So now I have users who
> cause 60 or so mounts to happen everytime they login to their desktop.
> Again that is not autofs's problem, just a design failure on the nautilus
> writer's part.  Same issue with AFS.

Yep. I subscribed to the Nautalis list to get info on that. Have not
had time to hunt down the code that does it. Anyway I have my hands full
as it is now. Come to think of it I haven't seen any posts, maybe I've
been auto-unsibscribed ???

>
> Well I guess that is more than you wanted to know and I should really
> just go complain to the NFS writers and the writers of fuser, df, rm
> and quota.

Yes, the problems here don't only belong to automount.

Ian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-31 15:27                   ` Paul Raines
@ 2003-12-30 19:59                     ` Ian Kent
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Kent @ 2003-12-30 19:59 UTC (permalink / raw)
  To: Paul Raines; +Cc: autofs mailing list

On Wed, 31 Dec 2003, Paul Raines wrote:

> On Wed, 31 Dec 2003, Ian Kent wrote:
>
> > On Wed, 31 Dec 2003, Paul Raines wrote:
> >
> > >
> > > I believe I made a mistake and feel pretty stupid now.  I think the hung
> > > automount subprocesses are not hanging on the mount but on the umount.
> > > Does autofs ever call umount with the -f option after the first (few)
> > > regular umount's fail. If autofs has made a mount, the server dies, then
> > > the mount times out from no use so autofs wants to umount it, how does
> > > autofs handle the situation?
> > >
> >
> > No. It never uses the -f option.
> >
> > I think the -f option is safe enough though. It probably should do
> > something like that. But the -l option sounds like it could lead to
> > strife.
>
> Though I hope autofs after the call to the umount does not purely rely
> on its return code to see that a mount is now unmounted but also checks
> the mount table to see if it is still there.
>

No but the that can be unreliable as well. The return code is OK for some
things. For example, when if the autofs mount fails to umount due to mtab
contention (actually umounts it but fails to update mtab) between mount
requests the return code can be reliably used to detect this. A second
call to umount generally fixes the mtab.

The umount logic is not very smart. This would be why I'm interested in
discussing it.

But what are you saying about the return code?
I have seen situations where the return code is bogus.

Ian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* rm hangs on symlinks to down mount points
@ 2003-12-30 21:47 Paul Raines
  2003-12-30 23:09 ` H. Peter Anvin
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Raines @ 2003-12-30 21:47 UTC (permalink / raw)
  To: autofs

I have RedHat7.3 and RedHat9 systems with autofs-3.1.7 and
2.4.20 kernels on all.

Lets say for instance that /autofs/foobar is an automounted mount point
on server foobar and foobar is down.  On your client box you have
a symlink on your local disk as such:

  linkfile -> /autofs/foobar/somefile

If you try to delete linkfile, it hangs forever.  Seems to never timeout
or work.   Doing a strace shows it hanging to a system stat() call
on the pointed to file.

First, it seems fundamentally wrong that doing an unlink() on a symlink
causes a stat() call on the pointed to file.  Seems a huge waste.

One interesting thing, on RH7.3 if the above linkfile is an a directory
and you do a 'rm -rf' on that directory, it works.  On RH9, it still
hangs.

BTW, is there anyway outside a kernel recompile to configure how
long it takes for an access to a mount point on a down server to
timeout?

-- 
---------------------------------------------------------------
Paul Raines                   email: raines@nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street        Charlestown, MA 02129	USA   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 21:47 rm hangs on symlinks to down mount points Paul Raines
@ 2003-12-30 23:09 ` H. Peter Anvin
  2003-12-30 23:19   ` Paul Raines
  0 siblings, 1 reply; 16+ messages in thread
From: H. Peter Anvin @ 2003-12-30 23:09 UTC (permalink / raw)
  To: Paul Raines; +Cc: autofs

Paul Raines wrote:
> 
> If you try to delete linkfile, it hangs forever.  Seems to never timeout
> or work.   Doing a strace shows it hanging to a system stat() call
> on the pointed to file.
> 
> First, it seems fundamentally wrong that doing an unlink() on a symlink
> causes a stat() call on the pointed to file.  Seems a huge waste.
> 

It doesn't; however rm(1) apparently does.

	-hpa

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:09 ` H. Peter Anvin
@ 2003-12-30 23:19   ` Paul Raines
  2003-12-30 23:23     ` H. Peter Anvin
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Raines @ 2003-12-30 23:19 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: autofs

On Tue, 30 Dec 2003, H. Peter Anvin wrote:
> 
> It doesn't; however rm(1) apparently does.
> 

Yes, so I guess it is mostly rm's fault.  I have already complained to
the fileutils package writers about df which will do a stat on every
mount even if you pass df a specific mount point.  So on my central
system which often has over 50 automounts going at once with one of
the servers almost always offline, df never works.  quota never works
either.

It would be nice if one could configure a timeout in autofs for
mounting though separate from something like 'timeo' in the nfs options.

I am often getting automount processes that are hung and don't die
with a simple kill.  I can "kill -9" them but that leaves things in 
a bad state usually (though sometimes I will just hand edit /etc/mtab
so I can get something done).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:19   ` Paul Raines
@ 2003-12-30 23:23     ` H. Peter Anvin
  2003-12-30 23:37       ` Paul Raines
  0 siblings, 1 reply; 16+ messages in thread
From: H. Peter Anvin @ 2003-12-30 23:23 UTC (permalink / raw)
  To: Paul Raines; +Cc: autofs

Paul Raines wrote:
> 
> Yes, so I guess it is mostly rm's fault.  I have already complained to
> the fileutils package writers about df which will do a stat on every
> mount even if you pass df a specific mount point.  So on my central
> system which often has over 50 automounts going at once with one of
> the servers almost always offline, df never works.  quota never works
> either.
> 
> It would be nice if one could configure a timeout in autofs for
> mounting though separate from something like 'timeo' in the nfs options.
 >

s/autofs/mount(8)/.

Not autofs' job.

> I am often getting automount processes that are hung and don't die
> with a simple kill.  I can "kill -9" them but that leaves things in 
> a bad state usually (though sometimes I will just hand edit /etc/mtab
> so I can get something done).

That's why they don't die with a simple kill.

	-hpa

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:23     ` H. Peter Anvin
@ 2003-12-30 23:37       ` Paul Raines
  2003-12-30 23:57         ` H. Peter Anvin
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Raines @ 2003-12-30 23:37 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: autofs

On Tue, 30 Dec 2003, H. Peter Anvin wrote:
> > It would be nice if one could configure a timeout in autofs for
> > mounting though separate from something like 'timeo' in the nfs options.
>  >
> 
> s/autofs/mount(8)/.
> 
> Not autofs' job.
> 

Maybe not, but it could be done through autofs. There would probably be
alot more resistance to getting mount's API changed for support of separate
timeout value just for the mount side of things.

The master automount process could just kill the automount subprocess
spawned to do a mount if it doesn't complete in X number of seconds.

> > I am often getting automount processes that are hung and don't die
> > with a simple kill.  I can "kill -9" them but that leaves things in 
> > a bad state usually (though sometimes I will just hand edit /etc/mtab
> > so I can get something done).
> 
> That's why they don't die with a simple kill.

Which is a pain.  Just because some remote server went down I had 
automounted should not force me into a reboot of the client.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:37       ` Paul Raines
@ 2003-12-30 23:57         ` H. Peter Anvin
  2003-12-31  1:37           ` Jim Carter
                             ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: H. Peter Anvin @ 2003-12-30 23:57 UTC (permalink / raw)
  To: Paul Raines; +Cc: autofs

Paul Raines wrote:
> 
> Maybe not, but it could be done through autofs. There would probably be
> alot more resistance to getting mount's API changed for support of separate
> timeout value just for the mount side of things.
> 
> The master automount process could just kill the automount subprocess
> spawned to do a mount if it doesn't complete in X number of seconds.
> 

People constantly ask for various kinds of NFS support crap in autofs, 
and it's *ALWAYS* wrong.  There is really no excuse for a feature 
working from within autofs and not when mounted normally.

None.

> 
>>>I am often getting automount processes that are hung and don't die
>>>with a simple kill.  I can "kill -9" them but that leaves things in 
>>>a bad state usually (though sometimes I will just hand edit /etc/mtab
>>>so I can get something done).
>>
>>That's why they don't die with a simple kill.
> 
> Which is a pain.  Just because some remote server went down I had 
> automounted should not force me into a reboot of the client.
> 

OK... clue call... *AUTOFS ISN'T INVOLVED.*  Autofs *CANNOT* help you 
when an in-use filesystem has its server removed from underneath it. 
All autofs does is mount and unmount filesystems... it's not involved in 
any shape, way or form with the running thereof.  All autofs can see is 
that the filesystem is still in use, and there is nothing it can do 
about it.

If this sort of things happen to you often you may want to consider soft 
mounts.  Of course, you take the risk of data loss, but that is the only 
possible choice -- if the server cannot be accessed, the only options 
are to wait (hard mount) or return failure and throw the data away (soft 
mount.)

	-hpa

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:57         ` H. Peter Anvin
@ 2003-12-31  1:37           ` Jim Carter
  2003-12-31  3:17           ` Paul Raines
  2004-01-06 16:23           ` Paul Raines
  2 siblings, 0 replies; 16+ messages in thread
From: Jim Carter @ 2003-12-31  1:37 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: autofs

On Tue, 30 Dec 2003, H. Peter Anvin wrote:
> People constantly ask for various kinds of NFS support crap in autofs,
> and it's *ALWAYS* wrong.  There is really no excuse for a feature
> working from within autofs and not when mounted normally.

However, automount's job is to have the mount points available when needed,
and to make them go away when not needed.  As part of the latter, it's
important for the client to resist being harmed when a server dies.

> >>>I am often getting automount processes that are hung and don't die
> >>>with a simple kill.  I can "kill -9" them but that leaves things in
> >>>a bad state usually (though sometimes I will just hand edit /etc/mtab
> >>>so I can get something done).
> >>
> >>That's why they don't die with a simple kill.
> >
> > Which is a pain.  Just because some remote server went down I had
> > automounted should not force me into a reboot of the client.
> >
>
> OK... clue call... *AUTOFS ISN'T INVOLVED.*  Autofs *CANNOT* help you
> when an in-use filesystem has its server removed from underneath it.

Well...  At UCLA-Mathnet we have a lot of Solaris and Linux boxes, all
cross-mounted via automount (soft mounts) for various purposes.
Fortunately they're very stable, because when a server crashes, a Solaris
client's automounter is totally horked and you end up rebooting it.  On
Linux I can often do umount -f -l as part of a cleanup campaign and save
the client (though not the process cd'd to the NFS mounted directory or
executing NFS mounted software).  This is important because the majority of
machines, particularly compute nodes, are NFS servers and clients at the
same time, so a reboot messes up even more machines.

It would really be helpful if this cleanup process could be automated.  I
attribute responsibility to automount, because automount mounted it, and
automount eventually will try (and possibly fail) to unmount it.
Automount needs to be agile as the sysop wriggles around NFS limitations;
particularly, if the filesystem is forcibly unmounted manually, automount
needs to be aware, and to exit when the use count of the client's
representation of the filesystem drops to zero.  It should not be
advertising that there is still a useable filesystem on that mount point.

Perhaps MNT_FORCE (umount -f) or MNT_DETACH (-l) could be helpful in
getting rid of a horked mount point.  I have thought about using rename()
to move a horked mount point into a garbage dump directory that the users
can't see, after which automount can behave normally for that path name:
ENOENT if the server is still dead, or remounting the filesystem if
feasible.  I guess that won't work, but kernel 2.5.x/2.6.x has MS_MOVE
which may accomplish the desired result.

While in Sun's original design you could reboot a server and have NFS
clients resume as if nothing had happened, that no longer seems to be
really happening, probably because Solaris automount can't deal with the
loss of the server even if temporary.

James F. Carter          Voice 310 825 2897    FAX 310 206 6673
UCLA-Mathnet;  6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555
Email: jimc@math.ucla.edu  http://www.math.ucla.edu/~jimc (q.v. for PGP key)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:57         ` H. Peter Anvin
  2003-12-31  1:37           ` Jim Carter
@ 2003-12-31  3:17           ` Paul Raines
  2003-12-30 18:38             ` Ian Kent
  2003-12-31 10:58             ` Christian Vogel
  2004-01-06 16:23           ` Paul Raines
  2 siblings, 2 replies; 16+ messages in thread
From: Paul Raines @ 2003-12-31  3:17 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: autofs

On Tue, 30 Dec 2003, H. Peter Anvin wrote:
> People constantly ask for various kinds of NFS support crap in autofs, 
> and it's *ALWAYS* wrong.  There is really no excuse for a feature 
> working from within autofs and not when mounted normally.
> 
> None.

Also, one could conceivably expect the kernel NFS developers to say to me
"that is feature only people who use autofs would ever want so get those
guys to do it".

I have sort of considered autofs as a wrapper around that basic NFS that
improves it.  And this is one way I would like it improved.  And I sort of
feel like I am really asking for a change in autofs behavior rather than
NFS.  I want autofs to give up on a mount that is taking too long after a
shorter time then for mounts done by hand or in /etc/fstab.

> 
> > 
> >>>I am often getting automount processes that are hung and don't die
> >>>with a simple kill.  I can "kill -9" them but that leaves things in 
> >>>a bad state usually (though sometimes I will just hand edit /etc/mtab
> >>>so I can get something done).
> >>
> >>That's why they don't die with a simple kill.
> > 
> > Which is a pain.  Just because some remote server went down I had 
> > automounted should not force me into a reboot of the client.
> > 
> 
> OK... clue call... *AUTOFS ISN'T INVOLVED.*  Autofs *CANNOT* help you 
> when an in-use filesystem has its server removed from underneath it. 
> All autofs does is mount and unmount filesystems... it's not involved in 
> any shape, way or form with the running thereof.  All autofs can see is 
> that the filesystem is still in use, and there is nothing it can do 
> about it.

Yet somehow it is invovled.  I never have the problem with any hard mounts.
On those I can always succeed in unmounting them when they go down. I do
that by killing processes and doing a 'umount -f'.  A small percentage of
mounts made by autofs seem to become "permanent".  They seem usually tied
to a automount subprocess that will not go away.  Again, I can do a 'kill
-9' on that automount subprocess.  After that, sometimes an 'umount -f'
will work, and sometimes I still get 'device busy'.

fuser is totally useless in these situations since it hangs on down
NFS mounts.  I have discovered doing "find /proc -lname '/mnt/point/*'"
is the way to find processes with files open on a mount.  But even
when that shows nothing, I still get device busy.  

I can shutdown autofs totally but that sometimes stops in ways that
leave the pidXXX entries in mtab.  And that of course disturbs users
of the system accessing volumes that have nothing to do with the
problem volume.

Why am I so keen on getting these volumes unmounted?  Well, too many
programs stop working as soon as any single NFS volume goes down. df is
obvious and not unexpected.  But even 'df /mnt' fails when /mnt is a still
valid mount as for some stupid reason df still goes out and stats every NFS
mount.  Other problems are quota and rpm which both stat every mount. And
we have talked about rm and fuser. These are of course not an autofs
problem.

I guess what I should do is write a 'unautofs' service.  It would
constantly monitor a specific subset of NFS mounts and when it sees a
server has been down for X amount of time then it will unmount its volumes
killing whatever processes it needs to to do it.  However, I am having
a problem finding a reliable way to do that because of the behavior
described above.

> 
> If this sort of things happen to you often you may want to consider soft 
> mounts.  Of course, you take the risk of data loss, but that is the only 
> possible choice -- if the server cannot be accessed, the only options 
> are to wait (hard mount) or return failure and throw the data away (soft 
> mount.)

Yes, using "soft" has been called a bad thing by many, many people
and I want to avoid it.  And I don't think that would help with my problem
anway.  I do use "intr" so I can kill the processing using a dead mount.

Maybe I should give some perspective by describing my situation.

Our center is behind a firewall with incoming ssh access only allowed to a
Linux box called 'gate'.  Within the center we have over 200 Linux user
desktops each with various amounts of data volumes.  We are a Biomedical
Imaging research facility so I am not joking when I say many desktops
have over 2TB of space.  For backup reasons, partitions are limited to
100GB or under so most desktops have several NFS exported volumes. 

The user desktop volumes are mounted through an autofs map on the gate
server.  This is mainly so users can scp or sftp their data from remote
sites.  As I sit right now there are 91 volumes mounted.  The map has
over 1200 entries.

Of course inevitably users login to gate and cd into their desktop's
volumes and then leave that login and forget about it.  Later they may shut
down their desktop (or with over 200 we always have one or two down for
hardware reasons).  Then gate become unhappy when users start using df,
quota, etc.

Of course I have tons of problems with users browsing to the autofs
mountpoint, in nautilus for instance, and then basically reporting to me
that their desktops are frozen.  The really bad thing about nautilus is it
constantly caches the mount points it finds for its Trashcan and then they
all get remounted when the user logins in again. So now I have users who
cause 60 or so mounts to happen everytime they login to their desktop.  
Again that is not autofs's problem, just a design failure on the nautilus
writer's part.  Same issue with AFS.

Well I guess that is more than you wanted to know and I should really
just go complain to the NFS writers and the writers of fuser, df, rm
and quota.

-- 
---------------------------------------------------------------
Paul Raines                   email: raines@nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street        Charlestown, MA 02129	USA   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-31  3:17           ` Paul Raines
  2003-12-30 18:38             ` Ian Kent
@ 2003-12-31 10:58             ` Christian Vogel
  2003-12-31 14:32               ` Paul Raines
  1 sibling, 1 reply; 16+ messages in thread
From: Christian Vogel @ 2003-12-31 10:58 UTC (permalink / raw)
  To: Paul Raines; +Cc: autofs

Hi,

have you ever tried to compile a automount daemon that uses not
/bin/mount and /bin/umount but special shellscripts? You can change the
defines in $autofs_source/include/config.h.

There you could probably apply all the magic you want to the process of
mounting/unmounting filesystems...

	Chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 18:38             ` Ian Kent
@ 2003-12-31 14:25               ` Paul Raines
  2003-12-31 14:54                 ` Ian Kent
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Raines @ 2003-12-31 14:25 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs mailing list


I believe I made a mistake and feel pretty stupid now.  I think the hung
automount subprocesses are not hanging on the mount but on the umount.  
Does autofs ever call umount with the -f option after the first (few)
regular umount's fail. If autofs has made a mount, the server dies, then
the mount times out from no use so autofs wants to umount it, how does
autofs handle the situation?

On Wed, 31 Dec 2003, Ian Kent wrote:
> > Yet somehow it is invovled.  I never have the problem with any hard mounts.
> > On those I can always succeed in unmounting them when they go down. I do
> > that by killing processes and doing a 'umount -f'.  A small percentage of
> > mounts made by autofs seem to become "permanent".  They seem usually tied
> > to a automount subprocess that will not go away.  Again, I can do a 'kill
> > -9' on that automount subprocess.  After that, sometimes an 'umount -f'
> > will work, and sometimes I still get 'device busy'.
> 
> Now your talking about different problems that appear related. The mounts
> are tied to the automount process and if you 'kill -9' the process,
> antomount can't clean up and you should expect to have problems. One
> possibility to help with this is to implement a umount_begin function in
> the kernel module to allow it to clean up if possible during a umount -f.
> I'm not sure how much mileage this will get but could prove effective.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-31 10:58             ` Christian Vogel
@ 2003-12-31 14:32               ` Paul Raines
  0 siblings, 0 replies; 16+ messages in thread
From: Paul Raines @ 2003-12-31 14:32 UTC (permalink / raw)
  To: Christian Vogel; +Cc: autofs

On Wed, 31 Dec 2003, Christian Vogel wrote:
> have you ever tried to compile a automount daemon that uses not
> /bin/mount and /bin/umount but special shellscripts? You can change the
> defines in $autofs_source/include/config.h.
> 
> There you could probably apply all the magic you want to the process of
> mounting/unmounting filesystems...

That sounds like a very good idea.  Mainly, I would want to replace
umount with a script that does a 'umount -f' if a regular umount fails.
Better yet, it could read /etc/mtab for the name of the server and ping
it and do a 'umount -f' from the beginning if it fails to ping.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-31 14:25               ` Paul Raines
@ 2003-12-31 14:54                 ` Ian Kent
  2003-12-31 15:27                   ` Paul Raines
  0 siblings, 1 reply; 16+ messages in thread
From: Ian Kent @ 2003-12-31 14:54 UTC (permalink / raw)
  To: Paul Raines; +Cc: autofs mailing list

On Wed, 31 Dec 2003, Paul Raines wrote:

> 
> I believe I made a mistake and feel pretty stupid now.  I think the hung
> automount subprocesses are not hanging on the mount but on the umount.  
> Does autofs ever call umount with the -f option after the first (few)
> regular umount's fail. If autofs has made a mount, the server dies, then
> the mount times out from no use so autofs wants to umount it, how does
> autofs handle the situation?
> 

No. It never uses the -f option.

I think the -f option is safe enough though. It probably should do 
something like that. But the -l option sounds like it could lead to 
strife.

Ian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-31 14:54                 ` Ian Kent
@ 2003-12-31 15:27                   ` Paul Raines
  2003-12-30 19:59                     ` Ian Kent
  0 siblings, 1 reply; 16+ messages in thread
From: Paul Raines @ 2003-12-31 15:27 UTC (permalink / raw)
  To: Ian Kent; +Cc: autofs mailing list

On Wed, 31 Dec 2003, Ian Kent wrote:

> On Wed, 31 Dec 2003, Paul Raines wrote:
> 
> > 
> > I believe I made a mistake and feel pretty stupid now.  I think the hung
> > automount subprocesses are not hanging on the mount but on the umount.  
> > Does autofs ever call umount with the -f option after the first (few)
> > regular umount's fail. If autofs has made a mount, the server dies, then
> > the mount times out from no use so autofs wants to umount it, how does
> > autofs handle the situation?
> > 
> 
> No. It never uses the -f option.
> 
> I think the -f option is safe enough though. It probably should do 
> something like that. But the -l option sounds like it could lead to 
> strife.

Though I hope autofs after the call to the umount does not purely rely
on its return code to see that a mount is now unmounted but also checks
the mount table to see if it is still there.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: rm hangs on symlinks to down mount points
  2003-12-30 23:57         ` H. Peter Anvin
  2003-12-31  1:37           ` Jim Carter
  2003-12-31  3:17           ` Paul Raines
@ 2004-01-06 16:23           ` Paul Raines
  2 siblings, 0 replies; 16+ messages in thread
From: Paul Raines @ 2004-01-06 16:23 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: autofs

I am now in one of the unmountable mount situation I described
before:

[root@gate /root]# umount /autofs/space/meso_013
umount: /autofs/space/meso_013: device is busy
[root@gate /root]# umount -f /autofs/space/meso_013
umount2: Device or resource busy
umount: /autofs/space/meso_013: Illegal seek
[root@gate /root]# find /proc -lname '*meso*' 2>&1 | grep -v 'No such'
[root@gate /root]# fuser -m /autofs/space/meso_013
/autofs/space/meso_013: Stale NFS file handle
[root@gate /root]# ps auxw | grep auto
root       650  0.0  0.1  1612  580 ?        S     2003   0:02 
/usr/sbin/automount /autofs/space yp auto.space intr,rw,hard,nodev,rs

As you can see, no process is advertising itself as one that is
keeping the mount "busy".  So I see nothing I can kill.  There
are no automount subprocesses, just the parent one.  I can try 
stopping that but I have over 30 users on the system now and I
would rather not cause an interruption.

Is there any other solution?  Any better way to hunt for processes
that are keeping a mount busy?

-- 
---------------------------------------------------------------
Paul Raines                   email: raines@nmr.mgh.harvard.edu
MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging
149 (2301) 13th Street        Charlestown, MA 02129	USA   

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-01-06 16:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-30 21:47 rm hangs on symlinks to down mount points Paul Raines
2003-12-30 23:09 ` H. Peter Anvin
2003-12-30 23:19   ` Paul Raines
2003-12-30 23:23     ` H. Peter Anvin
2003-12-30 23:37       ` Paul Raines
2003-12-30 23:57         ` H. Peter Anvin
2003-12-31  1:37           ` Jim Carter
2003-12-31  3:17           ` Paul Raines
2003-12-30 18:38             ` Ian Kent
2003-12-31 14:25               ` Paul Raines
2003-12-31 14:54                 ` Ian Kent
2003-12-31 15:27                   ` Paul Raines
2003-12-30 19:59                     ` Ian Kent
2003-12-31 10:58             ` Christian Vogel
2003-12-31 14:32               ` Paul Raines
2004-01-06 16:23           ` Paul Raines

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.