All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Re: broken umount -f
@ 2003-01-15 14:45 Lever, Charles
  2003-01-15 16:32 ` Scott Mcdermott
  0 siblings, 1 reply; 28+ messages in thread
From: Lever, Charles @ 2003-01-15 14:45 UTC (permalink / raw)
  To: 'Scott Mcdermott'; +Cc: nfs

> To nfs@lists.sourceforge.net on Wed 15/01 00:19 -0500:
> > Lever, Charles on Tue 14/01 11:36 -0800:
> > <on client>
> 
> forgot to include
> 
> # grep nfsserver /proc/mounts
> nfsserver:/tmp /mnt/tmp nfs 
> rw,v3,rsize=8192,wsize=8192,hard,udp,lock,addr=nfsserver 0 0

thanks!

what if you try it again with the intr mount option?

if that doesn't help, enable rpc level debugging and send me
the kernel log contents.

echo  3 > /proc/sys/sunrpc/rpc_debug   #  to enable debugging
echo  0 > /proc/sys/sunrpc/rpc_debug   #  to turn it off again


-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: Broken umount -f
@ 2003-01-16 19:45 Cole, Timothy D.
  2003-01-16 19:56 ` Scott Mcdermott
  0 siblings, 1 reply; 28+ messages in thread
From: Cole, Timothy D. @ 2003-01-16 19:45 UTC (permalink / raw)
  To: 'Scott Mcdermott', nfs

> -----Original Message-----
> From: Scott Mcdermott [mailto:smcdermott@questra.com]
> Sent: Wednesday, January 15, 2003 22:38
> To: nfs@lists.sourceforge.net
> Subject: Re: [NFS] Re: Broken umount -f

> But, I still think that with `intr', umount -f should work.

Agreed on that count -- I didn't really mean to suggest otherwise.
Unmounting is orthagonal to signal delivery, so really intr/nointr shouldn't
influence umount -f's behavior.

> btw the NFS HOWTO recommends hard,nointr

I think there was a typo in an some earlier versions -- the current version
of the NFS HOWTO (http://nfs.sourceforge.net/nfs-howto/client.html)
recommends hard,intr.


-------------------------------------------------------
This SF.NET email is sponsored by: Thawte.com
Understand how to protect your customers personal information by implementing
SSL on your Apache Web Server. Click here to get our FREE Thawte Apache 
Guide: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0029en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* Re: Broken umount -f
@ 2003-01-15 19:59 Heflin, Roger A.
  2003-01-16  3:37 ` Scott Mcdermott
  0 siblings, 1 reply; 28+ messages in thread
From: Heflin, Roger A. @ 2003-01-15 19:59 UTC (permalink / raw)
  To: nfs




> Message: 1
> From: "Cole, Timothy D." <tdcole@northropgrumman.com>
> To: 'Scott Mcdermott' <smcdermott@questra.com>, =
nfs@lists.sourceforge.net
> Subject: RE: [NFS] Re: broken umount -f
> Date: Wed, 15 Jan 2003 10:46:42 -0800
>=20
> > -----Original Message-----
> > From: Scott Mcdermott [mailto:smcdermott@questra.com]
> > Sent: Wednesday, January 15, 2003 12:24
> > To: nfs@lists.sourceforge.net
> > Subject: Re: [NFS] Re: broken umount -f
> >=20
> > User saving his mail spool, sees "nfs server not responding, still
> > trying" and decides to try killing his MUA.  Too bad it works and =
now
> > his spool is a steaming pile of ASCII.
>=20
> That's possible with non-NFS filesystems too -- just normally a =
smaller
> window of opportunity.  It doesn't require a filesystem hang in any =
case --
> most mailbox operations are not a single atomic write().  Imagine =
someone
> killing the MUA in the middle of deleting a large mail from a ~40MB =
mail
> spool on any filesystem, local or remote.
>=20
> Also consider the nointr case -- process hangs, user can't kill it, =
user
> naively closes the terminal window.  Server comes back up.  SIGHUP is
> finally handled when the write() returns.  Process dies.  ASCII soup =
again.
>=20
> This is of course assuming that the NFS server _can_ come back up.  If =
not,
> totally unkillable processes are a pain-in-the-ssh.
>=20
> > `soft' and `intr' are evil and should be banned.
>=20
> Agreed WRT soft's evil-ness, anyway.  But hard,intr seems to be a =
pretty
> good combination, as far as safety from data corruption, and from a
> standpoint of not having to reboot-and-kill-week-long-simulations just
> because a few unrelated (but important) processes got wedged by a
> recalcitrant NFS server.
>=20
	Thinking about it, having non-intr won't save you in any way shape for
	form.

	If the nfs server pauses for some reason and you have non-intr and the=20
	user hits ctrl-c or something similir, it will just wait for the server =
to come
	back and then take the ctrl-c and abruptly kill the program, if the=20
	operation(s) takes any amount of time you will still have corrupted =
files.

					Roger


-------------------------------------------------------
This SF.NET email is sponsored by: A Thawte Code Signing Certificate 
is essential in establishing user confidence by providing assurance of 
authenticity and code integrity. Download our Free Code Signing guide:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: broken umount -f
@ 2003-01-15 18:46 Cole, Timothy D.
  0 siblings, 0 replies; 28+ messages in thread
From: Cole, Timothy D. @ 2003-01-15 18:46 UTC (permalink / raw)
  To: 'Scott Mcdermott', nfs

> -----Original Message-----
> From: Scott Mcdermott [mailto:smcdermott@questra.com]
> Sent: Wednesday, January 15, 2003 12:24
> To: nfs@lists.sourceforge.net
> Subject: Re: [NFS] Re: broken umount -f
> 
> User saving his mail spool, sees "nfs server not responding, still
> trying" and decides to try killing his MUA.  Too bad it works and now
> his spool is a steaming pile of ASCII.

That's possible with non-NFS filesystems too -- just normally a smaller
window of opportunity.  It doesn't require a filesystem hang in any case --
most mailbox operations are not a single atomic write().  Imagine someone
killing the MUA in the middle of deleting a large mail from a ~40MB mail
spool on any filesystem, local or remote.

Also consider the nointr case -- process hangs, user can't kill it, user
naively closes the terminal window.  Server comes back up.  SIGHUP is
finally handled when the write() returns.  Process dies.  ASCII soup again.

This is of course assuming that the NFS server _can_ come back up.  If not,
totally unkillable processes are a pain-in-the-ssh.

> `soft' and `intr' are evil and should be banned.

Agreed WRT soft's evil-ness, anyway.  But hard,intr seems to be a pretty
good combination, as far as safety from data corruption, and from a
standpoint of not having to reboot-and-kill-week-long-simulations just
because a few unrelated (but important) processes got wedged by a
recalcitrant NFS server.


-------------------------------------------------------
This SF.NET email is sponsored by: A Thawte Code Signing Certificate 
is essential in establishing user confidence by providing assurance of 
authenticity and code integrity. Download our Free Code Signing guide:
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: broken umount -f
@ 2003-01-15 17:04 Lever, Charles
  2003-01-15 17:23 ` Scott Mcdermott
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Lever, Charles @ 2003-01-15 17:04 UTC (permalink / raw)
  To: 'Scott Mcdermott'; +Cc: nfs

> I'm sure it *will* work with the `intr' mount option.  But I 
> don't want
> my users to be able to corrupt their own data just because I 
> decided to
> bounce to server for whatever reason.  Their IO to that filesystem
> should hang, uninterruptibly, as is the conventional wisdom 
> (that hard,
> nointr is the Right Way), and I agree with.

do you know what the risk of data corruption is when using "intr"?
seems pretty low to me.


-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: broken umount -f
@ 2003-01-15 15:35 Murata, Dennis W (SAIC)
  0 siblings, 0 replies; 28+ messages in thread
From: Murata, Dennis W (SAIC) @ 2003-01-15 15:35 UTC (permalink / raw)
  To: 'Scott Mcdermott', nfs

If you cd out of /mnt/tmp can you then umount -f?  This looks similar to
what happens on Solaris 8 if the directory happens to be the home directory
of a user logged into a system.  The login session has to be killed before
the umount -f works.  I would not think in your case that the user would
have to be forced off.

Wayne

-----Original Message-----
From: Scott Mcdermott [mailto:smcdermott@questra.com]
Sent: Tuesday, January 14, 2003 11:19 PM
To: nfs@lists.sourceforge.net
Subject: Re: [NFS] Re: broken umount -f


Lever, Charles on Tue 14/01 11:36 -0800:
> > Last I checked, the programs wouldn't die even with -KILL when they
> > were stuck in device-wait state.  The only way to reboot a machine
> > with such processes is to reboot -f, which is wrong.  The
> > filesystems should be able to have forced umount at sysadmin's
> > discretion.
> 
> do you remember which kernel this was?
> 
> trond fixed a long-standing "processes stuck in 'D' state" bug in
> 2.4.20.  this bug may be the reason these processes didn't die when
> you killed them.

<on client>

# uname -r
2.4.21-pre3-NFS_ALL

# showmount --exports nfsserver
Export list for nfsserver
/tmp           10.0.0.5

# mount nfsserver:/tmp /mnt/tmp

# cd /mnt/tmp

# ssh nfsserver /etc/init.d/nfs stop
Shutting down NFS mountd: [  OK  ]
Shutting down NFS daemon: [  OK  ]
Shutting down NFS services:  [  OK  ]
Shutting down NFS quotas: [  OK  ]

# /bin/pwd
nfs: server nfsserver not responding, still trying

<hangs>

<now on other tty with pwd=/>

# ps -eo state,wchan,pid,command | grep ^D
D end     1472 /bin/pwd

# kill -KILL 1472

# umount -f /mnt/tmp
Cannot MOUNTPROG RPC: RPC: Program not registered
umount2: Device or resource busy
umount: /mnt/tmp: device is busy

# kill -KILL 1472
# kill -KILL 1472

# umount -f /mnt/tmp
Cannot MOUNTPROG RPC: RPC: Program not registered
umount2: Device or resource busy
umount: /mnt/tmp: device is busy

# kill -KILL 1472
# kill -KILL 1472
# kill -KILL 1472
# kill -KILL 1472
# kill -KILL 1472
# kill -KILL 1472
# kill -KILL 1472

# umount -f /mnt/tmp
Cannot MOUNTPROG RPC: RPC: Program not registered
umount2: Device or resource busy
umount: /mnt/tmp: device is busy

# kill -KILL 1472

# ps -eo state,wchan,pid,command | grep ^D
D end     1472 /bin/pwd


-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: broken umount -f
@ 2003-01-14 19:49 Cole, Timothy D.
  0 siblings, 0 replies; 28+ messages in thread
From: Cole, Timothy D. @ 2003-01-14 19:49 UTC (permalink / raw)
  To: 'trond.myklebust@fys.uio.no'; +Cc: nfs

> -----Original Message-----
> From: Trond Myklebust [mailto:trond.myklebust@fys.uio.no]
> Sent: Tuesday, January 14, 2003 14:36
> To: Scott Mcdermott
> Cc: nfs@lists.sourceforge.net
> Subject: Re: [NFS] Re: broken umount -f
> 
> They will if you mount with 'intr', and make sure that you kill *all*
> programs that are using that mountpoint.

That doesn't always appear to be the case in practice (maybe a long-standing
bug/strange interaction?).  Can there be users not reported by fuser?



-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: broken umount -f
@ 2003-01-14 19:36 Lever, Charles
  2003-01-15  5:19 ` Scott Mcdermott
  0 siblings, 1 reply; 28+ messages in thread
From: Lever, Charles @ 2003-01-14 19:36 UTC (permalink / raw)
  To: 'Scott Mcdermott'; +Cc: nfs

> -----Original Message-----
> From: Scott Mcdermott [mailto:smcdermott@questra.com]
> Sent: Tuesday, January 14, 2003 2:20 PM
> To: Trond Myklebust
> Cc: nfs@lists.sourceforge.net
> Subject: Re: [NFS] Re: broken umount -f
> 
> Last I checked, the programs wouldn't die even with -KILL 
> when they were
> stuck in device-wait state.  The only way to reboot a machine 
> with such
> processes is to reboot -f, which is wrong.  The filesystems should be
> able to have forced umount at sysadmin's discretion.

do you remember which kernel this was?

trond fixed a long-standing "processes stuck in 'D' state" bug in
2.4.20.  this bug may be the reason these processes didn't die
when you killed them.


-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: Re: broken umount -f
@ 2003-01-14 19:30 Cole, Timothy D.
  0 siblings, 0 replies; 28+ messages in thread
From: Cole, Timothy D. @ 2003-01-14 19:30 UTC (permalink / raw)
  To: nfs

> -----Original Message-----
> From: Trond Myklebust [mailto:trond.myklebust@fys.uio.no]
> Sent: Tuesday, January 14, 2003 14:06
> To: Scott Mcdermott
> Cc: nfs@lists.sourceforge.net
> Subject: Re: [NFS] Re: broken umount -f

> Linux will not allow you to unmount without killing those processes,
> and I'd be opposed to any patch that tries to kill active processes
> from within the filesystem.

> This is something that needs to be resolved in userland.

The few times I've needed to use umount -f, the processes in question
weren't killable from userland.  Is there an architectural reason for this?

(i.e. instead of killing them, why can't their pending system calls return
with -EIO if a umount is forced, as I've seen some other unices do in
similar situations [pending RPCs + a dead server pinning a mount]?)


-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 28+ messages in thread
* RE: [NFS] Re: broken umount -f
@ 2003-01-14 15:56 Lever, Charles
  2003-01-14 17:07 ` Scott Mcdermott
  0 siblings, 1 reply; 28+ messages in thread
From: Lever, Charles @ 2003-01-14 15:56 UTC (permalink / raw)
  To: 'Peter Åstrand'; +Cc: nfs, linux-kernel

"umount -f" doesn't end pending RPCs.  if there are processes
with pending RPCs, then they are stuck and you will have to
reboot.  "intr" may allow some of these processes to be killed
before trying the "umount."

however, if there are no outstanding RPCs on the client, but
the server is not available, umount -f works as advertised.

> -----Original Message-----
> From: Peter =C5strand [mailto:peter@cendio.se]=20
> Sent: Monday, January 13, 2003 4:45 AM
> To: Trond Myklebust
> Cc: nfs@lists.sourceforge.net; linux-kernel@vger.kernel.org
> Subject: [NFS] Re: broken umount -f
>=20
>=20
> >>For as long as I remember, umount -f has been broken. I got=20
> a reminder=20
> >>of this fact today when we took an older NFS server out of=20
> use. I had=20
> >>to reboot almost all machines that had mounts from this server. Not=
=20
> >>nice.
>=20
> ...
>=20
> > AFAICS It works for me.
> >=20
> > Are you using the 'intr' mount option,
>=20
> Yes, as often I can. But IMHO, it should be possible to=20
> unmount an unreachable NFS fs even if it wasn't mounted with=20
> "intr". Otherwise we have a quite silly "sysadmin trap".
>=20
> >and are you remembering to kill
> > those processes that are actually using the mount point first?
>=20
> One some machines, I killed more or less everything. It=20
> didn't help. One some other machines, I couldn't kill so=20
> blindly. Remember, both "lsof" and "fuser" hangs.
>=20
> Also, as far as I understand, Solaris 8 does not require that=20
> you kill all processes before unmounting, if you use the "-f"=20
> flag (processes will get EIO). Would it be possible to=20
> implement this feature in Linux? That would be really nice.
>=20
> Regards, Peter
>=20
>=20
> >>For as long as I remember, umount -f has been broken. I got=20
> a reminder=20
> >>of this fact today when we took an older NFS server out of=20
> use. I had=20
> >>to reboot almost all machines that had mounts from this server. Not=
=20
> >>nice.
> >>
> >>Anyone knows why -f does not work? When I try, I get:
> >>
> >># umount -f /import/applix Cannot MOUNTPROG RPC: RPC: Port mapper=20
> >>failure - RPC: Unable to receive umount2: Device or resource busy
> >>umount: /import/applix: device is busy
> >>
> >>lsof and fuser hangs, as do "df" and "du". Really frustrating. It's=
=20
> >>not even possible to cleanly reboot the system, since=20
> RedHats shutdown=20
> >>scripts wants to unmount NFS fs's.
> >>
> >>I'm not exactly sure I understand what -f is supposed to do. Is it=20
> >>correct that it is supposed to unmount without contacting the NFS=20
> >>server? I assume that I still have to make sure no=20
> processes are using=20
> >>the FS? Would it be possible to add a "-9" flag (or something like=20
> >>that) that kills off all processes that uses the NFS fs=20
> automatically?
> >>
> >>(I'm using all kinds of RedHat Linux versions, from 5.0 up to 7.3.=20
> >>From what I can tell, this problems exists in all versions.)
> >>
>=20
>=20
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.NET email is sponsored by: FREE  SSL Guide from=20
> Thawte are you planning your Web Server Security? Click here=20
> to get a FREE Thawte SSL guide and find the answers to all=20
> your  SSL security issues.=20
> http://ads.sourceforge.net/cgi-> bin/redirect.pl?thaw0026en
>=20
>=20
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net=20
> https://lists.sourceforge.net/lists/listinfo/n> fs
>=20

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2003-01-16 20:49 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-15 14:45 Re: broken umount -f Lever, Charles
2003-01-15 16:32 ` Scott Mcdermott
  -- strict thread matches above, loose matches on Subject: below --
2003-01-16 19:45 Re: Broken " Cole, Timothy D.
2003-01-16 19:56 ` Scott Mcdermott
2003-01-15 19:59 Heflin, Roger A.
2003-01-16  3:37 ` Scott Mcdermott
2003-01-15 18:46 Re: broken " Cole, Timothy D.
2003-01-15 17:04 Lever, Charles
2003-01-15 17:23 ` Scott Mcdermott
     [not found] ` <20030115130759.B11894@ti19>
2003-01-15 18:22   ` Bill Rugolsky Jr.
2003-01-15 18:24   ` Scott Mcdermott
2003-01-16 20:49 ` Ion Badulescu
2003-01-15 15:35 Murata, Dennis W (SAIC)
2003-01-14 19:49 Cole, Timothy D.
2003-01-14 19:36 Lever, Charles
2003-01-15  5:19 ` Scott Mcdermott
2003-01-15  5:21   ` Scott Mcdermott
2003-01-14 19:30 Cole, Timothy D.
2003-01-14 15:56 [NFS] " Lever, Charles
2003-01-14 17:07 ` Scott Mcdermott
2003-01-14 19:06   ` Trond Myklebust
2003-01-14 19:19     ` Scott Mcdermott
2003-01-14 19:32       ` Brian Tinsley
2003-01-14 19:35       ` Trond Myklebust
2003-01-14 22:17         ` Scott Mcdermott
2003-01-14 22:29           ` Steven N. Hirsch
2003-01-14 22:27         ` Steven N. Hirsch
2003-01-14 19:39     ` Benjamin LaHaise
2003-01-14 19:52       ` Trond Myklebust
2003-01-14 19:56         ` Benjamin LaHaise

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.