* NFS hang + umount -f: better behaviour requested.
@ 2007-08-20 22:54 Robin Lee Powell
2007-08-20 23:27 ` Neil Brown
2007-08-21 16:43 ` John Stoffel
0 siblings, 2 replies; 22+ messages in thread
From: Robin Lee Powell @ 2007-08-20 22:54 UTC (permalink / raw)
To: linux-kernel
(cc's to me appreciated)
It would be really, really nice if "umount -f" against a hung NFS
mount actually worked on Linux. As much as I hate Solaris, I
consider it the gold standard in this case: If I say
"umount -f /mount/that/is/hung" it just goes away, immediately, and
anything still trying to use it dies (with EIO, I'm told).
If I know the NFS server is down, that really is the correct
behaviour. I very much want this behaviour, and am willing to
bribe/pay for it, although my resources are limited.
Unless you're interested in details of my tests, stop here.
I'm bringing this up again (I know it's been mentioned here before)
because I had been told that NFS support had gotten better in Linux
recently, so I have been (for my $dayjob) testing the behaviour of
NFS (autofs NFS, specifically) under Linux with hard,intr and using
iptables to simulate a hang. fuser hangs, as far as I can tell
indefinately, as does lsof. umount -f returns after a long time with
"busy", umount -l works after a long time but leaves the system in a
very unfortunate state such that I have to kill things by hand and
manually edit /etc/mtab to get autofs to work again.
The "correct solution" to this situation according to
http://nfs.sourceforge.net/ is cycles of "kill processes" and
"umount -f". This has two problems: 1. It sucks. 2. If fuser
and lsof both hand (and they do: fuser has been on
"stat("/home/rpowell/"," for > 30 minutes now), I have no way to
pick which processes to kill.
I've read every man page I could find, and the only nfs option that
semes even vaguely helpful is "soft", but everything that mentions
"soft" also says to never use it.
This is the single worst aspect of adminning a Linux system that I,
as a carreer sysadmin, have to deal with. In fact, it's really the
only one I even dislike. At my current work place, we've lost
multiple person-days to this issue, having to go around and reboot
every Linux box that was hanging off a down NFS server.
I know many other admins who also really want Solaris style
"umount -f"; I'm sure if I passed the hat I could get a decent
bounty together for this feature; let me know if you're interested.
Thanks.
-Robin
--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: NFS hang + umount -f: better behaviour requested. 2007-08-20 22:54 NFS hang + umount -f: better behaviour requested Robin Lee Powell @ 2007-08-20 23:27 ` Neil Brown 2007-08-20 23:34 ` Robin Lee Powell 2007-08-21 16:43 ` John Stoffel 1 sibling, 1 reply; 22+ messages in thread From: Neil Brown @ 2007-08-20 23:27 UTC (permalink / raw) To: Robin Lee Powell; +Cc: linux-kernel On Monday August 20, rlpowell@digitalkingdom.org wrote: > (cc's to me appreciated) > > It would be really, really nice if "umount -f" against a hung NFS > mount actually worked on Linux. As much as I hate Solaris, I > consider it the gold standard in this case: If I say > "umount -f /mount/that/is/hung" it just goes away, immediately, and > anything still trying to use it dies (with EIO, I'm told). Have you tried "umount -l"? How far is that from your requirements? Alternately: mount --move /problem/path /somewhere/else umount -f /somewhere/else umount -l /somewhere/else might be a little closer to what you want. Though I agree that it would be nice if we could convince all subsequent requests to a server to fail EIO instead of just the currently active ones. I'm not sure that just changing "umount -f" is the right interface though.... Maybe if all the server handles appeared in sysfs and have an attribute which you could set to cause all requests to fail... NeilBrown ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-20 23:27 ` Neil Brown @ 2007-08-20 23:34 ` Robin Lee Powell 2007-08-21 1:51 ` Salah Coronya 0 siblings, 1 reply; 22+ messages in thread From: Robin Lee Powell @ 2007-08-20 23:34 UTC (permalink / raw) To: Neil Brown; +Cc: linux-kernel On Tue, Aug 21, 2007 at 09:27:06AM +1000, Neil Brown wrote: > On Monday August 20, rlpowell@digitalkingdom.org wrote: > > (cc's to me appreciated) > > > > It would be really, really nice if "umount -f" against a hung > > NFS mount actually worked on Linux. As much as I hate Solaris, > > I consider it the gold standard in this case: If I say "umount > > -f /mount/that/is/hung" it just goes away, immediately, and > > anything still trying to use it dies (with EIO, I'm told). > > Have you tried "umount -l"? How far is that from your > requirements? I actually talked about that further down. The short version: quite far. The long version: It leaves a bunch of hung processes, with no real way for me to determine which processes are hung on the now-non-existent mount, and (at least with autofs) it leaves /etc/mtab in an inconsistent state, so I had to edit it to restart autofs. Only a mild improvement on rebooting, says I. Also, it took a really long time (minutes) to return. > Alternately: > mount --move /problem/path /somewhere/else > umount -f /somewhere/else > umount -l /somewhere/else > > might be a little closer to what you want. I don't think that would solve the problem: the umount -f would still hang and eventually return busy, fuser would still hang, and umount -l would still leave inconsistent crap lying around. > Though I agree that it would be nice if we could convince all > subsequent requests to a server to fail EIO instead of just the > currently active ones. I'm not sure that just changing "umount > -f" is the right interface though.... Maybe if all the server > handles appeared in sysfs and have an attribute which you could > set to cause all requests to fail... I have no opinion on interface details, I simply know that on Solaris, "umount -f" Just Works, and I would love to have similar behaviour on Linux. -Robin -- http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ Reason #237 To Learn Lojban: "Homonyms: Their Grate!" Proud Supporter of the Singularity Institute - http://singinst.org/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-20 23:34 ` Robin Lee Powell @ 2007-08-21 1:51 ` Salah Coronya 0 siblings, 0 replies; 22+ messages in thread From: Salah Coronya @ 2007-08-21 1:51 UTC (permalink / raw) To: linux-kernel Robin Lee Powell <rlpowell <at> digitalkingdom.org> writes: > > Though I agree that it would be nice if we could convince all > > subsequent requests to a server to fail EIO instead of just the > > currently active ones. I'm not sure that just changing "umount > > -f" is the right interface though.... Maybe if all the server > > handles appeared in sysfs and have an attribute which you could > > set to cause all requests to fail... > > I have no opinion on interface details, I simply know that on > Solaris, "umount -f" Just Works, and I would love to have similar > behaviour on Linux. > > -Robin > What you are looing is revoke()/frevokeat(); which will yank the file right from under the descriptor. Its currently in -mm. Of course "mount" will still need to iterate over each open file on the mount and revoke it. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-20 22:54 NFS hang + umount -f: better behaviour requested Robin Lee Powell 2007-08-20 23:27 ` Neil Brown @ 2007-08-21 16:43 ` John Stoffel 2007-08-21 16:55 ` J. Bruce Fields 2007-08-21 17:01 ` Peter Staubach 1 sibling, 2 replies; 22+ messages in thread From: John Stoffel @ 2007-08-21 16:43 UTC (permalink / raw) To: Robin Lee Powell; +Cc: linux-kernel Robin> I'm bringing this up again (I know it's been mentioned here Robin> before) because I had been told that NFS support had gotten Robin> better in Linux recently, so I have been (for my $dayjob) Robin> testing the behaviour of NFS (autofs NFS, specifically) under Robin> Linux with hard,intr and using iptables to simulate a hang. So why are you mouting with hard,intr semantics? At my current SysAdmin job, we mount everything (solaris included) with 'soft,intr' and it works well. If an NFS server goes down, clients don't hang for large periods of time. Robin> fuser hangs, as far as I can tell indefinately, as does Robin> lsof. umount -f returns after a long time with "busy", umount Robin> -l works after a long time but leaves the system in a very Robin> unfortunate state such that I have to kill things by hand and Robin> manually edit /etc/mtab to get autofs to work again. Robin> The "correct solution" to this situation according to Robin> http://nfs.sourceforge.net/ is cycles of "kill processes" and Robin> "umount -f". This has two problems: 1. It sucks. 2. If fuser Robin> and lsof both hand (and they do: fuser has been on Robin> "stat("/home/rpowell/"," for > 30 minutes now), I have no way to Robin> pick which processes to kill. Robin> I've read every man page I could find, and the only nfs option Robin> that semes even vaguely helpful is "soft", but everything that Robin> mentions "soft" also says to never use it. I think the man pages are out of date, or ignoring reality. Try mounting with soft,intr and see how it works for you. I think you'll be happy. Robin> This is the single worst aspect of adminning a Linux system that I, Robin> as a carreer sysadmin, have to deal with. In fact, it's really the Robin> only one I even dislike. At my current work place, we've lost Robin> multiple person-days to this issue, having to go around and reboot Robin> every Linux box that was hanging off a down NFS server. Robin> I know many other admins who also really want Solaris style Robin> "umount -f"; I'm sure if I passed the hat I could get a decent Robin> bounty together for this feature; let me know if you're interested. Robin> Thanks. Robin> -Robin Robin> -- Robin> http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ Robin> Reason #237 To Learn Lojban: "Homonyms: Their Grate!" Robin> Proud Supporter of the Singularity Institute - http://singinst.org/ Robin> - Robin> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Robin> the body of a message to majordomo@vger.kernel.org Robin> More majordomo info at http://vger.kernel.org/majordomo-info.html Robin> Please read the FAQ at http://www.tux.org/lkml/ Robin> !DSPAM:46ca1d9676791030010506! ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 16:43 ` John Stoffel @ 2007-08-21 16:55 ` J. Bruce Fields 2007-08-21 17:01 ` Peter Staubach 1 sibling, 0 replies; 22+ messages in thread From: J. Bruce Fields @ 2007-08-21 16:55 UTC (permalink / raw) To: John Stoffel; +Cc: Robin Lee Powell, linux-kernel On Tue, Aug 21, 2007 at 12:43:47PM -0400, John Stoffel wrote: > Robin> I've read every man page I could find, and the only nfs option > Robin> that semes even vaguely helpful is "soft", but everything that > Robin> mentions "soft" also says to never use it. > > I think the man pages are out of date, or ignoring reality. No. The price of using "soft" is the chance of data corruption, since an application may for example be left thinking that a write has succeeded when it hasn't. See http://nfs.sourceforge.net/#faq_e4 --b. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 16:43 ` John Stoffel 2007-08-21 16:55 ` J. Bruce Fields @ 2007-08-21 17:01 ` Peter Staubach 2007-08-21 17:14 ` Chakri n ` (2 more replies) 1 sibling, 3 replies; 22+ messages in thread From: Peter Staubach @ 2007-08-21 17:01 UTC (permalink / raw) To: John Stoffel; +Cc: Robin Lee Powell, linux-kernel John Stoffel wrote: > Robin> I'm bringing this up again (I know it's been mentioned here > Robin> before) because I had been told that NFS support had gotten > Robin> better in Linux recently, so I have been (for my $dayjob) > Robin> testing the behaviour of NFS (autofs NFS, specifically) under > Robin> Linux with hard,intr and using iptables to simulate a hang. > > So why are you mouting with hard,intr semantics? At my current > SysAdmin job, we mount everything (solaris included) with 'soft,intr' > and it works well. If an NFS server goes down, clients don't hang for > large periods of time. > > Wow! That's _really_ a bad idea. NFS READ operations which timeout can lead to executables which mysteriously fail, file corruption, etc. NFS WRITE operations which fail may or may not lead to file corruption. Anything writable should _always_ be mounted "hard" for safety purposes. Readonly mounted file systems _may_ be mounted "soft", depending upon what is located on them. > Robin> fuser hangs, as far as I can tell indefinately, as does > Robin> lsof. umount -f returns after a long time with "busy", umount > Robin> -l works after a long time but leaves the system in a very > Robin> unfortunate state such that I have to kill things by hand and > Robin> manually edit /etc/mtab to get autofs to work again. > > Robin> The "correct solution" to this situation according to > Robin> http://nfs.sourceforge.net/ is cycles of "kill processes" and > Robin> "umount -f". This has two problems: 1. It sucks. 2. If fuser > Robin> and lsof both hand (and they do: fuser has been on > Robin> "stat("/home/rpowell/"," for > 30 minutes now), I have no way to > Robin> pick which processes to kill. > > Robin> I've read every man page I could find, and the only nfs option > Robin> that semes even vaguely helpful is "soft", but everything that > Robin> mentions "soft" also says to never use it. > > I think the man pages are out of date, or ignoring reality. Try > mounting with soft,intr and see how it works for you. I think you'll > be happy. > > Please don't. You will end up regretting it in the long run. Taking a chance on corrupted data or critical applications which just fail is not worth the benefit. It would safer for us to implement something which works like the Solaris forced umount support for NFS. Thanx... ps > Robin> This is the single worst aspect of adminning a Linux system that I, > Robin> as a carreer sysadmin, have to deal with. In fact, it's really the > Robin> only one I even dislike. At my current work place, we've lost > Robin> multiple person-days to this issue, having to go around and reboot > Robin> every Linux box that was hanging off a down NFS server. > > Robin> I know many other admins who also really want Solaris style > Robin> "umount -f"; I'm sure if I passed the hat I could get a decent > Robin> bounty together for this feature; let me know if you're interested. > > Robin> Thanks. > > Robin> -Robin > > Robin> -- > Robin> http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ > Robin> Reason #237 To Learn Lojban: "Homonyms: Their Grate!" > Robin> Proud Supporter of the Singularity Institute - http://singinst.org/ > Robin> - > Robin> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > Robin> the body of a message to majordomo@vger.kernel.org > Robin> More majordomo info at http://vger.kernel.org/majordomo-info.html > Robin> Please read the FAQ at http://www.tux.org/lkml/ > > > Robin> !DSPAM:46ca1d9676791030010506! > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 17:01 ` Peter Staubach @ 2007-08-21 17:14 ` Chakri n 2007-08-21 17:14 ` Robin Lee Powell 2007-08-21 18:50 ` John Stoffel 2 siblings, 0 replies; 22+ messages in thread From: Chakri n @ 2007-08-21 17:14 UTC (permalink / raw) To: Peter Staubach; +Cc: John Stoffel, Robin Lee Powell, linux-kernel To add to the pain, lsof or fuser hang on unresponsive shares. I wrote my own wrapper to go through the "/proc/<pid>" file tables and find any process using the unresponsive mounts and kill those processes.This works well. Also, it brings another point. If the unresponsives problem cannot be fixed for some NFS data corruption reasons, is it possible for a mount to have both soft & hard semantics? Some process might want to use the mount point soft and other processes hard. This can be implemented easily in NFS & SUNRPC layers adding timeout to requests, but it becomes tricky in VFS layer. If a soft proces is waiting on an inode locked by a hard process, the soft process gets hard semantics too. Thanks --Chakri On 8/21/07, Peter Staubach <staubach@redhat.com> wrote: > John Stoffel wrote: > > Robin> I'm bringing this up again (I know it's been mentioned here > > Robin> before) because I had been told that NFS support had gotten > > Robin> better in Linux recently, so I have been (for my $dayjob) > > Robin> testing the behaviour of NFS (autofs NFS, specifically) under > > Robin> Linux with hard,intr and using iptables to simulate a hang. > > > > So why are you mouting with hard,intr semantics? At my current > > SysAdmin job, we mount everything (solaris included) with 'soft,intr' > > and it works well. If an NFS server goes down, clients don't hang for > > large periods of time. > > > > > > Wow! That's _really_ a bad idea. NFS READ operations which > timeout can lead to executables which mysteriously fail, file > corruption, etc. NFS WRITE operations which fail may or may > not lead to file corruption. > > Anything writable should _always_ be mounted "hard" for safety > purposes. Readonly mounted file systems _may_ be mounted "soft", > depending upon what is located on them. > > > Robin> fuser hangs, as far as I can tell indefinately, as does > > Robin> lsof. umount -f returns after a long time with "busy", umount > > Robin> -l works after a long time but leaves the system in a very > > Robin> unfortunate state such that I have to kill things by hand and > > Robin> manually edit /etc/mtab to get autofs to work again. > > > > Robin> The "correct solution" to this situation according to > > Robin> http://nfs.sourceforge.net/ is cycles of "kill processes" and > > Robin> "umount -f". This has two problems: 1. It sucks. 2. If fuser > > Robin> and lsof both hand (and they do: fuser has been on > > Robin> "stat("/home/rpowell/"," for > 30 minutes now), I have no way to > > Robin> pick which processes to kill. > > > > Robin> I've read every man page I could find, and the only nfs option > > Robin> that semes even vaguely helpful is "soft", but everything that > > Robin> mentions "soft" also says to never use it. > > > > I think the man pages are out of date, or ignoring reality. Try > > mounting with soft,intr and see how it works for you. I think you'll > > be happy. > > > > > > Please don't. You will end up regretting it in the long run. > Taking a chance on corrupted data or critical applications which > just fail is not worth the benefit. > > It would safer for us to implement something which works like > the Solaris forced umount support for NFS. > > Thanx... > > ps > > > Robin> This is the single worst aspect of adminning a Linux system that I, > > Robin> as a carreer sysadmin, have to deal with. In fact, it's really the > > Robin> only one I even dislike. At my current work place, we've lost > > Robin> multiple person-days to this issue, having to go around and reboot > > Robin> every Linux box that was hanging off a down NFS server. > > > > Robin> I know many other admins who also really want Solaris style > > Robin> "umount -f"; I'm sure if I passed the hat I could get a decent > > Robin> bounty together for this feature; let me know if you're interested. > > > > Robin> Thanks. > > > > Robin> -Robin > > > > Robin> -- > > Robin> http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ > > Robin> Reason #237 To Learn Lojban: "Homonyms: Their Grate!" > > Robin> Proud Supporter of the Singularity Institute - http://singinst.org/ > > Robin> - > > Robin> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > Robin> the body of a message to majordomo@vger.kernel.org > > Robin> More majordomo info at http://vger.kernel.org/majordomo-info.html > > Robin> Please read the FAQ at http://www.tux.org/lkml/ > > > > > > Robin> !DSPAM:46ca1d9676791030010506! > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 17:01 ` Peter Staubach 2007-08-21 17:14 ` Chakri n @ 2007-08-21 17:14 ` Robin Lee Powell 2007-08-21 17:18 ` Peter Staubach 2007-08-21 18:50 ` John Stoffel 2 siblings, 1 reply; 22+ messages in thread From: Robin Lee Powell @ 2007-08-21 17:14 UTC (permalink / raw) To: Peter Staubach; +Cc: John Stoffel, linux-kernel On Tue, Aug 21, 2007 at 01:01:44PM -0400, Peter Staubach wrote: > John Stoffel wrote: > >Robin> I'm bringing this up again (I know it's been mentioned here > >Robin> before) because I had been told that NFS support had gotten > >Robin> better in Linux recently, so I have been (for my $dayjob) > >Robin> testing the behaviour of NFS (autofs NFS, specifically) under > >Robin> Linux with hard,intr and using iptables to simulate a hang. > > > >So why are you mouting with hard,intr semantics? At my current > >SysAdmin job, we mount everything (solaris included) with > >'soft,intr' and it works well. If an NFS server goes down, > >clients don't hang for large periods of time. > > Wow! That's _really_ a bad idea. NFS READ operations which > timeout can lead to executables which mysteriously fail, file > corruption, etc. NFS WRITE operations which fail may or may not > lead to file corruption. > > Anything writable should _always_ be mounted "hard" for safety > purposes. Readonly mounted file systems _may_ be mounted "soft", > depending upon what is located on them. Does write + tcp make this any different? -Robin -- http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/ Reason #237 To Learn Lojban: "Homonyms: Their Grate!" Proud Supporter of the Singularity Institute - http://singinst.org/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 17:14 ` Robin Lee Powell @ 2007-08-21 17:18 ` Peter Staubach 0 siblings, 0 replies; 22+ messages in thread From: Peter Staubach @ 2007-08-21 17:18 UTC (permalink / raw) To: Robin Lee Powell; +Cc: John Stoffel, linux-kernel Robin Lee Powell wrote: > On Tue, Aug 21, 2007 at 01:01:44PM -0400, Peter Staubach wrote: > >> John Stoffel wrote: >> >>> Robin> I'm bringing this up again (I know it's been mentioned here >>> Robin> before) because I had been told that NFS support had gotten >>> Robin> better in Linux recently, so I have been (for my $dayjob) >>> Robin> testing the behaviour of NFS (autofs NFS, specifically) under >>> Robin> Linux with hard,intr and using iptables to simulate a hang. >>> >>> So why are you mouting with hard,intr semantics? At my current >>> SysAdmin job, we mount everything (solaris included) with >>> 'soft,intr' and it works well. If an NFS server goes down, >>> clients don't hang for large periods of time. >>> >> Wow! That's _really_ a bad idea. NFS READ operations which >> timeout can lead to executables which mysteriously fail, file >> corruption, etc. NFS WRITE operations which fail may or may not >> lead to file corruption. >> >> Anything writable should _always_ be mounted "hard" for safety >> purposes. Readonly mounted file systems _may_ be mounted "soft", >> depending upon what is located on them. >> > > Does write + tcp make this any different? Nope... TCP may make a difference if the problem is related to the network being slow or lossy, but will not affect anything if the server is just slow or down. Even if TCP would have eventually gotten all of the packets in a request or response through, the client may time out, cease waiting, and corruption may occur again. ps ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 17:01 ` Peter Staubach 2007-08-21 17:14 ` Chakri n 2007-08-21 17:14 ` Robin Lee Powell @ 2007-08-21 18:50 ` John Stoffel 2007-08-21 19:04 ` Peter Staubach ` (3 more replies) 2 siblings, 4 replies; 22+ messages in thread From: John Stoffel @ 2007-08-21 18:50 UTC (permalink / raw) To: Peter Staubach; +Cc: John Stoffel, Robin Lee Powell, linux-kernel >>>>> "Peter" == Peter Staubach <staubach@redhat.com> writes: Peter> John Stoffel wrote: Robin> I'm bringing this up again (I know it's been mentioned here Robin> before) because I had been told that NFS support had gotten Robin> better in Linux recently, so I have been (for my $dayjob) Robin> testing the behaviour of NFS (autofs NFS, specifically) under Robin> Linux with hard,intr and using iptables to simulate a hang. >> >> So why are you mouting with hard,intr semantics? At my current >> SysAdmin job, we mount everything (solaris included) with 'soft,intr' >> and it works well. If an NFS server goes down, clients don't hang for >> large periods of time. Peter> Wow! That's _really_ a bad idea. NFS READ operations which Peter> timeout can lead to executables which mysteriously fail, file Peter> corruption, etc. NFS WRITE operations which fail may or may Peter> not lead to file corruption. Peter> Anything writable should _always_ be mounted "hard" for safety Peter> purposes. Readonly mounted file systems _may_ be mounted Peter> "soft", depending upon what is located on them. Not in my experience. We use NetApps as our backing NFS servers, so maybe my experience isn't totally relevant. But with a mix of Linux and Solaris clients, we've never had problems with soft,intr on our NFS clients. We also don't see file corruption, mysterious executables failing to run, etc. Now maybe those issues are raised when you have a Linux NFS server with Solaris clients. But in my book, reliable NFS servers are key, and if they are reliable, 'soft,intr' works just fine. Now maybe if we had NFS exported directories everywhere, and stuff cross mounted all over the place with autofs, then we might change our minds. In any case, I don't dis-agree with the fundamental request to make the NFS client code on Linux easier to work with. I bet Trond (who works at NetApp) will have something to say on this issue. John ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 18:50 ` John Stoffel @ 2007-08-21 19:04 ` Peter Staubach 2007-08-21 19:25 ` J. Bruce Fields ` (2 subsequent siblings) 3 siblings, 0 replies; 22+ messages in thread From: Peter Staubach @ 2007-08-21 19:04 UTC (permalink / raw) To: John Stoffel; +Cc: Robin Lee Powell, linux-kernel John Stoffel wrote: >>>>>> "Peter" == Peter Staubach <staubach@redhat.com> writes: >>>>>> > > Peter> John Stoffel wrote: > Robin> I'm bringing this up again (I know it's been mentioned here > Robin> before) because I had been told that NFS support had gotten > Robin> better in Linux recently, so I have been (for my $dayjob) > Robin> testing the behaviour of NFS (autofs NFS, specifically) under > Robin> Linux with hard,intr and using iptables to simulate a hang. > >>> So why are you mouting with hard,intr semantics? At my current >>> SysAdmin job, we mount everything (solaris included) with 'soft,intr' >>> and it works well. If an NFS server goes down, clients don't hang for >>> large periods of time. >>> > > Peter> Wow! That's _really_ a bad idea. NFS READ operations which > Peter> timeout can lead to executables which mysteriously fail, file > Peter> corruption, etc. NFS WRITE operations which fail may or may > Peter> not lead to file corruption. > > Peter> Anything writable should _always_ be mounted "hard" for safety > Peter> purposes. Readonly mounted file systems _may_ be mounted > Peter> "soft", depending upon what is located on them. > > Not in my experience. We use NetApps as our backing NFS servers, so > maybe my experience isn't totally relevant. But with a mix of Linux > and Solaris clients, we've never had problems with soft,intr on our > NFS clients. > > We also don't see file corruption, mysterious executables failing to > run, etc. > > Now maybe those issues are raised when you have a Linux NFS server > with Solaris clients. But in my book, reliable NFS servers are key, > and if they are reliable, 'soft,intr' works just fine. > > Now maybe if we had NFS exported directories everywhere, and stuff > cross mounted all over the place with autofs, then we might change our > minds. > > In any case, I don't dis-agree with the fundamental request to make > the NFS client code on Linux easier to work with. I bet Trond (who > works at NetApp) will have something to say on this issue. Just for the others who may be reading this thread -- If you use sufficient network bandwidth and high quality enough networks and NFS servers with plenty of resources, then you _may_ be able to get away with "soft" mounting for a some period of time. However, any server, including Solaris and NetApp servers, will fail, and those failures may or may not affect the NFS service being provided. In fact, unless the system is being carefully administrated and the applications are written very well, with error detection and recovery in mind, then corruption can occur, and it can be silent and unnoticed until too late. In fact, most failures do occur silently and get chalked up to other causes because it will not be possible to correlate the badness with the NFS client giving up when attempting to communicate with an NFS server. I wish you the best of luck, although with the environment that you describe, it seems like "hard" mounts would work equally well and would not incur the risks. ps ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 18:50 ` John Stoffel 2007-08-21 19:04 ` Peter Staubach @ 2007-08-21 19:25 ` J. Bruce Fields 2007-08-24 15:09 ` Ric Wheeler 2007-08-21 23:04 ` Valdis.Kletnieks 2007-08-31 8:06 ` Ian Kent 3 siblings, 1 reply; 22+ messages in thread From: J. Bruce Fields @ 2007-08-21 19:25 UTC (permalink / raw) To: John Stoffel; +Cc: Peter Staubach, Robin Lee Powell, linux-kernel On Tue, Aug 21, 2007 at 02:50:42PM -0400, John Stoffel wrote: > Not in my experience. We use NetApps as our backing NFS servers, so > maybe my experience isn't totally relevant. But with a mix of Linux > and Solaris clients, we've never had problems with soft,intr on our > NFS clients. > > We also don't see file corruption, mysterious executables failing to > run, etc. > > Now maybe those issues are raised when you have a Linux NFS server > with Solaris clients. But in my book, reliable NFS servers are key, > and if they are reliable, 'soft,intr' works just fine. The NFS server alone can't prevent the problems Peter Staubach refers to. Their frequency also depends on the network and the way you're using the filesystem. (A sufficiently paranoid application accessing the filesystem could function correctly despite the problems caused by soft mounts, but the degree of paranoia required probably isn't common.) In practice, you may get away with soft mounts and never see problems. But other people considering them should probably make sure they understand the issues before trusting anything important to them. --b. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 19:25 ` J. Bruce Fields @ 2007-08-24 15:09 ` Ric Wheeler 2007-08-24 15:37 ` Peter Staubach 2007-08-24 15:53 ` J. Bruce Fields 0 siblings, 2 replies; 22+ messages in thread From: Ric Wheeler @ 2007-08-24 15:09 UTC (permalink / raw) To: J. Bruce Fields Cc: John Stoffel, Peter Staubach, Robin Lee Powell, linux-kernel J. Bruce Fields wrote: > On Tue, Aug 21, 2007 at 02:50:42PM -0400, John Stoffel wrote: > >> Not in my experience. We use NetApps as our backing NFS servers, so >> maybe my experience isn't totally relevant. But with a mix of Linux >> and Solaris clients, we've never had problems with soft,intr on our >> NFS clients. >> >> We also don't see file corruption, mysterious executables failing to >> run, etc. >> >> Now maybe those issues are raised when you have a Linux NFS server >> with Solaris clients. But in my book, reliable NFS servers are key, >> and if they are reliable, 'soft,intr' works just fine. >> > > The NFS server alone can't prevent the problems Peter Staubach refers > to. Their frequency also depends on the network and the way you're > using the filesystem. (A sufficiently paranoid application accessing > the filesystem could function correctly despite the problems caused by > soft mounts, but the degree of paranoia required probably isn't common.) > Would it be sufficient to insure that that application always issues an fsync() before closing any recently written/updated file? Is there some other subtle paranoid techniques that should be used? ric ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-24 15:09 ` Ric Wheeler @ 2007-08-24 15:37 ` Peter Staubach 2007-08-24 15:53 ` J. Bruce Fields 1 sibling, 0 replies; 22+ messages in thread From: Peter Staubach @ 2007-08-24 15:37 UTC (permalink / raw) To: Ric Wheeler; +Cc: J. Bruce Fields, John Stoffel, Robin Lee Powell, linux-kernel Ric Wheeler wrote: > J. Bruce Fields wrote: >> On Tue, Aug 21, 2007 at 02:50:42PM -0400, John Stoffel wrote: >> >>> Not in my experience. We use NetApps as our backing NFS servers, so >>> maybe my experience isn't totally relevant. But with a mix of Linux >>> and Solaris clients, we've never had problems with soft,intr on our >>> NFS clients. >>> >>> We also don't see file corruption, mysterious executables failing to >>> run, etc. >>> Now maybe those issues are raised when you have a Linux NFS server >>> with Solaris clients. But in my book, reliable NFS servers are key, >>> and if they are reliable, 'soft,intr' works just fine. >>> >> >> The NFS server alone can't prevent the problems Peter Staubach refers >> to. Their frequency also depends on the network and the way you're >> using the filesystem. (A sufficiently paranoid application accessing >> the filesystem could function correctly despite the problems caused by >> soft mounts, but the degree of paranoia required probably isn't common.) >> > Would it be sufficient to insure that that application always issues > an fsync() before closing any recently written/updated file? Is there > some other subtle paranoid techniques that should be used? I suspect that this is not sufficient. The application should be prepared to rewrite data if it can determine what data did not get written. Using fsync will tell the application when data was not written to the server correctly, but not which part of the data. Perhaps O_SYNC or fsync following each write, but either one of these options will also cause a large performance degradation. The right solution is the use of TCP and hard mounting. ps ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-24 15:09 ` Ric Wheeler 2007-08-24 15:37 ` Peter Staubach @ 2007-08-24 15:53 ` J. Bruce Fields 1 sibling, 0 replies; 22+ messages in thread From: J. Bruce Fields @ 2007-08-24 15:53 UTC (permalink / raw) To: Ric Wheeler; +Cc: John Stoffel, Peter Staubach, Robin Lee Powell, linux-kernel On Fri, Aug 24, 2007 at 11:09:14AM -0400, Ric Wheeler wrote: > J. Bruce Fields wrote: >> The NFS server alone can't prevent the problems Peter Staubach refers >> to. Their frequency also depends on the network and the way you're >> using the filesystem. (A sufficiently paranoid application accessing >> the filesystem could function correctly despite the problems caused by >> soft mounts, but the degree of paranoia required probably isn't common.) >> > Would it be sufficient to insure that that application always issues an > fsync() before closing any recently written/updated file? Is there some > other subtle paranoid techniques that should be used? NFS already syncs on close (and on unlock), so you should just need to check the return values from any writes, fsyncs, closes, etc. (and realize that an error there may mean some or all of the previous writes to this file descriptor failed). And operations like mkdir have the same problem--a timeout leaves you not knowing whether the directory was created, because you don't know whether the operation reached the server or not. I assume the problems with executables that Peter Staubach refers to are due to reads on mmap'd files timing out. I don't use soft mounts myself and haven't had to debug user problems with them, so my understanding of it all is purely theoretical--others will have a better idea when and how these kinds of failures actually manifest themselves in practice. --b. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 18:50 ` John Stoffel 2007-08-21 19:04 ` Peter Staubach 2007-08-21 19:25 ` J. Bruce Fields @ 2007-08-21 23:04 ` Valdis.Kletnieks 2007-08-22 10:03 ` Theodore Tso 2007-08-22 15:26 ` John Stoffel 2007-08-31 8:06 ` Ian Kent 3 siblings, 2 replies; 22+ messages in thread From: Valdis.Kletnieks @ 2007-08-21 23:04 UTC (permalink / raw) To: John Stoffel; +Cc: Peter Staubach, Robin Lee Powell, linux-kernel [-- Attachment #1: Type: text/plain, Size: 352 bytes --] On Tue, 21 Aug 2007 14:50:42 EDT, John Stoffel said: > Now maybe those issues are raised when you have a Linux NFS server > with Solaris clients. But in my book, reliable NFS servers are key, > and if they are reliable, 'soft,intr' works just fine. And you don't need all that ext3 journal overhead if your disk drives are reliable too. Gotcha. :) [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 23:04 ` Valdis.Kletnieks @ 2007-08-22 10:03 ` Theodore Tso 2007-08-22 15:26 ` John Stoffel 1 sibling, 0 replies; 22+ messages in thread From: Theodore Tso @ 2007-08-22 10:03 UTC (permalink / raw) To: Valdis.Kletnieks Cc: John Stoffel, Peter Staubach, Robin Lee Powell, linux-kernel On Tue, Aug 21, 2007 at 07:04:16PM -0400, Valdis.Kletnieks@vt.edu wrote: > On Tue, 21 Aug 2007 14:50:42 EDT, John Stoffel said: > > > Now maybe those issues are raised when you have a Linux NFS server > > with Solaris clients. But in my book, reliable NFS servers are key, > > and if they are reliable, 'soft,intr' works just fine. > > And you don't need all that ext3 journal overhead if your disk drives > are reliable too. Gotcha. :) Err, no. The ext3 journal overhead buys you not needing to fsck after an unclean shutdown, and safety against crap getting written to the inode table on an unclean power hit while the disk drive is writing and the memory goes insane before the DMA engine and disk drive stop working from the voltage on the power supply rails. (Hence my advice that if you use XFS on Linux, make *sure* you have a UPS; on machines such as the SGI Indy they added bigger capacitors to the PSU and a real power fail interrupt, but PC-class hardware is inexpensive/crappy, so it doesn't have such niceties.) - Ted ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 23:04 ` Valdis.Kletnieks 2007-08-22 10:03 ` Theodore Tso @ 2007-08-22 15:26 ` John Stoffel 1 sibling, 0 replies; 22+ messages in thread From: John Stoffel @ 2007-08-22 15:26 UTC (permalink / raw) To: Valdis.Kletnieks Cc: John Stoffel, Peter Staubach, Robin Lee Powell, linux-kernel >>>>> "Valdis" == Valdis Kletnieks <Valdis.Kletnieks@vt.edu> writes: Valdis> On Tue, 21 Aug 2007 14:50:42 EDT, John Stoffel said: >> Now maybe those issues are raised when you have a Linux NFS server >> with Solaris clients. But in my book, reliable NFS servers are key, >> and if they are reliable, 'soft,intr' works just fine. Valdis> And you don't need all that ext3 journal overhead if your disk Valdis> drives are reliable too. Gotcha. :) Yeah yeah... you got me. *grin* In a way. How to say this. NFS is like ext2 in some ways. No real protection from errors unless you turn on possibly performance killing aspects of the code. Ext3 takes it to a higher level of consistency without compromising as much on the performance. RAID can be the base of both of these things, and that helps alot. If your RAID is reliable. So, my NetApps are reliable because they have NVRAM for performance, and it's battery backed for reliability. On that they build the Volume and Filesystem stuff, which also has performance and reliability built-in. On top of this, they have NFS (or CIFS or other protocols, but I use only NFS). And we actually default to "proto=tcp,soft,intr" for all our mounts. We do this for performance, because we're confident of the underlying reliability of the layers below it. All the way down to the Network switches in a way. Though I admit we don't dual-path everything since we don't have enough need for that level of reliability. So that's where I'm coming from. Now, I'd be happy to be proven wrong, but I'd like to see people giving test scripts which can be run on a client to simulate failures and such so I can run them here in my environment as test. Maybe I'll change my mind. Maybe I won't. At least we've got choice. :] John ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-21 18:50 ` John Stoffel ` (2 preceding siblings ...) 2007-08-21 23:04 ` Valdis.Kletnieks @ 2007-08-31 8:06 ` Ian Kent 2007-08-31 15:10 ` Valdis.Kletnieks 3 siblings, 1 reply; 22+ messages in thread From: Ian Kent @ 2007-08-31 8:06 UTC (permalink / raw) To: John Stoffel; +Cc: Peter Staubach, Robin Lee Powell, linux-kernel On Tue, 21 Aug 2007, John Stoffel wrote: > >>>>> "Peter" == Peter Staubach <staubach@redhat.com> writes: > > Peter> John Stoffel wrote: > Robin> I'm bringing this up again (I know it's been mentioned here > Robin> before) because I had been told that NFS support had gotten > Robin> better in Linux recently, so I have been (for my $dayjob) > Robin> testing the behaviour of NFS (autofs NFS, specifically) under > Robin> Linux with hard,intr and using iptables to simulate a hang. > >> > >> So why are you mouting with hard,intr semantics? At my current > >> SysAdmin job, we mount everything (solaris included) with 'soft,intr' > >> and it works well. If an NFS server goes down, clients don't hang for > >> large periods of time. > > Peter> Wow! That's _really_ a bad idea. NFS READ operations which > Peter> timeout can lead to executables which mysteriously fail, file > Peter> corruption, etc. NFS WRITE operations which fail may or may > Peter> not lead to file corruption. > > Peter> Anything writable should _always_ be mounted "hard" for safety > Peter> purposes. Readonly mounted file systems _may_ be mounted > Peter> "soft", depending upon what is located on them. > > Not in my experience. We use NetApps as our backing NFS servers, so > maybe my experience isn't totally relevant. But with a mix of Linux > and Solaris clients, we've never had problems with soft,intr on our > NFS clients. So, there's a power outage and the UPS had a glitch. Oops, you've got to recover multiple TB and tell users everything since the last incremental backup is gone. You use UPS in the computer room but management, in it's cost cutting wisdom, hasn't provided for UPS for your Unix workstations and there's a power outage. Oops, you've got lots of corrupt files but you don't know which ones they are so you've got to recover multiple TB and tell users everything since the last incremental backup is gone. Ok, so hard mounting may not always save you in these circumstances but soft mounting will surely get you in the neck. Ian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-31 8:06 ` Ian Kent @ 2007-08-31 15:10 ` Valdis.Kletnieks 2007-08-31 15:30 ` Ian Kent 0 siblings, 1 reply; 22+ messages in thread From: Valdis.Kletnieks @ 2007-08-31 15:10 UTC (permalink / raw) To: Ian Kent; +Cc: John Stoffel, Peter Staubach, Robin Lee Powell, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1190 bytes --] On Fri, 31 Aug 2007 16:06:36 +0800, Ian Kent said: > So, there's a power outage and the UPS had a glitch. Murphy can get a *lot* more creative than that. So we'd outgrown the capacity on our UPS and diesel generator, and decided to replace them. So we schedule downtime for a Saturday. Rather scary, we had a Sun E10K that had been powered-up for several years, and just as expected, a good fraction of the 400+ drives it had failed to re-spinup. While recovering from that, we discovered that although the vast majority of the 400 drives were either mirrors or raidsets, due to a config error, the boot volume wasn't mirrored (fortunately, it spun up OK so we dodged the bullet), so we fixed that. Literally the next Friday, not even a week later, a contractor relocating a door into our machine room shorted out a sensor circuit in our fire suppression system, triggering a Halon dump. Of course, no amount of UPS and diesel was going to save us now, because there was a safety interlock that killed the power feeds if the Halon dumped. This time, since they'd all been stressed just a week before, only 2 of the 400+ disks on the E10K failed to spin up. Guess which two. ;) [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: NFS hang + umount -f: better behaviour requested. 2007-08-31 15:10 ` Valdis.Kletnieks @ 2007-08-31 15:30 ` Ian Kent 0 siblings, 0 replies; 22+ messages in thread From: Ian Kent @ 2007-08-31 15:30 UTC (permalink / raw) To: Valdis.Kletnieks Cc: John Stoffel, Peter Staubach, Robin Lee Powell, linux-kernel On Fri, 2007-08-31 at 11:10 -0400, Valdis.Kletnieks@vt.edu wrote: > On Fri, 31 Aug 2007 16:06:36 +0800, Ian Kent said: > > So, there's a power outage and the UPS had a glitch. > > Murphy can get a *lot* more creative than that. > > So we'd outgrown the capacity on our UPS and diesel generator, and decided > to replace them. So we schedule downtime for a Saturday. Rather scary, we > had a Sun E10K that had been powered-up for several years, and just as expected, > a good fraction of the 400+ drives it had failed to re-spinup. While recovering > from that, we discovered that although the vast majority of the 400 drives were > either mirrors or raidsets, due to a config error, the boot volume wasn't > mirrored (fortunately, it spun up OK so we dodged the bullet), so we fixed that. > > Literally the next Friday, not even a week later, a contractor relocating a > door into our machine room shorted out a sensor circuit in our fire suppression > system, triggering a Halon dump. Of course, no amount of UPS and diesel was > going to save us now, because there was a safety interlock that killed the > power feeds if the Halon dumped. This time, since they'd all been stressed > just a week before, only 2 of the 400+ disks on the E10K failed to spin up. > > Guess which two. ;) Eeeeeekkkk!! The mirrors, of course. Ian ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2007-08-31 15:30 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-20 22:54 NFS hang + umount -f: better behaviour requested Robin Lee Powell 2007-08-20 23:27 ` Neil Brown 2007-08-20 23:34 ` Robin Lee Powell 2007-08-21 1:51 ` Salah Coronya 2007-08-21 16:43 ` John Stoffel 2007-08-21 16:55 ` J. Bruce Fields 2007-08-21 17:01 ` Peter Staubach 2007-08-21 17:14 ` Chakri n 2007-08-21 17:14 ` Robin Lee Powell 2007-08-21 17:18 ` Peter Staubach 2007-08-21 18:50 ` John Stoffel 2007-08-21 19:04 ` Peter Staubach 2007-08-21 19:25 ` J. Bruce Fields 2007-08-24 15:09 ` Ric Wheeler 2007-08-24 15:37 ` Peter Staubach 2007-08-24 15:53 ` J. Bruce Fields 2007-08-21 23:04 ` Valdis.Kletnieks 2007-08-22 10:03 ` Theodore Tso 2007-08-22 15:26 ` John Stoffel 2007-08-31 8:06 ` Ian Kent 2007-08-31 15:10 ` Valdis.Kletnieks 2007-08-31 15:30 ` Ian Kent
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox