* Re: network storage solutions [not found] <1053018023.2883.168.camel@protein.scalableinformatics.com> @ 2003-05-15 17:50 ` Jeff Layton 2003-05-15 18:19 ` Brian Pawlowski 2003-05-15 18:56 ` Joe Landman 0 siblings, 2 replies; 9+ messages in thread From: Jeff Layton @ 2003-05-15 17:50 UTC (permalink / raw) To: Beowulf; +Cc: Joseph Landman, nfs Joseph Landman wrote: > On Thu, 2003-05-15 at 12:24, Jeff Layton wrote: > > Jeff Layton wrote: > > > > > Joe Landman wrote: > > > > > >> Note: the soft vs hard mount is a matter of "religion" to some > folk. I > > >> usually specify > > >> > > > > > > I don't think it's really a religion. From what I've read, > > > the NFS guru's say that you have to use hard mounts to > > > guarantee data integrity (which I'm sure everyone wants > > > for a rw mounted filesystem). Here is one reference: > > > > > > http://www.netapp.com/tech_library/3183.html#3. > > I still maintain it is a religious preference. Hard mounts can and will > crash client machines in the event of a server being permanently down. > Some folks want that behavior. Some do not. This is also a religious > war. > I'm cc-ing the NFS mailing list to get their input on this. However, let me say that I don't really view it as a religious preference. If I lose my server in a cluster, I don't mind losing the nodes (however, we've lost the NFS server before and never lost any of the nodes on a 288 node cluster even though they are hard mounted - strange). Since we use our cluster for production work (please, I'm not trying to offend anyone), we HAVE to have non-corrupted data. This is why we use hard mounts with 'sync' as well as a few other options. The URL above to Chuck's paper has several examples of "good" mount options. > Amazing how many of them occur. > > The way I and other who use soft mounts view it, data lossage occurs > when the server crashes, as you cannot guarantee (except with sync), > that the data was committed to disk. > However, if I read Chuck's paper correctly, with soft mount you can get a soft time-out that can interrupt an operation but the client will continue then with corrupted data. Am I understanding this correctly? Therefore, the clients may be up, but now the data is corrupt and the appliation doesn't know it. > Worse, if you are using a > journaling fs on the NFS server side, to recover the fs, there may > require a roll-back of the fs state. This would crash a transaction in > progress on the client with a hard mount and sync, and in a number of > cases, crash the kernel. With a soft mount, and sync, you would get an > error. Please note that this is a highly oversimplified version of what > really happens, and some may disagree with the statements. Refer to the > source to see what happens. Wont be reproduced here. > > Which one is more relevant to you is more a matter of preference than of > data security. If your server crashes, you are going to lose > transactions in flight, written but not committed. How the client > responds to those is a matter of preference. This is where the > religious aspect crops up. > I'm not sure... If the server crashes, I think this is true. But what if you get an interrupt. Soft mounts will allow the application to continue with corrupted data while hard mounts will produce an error, but not corrupt data (I think). Jeff > [...] > > > >> as options on my mounts. I prefer the soft mount for a number of > > >> reasons, most notably stability of the whole cluster is not a > function > > >> of the least stable server. > > This really opens up some of the points of how to handle errors in the > cluster shared file system. > > -- > Joseph Landman <landman@scalableinformatics.com> > -- Jeff Layton Senior Engineer - Aerodynamics and CFD Lockheed-Martin Aeronautical Company - Marietta "Is it possible to overclock a cattle prod?" - Irv Mullins ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: network storage solutions 2003-05-15 17:50 ` network storage solutions Jeff Layton @ 2003-05-15 18:19 ` Brian Pawlowski 2003-05-16 6:07 ` Brian Pawlowski 2003-05-15 18:56 ` Joe Landman 1 sibling, 1 reply; 9+ messages in thread From: Brian Pawlowski @ 2003-05-15 18:19 UTC (permalink / raw) To: jeffrey.b.layton; +Cc: beowulf, landman, nfs It's not religious. It's simple. NFS servers (that I use) commit their data to persistent storage before replying to the client. This protects against simple data loss in face of server reboots. If they didn't do this, I could get silent data loss or corruptions of data that my application may not be aware of or recover from. That's expected behaviour from hard mounts on a client to a server. Soft mounts say "Try, but after N errors of transmission give up." People use soft mounts for (1) improved performance (you can juice up cheap servers by caching data), or (2) prevent hung clients in face of unreliable networks and servers (when client is accessing many NFS servers). At Sun, I felt in the end soft mounts were a bad idea. Better was "intr" where at least user interaction could override "hard" mount guarantees, and the user can make a choice of "screw my data". Today, though, even the reboot persistence of data is inadequate for many critical apps. Commercial servers have RAID or mirroring, clustered configs for eliminating single points of failure (and hung mounts), etc. > Joseph Landman wrote: > > > On Thu, 2003-05-15 at 12:24, Jeff Layton wrote: > > > Jeff Layton wrote: > > > > > > > Joe Landman wrote: > > > > > > > >> Note: the soft vs hard mount is a matter of "religion" to some > > folk. I > > > >> usually specify > > > >> > > > > > > > > I don't think it's really a religion. From what I've read, > > > > the NFS guru's say that you have to use hard mounts to > > > > guarantee data integrity (which I'm sure everyone wants > > > > for a rw mounted filesystem). Here is one reference: > > > > > > > > http://www.netapp.com/tech_library/3183.html#3. > > > > I still maintain it is a religious preference. Hard mounts can and will > > crash client machines in the event of a server being permanently down. > > Some folks want that behavior. Some do not. This is also a religious > > war. > > > > I'm cc-ing the NFS mailing list to get their input on this. > However, let me say that I don't really view it as a religious > preference. If I lose my server in a cluster, I don't mind > losing the nodes (however, we've lost the NFS server > before and never lost any of the nodes on a 288 node cluster > even though they are hard mounted - strange). > Since we use our cluster for production work (please, I'm > not trying to offend anyone), we HAVE to have non-corrupted > data. This is why we use hard mounts with 'sync' as well as > a few other options. The URL above to Chuck's paper has > several examples of "good" mount options. > > > Amazing how many of them occur. > > > > The way I and other who use soft mounts view it, data lossage occurs > > when the server crashes, as you cannot guarantee (except with sync), > > that the data was committed to disk. > > > > However, if I read Chuck's paper correctly, with soft mount > you can get a soft time-out that can interrupt an operation > but the client will continue then with corrupted data. Am I > understanding this correctly? Therefore, the clients may be > up, but now the data is corrupt and the appliation doesn't > know it. > > > > Worse, if you are using a > > journaling fs on the NFS server side, to recover the fs, there may > > require a roll-back of the fs state. This would crash a transaction in > > progress on the client with a hard mount and sync, and in a number of > > cases, crash the kernel. With a soft mount, and sync, you would get an > > error. Please note that this is a highly oversimplified version of what > > really happens, and some may disagree with the statements. Refer to the > > source to see what happens. Wont be reproduced here. > > > > Which one is more relevant to you is more a matter of preference than of > > data security. If your server crashes, you are going to lose > > transactions in flight, written but not committed. How the client > > responds to those is a matter of preference. This is where the > > religious aspect crops up. > > > > I'm not sure... If the server crashes, I think this is true. > But what if you get an interrupt. Soft mounts will allow > the application to continue with corrupted data while hard > mounts will produce an error, but not corrupt data (I think). > > Jeff > > > [...] > > > > > >> as options on my mounts. I prefer the soft mount for a number of > > > >> reasons, most notably stability of the whole cluster is not a > > function > > > >> of the least stable server. > > > > This really opens up some of the points of how to handle errors in the > > cluster shared file system. > > > > -- > > Joseph Landman <landman@scalableinformatics.com> > > > > > -- > Jeff Layton > Senior Engineer - Aerodynamics and CFD > Lockheed-Martin Aeronautical Company - Marietta > > "Is it possible to overclock a cattle prod?" - Irv Mullins > > > > > ------------------------------------------------------- > Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara > The only event dedicated to issues related to Linux enterprise solutions > www.enterpriselinuxforum.com > > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: network storage solutions 2003-05-15 18:19 ` Brian Pawlowski @ 2003-05-16 6:07 ` Brian Pawlowski 0 siblings, 0 replies; 9+ messages in thread From: Brian Pawlowski @ 2003-05-16 6:07 UTC (permalink / raw) To: Brian Pawlowski; +Cc: jeffrey.b.layton, beowulf, landman, nfs > People use soft mounts for (1) improved performance (you can juice > up cheap servers by caching data), or (2) prevent hung clients > in face of unreliable networks and servers (when client is accessing > many NFS servers). Skip (1) - that is (a)sync on server (not soft mounts). I should think before I type:-) So, Windows CIFS has dramatic "soft mount" like behaviour (some popup like "Delayed writes lost" on session disconnect). Always a pain - makes me want hard mount NFS behaviour. ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: network storage solutions 2003-05-15 17:50 ` network storage solutions Jeff Layton 2003-05-15 18:19 ` Brian Pawlowski @ 2003-05-15 18:56 ` Joe Landman 2003-05-15 22:01 ` Trent Piepho 1 sibling, 1 reply; 9+ messages in thread From: Joe Landman @ 2003-05-15 18:56 UTC (permalink / raw) To: jeffrey.b.layton; +Cc: Beowulf, nfs On Thu, 2003-05-15 at 13:50, Jeff Layton wrote: > Since we use our cluster for production work (please, I'm > not trying to offend anyone), we HAVE to have non-corrupted > data. This is why we use hard mounts with 'sync' as well as > a few other options. The URL above to Chuck's paper has > several examples of "good" mount options. Hmmm. I am reasonably sure that when the IO system returns an error, it does in fact get propagated to the appropriate user-land calling program. The program then makes the determination as to whether or not to continue. There are quite a few programs that rarely inspect return code from file operations. If you really require uncorrupted data, then you are probably using the synchronous/unbuffered file writes anyway (the O_SYNC, and possibly O_DIRECT options, though NFS has experimental support for O_DIRECT from reading the note around Trond's patches). > > The way I and other who use soft mounts view it, data lossage occurs > > when the server crashes, as you cannot guarantee (except with sync), > > that the data was committed to disk. > > > > However, if I read Chuck's paper correctly, with soft mount > you can get a soft time-out that can interrupt an operation > but the client will continue then with corrupted data. Am I > understanding this correctly? Therefore, the clients may be > up, but now the data is corrupt and the appliation doesn't > know it. I would like to know that as well. I would like to believe it will not continue with corrupt data, but return an error code/condition which should be handled. [...] > I'm not sure... If the server crashes, I think this is true. > But what if you get an interrupt. Soft mounts will allow > the application to continue with corrupted data while hard > mounts will produce an error, but not corrupt data (I think). I hope not. The programs that I send an INTR to on an NFS system (with the intr flag allowed) seem to accept the signal and die. I guess the question is here, what should be the state of the filesystem upon acceptance of that signal? Can you assume it is in a known state? -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615 ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: network storage solutions 2003-05-15 18:56 ` Joe Landman @ 2003-05-15 22:01 ` Trent Piepho 0 siblings, 0 replies; 9+ messages in thread From: Trent Piepho @ 2003-05-15 22:01 UTC (permalink / raw) Cc: Beowulf, nfs On 15 May 2003, Joe Landman wrote: > > However, if I read Chuck's paper correctly, with soft mount > > you can get a soft time-out that can interrupt an operation > > but the client will continue then with corrupted data. Am I > > understanding this correctly? Therefore, the clients may be > > up, but now the data is corrupt and the appliation doesn't > > know it. > > I would like to know that as well. I would like to believe it will not > continue with corrupt data, but return an error code/condition which > should be handled. That was my experience. We had a problem with soft NFS timing out during huge IO loads to large raid arrays. With a large server side cache getting flushed, some NFS requests could take several tens of seconds before the server got around to processing them. The soft NFS timeout limit turned out to be quite small. When this happens, there was both a message to syslog from the kernel about nfs timeout exceeded, and the application returned an error (read: I/O error or something of that nature). I can see how a poorly coded (though not uncommon) program that doesn't check the return value of read and write calls would not detect the failure. I raised the timeout to a more reasonable value, and no problems since. ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <3EC2A740.9060902@cert.ucr.edu>]
[parent not found: <Pine.LNX.3.96.1030514135224.2430H-100000@Maggie.Linux-Consulting.com>]
[parent not found: <20030515070359.GB1912@greglaptop.attbi.com>]
[parent not found: <3EC3ECC6.6000802@cert.ucr.edu>]
[parent not found: <3EC40815.9040504@lanl.gov>]
* Re: network storage solutions [not found] ` <3EC40815.9040504@lanl.gov> @ 2003-05-15 18:12 ` Jeffrey B. Layton 2003-05-16 13:21 ` Robert G. Brown 0 siblings, 1 reply; 9+ messages in thread From: Jeffrey B. Layton @ 2003-05-15 18:12 UTC (permalink / raw) To: beowulf; +Cc: Josip Loncaric, nfs, Charles.Lever Josip Loncaric wrote: > Glen Kaukola wrote: > >> Greg Lindahl wrote: >> >>> There are soft and hard mounts, and there are interruptable mounts >>> ("intr" -- check out "man nfs"). >>> >>> A hard mount will never time out. If you make it interruptable, then >>> the user can choose to ^C. This is the safe option. >>> >>> >> >> You know, I thought that's how it was supposed to work too. I do use >> the intr option, but even with that option, when a nfs drive is down, >> and something like a df command gets stuck, hitting ctrl-c doesn't >> seem to do a thing. All I can ever do is just kill my xterm or >> whatever. > > > We've had similar problems while I was at ICASE. "Hard" mounts would > lock up client processes (even unmount) when the NFS server went down, > but "soft" mounts were "too soft" for some of our users. A reasonable > solution is to "harden" your soft mounts by insisting on longer major > timeouts, as in "retrans=15" (the default is 3). I still think this is dangerous. With soft mounts you can still get silent data corruption despite the longer timeouts. Chuck, do you agree? Jeff > > > Sincerely, > Josip > > P.S. Our NFS servers virtually never went down, except due to > hardware problems or service, so indefinite retransmissions were > highly undesirable. > ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: network storage solutions 2003-05-15 18:12 ` Jeffrey B. Layton @ 2003-05-16 13:21 ` Robert G. Brown 0 siblings, 0 replies; 9+ messages in thread From: Robert G. Brown @ 2003-05-16 13:21 UTC (permalink / raw) To: Jeffrey B. Layton; +Cc: beowulf, Josip Loncaric, nfs, Charles.Lever On Thu, 15 May 2003, Jeffrey B. Layton wrote: > > We've had similar problems while I was at ICASE. "Hard" mounts would > > lock up client processes (even unmount) when the NFS server went down, > > but "soft" mounts were "too soft" for some of our users. A reasonable > > solution is to "harden" your soft mounts by insisting on longer major > > timeouts, as in "retrans=15" (the default is 3). > > > I still think this is dangerous. With soft mounts you can > still get silent data corruption despite the longer timeouts. > Chuck, do you agree? > > Jeff Perhaps it is a question of probability, and what people are willing to accept in terms of data loss in a given environment. It is a cost-benefit equation, as always, so acceptable solutions do have to at least examine the cost of a corrupted file against other costs associated with using hard mounts everywhere. In one somewhat jaded view, one says "crashes happen, and if a crash occurs in the middle of a file write there is a distinct chance of losing that file". This is (I suspect) true anyway for both hard and soft mounts, depending on the cause of the crash and what has to be done to fix it. If the exported filesystem is left in an inconsistent state post-crash and is modified before being re-exported to the clients, they are likely to see a stale mount and not be able to complete the ongoing write transaction. Nowadays (within linux) it is indeed pretty rare as Greg noted for a client not to recover gracefully from a server crash and reboot, although I confess to being less lucky -- we still see stale NFS mounts after certain crashes, and generally plan on being ABLE to reboot all the clients in the department after any major, planned downtime of our principle servers. There seems to be a bit of state dependence here -- "most" clients recover, but one or two sometimes seem to hang and need either a therapeutic reboot or at least a remount to clear some state-dependent problem. A question for the experts out there -- does the use of a journalling filesystem affect the probability of NFS file corruption on a soft mount? As in, is there any interaction between the journal and the NFS server that cause an incomplete or corrupted transaction to be interpreted as cause for invoking some of the protections journalling provides? I'm just curious...one would think that NFS would effectively "journal" itself to consistently end up in a "reliable" state (which might well cost one the latest writes to the file!) even on a soft mount. The probability and cost-benefit issues are often related to LAN architecture. In a common architecture, one has a single (or perhaps 2-3) "major server(s)" that have lots of capacity in all dimensions. This is where users manipulate "critical data" (e.g. home directories, project directories), and one EXPECTS the LAN to effectively go down when these servers are down so the mounts should definitely be hard mounts (although they might well be automounts, so your system isn't hung if YOUR home directory server stays up). To protect against anomalous amounts of downtime (which DEFINITELY costs one work at a fixed rate, compared to the stochastic expectation of loss in the case of possible data corruption) one makes the servers as reliable as possible -- they are architected "not to go down" and have things like four-hour service or hot mirror spares. In a few cases, as in Greg's example, lots of people with desktop workstations export workspace and crossmount it all over the place. Then the issue becomes one of cumulative stability of the workstation space. Because of the nasty behaviors of e.g. stat, it is quite common for a system or at least a session to effectively hang when ANY of its hard mounts go down -- perhaps not to crash, and to recover gracefully when the offending server comes back up, but in the meantime you can lose access to your workstation and ability to do work -- a real cost, potentially multiplied by N on a big network where NOBODY gets to do work until the workstation is back up. The downed workstation might NOT be so reliable and might NOT have hot and cold running service and might stay down a day or more, and a decision might well be made to take draconian measures to free up the (not really) "hung" clients. This also can cost work, e.g. work in progress, where a user may have to choose between not being able to work interactively while their background task completes or e.g. killing the background task with a reboot to come back up without the hung mount. Obviously the "best" solution to this sort of situation is to not put NFS exports on your path where you can avoid it, and to use the automounter to effectively reduce the number of mounts that can hang on a general path stat or df to the unavoidable main (hopefully reliable) filesystems plus those exported spaces belonging to your buddies that you actually are using NOW. However, in a small/informal LAN (like a home network), where the workstations that are providing the mounts aren't horribly overloaded at either the network or CPU or memory level (so they aren't at all likely to timeout on an NFS request) and where the admin either doesn't want to figure out the automounter or just isn't that concerned about the (low) probability of data corruption, one might choose as a "quick and dirty" solution to use a soft mount and bet that data corruption never occurs. Back a LONG time ago when NFS recovery on a hard mount was basically nonexistent (e.g. SunOS, Irix, etc.) I used sometimes used soft mounts on crossmounted workstation spaces and our (much slower, much less powerful client "servers") LAN never knowingly had a problem with data actually being corrupted -- although files were sometimes lost, leaving one of those lovely .nfs323112114 tags -- so I'd guess that the >>probability<< of silent corruption is actually pretty low. On the other hand, even a soft mount was never really all that recoverable either -- NFS just plain had a way of stubbornly hanging whenever a server went down, no matter what. It proved smarter, more cost-beneficial, and more professional in the long run (in a production LAN environment, with real costs associated with EVERYTHING) to consolidate exported space, including e.g. project space, into a very few, very reliable servers, period and just not LET "everybody mount everybody else", soft OR hard. Um, so to speak;-) I AM talking about networking here, after all... In summary, while soft mounts exist(ed) "for a reason", they never in the past and still don't work terribly well or reliably, and the reasons themselves for using them have mostly passed on. There are better ways to cope with the cost/benefit dilemna between de facto hung workstations and possibly lost/corrupt data. The vast improvements in automounters (which back in those same old days sucked incredibly and were as likely to produce problems as to solve them:-) make automounters with hard mounts, from a few, reliable, consolidated servers, by far the preferrable solution to the problem. The fact is that most users of single user workstations are most unlikely to have more than one or two automount directories mounted at any one time (within the mount timeout window) simply because they will typically be "working" at one path location or perhaps two at a time. Server consolidation also makes it MUCH easier to back things up, another "chronic" problem in cowboy networks where there otherwise would be project directories on fifty workstations, most of them with relatively unreliable IDE disks, every one being used by somebody that would whine or bluster and threaten if their data went away upon the crash of their cheap, three year old disk. Then there are the "control and security issues" -- NFS is a bleeding wound as far as security is concerned anyway (or at least has been historically) and all those crossmounts on private workstations offer a cracker or evil employee numerous opportunities to be naughty. True, one generally keeps a sucker rod handy to school the latter, but cleaning it afterwards is such a mess. Workstations just aren't architected (without effort and additional expense) to be good, secure, reliable servers in a LAN serving hundreds of clients. The "best" solution for "most" LAN architectures is thus to automount basically everything but the home directories or other "critical" filesystems mounted from a few reliable servers -- maybe even automount the home directories (if you have more than one home server, e.g.)! That way a desktop client system doesn't (generally) "hang" (recoverably, but hang nonetheless as far as the user is concerned) or become otherwise difficult to work with if a non-critical or currently unused server dies -- at most one might lose a tty window when one tries to access an automount, but if one keeps the automounts off of one's path then the path stat won't hang (almost) every shell transaction. Administrative control is concentrated in a relatively few points of failure and systems to secure and back up. Data reliability and protection against loss of work time and access at the same time. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu ------------------------------------------------------- Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara The only event dedicated to issues related to Linux enterprise solutions www.enterpriselinuxforum.com _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: network storage solutions
@ 2003-05-16 16:16 Lever, Charles
0 siblings, 0 replies; 9+ messages in thread
From: Lever, Charles @ 2003-05-16 16:16 UTC (permalink / raw)
To: Jeffrey B. Layton, beowulf; +Cc: Josip Loncaric, nfs
> > We've had similar problems while I was at ICASE. "Hard"=20
> mounts would=20
> > lock up client processes (even unmount) when the NFS server=20
> went down,=20
> > but "soft" mounts were "too soft" for some of our users. A=20
> reasonable=20
> > solution is to "harden" your soft mounts by insisting on=20
> longer major=20
> > timeouts, as in "retrans=3D15" (the default is 3).=20
>=20
>=20
> I still think this is dangerous. With soft mounts you can
> still get silent data corruption despite the longer timeouts.
> Chuck, do you agree?
yes. there is always a probability of corruption if there is
the possibility that the client will give up before the
operation has completed.
you can reduce that probability by following the suggestions
i posted yesterday.
-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
www.enterpriselinuxforum.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in thread* RE: network storage solutions
@ 2003-05-16 16:23 Lever, Charles
0 siblings, 0 replies; 9+ messages in thread
From: Lever, Charles @ 2003-05-16 16:23 UTC (permalink / raw)
To: Robert G. Brown, Jeffrey B. Layton; +Cc: beowulf, Josip Loncaric, nfs
> -----Original Message-----
> From: Robert G. Brown [mailto:rgb@phy.duke.edu]
> Sent: Friday, May 16, 2003 9:22 AM
> To: Jeffrey B. Layton
> Cc: beowulf@beowulf.org; Josip Loncaric; nfs@lists.sourceforge.net;
> Lever, Charles
> Subject: Re: network storage solutions
>=20
> Perhaps it is a question of probability, and what people are=20
> willing to
> accept in terms of data loss in a given environment. It is a
> cost-benefit equation, as always, so acceptable solutions do=20
> have to at
> least examine the cost of a corrupted file against other costs
> associated with using hard mounts everywhere. =20
right. we're dealing with probabilities here. there is always
a non-zero probability of data corruption, even with local
file systems.
using soft mounts increases the probability of silent data
corruption. if you can live with that, or you have solid
recovery mechanisms, then soft is a reasonable choice.
but it's best to be informed about this choice, rather than
just stabbing at using soft mounts because it makes other
problems go away.
-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
www.enterpriselinuxforum.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 9+ messages in threadend of thread, other threads:[~2003-05-16 16:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1053018023.2883.168.camel@protein.scalableinformatics.com>
2003-05-15 17:50 ` network storage solutions Jeff Layton
2003-05-15 18:19 ` Brian Pawlowski
2003-05-16 6:07 ` Brian Pawlowski
2003-05-15 18:56 ` Joe Landman
2003-05-15 22:01 ` Trent Piepho
[not found] <3EC2A740.9060902@cert.ucr.edu>
[not found] ` <Pine.LNX.3.96.1030514135224.2430H-100000@Maggie.Linux-Consulting.com>
[not found] ` <20030515070359.GB1912@greglaptop.attbi.com>
[not found] ` <3EC3ECC6.6000802@cert.ucr.edu>
[not found] ` <3EC40815.9040504@lanl.gov>
2003-05-15 18:12 ` Jeffrey B. Layton
2003-05-16 13:21 ` Robert G. Brown
2003-05-16 16:16 Lever, Charles
-- strict thread matches above, loose matches on Subject: below --
2003-05-16 16:23 Lever, Charles
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.