* RE: Re: NFS as a Cluster File System.
@ 2003-01-10 14:51 Lever, Charles
2003-01-10 15:23 ` Brian Tinsley
0 siblings, 1 reply; 19+ messages in thread
From: Lever, Charles @ 2003-01-10 14:51 UTC (permalink / raw)
To: 'Brian Tinsley', Lorn Kay; +Cc: nfs, linux-ha
> -----Original Message-----
> From: Brian Tinsley [mailto:btinsley@emageon.com]
> Sent: Thursday, January 09, 2003 4:11 PM
> To: Lorn Kay
> Cc: nfs@lists.sourceforge.net; linux-ha@muc.de
> Subject: [NFS] Re: NFS as a Cluster File System.
>
> > Linux clients can use TCP instead of UDP.
>
> Although I haven't had problems with this in our lab, I
> believe the NFS authors still consider this experimental.
the Linux NFS client support for TCP is not experimental.
perhaps less mature than UDP, but definitely not experimental.
the server, OTOH, sports brand new support for TCP. is that
what you were referring to?
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-10 14:51 Re: NFS as a Cluster File System Lever, Charles
@ 2003-01-10 15:23 ` Brian Tinsley
0 siblings, 0 replies; 19+ messages in thread
From: Brian Tinsley @ 2003-01-10 15:23 UTC (permalink / raw)
To: Lever, Charles; +Cc: Lorn Kay, nfs, linux-ha
[-- Attachment #1: Type: text/plain, Size: 871 bytes --]
Lever, Charles wrote:
>>-----Original Message-----
>>From: Brian Tinsley [mailto:btinsley@emageon.com]
>>Sent: Thursday, January 09, 2003 4:11 PM
>>To: Lorn Kay
>>Cc: nfs@lists.sourceforge.net; linux-ha@muc.de
>>Subject: [NFS] Re: NFS as a Cluster File System.
>>
>>> Linux clients can use TCP instead of UDP.
>>>
>>>
>>Although I haven't had problems with this in our lab, I
>>believe the NFS authors still consider this experimental.
>>
>>
>
>the Linux NFS client support for TCP is not experimental.
>perhaps less mature than UDP, but definitely not experimental.
>
>the server, OTOH, sports brand new support for TCP. is that
>what you were referring to?
>
>
Yes, sorry to confuse the issue.
--
-[========================]-
-[ Brian Tinsley ]-
-[ Chief Systems Engineer ]-
-[ Emageon ]-
-[========================]-
[-- Attachment #2: Type: text/html, Size: 1554 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: Re: NFS as a Cluster File System.
@ 2003-01-14 16:01 Lever, Charles
0 siblings, 0 replies; 19+ messages in thread
From: Lever, Charles @ 2003-01-14 16:01 UTC (permalink / raw)
To: 'Neil Brown', Alan Robertson; +Cc: Lorn Kay, nfs, linux-ha
> On Thursday January 9, alanr@unix.sh wrote:
> >
> > NFS V3 and before have problems with "cache coherency".
> That is, the
> > different nodes in the cluster are not guaranteed to see
> the same contents.
> >
> > I think this is supposed to be fixed in v4.
> >
>
> NFSv4 does not try to "fix" this. It makes no attempts at
> "cache coherency" beyond what NFSv2/3 provide which is "close
> to open" cohenrence, meaning that if only one process has a
> file open at a time, then everythnig will appear coherent,
> and if multiple processes have the file open at the same
> time, they need to use record locking.
well, coherency is partially addressed in NFSv4 with delegations.
a server can delegate a file to a client, allowing the client
to cache the file and trust that the server will notify it when
another client wants to access the file (read or write).
for an aggressively shared file, this doesn't perform well, but
NFS has always assumed that there is little concurrent sharing
of files.
this paradigm probably doesn't fit well with typical file
usage in clusters, where files are very very large, and many
nodes may be working on independent pieces of the same file
at the same time. in that case, record locking might be
best. however, on Linux, the client purges the entire file
from its cache when a file is locked, rather than just the
areas that were byte-range locked.
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
@ 2003-01-10 17:19 Lorn Kay
2003-01-12 21:29 ` Trond Myklebust
0 siblings, 1 reply; 19+ messages in thread
From: Lorn Kay @ 2003-01-10 17:19 UTC (permalink / raw)
To: trond.myklebust; +Cc: alanr, nfs, linux-ha
>
> >> NFS V3 and before have problems with "cache coherency". That
> >> is, the different nodes in the cluster are not guaranteed to
> >> see the same contents.
>
> > I think this can be resolved with the "noac" mount option
> > (prior to V4).
>
>Nope. It can only be resolved using file locking.
>
>Cheers,
> Trond
>
Meaning if you don't lock a file and just read it you may not see the what
another client has written to it, or is that not an issue because the other
client will have locked and then unlocked the file when it is done making
changes?
Thanks,
--K
_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*
http://join.msn.com/?page=features/virus
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-10 17:19 Lorn Kay
@ 2003-01-12 21:29 ` Trond Myklebust
0 siblings, 0 replies; 19+ messages in thread
From: Trond Myklebust @ 2003-01-12 21:29 UTC (permalink / raw)
To: Lorn Kay; +Cc: alanr, nfs
>>>>> " " == Lorn Kay <lorn_kay@hotmail.com> writes:
> Meaning if you don't lock a file and just read it you may not
> see the what another client has written to it, or is that not
> an issue because the other client will have locked and then
> unlocked the file when it is done making changes?
Meaning that unless *all* clients lock the file prior to reading or
writing from it (and then unlock it when done), there will be
caching inconsistencies.
Cheers,
Trond
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS as a Cluster File System.
@ 2003-01-09 23:13 Lorn Kay
2003-01-09 23:45 ` Donavan Pantke
0 siblings, 1 reply; 19+ messages in thread
From: Lorn Kay @ 2003-01-09 23:13 UTC (permalink / raw)
To: lmb, nfs, linux-ha
>However, it is NOT a "CFS", which people commonly use to refer to a
>filesystem
>which is distributed and usually shares the same storage system connected
>to
>all nodes.
>
>I believe there might be a confusion of words here ;-)
>
>
>Sincerely,
> Lars Marowsky-Brée <lmb@suse.de>
Sorry, still confused about what a "CFS" really is. In "In Search Of
Clusters" Gregory Pfister takes the position that a distributed file system
is what he calls a valid "single system image" file system, what I would
take to mean a cluster file system (though he doesn't use those words).
I guess you are saying a clustered file system isn't necessarily supporting
a cluster of application servers but is itself stored on a cluster. (A
single server can be the only server using a cluster file system.) ?
--K
_________________________________________________________________
Help STOP SPAM: Try the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-09 23:13 Lorn Kay
@ 2003-01-09 23:45 ` Donavan Pantke
0 siblings, 0 replies; 19+ messages in thread
From: Donavan Pantke @ 2003-01-09 23:45 UTC (permalink / raw)
To: Lorn Kay, lmb, nfs, linux-ha
On Thursday 09 January 2003 18:13, Lorn Kay wrote:
>
> Sorry, still confused about what a "CFS" really is. In "In Search Of
> Clusters" Gregory Pfister takes the position that a distributed file sy=
stem
> is what he calls a valid "single system image" file system, what I woul=
d
> take to mean a cluster file system (though he doesn't use those words).
>
> I guess you are saying a clustered file system isn't necessarily suppor=
ting
> a cluster of application servers but is itself stored on a cluster. (A
> single server can be the only server using a cluster file system.) ?
>
=09Typically, the term CFS refers to a set of servers that work on the sa=
me=20
storage at the same time. What this means is that my central storage syst=
em=20
could have multiple servers mounting a file system at the same time.=20
Currently, preforming this is still in development; the difficulty is tha=
t=20
all nodes have to tell each other in some way about exactly what they're=20
doing to keep from corrupting the data on storage. Until this matures for=
=20
production use (It's my presonal opinion that there are still too many bu=
gs=20
in current implementations), the next best answer is for a highly availab=
le=20
cluster of servers that handle data requests. This is where NFS comes in.=
=20
Although I agree that in some applications this isn't workable with NFS, =
I've=20
found it to be quite a boon. At my workplace, we have a great many machin=
es=20
accessing common data. We started with a M$ cluster using SMB, but the na=
ture=20
of the protocol means that when the cluster fails, each client can't acce=
ss=20
currently open files. they have to close and re-open each handle. The=20
stateless nature of NFSv3 and v2 is stateless. This means that a cluster =
can=20
fail over and clients simply pause requests untile the filesystem is=20
accessible again. Not a good soution for applications with very high resp=
onse=20
time requirements, but it works for us. BTW: When describing clusters her=
e=20
I'm referring to a active-passive pair of servers where resources move wh=
en a=20
failure has been detected.
=09Donavan Pantke
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
@ 2003-01-09 22:51 Lorn Kay
2003-01-10 15:01 ` Trond Myklebust
0 siblings, 1 reply; 19+ messages in thread
From: Lorn Kay @ 2003-01-09 22:51 UTC (permalink / raw)
To: alanr; +Cc: nfs, linux-ha
>NFS V3 and before have problems with "cache coherency". That is, the
>different nodes in the cluster are not guaranteed to see the same contents.
>
>I think this is supposed to be fixed in v4.
>
>
>--
> Alan Robertson <alanr@unix.sh>
>
>"Openness is the foundation and preservative of friendship.... Let me
>claim from you at all times your undisguised opinions." - William
>Wilberforce
>
I think this can be resolved with the "noac" mount option (prior to V4).
--K
_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*
http://join.msn.com/?page=features/virus
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-09 22:51 Lorn Kay
@ 2003-01-10 15:01 ` Trond Myklebust
2003-01-10 17:38 ` Greg Lindahl
0 siblings, 1 reply; 19+ messages in thread
From: Trond Myklebust @ 2003-01-10 15:01 UTC (permalink / raw)
To: Lorn Kay; +Cc: alanr, nfs, linux-ha
>>>>> " " == Lorn Kay <lorn_kay@hotmail.com> writes:
>> NFS V3 and before have problems with "cache coherency". That
>> is, the different nodes in the cluster are not guaranteed to
>> see the same contents.
> I think this can be resolved with the "noac" mount option
> (prior to V4).
Nope. It can only be resolved using file locking.
Cheers,
Trond
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: Re: NFS as a Cluster File System.
2003-01-10 15:01 ` Trond Myklebust
@ 2003-01-10 17:38 ` Greg Lindahl
2003-01-12 21:23 ` Trond Myklebust
0 siblings, 1 reply; 19+ messages in thread
From: Greg Lindahl @ 2003-01-10 17:38 UTC (permalink / raw)
To: Trond Myklebust; +Cc: nfs, linux-ha
> > I think this can be resolved with the "noac" mount option
> > (prior to V4).
>
> Nope. It can only be resolved using file locking.
There are 2 consistency problems: metadata and data.
Metadata is solved by noac. And yes, some MPI programs do things like
"node 0 writes out a bunch of files, then tells all the other nodes to
read one file each." This means that you have about 1/100 of a second
window between creation on one client and reading on a different client.
Data consistency is solved by file locking.
-- greg
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-10 17:38 ` Greg Lindahl
@ 2003-01-12 21:23 ` Trond Myklebust
0 siblings, 0 replies; 19+ messages in thread
From: Trond Myklebust @ 2003-01-12 21:23 UTC (permalink / raw)
To: Greg Lindahl; +Cc: nfs
>>>>> " " == Greg Lindahl <lindahl@keyresearch.com> writes:
> Metadata is solved by noac.
Nope. 'noac' doesn't even do that for you:
'noac' just increases the frequency of GETATTR calls. It will not
suffice to provide consistent metadata for those operations that
actually depend on the returned values, as there will be no atomicity
guarantees between the GETATTR call and the subsequent
READ/WRITE/READDIR/...
A common example where this is relevant: You have absolutely no
guarantee with 'noac' that open("foo",O_APPEND,O_WRONLY), will work as
expected.
So, again I repeat: File locking is the only way to ensure cache
coherency.
Cheers,
Trond
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* NFS as a Cluster File System.
@ 2003-01-09 19:39 Lorn Kay
2003-01-09 21:11 ` Brian Tinsley
2003-01-09 21:29 ` Alan Robertson
0 siblings, 2 replies; 19+ messages in thread
From: Lorn Kay @ 2003-01-09 19:39 UTC (permalink / raw)
To: nfs, linux-ha
Is NFS a viable CFS? (I'm cross posting this due to a discussion on the the
linux-ha list recently.)
NFS has a bad reputation probably due to (at least) the following:
It has been used in networking environments where different server hardware
configurations (NICS, drivers, etc.) running different operating systems
have connected to each other (in many-to-many configurations).
It grew up on networks that were perhaps unstable, or immature
(Someones kicked the token ring coax cable laying on the floor again)
long before switches were common place, and the network was loaded down with
all kinds of network traffic.
It wasnt understood very well. Since the default mount options worked,
system administrators often didnt fully understand the ramifications of
their NFS client mount option choices.
It relied on UDP, which is susceptible to huge retransmission efforts on
noisy or lossy networks.
NFS was used over many-hop WAN connections.
NFS servers were often used for many other tasks, not just NFS.
A cluster configuration, however, offers several advantages over the typical
NFS configuration:
All NFS clients (the cluster nodes) run the same operating system (Linux).
All clients run the same version of NFS and the kernel.
All clients use the same network tuned configuration.
A physical network can be dedicated to NFS. (Using a high quality switch,
with short data-center-only cable runs.)
All clients connect to one NFS server.
The NFS server is a high-quality dedicated machine (Net App, EMC, etc.)
Only one mount point need be used with one set of mount options.
Linux clients can use TCP instead of UDP.
Except for the vagaries of the load placed on the cluster nodes, this sounds
like a test lab environment. If NFS cant work in this environment where
will it ever work?
--K
_________________________________________________________________
STOP MORE SPAM with the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS as a Cluster File System.
2003-01-09 19:39 Lorn Kay
@ 2003-01-09 21:11 ` Brian Tinsley
2003-01-09 22:04 ` Brian Jackson
2003-01-09 21:29 ` Alan Robertson
1 sibling, 1 reply; 19+ messages in thread
From: Brian Tinsley @ 2003-01-09 21:11 UTC (permalink / raw)
To: Lorn Kay; +Cc: nfs, linux-ha
Lorn Kay wrote:
>
> Is NFS a viable CFS? (I'm cross posting this due to a discussion on
> the the linux-ha list recently.)
Since there is not a really good cluster filesystem for Linux that is
not either "half baked" (IMHO - I'm probably going to get smacked over
that statement!) or cost an arm and a leg, this is exactly the route we
have taken.
> The NFS server is a high-quality dedicated machine (Net App, EMC,
> etc.)
We've had great success with just using SMP Linux servers. We do have
one EMC IP4700 in production, and it's a nice system, but I prefer the
Linux based alternative.
> Linux clients can use TCP instead of UDP.
Although I haven't had problems with this in our lab, I believe the NFS
authors still consider this experimental.
--
-[========================]-
-[ Brian Tinsley ]-
-[ Chief Systems Engineer ]-
-[ Emageon ]-
-[========================]-
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS as a Cluster File System.
2003-01-09 21:11 ` Brian Tinsley
@ 2003-01-09 22:04 ` Brian Jackson
2003-01-09 23:02 ` Brian Tinsley
0 siblings, 1 reply; 19+ messages in thread
From: Brian Jackson @ 2003-01-09 22:04 UTC (permalink / raw)
To: nfs, linux-ha
On Thursday 09 January 2003 03:11 pm, Brian Tinsley wrote:
> Lorn Kay wrote:
> > Is NFS a viable CFS? (I'm cross posting this due to a discussion on
> > the the linux-ha list recently.)
>
> Since there is not a really good cluster filesystem for Linux that is
> not either "half baked"
Hey, we're working on it ;)
--Brian Jackson
OpenGFS Project
> (IMHO - I'm probably going to get smacked over
> that statement!) or cost an arm and a leg, this is exactly the route we
> have taken.
>
> > The NFS server is a high-quality dedicated machine (Net App, EMC,
> > etc.)
>
> We've had great success with just using SMP Linux servers. We do have
> one EMC IP4700 in production, and it's a nice system, but I prefer the
> Linux based alternative.
>
> > Linux clients can use TCP instead of UDP.
>
> Although I haven't had problems with this in our lab, I believe the NFS
> authors still consider this experimental.
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: NFS as a Cluster File System.
2003-01-09 19:39 Lorn Kay
2003-01-09 21:11 ` Brian Tinsley
@ 2003-01-09 21:29 ` Alan Robertson
2003-01-13 19:36 ` Neil Brown
1 sibling, 1 reply; 19+ messages in thread
From: Alan Robertson @ 2003-01-09 21:29 UTC (permalink / raw)
To: Lorn Kay; +Cc: nfs, linux-ha
Lorn Kay wrote:
>
> Is NFS a viable CFS? (I'm cross posting this due to a discussion on the
> the linux-ha list recently.)
>
> NFS has a bad reputation probably due to (at least) the following:
>
> It has been used in networking environments where different server
> hardware configurations (NICS, drivers, etc.) running different
> operating systems have connected to each other (in many-to-many
> configurations).
>
> It “grew up” on networks that were perhaps unstable, or immature
> (“Someone’s kicked the token ring coax cable laying on the floor again”)
> long before switches were common place, and the network was loaded down
> with all kinds of network traffic.
>
> It wasn’t understood very well. Since the default mount options
> worked, system administrators often didn’t fully understand the
> ramifications of their NFS client mount option choices.
>
> It relied on UDP, which is susceptible to huge retransmission
> efforts on noisy or lossy networks.
>
> NFS was used over many-hop WAN connections.
>
> NFS servers were often used for many other tasks, not just NFS.
>
>
> A cluster configuration, however, offers several advantages over the
> typical NFS configuration:
>
> All NFS clients (the cluster nodes) run the same operating system
> (Linux).
>
> All clients run the same version of NFS and the kernel.
>
> All clients use the same network tuned configuration.
>
> A physical network can be dedicated to NFS. (Using a high quality
> switch, with short data-center-only cable runs.)
>
> All clients connect to one NFS server.
>
> The NFS server is a high-quality dedicated machine (Net App, EMC, etc.)
>
> Only one mount point need be used with one set of mount options.
>
> Linux clients can use TCP instead of UDP.
>
> Except for the vagaries of the load placed on the cluster nodes, this
> sounds like a test lab environment. If NFS can’t work in this
> environment where will it ever work?
NFS V3 and before have problems with "cache coherency". That is, the
different nodes in the cluster are not guaranteed to see the same contents.
I think this is supposed to be fixed in v4.
--
Alan Robertson <alanr@unix.sh>
"Openness is the foundation and preservative of friendship.... Let me claim
from you at all times your undisguised opinions." - William Wilberforce
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: Re: NFS as a Cluster File System.
2003-01-09 21:29 ` Alan Robertson
@ 2003-01-13 19:36 ` Neil Brown
2003-01-13 20:25 ` David B. Ritch
2003-01-14 15:46 ` Trond Myklebust
0 siblings, 2 replies; 19+ messages in thread
From: Neil Brown @ 2003-01-13 19:36 UTC (permalink / raw)
To: Alan Robertson; +Cc: Lorn Kay, nfs, linux-ha
On Thursday January 9, alanr@unix.sh wrote:
>
> NFS V3 and before have problems with "cache coherency". That is, the
> different nodes in the cluster are not guaranteed to see the same contents.
>
> I think this is supposed to be fixed in v4.
>
NFSv4 does not try to "fix" this. It makes no attempts at "cache
coherency" beyond what NFSv2/3 provide which is "close to open"
cohenrence, meaning that if only one process has a file open at a
time, then everythnig will appear coherent, and if multiple processes
have the file open at the same time, they need to use record locking.
I really don't think total cache coherency is a sensible goal for a
network filesystem, even a cluster filesystem. It imposes lots of
extra network traffic that most of the time will be of no value.
If an application needs some degree of coherence, it should be
explicit about it's needs (using open/close or locking) so that the
protocol can provide it then, and only then.
NeilBrown
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-13 19:36 ` Neil Brown
@ 2003-01-13 20:25 ` David B. Ritch
2003-01-13 20:40 ` Neil Brown
2003-01-14 15:46 ` Trond Myklebust
1 sibling, 1 reply; 19+ messages in thread
From: David B. Ritch @ 2003-01-13 20:25 UTC (permalink / raw)
To: Neil Brown; +Cc: Alan Robertson, Lorn Kay, NFS mailing list, linux-ha
I agree that cache coherency is not a sensible goal for a cluster
filesystem. However, cache coherency of metadata is rather important.
For example, when one node creates a file of intermediate data, it is
important for the other nodes to be able to see that. Using actime=0 is
the conventional mechanism for allowing file creation and deletion to be
propagated quickly. Usually, one can tweak that a bit to reduce the
burden on the server. However, it might be be nice if there were a
mechanism to propagate this sort of metadata change without dumping all
metadata over a second or two old.
dbr
On Mon, 2003-01-13 at 14:36, Neil Brown wrote:
> On Thursday January 9, alanr@unix.sh wrote:
> >
> > NFS V3 and before have problems with "cache coherency". That is, the
> > different nodes in the cluster are not guaranteed to see the same contents.
> >
> > I think this is supposed to be fixed in v4.
> >
>
> NFSv4 does not try to "fix" this. It makes no attempts at "cache
> coherency" beyond what NFSv2/3 provide which is "close to open"
> cohenrence, meaning that if only one process has a file open at a
> time, then everythnig will appear coherent, and if multiple processes
> have the file open at the same time, they need to use record locking.
>
> I really don't think total cache coherency is a sensible goal for a
> network filesystem, even a cluster filesystem. It imposes lots of
> extra network traffic that most of the time will be of no value.
> If an application needs some degree of coherence, it should be
> explicit about it's needs (using open/close or locking) so that the
> protocol can provide it then, and only then.
>
> NeilBrown
>
>
> -------------------------------------------------------
> This SF.NET email is sponsored by: FREE SSL Guide from Thawte
> are you planning your Web Server Security? Click here to get a FREE
> Thawte SSL guide and find the answers to all your SSL security issues.
> http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
> _______________________________________________
> NFS maillist - NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
--
David B. Ritch
High Performance Technologies, Inc.
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-13 20:25 ` David B. Ritch
@ 2003-01-13 20:40 ` Neil Brown
2003-01-13 20:50 ` David B. Ritch
0 siblings, 1 reply; 19+ messages in thread
From: Neil Brown @ 2003-01-13 20:40 UTC (permalink / raw)
To: David B. Ritch; +Cc: Alan Robertson, Lorn Kay, NFS mailing list, linux-ha
On January 13, dritch@hpti.com wrote:
> I agree that cache coherency is not a sensible goal for a cluster
> filesystem. However, cache coherency of metadata is rather important.
> For example, when one node creates a file of intermediate data, it is
> important for the other nodes to be able to see that. Using actime=0 is
> the conventional mechanism for allowing file creation and deletion to be
> propagated quickly. Usually, one can tweak that a bit to reduce the
> burden on the server. However, it might be be nice if there were a
> mechanism to propagate this sort of metadata change without dumping all
> metadata over a second or two old.
If the 'other nodes' open the file and look in it, then they should
see current data (if they don't it's a bug). If they just 'stat' it
to see if it has changed then they may see and old timestamp.
I recommend openning the file. It is an explicit way for the
application to say "I really want to know the current state of this
file".
NeilBrown
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-13 20:40 ` Neil Brown
@ 2003-01-13 20:50 ` David B. Ritch
2003-01-13 22:11 ` Neil Brown
0 siblings, 1 reply; 19+ messages in thread
From: David B. Ritch @ 2003-01-13 20:50 UTC (permalink / raw)
To: Neil Brown; +Cc: Alan Robertson, Lorn Kay, NFS mailing list, linux-ha
That makes sense. However, it is common practice in many shops to write
intermediate data files with some sort of serial number or timestamp in
the name, and for the next step in the process to look for data using
"ls" with a wildcard. When doing that, you don't know what the name of
the next file might be, so you can't simply open it.
While I agree that this is not the most ideal method for coordinating
processing, it is widely used and I have found a need to support it.
We've also had processes fail with a "file not found" error when trying
to read a file recently written by a process on another node. It has
always been my belief that this was a failure when a process tried to
open the file, and the local metadata cache had not yet been updated.
Just to clarify - are you saying that the open system call should have
contacted the server, even if the local cached information said that the
file (and perhaps its parent directory) did not exist?
Thanks,
dbr
On Mon, 2003-01-13 at 15:40, Neil Brown wrote:
> On January 13, dritch@hpti.com wrote:
> > I agree that cache coherency is not a sensible goal for a cluster
> > filesystem. However, cache coherency of metadata is rather important.
> > For example, when one node creates a file of intermediate data, it is
> > important for the other nodes to be able to see that. Using actime=0 is
> > the conventional mechanism for allowing file creation and deletion to be
> > propagated quickly. Usually, one can tweak that a bit to reduce the
> > burden on the server. However, it might be be nice if there were a
> > mechanism to propagate this sort of metadata change without dumping all
> > metadata over a second or two old.
>
> If the 'other nodes' open the file and look in it, then they should
> see current data (if they don't it's a bug). If they just 'stat' it
> to see if it has changed then they may see and old timestamp.
>
> I recommend openning the file. It is an explicit way for the
> application to say "I really want to know the current state of this
> file".
>
> NeilBrown
--
David B. Ritch
High Performance Technologies, Inc.
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-13 20:50 ` David B. Ritch
@ 2003-01-13 22:11 ` Neil Brown
0 siblings, 0 replies; 19+ messages in thread
From: Neil Brown @ 2003-01-13 22:11 UTC (permalink / raw)
To: David B. Ritch; +Cc: Alan Robertson, Lorn Kay, NFS mailing list, linux-ha
On January 13, dritch@hpti.com wrote:
> That makes sense. However, it is common practice in many shops to write
> intermediate data files with some sort of serial number or timestamp in
> the name, and for the next step in the process to look for data using
> "ls" with a wildcard. When doing that, you don't know what the name of
> the next file might be, so you can't simply open it.
I don't think that this should be a problem for NFS. To do the 'ls'
or to expand the wildcard you need to open the directory (and do a
readdir) and this should cause the client to check with the server.
Once you have the name the open should work.
>
> While I agree that this is not the most ideal method for coordinating
> processing, it is widely used and I have found a need to support it.
It seems reasonable to me.
>
> We've also had processes fail with a "file not found" error when trying
> to read a file recently written by a process on another node. It has
> always been my belief that this was a failure when a process tried to
> open the file, and the local metadata cache had not yet been updated.
> Just to clarify - are you saying that the open system call should have
> contacted the server, even if the local cached information said that the
> file (and perhaps its parent directory) did not exist?
Hmmm... My understanding of NFS and 'close to open' semantics is that
on 'open' the client should definately contact the server, atleast to
do a GETATTR on the file, and possibly to do a LOOKUP if there is
doubt as to the current information in the name cache.
However the Linux VFS is not very friendly to network filesystems in
this respect. The NFS client doesn't know if a given name lookup is
for an "open" or for a "stat" and so it cannot impose it's subtley
different semantics.
So I can well imagine an "open" on a file that another client has just
created failing, if a recent name lookup has said that it didn't
exist.
However if you always do an opendir/readdir first, and only try to
open files that were found in the readdir, then the client should be
able to reliably detect the change to the directory and submit a new
LOOKUP request. I don't know if the Linux NFS client does this or
not.
I think it is fair to say that Linux isn't really ready for this sort
of tightly-coupled-network-filesystem thing yet. The VFS just isn't
ready. It doesn't even allow O_CREAT|O_EXCL to work over NFS even
though the NFSv3 protocol supports.
The implementers of Lustre has enhanced the VFS for their use. This
may get into mainline in 2.7 (too late for 2.6), or something else
might be developed.
With careful coding it should be possible to achieve any particular
result, but you really need to know exactly what functionality the NFS
client does, and does not, provide.
[[ NOTE: these replies aren't making it to linux-ha@muc.de as I am
not a subscriber. Feel free to forward them if you think that is
appropriate]]
NeilBrown
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-13 19:36 ` Neil Brown
2003-01-13 20:25 ` David B. Ritch
@ 2003-01-14 15:46 ` Trond Myklebust
2003-01-14 16:01 ` Kumaran Rajaram
1 sibling, 1 reply; 19+ messages in thread
From: Trond Myklebust @ 2003-01-14 15:46 UTC (permalink / raw)
To: Neil Brown; +Cc: Alan Robertson, Lorn Kay, nfs, linux-ha
>>>>> " " == Neil Brown <neilb@cse.unsw.edu.au> writes:
> NFSv4 does not try to "fix" this. It makes no attempts at
> "cache coherency" beyond what NFSv2/3 provide which is "close
> to open" cohenrence, meaning that if only one process has a
> file open at a time, then everythnig will appear coherent, and
> if multiple processes have the file open at the same time, they
> need to use record locking.
Note, though, that in addition to supporting file locking, NFSv4 adds
support for file 'delegation' which allow the NFS client to do locking
entirely as a local operation (i.e. there is no need to contact the
server). For most applications, this will make locking a much faster
operation...
Cheers,
Trond
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: Re: NFS as a Cluster File System.
2003-01-14 15:46 ` Trond Myklebust
@ 2003-01-14 16:01 ` Kumaran Rajaram
2003-01-14 16:08 ` Trond Myklebust
0 siblings, 1 reply; 19+ messages in thread
From: Kumaran Rajaram @ 2003-01-14 16:01 UTC (permalink / raw)
To: Trond Myklebust; +Cc: nfs
> Note, though, that in addition to supporting file locking, NFSv4 adds
> support for file 'delegation' which allow the NFS client to do locking
> entirely as a local operation (i.e. there is no need to contact the
> server). For most applications, this will make locking a much faster
> operation...
If file-locking is made local, how do other NFS-clients get to know the
locking info. I suspect this might lead to multiple NFS-clients holding
the lock on the same file simultaneously, leading to file-inconsistency.
Please correct me if Iam wrong.
Thanks,
-Kums
-- Kumaran Rajaram, Mississippi State University --
kums@cs.msstate.edu <http://www.cs.msstate.edu/~kums>
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Re: NFS as a Cluster File System.
2003-01-14 16:01 ` Kumaran Rajaram
@ 2003-01-14 16:08 ` Trond Myklebust
0 siblings, 0 replies; 19+ messages in thread
From: Trond Myklebust @ 2003-01-14 16:08 UTC (permalink / raw)
To: Kumaran Rajaram; +Cc: nfs
>>>>> " " == Kumaran Rajaram <kums@CS.MsState.EDU> writes:
> If file-locking is made local, how do other NFS-clients get
> to know the
> locking info. I suspect this might lead to multiple NFS-clients
> holding the lock on the same file simultaneously, leading to
> file-inconsistency. Please correct me if Iam wrong.
I suggest you read RFC3010, as this is what the file delegation takes
care of.
Delegation is a way for the server to tell the client that it is the
only current user of that particular file. If another client comes
along and opens the same file, then the server must notify the first
client that it is about to revoke the delegation, and give it a short
period of time in which to flush back all changes (including any locks
that are being held).
Cheers,
Trond
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2003-01-14 16:08 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-10 14:51 Re: NFS as a Cluster File System Lever, Charles
2003-01-10 15:23 ` Brian Tinsley
-- strict thread matches above, loose matches on Subject: below --
2003-01-14 16:01 Lever, Charles
2003-01-10 17:19 Lorn Kay
2003-01-12 21:29 ` Trond Myklebust
2003-01-09 23:13 Lorn Kay
2003-01-09 23:45 ` Donavan Pantke
2003-01-09 22:51 Lorn Kay
2003-01-10 15:01 ` Trond Myklebust
2003-01-10 17:38 ` Greg Lindahl
2003-01-12 21:23 ` Trond Myklebust
2003-01-09 19:39 Lorn Kay
2003-01-09 21:11 ` Brian Tinsley
2003-01-09 22:04 ` Brian Jackson
2003-01-09 23:02 ` Brian Tinsley
2003-01-09 21:29 ` Alan Robertson
2003-01-13 19:36 ` Neil Brown
2003-01-13 20:25 ` David B. Ritch
2003-01-13 20:40 ` Neil Brown
2003-01-13 20:50 ` David B. Ritch
2003-01-13 22:11 ` Neil Brown
2003-01-14 15:46 ` Trond Myklebust
2003-01-14 16:01 ` Kumaran Rajaram
2003-01-14 16:08 ` Trond Myklebust
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.