[Fwd: Re: NFS-HOWTO]

Linux NFS development
 help / color / mirror / Atom feed

* [Fwd: Re: NFS-HOWTO]
@ 2002-03-19 18:18 Tavis Barr
  2002-03-19 18:47 ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Tavis Barr @ 2002-03-19 18:18 UTC (permalink / raw)
  To: nfs

[-- Attachment #1: Type: text/plain, Size: 750 bytes --]

Andrew Ryan had a good question for me below that I don't know the
answer to.  When a file gets modified and left the same size twice
within one second, its mtime stays the same and all other attributes
stay the same, so the NFS server does not see that it has been altered. 
At least this is my understanding of the bug.  Some things that I don't
know because I'm not familiar enough with the NFS internals:

*What data structure reflects that the file has been altered?  Is it the
inode number, or some field within the inode?
*This was supposedly a 2.5 fix item; the issue is that mtime does not
have a granularity finer than one second.  What subsystem does the fix
go into?  The VFS layer?  Has there been any work done on it?

Thanks,
Tavis

[-- Attachment #2: Forwarded message - Re: [NFS] NFS-HOWTO --]
[-- Type: message/rfc822, Size: 2304 bytes --]

From: Andrew Ryan <andrewr@collab.net>
To: Tavis Barr <tb62@columbia.edu>
Subject: Re: [NFS] NFS-HOWTO
Date: Sat, 16 Mar 2002 10:38:42 -0800
Message-ID: <3C939132.F11E3D66@collab.net>

Nice job on the FAQ. It's been helpful for us setting up our NFS clients.

In 7.10, "File Corruption When Using Multiple Clients", you state that "If a
file has been modified within one second of its previous modification and left
the same size, it will continue to generate the same inode number." I don't
understand this statement -- it seems to me that the inode number of a file
should not change when a file is modified. I can see the mtime changing, but
not the inode number.

Also, this bug is new to me -- it's not in the previous NFS HOWTO, and I think
it deserves a lot more explanation as to why it happens now and also some
ideas for workarounds, both in 2.5 (where it will presumably be solved the
right way) and before 2.5 (where perhaps there is a hack that could be
applied).

Finally, is this bug a client-side bug, or does it just affect people using
linux as an NFS server?

thanks,
andrew
p.s. It would be nice if you could get the Netapp folks to contribute a
section in the interop chapter.

Tavis Barr wrote:

> Attached is a draft of the latest NFS-HOWTO, in HTML format.  (Put it
> all in the same directory and go to index.html).  Comments welcome.
>
> Cheers,
> Tavis

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Fwd: Re: NFS-HOWTO]
  2002-03-19 18:18 [Fwd: Re: NFS-HOWTO] Tavis Barr
@ 2002-03-19 18:47 ` Trond Myklebust
  2002-03-19 19:20   ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2002-03-19 18:47 UTC (permalink / raw)
  To: Tavis Barr; +Cc: nfs

>>>>> " " == Tavis Barr <tb62@columbia.edu> writes:

     > Andrew Ryan had a good question for me below that I don't know
     > the answer to.  When a file gets modified and left the same
     > size twice within one second, its mtime stays the same and all
     > other attributes stay the same, so the NFS server does not see
     > that it has been altered. At least this is my understanding of
     > the bug.  Some things that I don't know because I'm not
     > familiar enough with the NFS internals:

     > *What data structure reflects that the file has been altered?

inode->i_size + inode->i_mtime    ;-)

NFSv4 has support for a new 64-bit opaque value that can be used to
tell if the file has changed (that doesn't have to be i_mtime).
For NFSv2/v3 though, file size and mtime are all we have available to
tell whether or not the file has changed.

     > Is it the inode number, or some field within the inode?  *This
     > was supposedly a 2.5 fix item; the issue is that mtime does not
     > have a granularity finer than one second.  What subsystem does
     > the fix go into?  The VFS layer?  Has there been any work done
     > on it?

Neil was talking about fixing this in 2.5.x (it is after all a server
issue). The problem is that several filesystems (i.e. most notably
ext2/ext3) don't have the space in their on-disk inodes for <1s time
resolution.
There are some ideas floating around on how to get around this, but I
do not believe that concensus has yet been achieved...

Cheers,
  Trond

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Fwd: Re: NFS-HOWTO]
  2002-03-19 18:47 ` Trond Myklebust
@ 2002-03-19 19:20   ` Trond Myklebust
  2002-03-19 23:06     ` Ragnar Kjørstad
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2002-03-19 19:20 UTC (permalink / raw)
  To: Tavis Barr, nfs

>>>>> " " == Trond Myklebust <trond.myklebust@fys.uio.no> writes:

    >> Is it the inode number, or some field within the inode?  *This
    >> was supposedly a 2.5 fix item; the issue is that mtime does not
    >> have a granularity finer than one second.  What subsystem does
    >> the fix go into?  The VFS layer?  Has there been any work done
    >> on it?

     > Neil was talking about fixing this in 2.5.x (it is after all a
     > server issue). The problem is that several filesystems
     > (i.e. most notably ext2/ext3) don't have the space in their
     > on-disk inodes for <1s time resolution.  There are some ideas
     > floating around on how to get around this, but I do not believe
     > that concensus has yet been achieved...

Perhaps I should expand a little on this. The changes are twofold:

   - Change the VFS structures to support 64-bit (a|c|m)time values.
     This is not really a big deal, and nothing is stopping us from
     doing it today...

   - Changes to the individual filesystems so that they can save and
     retrieve the extra 96 bits (== 32 bits * (mtime + atime + ctime))
     as part of the on-disk metadata.
     This is non-trivial, since a lot of these filesystems have not
     got much padding left in their inodes (particularly once acls
     etc. have grabbed their share of real-estate). Even finding 32
     free bits is a real problem for ext[23]...

One solution might be to only keep the full 64-bit data in the VFS
inode cache, and to zero the low 32-bits whenever we have to reload
the metadata from the disk.
That means that each time the file falls out of cache, then the mtime
would appear to change on the client (which might then proceed to
invalidate its data cache). Not entirely satisfactory, but probably
better than nothing...

Cheers,
  Trond

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Fwd: Re: NFS-HOWTO]
  2002-03-19 19:20   ` Trond Myklebust
@ 2002-03-19 23:06     ` Ragnar Kjørstad
  0 siblings, 0 replies; 4+ messages in thread
From: Ragnar Kjørstad @ 2002-03-19 23:06 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Tavis Barr, nfs

On Tue, Mar 19, 2002 at 08:20:32PM +0100, Trond Myklebust wrote:
>    - Changes to the individual filesystems so that they can save and
>      retrieve the extra 96 bits (=3D=3D 32 bits * (mtime + atime + ctim=
e))
>      as part of the on-disk metadata.
>      This is non-trivial, since a lot of these filesystems have not
>      got much padding left in their inodes (particularly once acls
>      etc. have grabbed their share of real-estate). Even finding 32
>      free bits is a real problem for ext[23]...

I think Ted was talking about doing a disk-format change for ext[23]
soon (to improve resizing support and to extend fields like timestamps
and link-counters).

For reiserfs adding more data in the on-disk metadata is perhaps less of
a problem than for other filesystems, because reiserfs can handle
multiple inode-types on the same filesystem. (Of course old kernels
would not work with the new format, so it's still not trivial).

I haven't checked xfs, jfs or any of the other filesystems.

> One solution might be to only keep the full 64-bit data in the VFS
> inode cache, and to zero the low 32-bits whenever we have to reload
> the metadata from the disk.
> That means that each time the file falls out of cache, then the mtime
> would appear to change on the client (which might then proceed to
> invalidate its data cache). Not entirely satisfactory, but probably
> better than nothing...

Filesystems that _do_ have support for 64-bit data on-disk could still
take advantage, right?=20


--=20
Ragnar Kj=F8rstad
Big Storage

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-03-19 23:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-19 18:18 [Fwd: Re: NFS-HOWTO] Tavis Barr
2002-03-19 18:47 ` Trond Myklebust
2002-03-19 19:20   ` Trond Myklebust
2002-03-19 23:06     ` Ragnar Kjørstad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox