NFSv4 pseudo filesystem

All of lore.kernel.org
 help / color / mirror / Atom feed

* NFSv4 pseudo filesystem
@ 2002-05-10 18:12   ` Kendrick M. Smith
  2002-05-10 18:18     ` Christoph Hellwig
                       ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Kendrick M. Smith @ 2002-05-10 18:12 UTC (permalink / raw)
  To: nfs, linux-fsdevel; +Cc: nfsv4-wg

Hi all,

I'm one of the NFSv4 developers at the University of Michigan.  I'm
currently trying to settle on the best way to implement the NFSv4
"pseudo filesystem" in Linux 2.5, and I'm hoping to solicit feedback
from some other developers.

Background: In NFSv2/v3, the server's exports are more or less independent
of each other, and must be mounted seperately by the client.  NFSv4
introduced the requirement that the server must export a 'root filehandle'
(which must be a directory), and that all the exports be obtainable by
browsing the subtree rooted at the root filehandle.  In other words, the
server must present the client with ficticious directories, which live
above the exports and serve to tie them all together into one tree.  (The
term "pseudo filesystem" is used to refer to this collection of ficticious
directories.)

  Proposal 1: Have the server export a pseudofs which "mirrors" the actual
  namespace on the server, or at least enough of it to cover all the
  exports.  In other words, if the server's exports are named
  /home/kmsmith and /usr/local/src, then the server will present the
  client with the following pseudo filesystem:

                             /
       /home                                 /usr
       /home/kmsmith                         /usr/local
          ...                                /usr/local/src
                                                  ...

This is the approach suggested by RFC3010 for Unix servers, but it
seems like a nice feature to relax the requirement that pathnames in
the pseduofs be the same as pathnames in the server's filesystem.
The next 2 proposals allow the possibility of setting up an
arbitarily-named tree of ficticious directories, for the server
to export as the pseudofs.  (This would require changing the
/etc/exports file format, presumably in a backward-compatible way,
such as adding an export option pseudo_pathname=...)

  Proposal 2: Require the pseudofs to be built up somewhere on disk,
  presumably in a well-known location such as /etc/pseudofs.  The
  exportfs utility would create real on-disk subdirectories, then
  mount --bind the exports onto the leaves of the tree, before starting
  nfsd.  (Some mechanism would have to be introduced whereby the
  top-level pathname /etc/pseudofs could be communicated to the kernel.)

  Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
  say in a directory nfsd/pseudofs which is created when the nfsd
  filesystem is mounted.  The exportfs utility would be responsible
  for creating the necesary subdirectories, then hanging the exports
  off the leaves with mount --bind, before starting nfsd.

As I see it, the disadvantage of proposal 3 is that it is a little
tricky to construct persistent filehandles ("persistent" in the sense
that an old filehandle is still recognize after the server is rebooted).
One solution would be to use an MD5 or SHA hash of the pathname as the
filehandle.  The hash could be computed in userspace and passed into
the kernel somehow.

This disadvantage doesn't seem to exist under the other two proposals,
since in both cases each pseudo directory is "backed" by an on-disk
directory, and we can use this directory's filehandle.

Of course, it's possible that no one is interested in having a pseudofs
namespace which is independent of the namespace in the server's
filesystem.  If the consensus is that this is not a useful feature, it's
probably easiest to adopt proposal 1.

Feedback/comments welcome!

Cheers,
 Kendrick

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: bandwidth@sourceforge.net
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* NFSv4 pseudo filesystem
@ 2002-05-10 18:12 Kendrick M. Smith
  0 siblings, 0 replies; 21+ messages in thread
From: Kendrick M. Smith @ 2002-05-10 18:12 UTC (permalink / raw)
  To: nfs, linux-fsdevel; +Cc: nfsv4-wg

Hi all,

I'm one of the NFSv4 developers at the University of Michigan.  I'm
currently trying to settle on the best way to implement the NFSv4
"pseudo filesystem" in Linux 2.5, and I'm hoping to solicit feedback
from some other developers.

Background: In NFSv2/v3, the server's exports are more or less independent
of each other, and must be mounted seperately by the client.  NFSv4
introduced the requirement that the server must export a 'root filehandle'
(which must be a directory), and that all the exports be obtainable by
browsing the subtree rooted at the root filehandle.  In other words, the
server must present the client with ficticious directories, which live
above the exports and serve to tie them all together into one tree.  (The
term "pseudo filesystem" is used to refer to this collection of ficticious
directories.)

  Proposal 1: Have the server export a pseudofs which "mirrors" the actual
  namespace on the server, or at least enough of it to cover all the
  exports.  In other words, if the server's exports are named
  /home/kmsmith and /usr/local/src, then the server will present the
  client with the following pseudo filesystem:

                             /
       /home                                 /usr
       /home/kmsmith                         /usr/local
          ...                                /usr/local/src
                                                  ...

This is the approach suggested by RFC3010 for Unix servers, but it
seems like a nice feature to relax the requirement that pathnames in
the pseduofs be the same as pathnames in the server's filesystem.
The next 2 proposals allow the possibility of setting up an
arbitarily-named tree of ficticious directories, for the server
to export as the pseudofs.  (This would require changing the
/etc/exports file format, presumably in a backward-compatible way,
such as adding an export option pseudo_pathname=...)

  Proposal 2: Require the pseudofs to be built up somewhere on disk,
  presumably in a well-known location such as /etc/pseudofs.  The
  exportfs utility would create real on-disk subdirectories, then
  mount --bind the exports onto the leaves of the tree, before starting
  nfsd.  (Some mechanism would have to be introduced whereby the
  top-level pathname /etc/pseudofs could be communicated to the kernel.)

  Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
  say in a directory nfsd/pseudofs which is created when the nfsd
  filesystem is mounted.  The exportfs utility would be responsible
  for creating the necesary subdirectories, then hanging the exports
  off the leaves with mount --bind, before starting nfsd.

As I see it, the disadvantage of proposal 3 is that it is a little
tricky to construct persistent filehandles ("persistent" in the sense
that an old filehandle is still recognize after the server is rebooted).
One solution would be to use an MD5 or SHA hash of the pathname as the
filehandle.  The hash could be computed in userspace and passed into
the kernel somehow.

This disadvantage doesn't seem to exist under the other two proposals,
since in both cases each pseudo directory is "backed" by an on-disk
directory, and we can use this directory's filehandle.

Of course, it's possible that no one is interested in having a pseudofs
namespace which is independent of the namespace in the server's
filesystem.  If the consensus is that this is not a useful feature, it's
probably easiest to adopt proposal 1.

Feedback/comments welcome!

Cheers,
 Kendrick

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-10 18:12   ` NFSv4 pseudo filesystem Kendrick M. Smith
@ 2002-05-10 18:18     ` Christoph Hellwig
  2002-05-10 18:18     ` Christoph Hellwig
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2002-05-10 18:18 UTC (permalink / raw)
  To: Kendrick M. Smith; +Cc: nfs, linux-fsdevel, nfsv4-wg

On Fri, May 10, 2002 at 02:12:12PM -0400, Kendrick M. Smith wrote:
> Background: In NFSv2/v3, the server's exports are more or less independent
> of each other, and must be mounted seperately by the client.  NFSv4
> introduced the requirement that the server must export a 'root filehandle'
> (which must be a directory), and that all the exports be obtainable by
> browsing the subtree rooted at the root filehandle.  In other words, the
> server must present the client with ficticious directories, which live
> above the exports and serve to tie them all together into one tree.  (The
> term "pseudo filesystem" is used to refer to this collection of ficticious
> directories.)

Create the NFSv4 daemons using clone(.., CLONE_NAMESPACE, ..) and build
your own per-process namespace to fit the exports.


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: bandwidth@sourceforge.net
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-10 18:12   ` NFSv4 pseudo filesystem Kendrick M. Smith
  2002-05-10 18:18     ` Christoph Hellwig
@ 2002-05-10 18:18     ` Christoph Hellwig
  2002-05-10 23:14     ` H. Peter Anvin
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2002-05-10 18:18 UTC (permalink / raw)
  To: Kendrick M. Smith; +Cc: nfs, linux-fsdevel, nfsv4-wg

On Fri, May 10, 2002 at 02:12:12PM -0400, Kendrick M. Smith wrote:
> Background: In NFSv2/v3, the server's exports are more or less independent
> of each other, and must be mounted seperately by the client.  NFSv4
> introduced the requirement that the server must export a 'root filehandle'
> (which must be a directory), and that all the exports be obtainable by
> browsing the subtree rooted at the root filehandle.  In other words, the
> server must present the client with ficticious directories, which live
> above the exports and serve to tie them all together into one tree.  (The
> term "pseudo filesystem" is used to refer to this collection of ficticious
> directories.)

Create the NFSv4 daemons using clone(.., CLONE_NAMESPACE, ..) and build
your own per-process namespace to fit the exports.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-10 18:12   ` NFSv4 pseudo filesystem Kendrick M. Smith
  2002-05-10 18:18     ` Christoph Hellwig
  2002-05-10 18:18     ` Christoph Hellwig
@ 2002-05-10 23:14     ` H. Peter Anvin
  2002-05-11  6:31     ` Neil Brown
  2002-05-11  6:31     ` Neil Brown
  4 siblings, 0 replies; 21+ messages in thread
From: H. Peter Anvin @ 2002-05-10 23:14 UTC (permalink / raw)
  To: linux-fsdevel

Followup to:  <Pine.SOL.4.44.0205101340370.27306-100000@mspacman.gpcc.itd.umich.edu>
By author:    "Kendrick M. Smith" <kmsmith@umich.edu>
In newsgroup: linux.dev.fs.devel
> 
> Background: In NFSv2/v3, the server's exports are more or less independent
> of each other, and must be mounted seperately by the client.  NFSv4
> introduced the requirement that the server must export a 'root filehandle'
> (which must be a directory), and that all the exports be obtainable by
> browsing the subtree rooted at the root filehandle.  In other words, the
> server must present the client with ficticious directories, which live
> above the exports and serve to tie them all together into one tree.  (The
> term "pseudo filesystem" is used to refer to this collection of ficticious
> directories.)
> 
>   Proposal 1: Have the server export a pseudofs which "mirrors" the actual
>   namespace on the server, or at least enough of it to cover all the
>   exports.  In other words, if the server's exports are named
>   /home/kmsmith and /usr/local/src, then the server will present the
>   client with the following pseudo filesystem:
> 
>                              /
>        /home                                 /usr
>        /home/kmsmith                         /usr/local
>           ...                                /usr/local/src
>                                                   ...
> 
> 
> This is the approach suggested by RFC3010 for Unix servers, but it
> seems like a nice feature to relax the requirement that pathnames in
> the pseduofs be the same as pathnames in the server's filesystem.
> The next 2 proposals allow the possibility of setting up an
> arbitarily-named tree of ficticious directories, for the server
> to export as the pseudofs.  (This would require changing the
> /etc/exports file format, presumably in a backward-compatible way,
> such as adding an export option pseudo_pathname=...)
> 

I would really like to suggest making it possible to map these
arbitrarily, although the default should presumably be the "real"
path.

This, in fact, applies to NFSv2/v3 as well: the path that you want to
mount a client with shouldn't need to be so closely tied to the path
on the physical filesystem.

For example, I should be able to specify something like:

/exports/clients/dorkface/ready \
	somehost.bigcorp.com(ro,path=/clients/bigcorp) \
	*.mycorp.com(rw)

... and have somehost.bigcorp.com mount this filesystem as
myhost.mycorp.com:/clients/bigcorp whereas my own local hosts would
see it as /export/clients/dorkface/ready.

	-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
@ 2002-05-11  0:13 Bryan Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Bryan Henderson @ 2002-05-11  0:13 UTC (permalink / raw)
  To: Kendrick M. Smith; +Cc: linux-fsdevel, nfsv4-wg


>Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
>say in a directory nfsd/pseudofs which is created when the nfsd
>filesystem is mounted.

>As I see it, the disadvantage of proposal 3 is that it is a little
>tricky to construct persistent filehandles ("persistent" in the sense
>that an old filehandle is still recognize after the server is rebooted).
>...
>This disadvantage doesn't seem to exist under the other two proposals,
>since in both cases each pseudo directory is "backed" by an on-disk
>directory, and we can use this directory's filehandle.

But we already recognize a problem with NFS 3 in that the real directory's
filehandle isn't persistent enough.  So there's work being done in Linux
2.5 to allow the user make up a suitably persistent identifier of an export
at export time.

That makes this "disadvantage" of proposal 3 look like an advantage.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-10 18:12   ` NFSv4 pseudo filesystem Kendrick M. Smith
                       ` (3 preceding siblings ...)
  2002-05-11  6:31     ` Neil Brown
@ 2002-05-11  6:31     ` Neil Brown
  2002-05-11 17:39       ` David Chow
                         ` (2 more replies)
  4 siblings, 3 replies; 21+ messages in thread
From: Neil Brown @ 2002-05-11  6:31 UTC (permalink / raw)
  To: Kendrick M. Smith; +Cc: nfs, linux-fsdevel, nfsv4-wg

On Friday May 10, kmsmith@umich.edu wrote:
> 
>   Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
>   say in a directory nfsd/pseudofs which is created when the nfsd
>   filesystem is mounted.  The exportfs utility would be responsible
>   for creating the necesary subdirectories, then hanging the exports
>   off the leaves with mount --bind, before starting nfsd.
> 
> As I see it, the disadvantage of proposal 3 is that it is a little
> tricky to construct persistent filehandles ("persistent" in the sense
> that an old filehandle is still recognize after the server is rebooted).
> One solution would be to use an MD5 or SHA hash of the pathname as the
> filehandle.  The hash could be computed in userspace and passed into
> the kernel somehow.

I would go for 3, and don't care about persistent file handles.  Just
use volatile filehandles for this bit of the namespace.

NeilBrown

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: bandwidth@sourceforge.net
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-10 18:12   ` NFSv4 pseudo filesystem Kendrick M. Smith
                       ` (2 preceding siblings ...)
  2002-05-10 23:14     ` H. Peter Anvin
@ 2002-05-11  6:31     ` Neil Brown
  2002-05-11  6:31     ` Neil Brown
  4 siblings, 0 replies; 21+ messages in thread
From: Neil Brown @ 2002-05-11  6:31 UTC (permalink / raw)
  To: Kendrick M. Smith; +Cc: nfs, linux-fsdevel, nfsv4-wg

On Friday May 10, kmsmith@umich.edu wrote:
> 
>   Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
>   say in a directory nfsd/pseudofs which is created when the nfsd
>   filesystem is mounted.  The exportfs utility would be responsible
>   for creating the necesary subdirectories, then hanging the exports
>   off the leaves with mount --bind, before starting nfsd.
> 
> As I see it, the disadvantage of proposal 3 is that it is a little
> tricky to construct persistent filehandles ("persistent" in the sense
> that an old filehandle is still recognize after the server is rebooted).
> One solution would be to use an MD5 or SHA hash of the pathname as the
> filehandle.  The hash could be computed in userspace and passed into
> the kernel somehow.

I would go for 3, and don't care about persistent file handles.  Just
use volatile filehandles for this bit of the namespace.

NeilBrown

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-11  6:31     ` Neil Brown
@ 2002-05-11 17:39       ` David Chow
  2002-05-11 17:39       ` David Chow
  2002-05-11 20:19       ` NFS export operations question and BUG report Anton Altaparmakov
  2 siblings, 0 replies; 21+ messages in thread
From: David Chow @ 2002-05-11 17:39 UTC (permalink / raw)
  To: Neil Brown; +Cc: Kendrick M. Smith, nfs, linux-fsdevel, nfsv4-wg

Neil Brown wrote:

>On Friday May 10, kmsmith@umich.edu wrote:
>
>>  Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
>>  say in a directory nfsd/pseudofs which is created when the nfsd
>>  filesystem is mounted.  The exportfs utility would be responsible
>>  for creating the necesary subdirectories, then hanging the exports
>>  off the leaves with mount --bind, before starting nfsd.
>>
>>As I see it, the disadvantage of proposal 3 is that it is a little
>>tricky to construct persistent filehandles ("persistent" in the sense
>>that an old filehandle is still recognize after the server is rebooted).
>>One solution would be to use an MD5 or SHA hash of the pathname as the
>>filehandle.  The hash could be computed in userspace and passed into
>>the kernel somehow.
>>
>
>I would go for 3, and don't care about persistent file handles.  Just
>use volatile filehandles for this bit of the namespace.
>
>NeilBrown
>-
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
I am a bit confused about compatibility of NFSv3, doesn't the nohide 
option allow to trigger an exported fs's (dir) subdirectory mount points 
to be automatically mounted by clients? If 2.5 is proposing the arbitary 
(fs=somepersistentnumber), how nohide is going to know whether the next 
fs's number? It is clear in 2.4 now we use the device major and minor 
numbers such that only block device fs are exportable. I think there 
were a long discussion about the fs= months ago to allow exporting non 
block device fs. I have worked hard on making my fs a fake block device 
fs in 2.4 which uses an arbitary block device number, but I don't think 
it is a good idea to drop the fs= implementation because many times we 
want to export non block device fs with a persisten t file handle. I 
don't think using arbitary devices for just because want to cope with 
NFS is a good idea. I think many of the system rely on the stateless and 
persistent design of NFS, since NFS is a de facto standard for Unixes, I 
am not happy to see NFSv4 break this.


regards,

David


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: bandwidth@sourceforge.net
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFSv4 pseudo filesystem
  2002-05-11  6:31     ` Neil Brown
  2002-05-11 17:39       ` David Chow
@ 2002-05-11 17:39       ` David Chow
  2002-05-11 20:19       ` NFS export operations question and BUG report Anton Altaparmakov
  2 siblings, 0 replies; 21+ messages in thread
From: David Chow @ 2002-05-11 17:39 UTC (permalink / raw)
  To: Neil Brown; +Cc: Kendrick M. Smith, nfs, linux-fsdevel, nfsv4-wg

Neil Brown wrote:

>On Friday May 10, kmsmith@umich.edu wrote:
>
>>  Proposal 3: Build up the pseudofs inside the 2.5 'nfsd' filesystem,
>>  say in a directory nfsd/pseudofs which is created when the nfsd
>>  filesystem is mounted.  The exportfs utility would be responsible
>>  for creating the necesary subdirectories, then hanging the exports
>>  off the leaves with mount --bind, before starting nfsd.
>>
>>As I see it, the disadvantage of proposal 3 is that it is a little
>>tricky to construct persistent filehandles ("persistent" in the sense
>>that an old filehandle is still recognize after the server is rebooted).
>>One solution would be to use an MD5 or SHA hash of the pathname as the
>>filehandle.  The hash could be computed in userspace and passed into
>>the kernel somehow.
>>
>
>I would go for 3, and don't care about persistent file handles.  Just
>use volatile filehandles for this bit of the namespace.
>
>NeilBrown
>-
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
I am a bit confused about compatibility of NFSv3, doesn't the nohide 
option allow to trigger an exported fs's (dir) subdirectory mount points 
to be automatically mounted by clients? If 2.5 is proposing the arbitary 
(fs=somepersistentnumber), how nohide is going to know whether the next 
fs's number? It is clear in 2.4 now we use the device major and minor 
numbers such that only block device fs are exportable. I think there 
were a long discussion about the fs= months ago to allow exporting non 
block device fs. I have worked hard on making my fs a fake block device 
fs in 2.4 which uses an arbitary block device number, but I don't think 
it is a good idea to drop the fs= implementation because many times we 
want to export non block device fs with a persisten t file handle. I 
don't think using arbitary devices for just because want to cope with 
NFS is a good idea. I think many of the system rely on the stateless and 
persistent design of NFS, since NFS is a de facto standard for Unixes, I 
am not happy to see NFSv4 break this.


regards,

David


^ permalink raw reply	[flat|nested] 21+ messages in thread

* NFS export operations question and BUG report
  2002-05-11  6:31     ` Neil Brown
  2002-05-11 17:39       ` David Chow
  2002-05-11 17:39       ` David Chow
@ 2002-05-11 20:19       ` Anton Altaparmakov
  2002-05-11 20:21         ` Anton Altaparmakov
  2002-05-11 21:18         ` Neil Brown
  2 siblings, 2 replies; 21+ messages in thread
From: Anton Altaparmakov @ 2002-05-11 20:19 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-fsdevel

Hi,

I am implementing export operations for NTFS in 2.5.15 at tbe moment and am 
looking at the default implementation of encode_fh in fs/exportfs/expfs.c, 
i.e. export_encode_fh().

This does if (connectable && !S_ISDIR(inode->i_mode)) and only then does it 
store information about the parent in the @fh. Why is this? Is this 
something inherent in NFS? Or should ntfs not do the !S_ISDIR and do this 
regardless? Perhaps replacing the check by !IS_ROOT(dentry) instead?

Also dentry->d_parent is dereferenced without the dcache_lock being held in 
the export_encode_fh() function. I believe this is a BUG...

Best regards,

         Anton

-- 
   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
-- 
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 20:19       ` NFS export operations question and BUG report Anton Altaparmakov
@ 2002-05-11 20:21         ` Anton Altaparmakov
  2002-05-11 21:08           ` Anton Altaparmakov
  2002-05-11 21:18         ` Neil Brown
  1 sibling, 1 reply; 21+ messages in thread
From: Anton Altaparmakov @ 2002-05-11 20:21 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-fsdevel

At 21:19 11/05/02, Anton Altaparmakov wrote:
>Hi,
>
>I am implementing export operations for NTFS in 2.5.15 at tbe moment and 
>am looking at the default implementation of encode_fh in 
>fs/exportfs/expfs.c, i.e. export_encode_fh().
>
>This does if (connectable && !S_ISDIR(inode->i_mode)) and only then does 
>it store information about the parent in the @fh. Why is this? Is this 
>something inherent in NFS? Or should ntfs not do the !S_ISDIR and do this 
>regardless? Perhaps replacing the check by !IS_ROOT(dentry) instead?
>
>Also dentry->d_parent is dereferenced without the dcache_lock being held 
>in the export_encode_fh() function. I believe this is a BUG...

Argh. s/dcache_lock/dparent_lock/


>Best regards,
>
>         Anton
>
>
>--
>   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
>--
>Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
>Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
>WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
-- 
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 20:21         ` Anton Altaparmakov
@ 2002-05-11 21:08           ` Anton Altaparmakov
  2002-05-11 21:43             ` Neil Brown
  0 siblings, 1 reply; 21+ messages in thread
From: Anton Altaparmakov @ 2002-05-11 21:08 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-fsdevel

May I request that Documentation/filesystems/Locking be updated for the 
export operations?

That would save me having to ask "is this a bug questions"...

I tried to follow the code paths and it seems the parent directory is 
locked but I am not sure. (I only saw this mentioned in a comment, I got 
lost in the nfsd code and gave up)

Cheers,

         Anton

At 21:21 11/05/02, Anton Altaparmakov wrote:
>At 21:19 11/05/02, Anton Altaparmakov wrote:
>>Hi,
>>
>>I am implementing export operations for NTFS in 2.5.15 at tbe moment and 
>>am looking at the default implementation of encode_fh in 
>>fs/exportfs/expfs.c, i.e. export_encode_fh().
>>
>>This does if (connectable && !S_ISDIR(inode->i_mode)) and only then does 
>>it store information about the parent in the @fh. Why is this? Is this 
>>something inherent in NFS? Or should ntfs not do the !S_ISDIR and do this 
>>regardless? Perhaps replacing the check by !IS_ROOT(dentry) instead?
>>
>>Also dentry->d_parent is dereferenced without the dcache_lock being held 
>>in the export_encode_fh() function. I believe this is a BUG...
>
>Argh. s/dcache_lock/dparent_lock/
>
>
>>Best regards,
>>
>>         Anton
>>
>>
>>--
>>   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
>>--
>>Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
>>Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
>>WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
>>
>>-
>>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>--
>   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
>--
>Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
>Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
>WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
-- 
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 20:19       ` NFS export operations question and BUG report Anton Altaparmakov
  2002-05-11 20:21         ` Anton Altaparmakov
@ 2002-05-11 21:18         ` Neil Brown
  2002-05-11 22:38           ` Anton Altaparmakov
  1 sibling, 1 reply; 21+ messages in thread
From: Neil Brown @ 2002-05-11 21:18 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel

On Saturday May 11, aia21@cantab.net wrote:
> Hi,
> 
> I am implementing export operations for NTFS in 2.5.15 at tbe moment and am 
> looking at the default implementation of encode_fh in fs/exportfs/expfs.c, 
> i.e. export_encode_fh().
> 
> This does if (connectable && !S_ISDIR(inode->i_mode)) and only then does it 
> store information about the parent in the @fh. Why is this? Is this 
> something inherent in NFS? Or should ntfs not do the !S_ISDIR and do this 
> regardless? Perhaps replacing the check by !IS_ROOT(dentry) instead?

This is because most filesystems have a link from each directory to
the parent (via '..') so storing the parent information is no
necessary.

If NTFS does not have anything resembling a '..' link, then you are in
trouble, as you need to be about to walk the '..' links all the way up
to the root, and you cannot store *all* of the parents in a file
handle.

> 
> Also dentry->d_parent is dereferenced without the dcache_lock being held in 
> the export_encode_fh() function. I believe this is a BUG...

Yep, I think that is a bug (though dparent_lock as you later say).
I'll see about getting it fixed.

Thanks.

NeilBrown

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 21:08           ` Anton Altaparmakov
@ 2002-05-11 21:43             ` Neil Brown
  2002-05-11 23:09               ` Anton Altaparmakov
  0 siblings, 1 reply; 21+ messages in thread
From: Neil Brown @ 2002-05-11 21:43 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: linux-fsdevel

On Saturday May 11, aia21@cantab.net wrote:
> May I request that Documentation/filesystems/Locking be updated for the 
> export operations?

There is not much to say, and it is mostly said in fs.h:
 *
 * Locking rules:
 *  get_parent is called with child->d_inode->i_sem down
 *  get_name is not (which is possibly inconsistent)

get_parent is called with i_sem held.
No locks are held for any other calls.

But I will try to update .../Locking soon.

NeilBrown

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 21:18         ` Neil Brown
@ 2002-05-11 22:38           ` Anton Altaparmakov
  2002-05-12  7:39             ` Alexander Viro
       [not found]             ` <Pine.GSO.4.21.0205120330410.23398-100000@weyl.math.psu.edu >
  0 siblings, 2 replies; 21+ messages in thread
From: Anton Altaparmakov @ 2002-05-11 22:38 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-fsdevel

At 22:18 11/05/02, Neil Brown wrote:
>On Saturday May 11, aia21@cantab.net wrote:
> > I am implementing export operations for NTFS in 2.5.15 at tbe moment 
> and am
> > looking at the default implementation of encode_fh in fs/exportfs/expfs.c,
> > i.e. export_encode_fh().
> >
> > This does if (connectable && !S_ISDIR(inode->i_mode)) and only then 
> does it
> > store information about the parent in the @fh. Why is this? Is this
> > something inherent in NFS? Or should ntfs not do the !S_ISDIR and do this
> > regardless? Perhaps replacing the check by !IS_ROOT(dentry) instead?
>
>This is because most filesystems have a link from each directory to
>the parent (via '..') so storing the parent information is no
>necessary.
>
>If NTFS does not have anything resembling a '..' link, then you are in
>trouble, as you need to be about to walk the '..' links all the way up
>to the root, and you cannot store *all* of the parents in a file
>handle.

Eeek! In ntfs each inode (no matter what kind of inode) has links to the 
parent directory which I guess is just like '..' so in theory I would never 
have to store any info about the parent, HOWEVER ntfs has directory 
hardlinks(!!!), which means that is is very easily possible that there are 
multiple '..' entries, so given a disconnected dentry it is possible that 
it becomes reconnected at the wrong place if I just chose the first '..' 
entry found.

As you say, obviously I can't just store the whole path to the root to have 
an unambiguous way of determining the correct tree structure.

Hm, I have to think about this a bit. There probably is a kludge I can use 
in the depths of the ntfs metadata to identify the correct hard link...

Best regards,

         Anton

-- 
   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
-- 
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 21:43             ` Neil Brown
@ 2002-05-11 23:09               ` Anton Altaparmakov
  0 siblings, 0 replies; 21+ messages in thread
From: Anton Altaparmakov @ 2002-05-11 23:09 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-fsdevel

At 22:43 11/05/02, Neil Brown wrote:
>On Saturday May 11, aia21@cantab.net wrote:
> > May I request that Documentation/filesystems/Locking be updated for the
> > export operations?
>
>There is not much to say, and it is mostly said in fs.h:
>  *
>  * Locking rules:
>  *  get_parent is called with child->d_inode->i_sem down
>  *  get_name is not (which is possibly inconsistent)
>
>get_parent is called with i_sem held.
>No locks are held for any other calls.
>
>But I will try to update .../Locking soon.

Ok, cool. Although I find it a bit hard to believe that ->encode_fh is 
called without any locks held... The code must at the very least be holding 
a reference on the dentry or dentry->d_inode (or both?) so it doesn't 
disappear... If it isn't I would have thought that it should do. But that 
is just me being naive about the NFS code...

Best regards,

Anton


-- 
   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
-- 
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-11 22:38           ` Anton Altaparmakov
@ 2002-05-12  7:39             ` Alexander Viro
       [not found]             ` <Pine.GSO.4.21.0205120330410.23398-100000@weyl.math.psu.edu >
  1 sibling, 0 replies; 21+ messages in thread
From: Alexander Viro @ 2002-05-12  7:39 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: Neil Brown, linux-fsdevel

On Sat, 11 May 2002, Anton Altaparmakov wrote:

> Eeek! In ntfs each inode (no matter what kind of inode) has links to the 
> parent directory which I guess is just like '..' so in theory I would never 
> have to store any info about the parent, HOWEVER ntfs has directory 
> hardlinks(!!!), which means that is is very easily possible that there are 
> multiple '..' entries

... which means that rename() will merrily allow to create loops and detach
parts of directory graph from root.  Which would count as fs corruption
even on NTFS.

And that's fairly serious - the only way to prevent it is to prohibit
cross-directory rename() on NTFS.  Loops in directory graph lead to
all sorts of trouble...

The thing being, safe cross-directory rename() == ability to tell if FOO is
ancestor of BAR.  Unless you have a way to do that, you are in all sorts of
trouble.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
       [not found]             ` <Pine.GSO.4.21.0205120330410.23398-100000@weyl.math.psu.edu >
@ 2002-05-12 12:00               ` Anton Altaparmakov
  2002-05-12 16:20                 ` Jan Harkes
  2002-05-13  6:54                 ` Neil Brown
  0 siblings, 2 replies; 21+ messages in thread
From: Anton Altaparmakov @ 2002-05-12 12:00 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Neil Brown, linux-fsdevel

At 08:39 12/05/02, Alexander Viro wrote:
>On Sat, 11 May 2002, Anton Altaparmakov wrote:
>
> > Eeek! In ntfs each inode (no matter what kind of inode) has links to the
> > parent directory which I guess is just like '..' so in theory I would 
> never
> > have to store any info about the parent, HOWEVER ntfs has directory
> > hardlinks(!!!), which means that is is very easily possible that there are
> > multiple '..' entries
>
>... which means that rename() will merrily allow to create loops and detach
>parts of directory graph from root.  Which would count as fs corruption
>even on NTFS.
>
>And that's fairly serious - the only way to prevent it is to prohibit
>cross-directory rename() on NTFS.  Loops in directory graph lead to
>all sorts of trouble...
>
>The thing being, safe cross-directory rename() == ability to tell if FOO is
>ancestor of BAR.  Unless you have a way to do that, you are in all sorts of
>trouble.

Indeed. I just tried all hard link creating Windows utilities I could find 
(including Microsofts own utility) and all of them made a check and refused 
to work on directories. So unless someone is being truly evil they will be 
unable to create a directory hard link in Windows. And if they do it 
anyway, they deserve what they get... I will just enforce no hardlinks on 
directories inside the Linux ntfs kernel driver which should make it safe 
to use. The only hardlinks we always have are the short and long file names 
of a directory but by default I am hiding the short file names (readdir 
just skips them) so that should be ok, too.

If I store the name space of the file name of both the dentry and the 
parent dentry together with the inode numbers and generation numbers I can 
make unique lookups for dparent (assuming a non-corrupt fs).

The only problem then remaining for ntfs is that I cannot distinguish which 
hard link to a file a file handle encodes. Unless I encode the name of the 
hard link in the file handle... but file handles are supposed to work 
accross rename so that doesn't work. Heck, how does this work for other 
file systems?!?

If you have two hard links both with the same parent directory (e.g. 
/bin/vi and /bin/vim), then their inode number, generation, and their 
parent inode and generation are all the same. So even ext2/3 cannot 
distinguish which one is being looked for with the present default 
encode_fh/decode_fh in exportfs...

What am I missing? Or is this just a hole in the file handle export scheme 
we have to live with?

Best regards,

         Anton

-- 
   "I've not lost my mind. It's backed up on tape somewhere." - Unknown
-- 
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-12 12:00               ` Anton Altaparmakov
@ 2002-05-12 16:20                 ` Jan Harkes
  2002-05-13  6:54                 ` Neil Brown
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Harkes @ 2002-05-12 16:20 UTC (permalink / raw)
  To: linux-fsdevel

On Sun, May 12, 2002 at 01:00:28PM +0100, Anton Altaparmakov wrote:
> The only problem then remaining for ntfs is that I cannot distinguish which 
> hard link to a file a file handle encodes. Unless I encode the name of the 
> hard link in the file handle... but file handles are supposed to work 
> accross rename so that doesn't work. Heck, how does this work for other 
> file systems?!?
> 
> If you have two hard links both with the same parent directory (e.g. 
> /bin/vi and /bin/vim), then their inode number, generation, and their 
> parent inode and generation are all the same. So even ext2/3 cannot 
> distinguish which one is being looked for with the present default 
> encode_fh/decode_fh in exportfs...

Isn't the whole point of this to open the 'object'? So whether you open
/bin/vi or /usr/bin/vim or /private_namespace/vim_executable doesn't
really matter.

The part that I haven't figured out is why we need the parents at all.
Coda has a 96-bit file identifier, that will be enough to find the
object as long as it hasn't been removed. But I would either have to add
an upcall to get the parent object at any given time, as this info is
available in the userspace daemon. Or at encoding time add the FID of
the parent directory and from there rely on lookup('..'). But this
breaks when the object is renamed to another directory.

Is it possible to just create a dangling dentry for objects that are
accessed through NFS? The inodes will be shared correctly.

Why does the dentry need to be 'well connected' in the first place?

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: NFS export operations question and BUG report
  2002-05-12 12:00               ` Anton Altaparmakov
  2002-05-12 16:20                 ` Jan Harkes
@ 2002-05-13  6:54                 ` Neil Brown
  1 sibling, 0 replies; 21+ messages in thread
From: Neil Brown @ 2002-05-13  6:54 UTC (permalink / raw)
  To: Anton Altaparmakov; +Cc: Alexander Viro, linux-fsdevel

On Sunday May 12, aia21@cantab.net wrote:
> 
> The only problem then remaining for ntfs is that I cannot distinguish which 
> hard link to a file a file handle encodes. Unless I encode the name of the 
> hard link in the file handle... but file handles are supposed to work 
> accross rename so that doesn't work. Heck, how does this work for other 
> file systems?!?

You don't need to distinguish which hard link to a file is encoded.
The file handle encodes a file, not a link to the file.
There are two reason why you need to get a link (i.e. name) of the
file though.

1/ To support the "subtree_check" export option, which allows you to
   export a subtree of a filesystem and ensure that access is only
   provided to files within that subtree.  For nfsd to ensure this it
   needs a full path to the file.

   This is where the "acceptable" function comes in that gets passed
   to ->decode_fh.
   If decode_fh finds that there are multiple possible dentries that
   refer to the same inode, it should return one (any one) that
   satisfies the "acceptable" function.

2/ To satisfy VFS requirements about dentries of directories.
   The VFS really likes each directory to have exactly one dentry, and
   for this dentry to be properly connected to the root.

   One reason that it wants precisely one dentry per directory is
   because the list of names of children are stored in the dentry
   rather than the inode.  If there were two dentries for the one
   inode, then it would be harder to guard against multiple concurrent
   creates of the same name.

   The reason that it wasts the directories dentry to be properly
   connected is that it needs to be able to check if one directory is
   an ancestor of another so that it can dis-allow directory renames
   that would cause closed loops.

   This is why ->get_parent is needed - so that nfsd can build a
   proper path up to the root.

NeilBrown

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2002-05-13  6:54 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <message from Anton Altaparmakov on Saturday May 11>
     [not found] ` <message from Kendrick M. Smith on Friday May 10>
2002-05-10 18:12   ` NFSv4 pseudo filesystem Kendrick M. Smith
2002-05-10 18:18     ` Christoph Hellwig
2002-05-10 18:18     ` Christoph Hellwig
2002-05-10 23:14     ` H. Peter Anvin
2002-05-11  6:31     ` Neil Brown
2002-05-11  6:31     ` Neil Brown
2002-05-11 17:39       ` David Chow
2002-05-11 17:39       ` David Chow
2002-05-11 20:19       ` NFS export operations question and BUG report Anton Altaparmakov
2002-05-11 20:21         ` Anton Altaparmakov
2002-05-11 21:08           ` Anton Altaparmakov
2002-05-11 21:43             ` Neil Brown
2002-05-11 23:09               ` Anton Altaparmakov
2002-05-11 21:18         ` Neil Brown
2002-05-11 22:38           ` Anton Altaparmakov
2002-05-12  7:39             ` Alexander Viro
     [not found]             ` <Pine.GSO.4.21.0205120330410.23398-100000@weyl.math.psu.edu >
2002-05-12 12:00               ` Anton Altaparmakov
2002-05-12 16:20                 ` Jan Harkes
2002-05-13  6:54                 ` Neil Brown
2002-05-11  0:13 NFSv4 pseudo filesystem Bryan Henderson
  -- strict thread matches above, loose matches on Subject: below --
2002-05-10 18:12 Kendrick M. Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.