linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Pathname Semantics with //
@ 2004-09-09 10:41 David Dabbs
  2004-09-08 16:13 ` Hans Reiser
  2004-09-09 17:33 ` Christian Mayrhuber
  0 siblings, 2 replies; 22+ messages in thread
From: David Dabbs @ 2004-09-09 10:41 UTC (permalink / raw)
  To: linux-fsdevel, 'ReiserFS List'



During the recent reiser4-related namespace/semantics discussions, Alan Cox
[http://marc.theaimsgroup.com/?l=reiserfs&m=109405544711435&w=2] and others
referred to the Single UNIX Specification v3 (SuS) provision for
implementation-specific pathname resolution. After a close reading of the
SuS, here is a proposal for how we might flexibly and legally accommodate
new filesystem features.

Assumption
* "Named file streams" (file-as-dir/dir-as-file/whatever) are worth
  implementing, if only to provide Windows interoperability for Samba.

Goals
* No breakage for naïve applications.
* No new APIs e.g. openat().
* Maintain POSIX/SuS compliance.
* Expose the "named streams" capability via the natural "collection of
  files" approach as Linus called it.
  [http://marc.theaimsgroup.com/?l=reiserfs&m=109353819826980&w=2]

Caveat
Other than punting and prohibiting linking to objects created with 
a container attribute, this proposal doesn't present a solution to thorny
issues such as locking.

In A Nutshell

     //:a-directory/SomeName      regular dir entry
     //:a-directory//SomeName     named stream
     //:foo.txt//SomeName         named stream
     //:/.//                      root's named stream dir


Still interested? Read on.



What the Spec Says

All Directories Are Files, but Not All Files Are Directories
In SuS (3.163 File), a file is "an object that can be written to, or read
from, or both. A file has certain attributes, including access permissions
and type. File types include regular file, character special file, block
special file, FIFO special file, symbolic link, socket, and directory. Other
types of files may be supported by the implementation." This last bit is
encouraging, as we are free to implement new file types.

Can An Object Be More Than One Type?
Recent debate hovered around notions of "file-as-dir" or "dir-as-file," but
is this legal? Everywhere the SuS mentions file types they are referred to
as "distinct file types." I didn't scan all the threads for technical
objections to file type duality, but it seems like we should steer clear of
objects that are more than one S_IFMT type. Suppose we implement a container
type. Is it permitted for an object to be both a file and a container?

Pathname Resolution
Whatever innovations people dream up, applications still need to provide a
pathname to the existing APIs. All manner of proposals were floated for
"named streams" -- from reiser4's "/metas" to prefixes such as ".$" or
"...". [http://marc.theaimsgroup.com/?l=reiserfs&m=109413602520921&w=2]

According to SuS 4.11 Pathname Resolution,
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_0
4_11
   ...
   A pathname that begins with two successive slashes may be interpreted in 
   an implementation-defined manner, although more than two leading slashes 
   shall be treated as a single slash.
   ....

// seems relatively clean, but if more than two leading slashes equate to
one, how would an absolute path be specified? How about:

    //:foo/bar/baz      relative path to foo/bar/baz
    //:/foo/bar/baz     absolute path to /foo/bar/baz

While other characters would work, colon seems to fit because "//:" is the
reverse of "file://" minus file.


The Directory Problem
Among the issues raised is disambiguating a stream object name from an
identically named object within a directory. With pathnames prefixed by //:,
one could use something as simple as //, e.g.

     //:a-directory/SomeName      regular dir entry
     //:a-directory//SomeName     named stream
     //:a-directory///SomeName    == //:a-directory/SomeName

The same applies to files as well...
      
     //:foo.txt/SomeName          ERROR
     //:foo.txt//SomeName         named stream
     //:foo.txt///SomeName        == //:foo.txt/SomeName == ERROR

And for the root dir also...

     //:/                         root
     //://                        root
     //:///                       root

     //:/.                        root
     //:/..                       root
     //:/./                       root
     //:/.//                      root's named stream dir
     //:/.///                     root
     //:/./.                      root
     //:/./..                     root

     //:./                        curr dir
     //:.//                       curr dir's named stream dir
     //:.///                      curr dir
     //:./.                       curr dir

No characters or words are removed or reserved from the 'normal' namespace.
A person named Metas can still use his/her name on files, and script kiddies
can still use "...". Applications that got away with pathnames with two
embedded slashes would need to be precise when using the new pathnames. 

Named Streams on Named Streams?
If we want to do no more than allow named stream compatibility with NTFS,
then we can prohibit "names on names". The directory referenced by foo//
would be flat, with only one level of files allowed. Attempts to create
something like //:/a-directory//SomeName//Another would return an error. It
doesn't seem like there's any reason to disallow names on names, though.
Stream-aware applications would simply discard the deeper streams (or
subdirectories) in the same way that named streams are discarded when
copying from NTFS to Linux. 

VFS Support
Apps written to take advantage of the new pathname semantics will need to
know whether it is supported under the current configuration. Would
statvfs() be the place to inquire whether "alternate namespace" support is
available? Would it also be necessary to require apps to query for the
"alternate pathname prefix character" e.g. ":"? Doesn't seem necessary. 

link_path_walk() and other namei.c functions will need changes to support
the // semantics. If a path component is followed by // then a macro such as
S_ISCONTAINER(stat_t) must be true.

super_operations would need a way for filesystems to expose their support
for the "named streams" or "container" capability.
 
Any named stream object storage must count toward quotas.


The Linking/Locking Problem
Al Viro points out a number of serious, unaddressed issues
[http://marc.theaimsgroup.com/?l=reiserfs&m=109463622427391&w=2]
including

> Locking: see above - links to regular files would create directories 
> seen in many places.  With all related issues...

If this cannot be solved then we can simply punt and decide that the
container property is not automatic for all files and directories on
filesystems that support the capability. This would be specified when the
file or dir is created and linking to such an object would be prohibited. 

Linking to a container or to contained object presents problems, but
creating links out of the container seems like it would be just like any
other link object, yes?


Other Unaddressed Issues
Alan Cox
> Another interesting question btw with substreams of a file is what the
> semantics of fchown, fstat, fchmod, fchdir etc are, or of mounting a 
> file over a substream."

Al Viro
> Note that we also have fun issues with device nodes (Linus' "show
> partitions" vs. "show metadata from hosting filesystem"), which makes
> it even more dubious. We also have symlinks to deal with (do they have 
> attributes?  where should that be exported?)."



David Dabbs

^ permalink raw reply	[flat|nested] 22+ messages in thread
* RE: [RFC] Pathname Semantics with //
@ 2004-09-10 17:49 David Dabbs
  2004-09-09 18:03 ` Hans Reiser
  0 siblings, 1 reply; 22+ messages in thread
From: David Dabbs @ 2004-09-10 17:49 UTC (permalink / raw)
  To: reiserfs-list, linux-fsdevel



> Jamie Lokier
> 
> David Dabbs wrote:
> 
> > Shooting from the hip here. If we want to unify namespaces in a 
> > UNIXy
> way,
> > what if we make the VFS expose all the non-file "protocol" 
> > namespaces through one mount point, device node or whatever. A 
> > filesystem, perhaps something built using FiST
[http://www.filesystems.org/], would "handle"
> a
> > protocol. Another, perhaps preferred, option is to steer in the
> direction of
> > Plan9, where ftp can be mounted and handled by a user-space 
> > filesystem, ftpfs.
> > See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
> 
> You can already do it, something like this:
> 
>     mkdir /http:; mount none /http: -t uservfs -o view=http
>     mkdir /ftp:;  mount none /ftp:  -t uservfs -o view=ftp
> 
> I don't see any compelling reason to make "//" special for this.
> However, if there is such a reason, then you could just mount protocol 
> handlers on "//http:" and so on, and make "//" a normal directory with 
> a special name.
> 
> -- Jamie

Jamie, we _definitely_ agree, except apps that want to create links to URLs
will prepend one slash to the URL instead of two. Is your reference to
uservfs a "foo" reference or do you mean
http://sourceforge.net/projects/uservfs/? It looks a little dusty. But we
are pulling in the same direction.

The /file: node could simply be a symlink. Thus we have

      cd /
      ln -s / file:
      mkdir http:; mount none /http: -t uservfs -o view=http
      mkdir ftp:;  mount none /ftp:  -t uservfs -o view=ftp
      #etc...

Pathnames would be resolved with the existing code in namei.c. I can
understand mounting a URL whose protocol looks like a fs tree (e.g. ftp),
but http? Namei() parses the pathname one component at a time, checks the
dcache, and goes to the fs when that fails. Let's trace through how a URL
might get resolved. 

      ln -s /http://sourceforge.net/projects/uservfs
      cat uservfs

Pathname would be resolved as 

      /http:          
      sourceforge.net/
      projects/
      uservfs

I need to look closer at namei (or the uservfs code if it really supports a
view=http). As long as a fs can generate meaningful, stateful values in
response to VFS calls to real_lookup(), then this may work. 


David

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2004-09-10 17:49 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-09 10:41 [RFC] Pathname Semantics with // David Dabbs
2004-09-08 16:13 ` Hans Reiser
2004-09-09 16:36   ` Peter Foldiak
2004-09-09 19:21   ` David Dabbs
2004-09-10  0:49     ` Hans Reiser
2004-09-10  3:06       ` David Dabbs
2004-09-10  5:40         ` Hans Reiser
2004-09-09 21:51   ` David Dabbs
2004-09-09  6:10     ` Hans Reiser
2004-09-09 17:33 ` Christian Mayrhuber
2004-09-09 20:17   ` David Dabbs
2004-09-09 20:41     ` Andreas Dilger
2004-09-10  9:11       ` Markus   Törnqvist
2004-09-10 10:37     ` Christian Mayrhuber
2004-09-09 23:03   ` Jamie Lokier
2004-09-10  1:37     ` David Dabbs
2004-09-10  9:53       ` [SPAM] " Jamie Lokier
2004-09-10 17:11         ` David Dabbs
2004-09-10 11:47       ` Christian Mayrhuber
2004-09-10 11:06     ` Christian Mayrhuber
  -- strict thread matches above, loose matches on Subject: below --
2004-09-10 17:49 David Dabbs
2004-09-09 18:03 ` Hans Reiser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).