* [RFC] Pathname Semantics with //
@ 2004-09-09 10:41 David Dabbs
2004-09-08 16:13 ` Hans Reiser
2004-09-09 17:33 ` Christian Mayrhuber
0 siblings, 2 replies; 22+ messages in thread
From: David Dabbs @ 2004-09-09 10:41 UTC (permalink / raw)
To: linux-fsdevel, 'ReiserFS List'
During the recent reiser4-related namespace/semantics discussions, Alan Cox
[http://marc.theaimsgroup.com/?l=reiserfs&m=109405544711435&w=2] and others
referred to the Single UNIX Specification v3 (SuS) provision for
implementation-specific pathname resolution. After a close reading of the
SuS, here is a proposal for how we might flexibly and legally accommodate
new filesystem features.
Assumption
* "Named file streams" (file-as-dir/dir-as-file/whatever) are worth
implementing, if only to provide Windows interoperability for Samba.
Goals
* No breakage for naïve applications.
* No new APIs e.g. openat().
* Maintain POSIX/SuS compliance.
* Expose the "named streams" capability via the natural "collection of
files" approach as Linus called it.
[http://marc.theaimsgroup.com/?l=reiserfs&m=109353819826980&w=2]
Caveat
Other than punting and prohibiting linking to objects created with
a container attribute, this proposal doesn't present a solution to thorny
issues such as locking.
In A Nutshell
//:a-directory/SomeName regular dir entry
//:a-directory//SomeName named stream
//:foo.txt//SomeName named stream
//:/.// root's named stream dir
Still interested? Read on.
What the Spec Says
All Directories Are Files, but Not All Files Are Directories
In SuS (3.163 File), a file is "an object that can be written to, or read
from, or both. A file has certain attributes, including access permissions
and type. File types include regular file, character special file, block
special file, FIFO special file, symbolic link, socket, and directory. Other
types of files may be supported by the implementation." This last bit is
encouraging, as we are free to implement new file types.
Can An Object Be More Than One Type?
Recent debate hovered around notions of "file-as-dir" or "dir-as-file," but
is this legal? Everywhere the SuS mentions file types they are referred to
as "distinct file types." I didn't scan all the threads for technical
objections to file type duality, but it seems like we should steer clear of
objects that are more than one S_IFMT type. Suppose we implement a container
type. Is it permitted for an object to be both a file and a container?
Pathname Resolution
Whatever innovations people dream up, applications still need to provide a
pathname to the existing APIs. All manner of proposals were floated for
"named streams" -- from reiser4's "/metas" to prefixes such as ".$" or
"...". [http://marc.theaimsgroup.com/?l=reiserfs&m=109413602520921&w=2]
According to SuS 4.11 Pathname Resolution,
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_0
4_11
...
A pathname that begins with two successive slashes may be interpreted in
an implementation-defined manner, although more than two leading slashes
shall be treated as a single slash.
....
// seems relatively clean, but if more than two leading slashes equate to
one, how would an absolute path be specified? How about:
//:foo/bar/baz relative path to foo/bar/baz
//:/foo/bar/baz absolute path to /foo/bar/baz
While other characters would work, colon seems to fit because "//:" is the
reverse of "file://" minus file.
The Directory Problem
Among the issues raised is disambiguating a stream object name from an
identically named object within a directory. With pathnames prefixed by //:,
one could use something as simple as //, e.g.
//:a-directory/SomeName regular dir entry
//:a-directory//SomeName named stream
//:a-directory///SomeName == //:a-directory/SomeName
The same applies to files as well...
//:foo.txt/SomeName ERROR
//:foo.txt//SomeName named stream
//:foo.txt///SomeName == //:foo.txt/SomeName == ERROR
And for the root dir also...
//:/ root
//:// root
//:/// root
//:/. root
//:/.. root
//:/./ root
//:/.// root's named stream dir
//:/./// root
//:/./. root
//:/./.. root
//:./ curr dir
//:.// curr dir's named stream dir
//:./// curr dir
//:./. curr dir
No characters or words are removed or reserved from the 'normal' namespace.
A person named Metas can still use his/her name on files, and script kiddies
can still use "...". Applications that got away with pathnames with two
embedded slashes would need to be precise when using the new pathnames.
Named Streams on Named Streams?
If we want to do no more than allow named stream compatibility with NTFS,
then we can prohibit "names on names". The directory referenced by foo//
would be flat, with only one level of files allowed. Attempts to create
something like //:/a-directory//SomeName//Another would return an error. It
doesn't seem like there's any reason to disallow names on names, though.
Stream-aware applications would simply discard the deeper streams (or
subdirectories) in the same way that named streams are discarded when
copying from NTFS to Linux.
VFS Support
Apps written to take advantage of the new pathname semantics will need to
know whether it is supported under the current configuration. Would
statvfs() be the place to inquire whether "alternate namespace" support is
available? Would it also be necessary to require apps to query for the
"alternate pathname prefix character" e.g. ":"? Doesn't seem necessary.
link_path_walk() and other namei.c functions will need changes to support
the // semantics. If a path component is followed by // then a macro such as
S_ISCONTAINER(stat_t) must be true.
super_operations would need a way for filesystems to expose their support
for the "named streams" or "container" capability.
Any named stream object storage must count toward quotas.
The Linking/Locking Problem
Al Viro points out a number of serious, unaddressed issues
[http://marc.theaimsgroup.com/?l=reiserfs&m=109463622427391&w=2]
including
> Locking: see above - links to regular files would create directories
> seen in many places. With all related issues...
If this cannot be solved then we can simply punt and decide that the
container property is not automatic for all files and directories on
filesystems that support the capability. This would be specified when the
file or dir is created and linking to such an object would be prohibited.
Linking to a container or to contained object presents problems, but
creating links out of the container seems like it would be just like any
other link object, yes?
Other Unaddressed Issues
Alan Cox
> Another interesting question btw with substreams of a file is what the
> semantics of fchown, fstat, fchmod, fchdir etc are, or of mounting a
> file over a substream."
Al Viro
> Note that we also have fun issues with device nodes (Linus' "show
> partitions" vs. "show metadata from hosting filesystem"), which makes
> it even more dubious. We also have symlinks to deal with (do they have
> attributes? where should that be exported?)."
David Dabbs
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [RFC] Pathname Semantics with //
2004-09-09 10:41 [RFC] Pathname Semantics with // David Dabbs
@ 2004-09-08 16:13 ` Hans Reiser
2004-09-09 16:36 ` Peter Foldiak
` (2 more replies)
2004-09-09 17:33 ` Christian Mayrhuber
1 sibling, 3 replies; 22+ messages in thread
From: Hans Reiser @ 2004-09-08 16:13 UTC (permalink / raw)
To: David Dabbs; +Cc: linux-fsdevel, 'ReiserFS List'
Use of : in addition to / is a bad idea, see The Hideous Name by Rob
Pike for why.
Hans
David Dabbs wrote:
>During the recent reiser4-related namespace/semantics discussions, Alan Cox
>[http://marc.theaimsgroup.com/?l=reiserfs&m=109405544711435&w=2] and others
>referred to the Single UNIX Specification v3 (SuS) provision for
>implementation-specific pathname resolution. After a close reading of the
>SuS, here is a proposal for how we might flexibly and legally accommodate
>new filesystem features.
>
>Assumption
>* "Named file streams" (file-as-dir/dir-as-file/whatever) are worth
> implementing, if only to provide Windows interoperability for Samba.
>
>Goals
>* No breakage for naïve applications.
>* No new APIs e.g. openat().
>* Maintain POSIX/SuS compliance.
>* Expose the "named streams" capability via the natural "collection of
> files" approach as Linus called it.
> [http://marc.theaimsgroup.com/?l=reiserfs&m=109353819826980&w=2]
>
>Caveat
>Other than punting and prohibiting linking to objects created with
>a container attribute, this proposal doesn't present a solution to thorny
>issues such as locking.
>
>In A Nutshell
>
> //:a-directory/SomeName regular dir entry
> //:a-directory//SomeName named stream
> //:foo.txt//SomeName named stream
> //:/.// root's named stream dir
>
>
>Still interested? Read on.
>
>
>
>What the Spec Says
>
>All Directories Are Files, but Not All Files Are Directories
>In SuS (3.163 File), a file is "an object that can be written to, or read
>from, or both. A file has certain attributes, including access permissions
>and type. File types include regular file, character special file, block
>special file, FIFO special file, symbolic link, socket, and directory. Other
>types of files may be supported by the implementation." This last bit is
>encouraging, as we are free to implement new file types.
>
>Can An Object Be More Than One Type?
>Recent debate hovered around notions of "file-as-dir" or "dir-as-file," but
>is this legal? Everywhere the SuS mentions file types they are referred to
>as "distinct file types." I didn't scan all the threads for technical
>objections to file type duality, but it seems like we should steer clear of
>objects that are more than one S_IFMT type. Suppose we implement a container
>type. Is it permitted for an object to be both a file and a container?
>
>Pathname Resolution
>Whatever innovations people dream up, applications still need to provide a
>pathname to the existing APIs. All manner of proposals were floated for
>"named streams" -- from reiser4's "/metas" to prefixes such as ".$" or
>"...". [http://marc.theaimsgroup.com/?l=reiserfs&m=109413602520921&w=2]
>
>According to SuS 4.11 Pathname Resolution,
>http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_0
>4_11
> ...
> A pathname that begins with two successive slashes may be interpreted in
> an implementation-defined manner, although more than two leading slashes
> shall be treated as a single slash.
> ....
>
>// seems relatively clean, but if more than two leading slashes equate to
>one, how would an absolute path be specified? How about:
>
> //:foo/bar/baz relative path to foo/bar/baz
> //:/foo/bar/baz absolute path to /foo/bar/baz
>
>While other characters would work, colon seems to fit because "//:" is the
>reverse of "file://" minus file.
>
>
>The Directory Problem
>Among the issues raised is disambiguating a stream object name from an
>identically named object within a directory. With pathnames prefixed by //:,
>one could use something as simple as //, e.g.
>
> //:a-directory/SomeName regular dir entry
> //:a-directory//SomeName named stream
> //:a-directory///SomeName == //:a-directory/SomeName
>
>The same applies to files as well...
>
> //:foo.txt/SomeName ERROR
> //:foo.txt//SomeName named stream
> //:foo.txt///SomeName == //:foo.txt/SomeName == ERROR
>
>And for the root dir also...
>
> //:/ root
> //:// root
> //:/// root
>
> //:/. root
> //:/.. root
> //:/./ root
> //:/.// root's named stream dir
> //:/./// root
> //:/./. root
> //:/./.. root
>
> //:./ curr dir
> //:.// curr dir's named stream dir
> //:./// curr dir
> //:./. curr dir
>
>No characters or words are removed or reserved from the 'normal' namespace.
>A person named Metas can still use his/her name on files, and script kiddies
>can still use "...". Applications that got away with pathnames with two
>embedded slashes would need to be precise when using the new pathnames.
>
>Named Streams on Named Streams?
>If we want to do no more than allow named stream compatibility with NTFS,
>then we can prohibit "names on names". The directory referenced by foo//
>would be flat, with only one level of files allowed. Attempts to create
>something like //:/a-directory//SomeName//Another would return an error. It
>doesn't seem like there's any reason to disallow names on names, though.
>Stream-aware applications would simply discard the deeper streams (or
>subdirectories) in the same way that named streams are discarded when
>copying from NTFS to Linux.
>
>VFS Support
>Apps written to take advantage of the new pathname semantics will need to
>know whether it is supported under the current configuration. Would
>statvfs() be the place to inquire whether "alternate namespace" support is
>available? Would it also be necessary to require apps to query for the
>"alternate pathname prefix character" e.g. ":"? Doesn't seem necessary.
>
>link_path_walk() and other namei.c functions will need changes to support
>the // semantics. If a path component is followed by // then a macro such as
>S_ISCONTAINER(stat_t) must be true.
>
>super_operations would need a way for filesystems to expose their support
>for the "named streams" or "container" capability.
>
>Any named stream object storage must count toward quotas.
>
>
>The Linking/Locking Problem
>Al Viro points out a number of serious, unaddressed issues
>[http://marc.theaimsgroup.com/?l=reiserfs&m=109463622427391&w=2]
>including
>
>
>
>>Locking: see above - links to regular files would create directories
>>seen in many places. With all related issues...
>>
>>
>
>If this cannot be solved then we can simply punt and decide that the
>container property is not automatic for all files and directories on
>filesystems that support the capability. This would be specified when the
>file or dir is created and linking to such an object would be prohibited.
>
>Linking to a container or to contained object presents problems, but
>creating links out of the container seems like it would be just like any
>other link object, yes?
>
>
>Other Unaddressed Issues
>Alan Cox
>
>
>>Another interesting question btw with substreams of a file is what the
>>semantics of fchown, fstat, fchmod, fchdir etc are, or of mounting a
>>file over a substream."
>>
>>
>
>Al Viro
>
>
>>Note that we also have fun issues with device nodes (Linus' "show
>>partitions" vs. "show metadata from hosting filesystem"), which makes
>>it even more dubious. We also have symlinks to deal with (do they have
>>attributes? where should that be exported?)."
>>
>>
>
>
>
>David Dabbs
>
>
>
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [RFC] Pathname Semantics with //
2004-09-08 16:13 ` Hans Reiser
@ 2004-09-09 16:36 ` Peter Foldiak
2004-09-09 19:21 ` David Dabbs
2004-09-09 21:51 ` David Dabbs
2 siblings, 0 replies; 22+ messages in thread
From: Peter Foldiak @ 2004-09-09 16:36 UTC (permalink / raw)
To: David Dabbs; +Cc: reiser, David Dabbs, linux-fsdevel, 'ReiserFS List'
see e.g.
http://www.cs.bell-labs.com/cm/cs/doc/85/1-05.ps.gz
On Wed, 2004-09-08 at 17:13, Hans Reiser wrote:
> Use of : in addition to / is a bad idea, see The Hideous Name by Rob
> Pike for why.
>
> Hans
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [RFC] Pathname Semantics with //
2004-09-08 16:13 ` Hans Reiser
2004-09-09 16:36 ` Peter Foldiak
@ 2004-09-09 19:21 ` David Dabbs
2004-09-10 0:49 ` Hans Reiser
2004-09-09 21:51 ` David Dabbs
2 siblings, 1 reply; 22+ messages in thread
From: David Dabbs @ 2004-09-09 19:21 UTC (permalink / raw)
To: 'Hans Reiser'; +Cc: linux-fsdevel, 'ReiserFS List'
>
> Use of : in addition to / is a bad idea, see The Hideous Name by Rob
> Pike for why.
>
> Hans
>
I've read The Hideous Name, and I think you're taking Pike out of context.
He wrote that document when device files were still only a part of a
research version of UNIX. His main point is that programs should not be
saddled with understanding physical devices when constructing filenames.
Regarding use of colons in file name spaces, I find the following:
>name of a file is a string that identifies the disk drive holding the file.
>Syntax separates these componentsof the name; MS-DOS uses a colon following
>the disk name, a single character:
>
>A:FILE is a file on disk drive A, while
>B:FILE is a file on disk drive B.
>
>The advantage of putting such information in names is that software need
>not know about disks tomanipulate files. Internally, of course, system
>software must use the syntax of the name to locate the file,
The colon itself is not at question here, but the fact that MSDOS naming
forces software to know about disks to manipulate files.
>ucbvax were a gateway we could access files on a distant machine as
>UCBVAX::KREMVAX::file, it is only because the semantics of :: explicitly
>permit such access. The :: operator is implemented by passing the string
>after it to the remote machine, but first checking its syn-tax, so the file
>name parser must have special code for multiple ::'s.
So what if VMS used two colons? They could have been tildes. The colon was
incident to his primary point:
>Instead, if names had multiple components (that is, syntax), where the
>components did not necessarily correspond to physical devices, the name
>space would have the advantages of that of both MVS and MS-DOS, with the
>disadvantages of neither. A good example: UNIX(R)
He later gripes about the colon w.r.t the Ibis remote file system. Again, it
is the same complaint. But we will not be changing paths to the objects
every application understands today. An opaque escape is added to the
beginning and again at the end. Everything in between is standard UNIX
pathnames, though sometimes the use of dot is required where it may be
elided in practice today.
Regarding slashes, I can't find anywhere that Pike criticizes slashes.
Now, I can see where the proposal runs afoul of his advice that:
>The syntax should be clean and uniform; every new syntactic rule requires
>at least one, and usually many, semantic rules to resolve peculiarities
>introduced by the new syntax. If the name space is a tree or any other kind
>of graph, a single character should be used to separate nodes in a name.
Well, the only reason two trailing slashes are required is that we need some
way to distinguish a directory's named streams from its entries. What is
worse, one extra slash at the end of a directory (or two for a file) or
"/metas"?
The SuS is the standards document to which Linux and other Unices attempt to
conform. The proposal inherits the pre-existing double leading slash and the
slash pathname delimiter. If I understand you correctly, you're saying there
is a problem with the slash character. Could you be more specific? I don't
think you're proposing to change Linux path delimiter, are you?
To what the SuS allows, only the dreaded colon is added. As I said in the
proposal, the colon is arbitrary. Also, the colon could be elided for
relative paths, but consistency seemed important. If we patched the VFS,
etc. to support Linux-specific pathname semantics you should (with hope) be
able to do the following without modifying the applications:
mkdir test
cd test
touch foo.txt # dir entry
emacs //.//foo.txt # named stream file
All you had to do was take the existing path, "foo.txt", prepend a special
escape and append two slashes. The new named streams are there only for the
initiated applications and humans. Every other naïve application is happily
ignorant.
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 19:21 ` David Dabbs
@ 2004-09-10 0:49 ` Hans Reiser
2004-09-10 3:06 ` David Dabbs
0 siblings, 1 reply; 22+ messages in thread
From: Hans Reiser @ 2004-09-10 0:49 UTC (permalink / raw)
To: David Dabbs; +Cc: linux-fsdevel, 'ReiserFS List'
David Dabbs wrote:
>
>
>>Use of : in addition to / is a bad idea, see The Hideous Name by Rob
>>Pike for why.
>>
>>Hans
>>
>>
>>
>
>I've read The Hideous Name, and I think you're taking Pike out of context.
>He wrote that document when device files were still only a part of a
>research version of UNIX. His main point is that programs should not be
>saddled with understanding physical devices when constructing filenames.
>
No, it was more than that. It was that hierarchy requires only one
delimiter.
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [RFC] Pathname Semantics with //
2004-09-10 0:49 ` Hans Reiser
@ 2004-09-10 3:06 ` David Dabbs
2004-09-10 5:40 ` Hans Reiser
0 siblings, 1 reply; 22+ messages in thread
From: David Dabbs @ 2004-09-10 3:06 UTC (permalink / raw)
To: 'Hans Reiser'; +Cc: linux-fsdevel, 'ReiserFS List'
> Hans Reiser wrote:
>
> David Dabbs wrote:
>
> >>Use of : in addition to / is a bad idea, see The Hideous Name by Rob
> >>Pike for why.
> >>
> >>Hans
> >>
> >
> >I've read The Hideous Name, and I think you're taking Pike out of
> context.
> >He wrote that document when device files were still only a part of a
> >research version of UNIX. His main point is that programs should not be
> >saddled with understanding physical devices when constructing filenames.
> >
> No, it was more than that. It was that hierarchy requires only one
> delimiter.
Of course his essay was about more than that! Where you quote me above, I
was referring to his main point /about use of the colon/. Perhaps my message
was truncated and you didn't get to read a bit later where I acknowledge
>Regarding slashes, I can't find anywhere that Pike criticizes slashes.
>Now, I can see where the proposal runs afoul of his advice that:
>
>The syntax should be clean and uniform; every new syntactic rule requires
>at least one, and usually many, semantic rules to resolve peculiarities
>introduced by the new syntax. If the name space is a tree or any other kind
>of graph, a single character should be used to separate nodes in a name.
>
>Well, the only reason two trailing slashes are required is that we need
>some way to distinguish a directory's named streams from its entries. What
>is worse, one extra slash at the end of a directory (or two for a file) or
>"/metas"?
I completely agree with Pike on the one delimiter thing -- it is certainly
much more elegant. And we can have that simplicity/elegance as long as
directories are prohibited from having metadata. Do you have a proposal to
expose metadata on a directory such that it
a) allows one to distinguish a directory entry from directory metadata,
b) uses only already-reserved pathname character(s),
c) doesn't require any reserved name,
d) is the same delimiter used for file metadata and
e) doesn't butt heads with Sus?
In a different post you said
>Oh, and yes, I understand that minimizing the cost of change by
>being artful is desirable.
Isn't it artful to try for the most innovation with the least breakage;
taking as little from the namespace and breaking only as many eggs as are
needed to make that innovative new omelet? I'm certainly willing to break as
many eggs as needed, but no more.
I don't claim to know as much about closure (or free trade) as do you. My
mental gifts are in other matters. ;-)
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-10 3:06 ` David Dabbs
@ 2004-09-10 5:40 ` Hans Reiser
0 siblings, 0 replies; 22+ messages in thread
From: Hans Reiser @ 2004-09-10 5:40 UTC (permalink / raw)
To: David Dabbs; +Cc: linux-fsdevel, 'ReiserFS List', Peter Foldiak
David Dabbs wrote:
>
>
>
> Do you have a proposal to
>expose metadata on a directory such that it
>
>a) allows one to distinguish a directory entry from directory metadata,
>
>
this should be only a style convention, not a deep semantic difference.
Maybe Peter can comment on this.
>b) uses only already-reserved pathname character(s),
>
>
this is not important in practice
>c) doesn't require any reserved name,
>
>
this is not important in practice
>d) is the same delimiter used for file metadata and
>
>
?
>e) doesn't butt heads with Sus?
>
>
standards are not the future, they are efforts to make the past less
painful.
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [RFC] Pathname Semantics with //
2004-09-08 16:13 ` Hans Reiser
2004-09-09 16:36 ` Peter Foldiak
2004-09-09 19:21 ` David Dabbs
@ 2004-09-09 21:51 ` David Dabbs
2004-09-09 6:10 ` Hans Reiser
2 siblings, 1 reply; 22+ messages in thread
From: David Dabbs @ 2004-09-09 21:51 UTC (permalink / raw)
To: linux-fsdevel, 'ReiserFS List'
Before we get too far into the merits of implementation-specific pathname
resolution for paths starting with //, it seems wise to address the POSIX
implications of any duality implied by this (or any other) semantic change.
This is the first issue raised in my original post. Gunnar Ritter also
addressed it this lkml post
[http://marc.theaimsgroup.com/?l=linux-kernel&m=109475512921055&w=2]
POSIX demands that open() must fail with ENOTDIR if "a component of the path
prefix is not a directory." If resolving a path using alternate pathname
resolution permits files to "act as" directories, is it legal to fail with
ENOTDIR when the VFS/filesystem does not support the capability even though
the final pathname component is of type S_IFREG?
Also open() must fail with EACCES if "search permission is denied on a
component of the path prefix..." Under normal circumstances this means no
execute permission on a /directory/ path component. What does it mean when
path prefixes may be of type S_IFREG? Does this mean that normally
non-executable files must have execute permissions set? This is an important
security issue.
BTW, a "Strictly Conforming POSIX Application"
[http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap02.html#tag_
02_02_01]
#Shall accept any implementation behavior that results from actions it takes
#in areas described in IEEE Std 1003.1-2001 as implementation-defined or
#unspecified, or where IEEE Std 1003.1-2001 indicates that implementations
#may vary.
How should we interpret this?
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 21:51 ` David Dabbs
@ 2004-09-09 6:10 ` Hans Reiser
0 siblings, 0 replies; 22+ messages in thread
From: Hans Reiser @ 2004-09-09 6:10 UTC (permalink / raw)
To: David Dabbs; +Cc: linux-fsdevel, 'ReiserFS List'
David Dabbs wrote:
>Before we get too far into the merits of implementation-specific pathname
>resolution for paths starting with //, it seems wise to address the POSIX
>implications of any duality implied by this (or any other) semantic change.
>
>This is the first issue raised in my original post. Gunnar Ritter also
>addressed it this lkml post
>[http://marc.theaimsgroup.com/?l=linux-kernel&m=109475512921055&w=2]
>
>
>POSIX demands that open() must fail with ENOTDIR if "a component of the path
>prefix is not a directory." If resolving a path using alternate pathname
>resolution
>
what is alternate pathname resolution? I don't understand this
paragraph somehow, probably my error.
>permits files to "act as" directories, is it legal to fail with
>ENOTDIR when the VFS/filesystem does not support the capability even though
>the final pathname component is of type S_IFREG?
>
>Also open() must fail with EACCES if "search permission is denied on a
>component of the path prefix..." Under normal circumstances this means no
>execute permission on a /directory/ path component. What does it mean when
>path prefixes may be of type S_IFREG? Does this mean that normally
>non-executable files must have execute permissions set? This is an important
>security issue.
>
>
I think we need two executable bits, one for searching directories and
once for executing the default file of the directory.
>BTW, a "Strictly Conforming POSIX Application"
>[http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap02.html#tag_
>02_02_01]
>
>#Shall accept any implementation behavior that results from actions it takes
>#in areas described in IEEE Std 1003.1-2001 as implementation-defined or
>#unspecified, or where IEEE Std 1003.1-2001 indicates that implementations
>#may vary.
>
>How should we interpret this?
>
>
>David
>
>
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 10:41 [RFC] Pathname Semantics with // David Dabbs
2004-09-08 16:13 ` Hans Reiser
@ 2004-09-09 17:33 ` Christian Mayrhuber
2004-09-09 20:17 ` David Dabbs
2004-09-09 23:03 ` Jamie Lokier
1 sibling, 2 replies; 22+ messages in thread
From: Christian Mayrhuber @ 2004-09-09 17:33 UTC (permalink / raw)
To: reiserfs-list; +Cc: David Dabbs, linux-fsdevel
What about using // as some URI entry point?
An URI looks like:
PROTOCOL://PROTOCOL_SPECIFIC_NAMESPACE_IMPLEMENTATION
As no one can guarantee unix semantics in an URI space only symbolic
links are allowed to and from the URL namespace.
The "protocol" names are issued by the kernel to prevent clashes and
fragmentation.
Some example URI's for a userspace implementation like
fuse.
//http://somehost:port/foo/bla
//tar:///path/to/some/file.tar.gz/sub/directories
That could unify the namesspace for userspace apps like gnome, kde and mc.
The / could even mean a shortcut to the //file:// URI, but loosing the
capability to hardlink.
If someone wants to expose filesystem metadata
something like //metas:// instead of //: could do it.
//metas://a-directory/SomeName regular dir entry
//metas://a-directory//SomeName named stream
//metas://a-directory///SomeName == //metas://a-directory/SomeName
The same applies to files as well...
//metas://foo.txt/SomeName ERROR
//metas://foo.txt//SomeName named stream
//metas://foo.txt///SomeName == //metas://foo.txt/SomeName ==
ERROR
And for the root dir also...
//metas:/// root
//metas://// root's named stream dir, looks odd
but has // after //metas://
//metas:///// root, has /// after //metas:// and
means / == root dir
//metas:///. root
//metas:///.. root
//metas:///./ root
//metas:///.// root's named stream dir
//metas:///./// root
//metas:///./. root
//metas:///./.. root
//metas://./ curr dir
//metas://.// curr dir's named stream dir
//metas://./// curr dir
//metas://./. curr dir
On Thursday 09 September 2004 12:41, David Dabbs wrote:
>
> During the recent reiser4-related namespace/semantics discussions, Alan Cox
> [http://marc.theaimsgroup.com/?l=reiserfs&m=109405544711435&w=2] and others
> referred to the Single UNIX Specification v3 (SuS) provision for
> implementation-specific pathname resolution. After a close reading of the
> SuS, here is a proposal for how we might flexibly and legally accommodate
> new filesystem features.
>
> Assumption
> * "Named file streams" (file-as-dir/dir-as-file/whatever) are worth
> implementing, if only to provide Windows interoperability for Samba.
>
> Goals
> * No breakage for naïve applications.
> * No new APIs e.g. openat().
> * Maintain POSIX/SuS compliance.
> * Expose the "named streams" capability via the natural "collection of
> files" approach as Linus called it.
> [http://marc.theaimsgroup.com/?l=reiserfs&m=109353819826980&w=2]
>
> Caveat
> Other than punting and prohibiting linking to objects created with
> a container attribute, this proposal doesn't present a solution to thorny
> issues such as locking.
>
> In A Nutshell
>
> //:a-directory/SomeName regular dir entry
> //:a-directory//SomeName named stream
> //:foo.txt//SomeName named stream
> //:/.// root's named stream dir
>
>
> Still interested? Read on.
>
>
>
> What the Spec Says
>
> All Directories Are Files, but Not All Files Are Directories
> In SuS (3.163 File), a file is "an object that can be written to, or read
> from, or both. A file has certain attributes, including access permissions
> and type. File types include regular file, character special file, block
> special file, FIFO special file, symbolic link, socket, and directory. Other
> types of files may be supported by the implementation." This last bit is
> encouraging, as we are free to implement new file types.
>
> Can An Object Be More Than One Type?
> Recent debate hovered around notions of "file-as-dir" or "dir-as-file," but
> is this legal? Everywhere the SuS mentions file types they are referred to
> as "distinct file types." I didn't scan all the threads for technical
> objections to file type duality, but it seems like we should steer clear of
> objects that are more than one S_IFMT type. Suppose we implement a container
> type. Is it permitted for an object to be both a file and a container?
>
> Pathname Resolution
> Whatever innovations people dream up, applications still need to provide a
> pathname to the existing APIs. All manner of proposals were floated for
> "named streams" -- from reiser4's "/metas" to prefixes such as ".$" or
> "...". [http://marc.theaimsgroup.com/?l=reiserfs&m=109413602520921&w=2]
>
> According to SuS 4.11 Pathname Resolution,
> http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_0
> 4_11
> ...
> A pathname that begins with two successive slashes may be interpreted in
> an implementation-defined manner, although more than two leading slashes
> shall be treated as a single slash.
> ....
>
> // seems relatively clean, but if more than two leading slashes equate to
> one, how would an absolute path be specified? How about:
>
> //:foo/bar/baz relative path to foo/bar/baz
> //:/foo/bar/baz absolute path to /foo/bar/baz
>
> While other characters would work, colon seems to fit because "//:" is the
> reverse of "file://" minus file.
>
>
> The Directory Problem
> Among the issues raised is disambiguating a stream object name from an
> identically named object within a directory. With pathnames prefixed by //:,
> one could use something as simple as //, e.g.
>
> //:a-directory/SomeName regular dir entry
> //:a-directory//SomeName named stream
> //:a-directory///SomeName == //:a-directory/SomeName
>
> The same applies to files as well...
>
> //:foo.txt/SomeName ERROR
> //:foo.txt//SomeName named stream
> //:foo.txt///SomeName == //:foo.txt/SomeName == ERROR
>
> And for the root dir also...
>
> //:/ root
> //:// root
> //:/// root
>
> //:/. root
> //:/.. root
> //:/./ root
> //:/.// root's named stream dir
> //:/./// root
> //:/./. root
> //:/./.. root
>
> //:./ curr dir
> //:.// curr dir's named stream dir
> //:./// curr dir
> //:./. curr dir
>
> No characters or words are removed or reserved from the 'normal' namespace.
> A person named Metas can still use his/her name on files, and script kiddies
> can still use "...". Applications that got away with pathnames with two
> embedded slashes would need to be precise when using the new pathnames.
>
> Named Streams on Named Streams?
> If we want to do no more than allow named stream compatibility with NTFS,
> then we can prohibit "names on names". The directory referenced by foo//
> would be flat, with only one level of files allowed. Attempts to create
> something like //:/a-directory//SomeName//Another would return an error. It
> doesn't seem like there's any reason to disallow names on names, though.
> Stream-aware applications would simply discard the deeper streams (or
> subdirectories) in the same way that named streams are discarded when
> copying from NTFS to Linux.
>
> VFS Support
> Apps written to take advantage of the new pathname semantics will need to
> know whether it is supported under the current configuration. Would
> statvfs() be the place to inquire whether "alternate namespace" support is
> available? Would it also be necessary to require apps to query for the
> "alternate pathname prefix character" e.g. ":"? Doesn't seem necessary.
>
> link_path_walk() and other namei.c functions will need changes to support
> the // semantics. If a path component is followed by // then a macro such as
> S_ISCONTAINER(stat_t) must be true.
>
> super_operations would need a way for filesystems to expose their support
> for the "named streams" or "container" capability.
>
> Any named stream object storage must count toward quotas.
>
>
> The Linking/Locking Problem
> Al Viro points out a number of serious, unaddressed issues
> [http://marc.theaimsgroup.com/?l=reiserfs&m=109463622427391&w=2]
> including
>
> > Locking: see above - links to regular files would create directories
> > seen in many places. With all related issues...
>
> If this cannot be solved then we can simply punt and decide that the
> container property is not automatic for all files and directories on
> filesystems that support the capability. This would be specified when the
> file or dir is created and linking to such an object would be prohibited.
>
> Linking to a container or to contained object presents problems, but
> creating links out of the container seems like it would be just like any
> other link object, yes?
>
>
> Other Unaddressed Issues
> Alan Cox
> > Another interesting question btw with substreams of a file is what the
> > semantics of fchown, fstat, fchmod, fchdir etc are, or of mounting a
> > file over a substream."
>
> Al Viro
> > Note that we also have fun issues with device nodes (Linus' "show
> > partitions" vs. "show metadata from hosting filesystem"), which makes
> > it even more dubious. We also have symlinks to deal with (do they have
> > attributes? where should that be exported?)."
>
>
>
> David Dabbs
>
>
--
lg, Chris
^ permalink raw reply [flat|nested] 22+ messages in thread* RE: [RFC] Pathname Semantics with //
2004-09-09 17:33 ` Christian Mayrhuber
@ 2004-09-09 20:17 ` David Dabbs
2004-09-09 20:41 ` Andreas Dilger
2004-09-10 10:37 ` Christian Mayrhuber
2004-09-09 23:03 ` Jamie Lokier
1 sibling, 2 replies; 22+ messages in thread
From: David Dabbs @ 2004-09-09 20:17 UTC (permalink / raw)
To: 'Christian Mayrhuber'; +Cc: linux-fsdevel, reiserfs-list
> Christian Mayrhuber
>
> What about using // as some URI entry point?
> An URI looks like:
> PROTOCOL://PROTOCOL_SPECIFIC_NAMESPACE_IMPLEMENTATION
>
I considered that in that //: is implicitly file://, but didn't make it
explicit in the proposal. Perhaps //: could be a legal alias for //file://.
> ...
>
> If someone wants to expose filesystem metadata
> something like //metas:// instead of //: could do it.
>
> //metas://a-directory/SomeName regular dir entry
> //metas://a-directory//SomeName named stream
> //metas://a-directory///SomeName == //metas://a-
> directory/SomeName
>
But this means that metas:// becomes an alias for file:// depending upon
what you want to access. If we want to explicitly require a URI for files,
which may be the way to go, then there should be one protocol for file
access, file://. The semantics for that would be such that // on the end of
a file or directory means "access the named stream if the a) VFS supports
this and b) the device where the named object resides supports named
streams." Programs that want to access named streams would use
//file://a-directory//SomeName
to access the streams as well as normal objects. Apps that exclusively use
the URI-style path would need to be more precise that is required today.
Today it works to concatenate "/home/" and "/"myfile" and references the
correct file. Under URI semantics, this would fail or not behave as
expected.
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 20:17 ` David Dabbs
@ 2004-09-09 20:41 ` Andreas Dilger
2004-09-10 9:11 ` Markus Törnqvist
2004-09-10 10:37 ` Christian Mayrhuber
1 sibling, 1 reply; 22+ messages in thread
From: Andreas Dilger @ 2004-09-09 20:41 UTC (permalink / raw)
To: David Dabbs; +Cc: 'Christian Mayrhuber', linux-fsdevel, reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 604 bytes --]
Christian Mayrhuber wrote:
> What about using // as some URI entry point?
One problem that using "//" may have (thought it is personally my favourite
option right now) is that "realpath(3)" may cause the "//" to be eaten, and
this is used by many programs to "resolve" pathnames to remvoe symlinks,
bogus "/./" etc. This may need a small fix in glibc, but at least it is
still central instead of teaching a million apps about different sematics.
Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://members.shaw.ca/adilger/ http://members.shaw.ca/golinux/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 20:41 ` Andreas Dilger
@ 2004-09-10 9:11 ` Markus Törnqvist
0 siblings, 0 replies; 22+ messages in thread
From: Markus Törnqvist @ 2004-09-10 9:11 UTC (permalink / raw)
To: David Dabbs, 'Christian Mayrhuber', linux-fsdevel,
reiserfs-list
On Thu, Sep 09, 2004 at 02:41:02PM -0600, Andreas Dilger wrote:
>One problem that using "//" may have (thought it is personally my favourite
>option right now) is that "realpath(3)" may cause the "//" to be eaten, and
Exactly.
However, I must say that adding // somewhere is IMO _much_ uglier than
doing ./file/..metas/ for real.
I'd also imagine that it's easier to teach progs ..metas/ than ^//(.*)
but I may be proven wrong and my opinion cated to /dev/null ;)
>this is used by many programs to "resolve" pathnames to remvoe symlinks,
>bogus "/./" etc. This may need a small fix in glibc, but at least it is
>still central instead of teaching a million apps about different sematics.
Different semantics?
Like you wouldn't teach old programs the // syntax? Or what am I missing :)
You don't have to teach them "streams as files-as-dirs" if you have a mirror
namespace with //?
Well, I'd rather stick to teaching progs about ..metas/ :P
--
mjt
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 20:17 ` David Dabbs
2004-09-09 20:41 ` Andreas Dilger
@ 2004-09-10 10:37 ` Christian Mayrhuber
1 sibling, 0 replies; 22+ messages in thread
From: Christian Mayrhuber @ 2004-09-10 10:37 UTC (permalink / raw)
To: reiserfs-list; +Cc: linux-fsdevel
On Thursday 09 September 2004 22:17, David Dabbs wrote:
>
> > Christian Mayrhuber
> >
> > What about using // as some URI entry point?
> > An URI looks like:
> > PROTOCOL://PROTOCOL_SPECIFIC_NAMESPACE_IMPLEMENTATION
> >
> I considered that in that //: is implicitly file://, but didn't make it
> explicit in the proposal. Perhaps //: could be a legal alias for //file://.
Sorry, I wasn't clear about this.
I thought more of the way //meta:// would mean //: == filesystem + metadata
exposed.
//file:///some/path equals /some/path == filesystem strictly posix conformant
How the entry points are named does not really matter as long they are
standardized.
I just wanted to keep the door open for userspace hooks that work on all
filesystems, not just reiser4.
Even some ext3 user may want to "cd //tar://myarchive.tgz" at some time.
--
lg, Chris
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 17:33 ` Christian Mayrhuber
2004-09-09 20:17 ` David Dabbs
@ 2004-09-09 23:03 ` Jamie Lokier
2004-09-10 1:37 ` David Dabbs
2004-09-10 11:06 ` Christian Mayrhuber
1 sibling, 2 replies; 22+ messages in thread
From: Jamie Lokier @ 2004-09-09 23:03 UTC (permalink / raw)
To: Christian Mayrhuber; +Cc: reiserfs-list, David Dabbs, linux-fsdevel
Christian Mayrhuber wrote:
> //http://somehost:port/foo/bla
While we're here, I'll point out that http://somehost/foo/bla and
http://somehost/foo/bla/ are valid, distinct URLs.
If http://somehost/foo/bla/ exists, many HTTP servers will return it
as the target of a redirect for http://somehost/foo/bla. The reason
for this is that it's important for any relative URLs in a document to
be resolved relative to the correct base URL of the document.
If we do actually allow file-and-directory objects, it makes sense for
the path _with_ slash on the end (//http://somehost/foo/bla/) to be
accessible as both a file and a directory, mapping to that HTTP
resource, and resources below it in the path hierarchy.
Then relative paths in the resource resolve sensibly in programs which
think they're in the local filesystem.
It also makes sense for the path _without_ slash to be recognisable to
programs as needing the slash. That could be done by making the path
without slash by a symlink to one with (symlinks are a natural
representation of redirects). But that seems like a gross violation
of POSIX semantics: we can't really return different objects depending
on the trailing slash.
So instead, it makes sense for either form of the path to be a
file-and-directory object whose type is S_IFDIR, and specifically not
S_IFREG -- but only in the cases where the HTTP transaction to the URL
without a trailing slash returns a redirect to the same URL with a
slash appended. When the HTTP transaction to the URL without a
trailing slash returns a resource, then it's type in the filesystem
should be S_IFREG, but still a file-as-directory object.
Food for thought.
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [RFC] Pathname Semantics with //
2004-09-09 23:03 ` Jamie Lokier
@ 2004-09-10 1:37 ` David Dabbs
2004-09-10 9:53 ` [SPAM] " Jamie Lokier
2004-09-10 11:47 ` Christian Mayrhuber
2004-09-10 11:06 ` Christian Mayrhuber
1 sibling, 2 replies; 22+ messages in thread
From: David Dabbs @ 2004-09-10 1:37 UTC (permalink / raw)
To: 'Jamie Lokier', 'Christian Mayrhuber'
Cc: reiserfs-list, linux-fsdevel
> From: Jamie Lokier [mailto:jamie@shareable.org]
>
> Christian Mayrhuber wrote:
> > //http://somehost:port/foo/bla
>
> While we're here, I'll point out that http://somehost/foo/bla and
> http://somehost/foo/bla/ are valid, distinct URLs.
>
>... snip
>
> Food for thought.
>
> -- Jamie
While I think there are bigger fish to fry before tackling URLish stuff
beyond filesystem access, I'll dive in.
Handwaving over changes to create() and many other apis that enable this,
what happens when you type the following commands?
ln -s //http://dabbs.net/foo/ MyPage
cat MyURL
The first command works just fine today sans /any/ changes. Cat of course
fails (today) because it calls open("MyURL") which will follow the link
provided O_NOFOLLOW was not passed. After our magic mods, VFS will parse the
// escape, see that it is NOT file:// and then what? When we were talking
strictly about filesystem access this wasn't an issue.
We certainly don't want the kernel in the business of being a protocol
proxy, so any non-file:// pathnames could simply return EFAULT or ENOTDIR,
or ELNKURL, which indicates that the target is a URL. Apps that want to do
something /useful/ with the target resource (i.e. desktops) will have
http/ftp/etc libraries. They are free to stat() the file, get the linked-to
URL and use whatever protocol client lib they wish to access the URL.
Shooting from the hip here. If we want to unify namespaces in a UNIXy way,
what if we make the VFS expose all the non-file "protocol" namespaces
through one mount point, device node or whatever. A filesystem, perhaps
something built using FiST [http://www.filesystems.org/], would "handle" a
protocol. Another, perhaps preferred, option is to steer in the direction of
Plan9, where ftp can be mounted and handled by a user-space filesystem,
ftpfs.
See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
David
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [SPAM] RE: [RFC] Pathname Semantics with //
2004-09-10 1:37 ` David Dabbs
@ 2004-09-10 9:53 ` Jamie Lokier
2004-09-10 17:11 ` David Dabbs
2004-09-10 11:47 ` Christian Mayrhuber
1 sibling, 1 reply; 22+ messages in thread
From: Jamie Lokier @ 2004-09-10 9:53 UTC (permalink / raw)
To: David Dabbs; +Cc: 'Christian Mayrhuber', reiserfs-list, linux-fsdevel
David Dabbs wrote:
> After our magic mods, VFS will parse the // escape, see that it is
> NOT file:// and then what? When we were talking strictly about
> filesystem access this wasn't an issue.
Obviously it'll access whatever is mounted on //http:/, which is
probably a proxy to a daemon which handles HTTP.
Note: you can do this _today_ with no changes. Just mount such a
daemon on /http: .
Similarly, you can make the suggestions of //file:/ and //:/ resolve
_today_ by simply putting the appropriate mounts on /file: and /: .
So is there any point in making // special?
> We certainly don't want the kernel in the business of being a protocol
> proxy, so any non-file:// pathnames could simply return EFAULT or ENOTDIR,
> or ELNKURL, which indicates that the target is a URL. Apps that want to do
> something /useful/ with the target resource (i.e. desktops) will have
> http/ftp/etc libraries. They are free to stat() the file, get the linked-to
> URL and use whatever protocol client lib they wish to access the URL.
What is the point in putting anything at all in the common filesystem,
if the visible content of pseudo-files depends on the application
accessing them?
If you go down the road of requiring every application to use a
library, then there is simply no need for anything at all in the VFS.
Of course, that way we get the Gnome-VFS / KDE / emacs mess that vwe
have now, where paths that work in one program don't work in another,
making them a whole lot less useful. There's also the unreliability
of having every filesystem operation go through an library prior to
calling the kernel -- what happens when a new fd-using syscall is
added to libc (like fgetxattr) and the intercepting library doesn't
know about it? What about chdir and fchdir? If you intercept those,
then you must intercept every syscall that uses a path for
consistency, even mount, fadvise so on. Or alternatively, when
programs are required to call uservfs_open, uservfs_read etc., then
you have problems passing fds to libraries which don't call those.
In other words, doing it without kernel assistence is flaky. Although
it does work, it's very hard to do it well and consistently, and it's
subject to silent breakage as libcs and the kernel evolve.
That's really why the kernel has to act as a proxy (even if it's only
using NFS, FiST, CODA or whatever mounts). To get a unified
namespace, to ensure all accesses to a named object call the same code
(i.e. a daemon), and that path semantics are consistent even with
chdir and links inside a proxied directory.
> Shooting from the hip here. If we want to unify namespaces in a UNIXy way,
> what if we make the VFS expose all the non-file "protocol" namespaces
> through one mount point, device node or whatever. A filesystem, perhaps
> something built using FiST [http://www.filesystems.org/], would "handle" a
> protocol. Another, perhaps preferred, option is to steer in the direction of
> Plan9, where ftp can be mounted and handled by a user-space filesystem,
> ftpfs.
> See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
You can already do it, something like this:
mkdir /http:; mount none /http: -t uservfs -o view=http
mkdir /ftp:; mount none /ftp: -t uservfs -o view=ftp
I don't see any compelling reason to make "//" special for this.
However, if there is such a reason, then you could just mount protocol
handlers on "//http:" and so on, and make "//" a normal directory with
a special name.
-- Jamie
^ permalink raw reply [flat|nested] 22+ messages in thread* RE: [SPAM] RE: [RFC] Pathname Semantics with //
2004-09-10 9:53 ` [SPAM] " Jamie Lokier
@ 2004-09-10 17:11 ` David Dabbs
0 siblings, 0 replies; 22+ messages in thread
From: David Dabbs @ 2004-09-10 17:11 UTC (permalink / raw)
To: 'Jamie Lokier'
Cc: 'Christian Mayrhuber', reiserfs-list, linux-fsdevel
> Jamie Lokier
>
> David Dabbs wrote:
>
> > Shooting from the hip here. If we want to unify namespaces in a UNIXy
> way,
> > what if we make the VFS expose all the non-file "protocol" namespaces
> > through one mount point, device node or whatever. A filesystem, perhaps
> > something built using FiST [http://www.filesystems.org/], would "handle"
> a
> > protocol. Another, perhaps preferred, option is to steer in the
> direction of
> > Plan9, where ftp can be mounted and handled by a user-space filesystem,
> > ftpfs.
> > See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
>
> You can already do it, something like this:
>
> mkdir /http:; mount none /http: -t uservfs -o view=http
> mkdir /ftp:; mount none /ftp: -t uservfs -o view=ftp
>
> I don't see any compelling reason to make "//" special for this.
> However, if there is such a reason, then you could just mount protocol
> handlers on "//http:" and so on, and make "//" a normal directory with
> a special name.
>
> -- Jamie
Jamie, we _definitely_ agree, except apps that want to create links to URLs
will prepend one slash to the URL instead of two. Is your reference to
uservfs a "foo" reference or do you mean
http://sourceforge.net/projects/uservfs/? It looks a little dusty. But we
are pulling in the same direction.
The /file: node could simply be a symlink. Thus we have
cd /
ln -s / file:
mkdir http:; mount none /http: -t uservfs -o view=http
mkdir ftp:; mount none /ftp: -t uservfs -o view=ftp
#etc...
Pathnames would be resolved with the existing code in namei.c. I can
understand mounting a URL whose protocol looks like a fs tree (e.g. ftp),
but http? Namei() parses the pathname one component at a time, checks the
dcache, and goes to the fs when that fails. Let's trace through how a URL
might get resolved.
ln -s /http://sourceforge.net/projects/uservfs
cat uservfs
Pathname would be resolved as
/http:
sourceforge.net/
projects/
uservfs
I need to look closer at namei (or the uservfs code if it really supports a
view=http). As long as a fs can generate meaningful, stateful values in
response to VFS calls to real_lookup(), then this may work.
David
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-10 1:37 ` David Dabbs
2004-09-10 9:53 ` [SPAM] " Jamie Lokier
@ 2004-09-10 11:47 ` Christian Mayrhuber
1 sibling, 0 replies; 22+ messages in thread
From: Christian Mayrhuber @ 2004-09-10 11:47 UTC (permalink / raw)
To: reiserfs-list; +Cc: linux-fsdevel
On Friday 10 September 2004 03:37, David Dabbs wrote:
>
> > From: Jamie Lokier [mailto:jamie@shareable.org]
> >
> > Christian Mayrhuber wrote:
> > > //http://somehost:port/foo/bla
> >
> > While we're here, I'll point out that http://somehost/foo/bla and
> > http://somehost/foo/bla/ are valid, distinct URLs.
> >
> >... snip
> >
> > Food for thought.
> >
> > -- Jamie
>
> While I think there are bigger fish to fry before tackling URLish stuff
> beyond filesystem access, I'll dive in.
>
> Handwaving over changes to create() and many other apis that enable this,
> what happens when you type the following commands?
>
> ln -s //http://dabbs.net/foo/ MyPage
> cat MyURL
>
> The first command works just fine today sans /any/ changes. Cat of course
> fails (today) because it calls open("MyURL") which will follow the link
> provided O_NOFOLLOW was not passed. After our magic mods, VFS will parse the
> // escape, see that it is NOT file:// and then what? When we were talking
> strictly about filesystem access this wasn't an issue.
>
> We certainly don't want the kernel in the business of being a protocol
> proxy, so any non-file:// pathnames could simply return EFAULT or ENOTDIR,
> or ELNKURL, which indicates that the target is a URL. Apps that want to do
> something /useful/ with the target resource (i.e. desktops) will have
> http/ftp/etc libraries. They are free to stat() the file, get the linked-to
> URL and use whatever protocol client lib they wish to access the URL.
I'd say return EFAULT for protocols the kernel has no userspace helper for.
If there is a userspace helper, ask it.
For now this means EFAULT all the time, except //file:// or //:, whatever,
because the kernel knows nothing about userspace helpers.
I'm suggesting //somename:// as an entry point for hooks to gnome-vfs or kio.
Providing these features to all userspace applications,
not just the ones linked with the gnome or kde libraries.
For ex.: cp "//audiocd:/Ogg Vorbis/*.ogg" ./
>
> Shooting from the hip here. If we want to unify namespaces in a UNIXy way,
> what if we make the VFS expose all the non-file "protocol" namespaces
> through one mount point, device node or whatever. A filesystem, perhaps
> something built using FiST [http://www.filesystems.org/], would "handle" a
> protocol. Another, perhaps preferred, option is to steer in the direction of
> Plan9, where ftp can be mounted and handled by a user-space filesystem,
> ftpfs.
> See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
>
>
> David
>
>
That's the most powerful option, but does not fit all use cases.
Sometimes you just may want to do "less //tar://myarchive.tar/README"
for read only content browsing.
Mounting a webdav or ftp server as userspace filesystem would be definitely
nice, but the filesystem has to guarantee posix conformance, because it is
mounted into a posix namespace.
--
lg, Chris
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [RFC] Pathname Semantics with //
2004-09-09 23:03 ` Jamie Lokier
2004-09-10 1:37 ` David Dabbs
@ 2004-09-10 11:06 ` Christian Mayrhuber
1 sibling, 0 replies; 22+ messages in thread
From: Christian Mayrhuber @ 2004-09-10 11:06 UTC (permalink / raw)
To: reiserfs-list; +Cc: linux-fsdevel
On Friday 10 September 2004 01:03, Jamie Lokier wrote:
> Christian Mayrhuber wrote:
> > //http://somehost:port/foo/bla
>
> While we're here, I'll point out that http://somehost/foo/bla and
> http://somehost/foo/bla/ are valid, distinct URLs.
>
> If http://somehost/foo/bla/ exists, many HTTP servers will return it
> as the target of a redirect for http://somehost/foo/bla. The reason
> for this is that it's important for any relative URLs in a document to
> be resolved relative to the correct base URL of the document.
>
> If we do actually allow file-and-directory objects, it makes sense for
> the path _with_ slash on the end (//http://somehost/foo/bla/) to be
> accessible as both a file and a directory, mapping to that HTTP
> resource, and resources below it in the path hierarchy.
>
> Then relative paths in the resource resolve sensibly in programs which
> think they're in the local filesystem.
>
> It also makes sense for the path _without_ slash to be recognisable to
> programs as needing the slash. That could be done by making the path
> without slash by a symlink to one with (symlinks are a natural
> representation of redirects). But that seems like a gross violation
> of POSIX semantics: we can't really return different objects depending
> on the trailing slash.
Agreed this is a problem.
I'd say that the kernel cannot gurantee to be posix conformant in the
protocol space, because most protocols are not designed to be
posix compliant.
Excerpt from David's first email:
>4_11
> ...
> A pathname that begins with two successive slashes may be interpreted in
> an implementation-defined manner, although more than two leading slashes
> shall be treated as a single slash.
> ....
That would mean no posix conformance for paths starting with //, because it
depends on the implementation. Except maybe //: or //file:// or //metas:// or
what ever name the metadata advanced namespace carries.
>
> So instead, it makes sense for either form of the path to be a
> file-and-directory object whose type is S_IFDIR, and specifically not
> S_IFREG -- but only in the cases where the HTTP transaction to the URL
> without a trailing slash returns a redirect to the same URL with a
> slash appended. When the HTTP transaction to the URL without a
> trailing slash returns a resource, then it's type in the filesystem
> should be S_IFREG, but still a file-as-directory object.
>
> Food for thought.
>
> -- Jamie
>
I'd say the userspace protocol handler should answer the question
where the url is pointing to and if it's a file or directory, if it follows
redirects, etc.
A http implementation may well only provide S_IFREG types as return values.
I don't think that cd //http://somehost/some/path/ will ever work, because
one will see http://somehost/some/path/index.html|php|cgi|jsp for sure.
I think for //ftp:// posix conformity could be guaranteed for names, but not
for more advanced features like locking.
Don't let the kernel decide how to deal with the different protocols it only
needs to know what userspace handler to ask.
--
lg, Chris
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [RFC] Pathname Semantics with //
@ 2004-09-10 17:49 David Dabbs
2004-09-09 18:03 ` Hans Reiser
0 siblings, 1 reply; 22+ messages in thread
From: David Dabbs @ 2004-09-10 17:49 UTC (permalink / raw)
To: reiserfs-list, linux-fsdevel
> Jamie Lokier
>
> David Dabbs wrote:
>
> > Shooting from the hip here. If we want to unify namespaces in a
> > UNIXy
> way,
> > what if we make the VFS expose all the non-file "protocol"
> > namespaces through one mount point, device node or whatever. A
> > filesystem, perhaps something built using FiST
[http://www.filesystems.org/], would "handle"
> a
> > protocol. Another, perhaps preferred, option is to steer in the
> direction of
> > Plan9, where ftp can be mounted and handled by a user-space
> > filesystem, ftpfs.
> > See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
>
> You can already do it, something like this:
>
> mkdir /http:; mount none /http: -t uservfs -o view=http
> mkdir /ftp:; mount none /ftp: -t uservfs -o view=ftp
>
> I don't see any compelling reason to make "//" special for this.
> However, if there is such a reason, then you could just mount protocol
> handlers on "//http:" and so on, and make "//" a normal directory with
> a special name.
>
> -- Jamie
Jamie, we _definitely_ agree, except apps that want to create links to URLs
will prepend one slash to the URL instead of two. Is your reference to
uservfs a "foo" reference or do you mean
http://sourceforge.net/projects/uservfs/? It looks a little dusty. But we
are pulling in the same direction.
The /file: node could simply be a symlink. Thus we have
cd /
ln -s / file:
mkdir http:; mount none /http: -t uservfs -o view=http
mkdir ftp:; mount none /ftp: -t uservfs -o view=ftp
#etc...
Pathnames would be resolved with the existing code in namei.c. I can
understand mounting a URL whose protocol looks like a fs tree (e.g. ftp),
but http? Namei() parses the pathname one component at a time, checks the
dcache, and goes to the fs when that fails. Let's trace through how a URL
might get resolved.
ln -s /http://sourceforge.net/projects/uservfs
cat uservfs
Pathname would be resolved as
/http:
sourceforge.net/
projects/
uservfs
I need to look closer at namei (or the uservfs code if it really supports a
view=http). As long as a fs can generate meaningful, stateful values in
response to VFS calls to real_lookup(), then this may work.
David
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [RFC] Pathname Semantics with //
2004-09-10 17:49 David Dabbs
@ 2004-09-09 18:03 ` Hans Reiser
0 siblings, 0 replies; 22+ messages in thread
From: Hans Reiser @ 2004-09-09 18:03 UTC (permalink / raw)
To: David Dabbs; +Cc: reiserfs-list, linux-fsdevel, cliff
David Dabbs wrote:
>
>
>>Jamie Lokier
>>
>>David Dabbs wrote:
>>
>>
>>
>>>Shooting from the hip here. If we want to unify namespaces in a
>>>UNIXy
>>>
>>>
>>way,
>>
>>
>>>what if we make the VFS expose all the non-file "protocol"
>>>namespaces through one mount point, device node or whatever. A
>>>filesystem, perhaps something built using FiST
>>>
>>>
>[http://www.filesystems.org/], would "handle"
>
>
>>a
>>
>>
>>>protocol. Another, perhaps preferred, option is to steer in the
>>>
>>>
>>direction of
>>
>>
>>>Plan9, where ftp can be mounted and handled by a user-space
>>>filesystem, ftpfs.
>>>See http://plan9.bell-labs.com/magic/man2html/4/ftpfs
>>>
>>>
>>You can already do it, something like this:
>>
>> mkdir /http:; mount none /http: -t uservfs -o view=http
>> mkdir /ftp:; mount none /ftp: -t uservfs -o view=ftp
>>
>>I don't see any compelling reason to make "//" special for this.
>>However, if there is such a reason, then you could just mount protocol
>>handlers on "//http:" and so on, and make "//" a normal directory with
>>a special name.
>>
>>-- Jamie
>>
>>
Jamie, I like your approach, and I think it should go into LSB and be
used by all distros. It improves closure in the OS.
>
>Jamie, we _definitely_ agree, except apps that want to create links to URLs
>will prepend one slash to the URL instead of two. Is your reference to
>uservfs a "foo" reference or do you mean
>http://sourceforge.net/projects/uservfs/? It looks a little dusty. But we
>are pulling in the same direction.
>
>The /file: node could simply be a symlink. Thus we have
>
> cd /
> ln -s / file:
> mkdir http:; mount none /http: -t uservfs -o view=http
> mkdir ftp:; mount none /ftp: -t uservfs -o view=ftp
> #etc...
>
>Pathnames would be resolved with the existing code in namei.c. I can
>understand mounting a URL whose protocol looks like a fs tree (e.g. ftp),
>but http? Namei() parses the pathname one component at a time, checks the
>dcache, and goes to the fs when that fails. Let's trace through how a URL
>might get resolved.
>
> ln -s /http://sourceforge.net/projects/uservfs
> cat uservfs
>
>Pathname would be resolved as
>
> /http:
> sourceforge.net/
> projects/
> uservfs
>
>I need to look closer at namei (or the uservfs code if it really supports a
>view=http). As long as a fs can generate meaningful, stateful values in
>response to VFS calls to real_lookup(), then this may work.
>
>
>David
>
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2004-09-10 17:49 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-09 10:41 [RFC] Pathname Semantics with // David Dabbs
2004-09-08 16:13 ` Hans Reiser
2004-09-09 16:36 ` Peter Foldiak
2004-09-09 19:21 ` David Dabbs
2004-09-10 0:49 ` Hans Reiser
2004-09-10 3:06 ` David Dabbs
2004-09-10 5:40 ` Hans Reiser
2004-09-09 21:51 ` David Dabbs
2004-09-09 6:10 ` Hans Reiser
2004-09-09 17:33 ` Christian Mayrhuber
2004-09-09 20:17 ` David Dabbs
2004-09-09 20:41 ` Andreas Dilger
2004-09-10 9:11 ` Markus Törnqvist
2004-09-10 10:37 ` Christian Mayrhuber
2004-09-09 23:03 ` Jamie Lokier
2004-09-10 1:37 ` David Dabbs
2004-09-10 9:53 ` [SPAM] " Jamie Lokier
2004-09-10 17:11 ` David Dabbs
2004-09-10 11:47 ` Christian Mayrhuber
2004-09-10 11:06 ` Christian Mayrhuber
-- strict thread matches above, loose matches on Subject: below --
2004-09-10 17:49 David Dabbs
2004-09-09 18:03 ` Hans Reiser
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).