File as a directory - Ordered Relations

All of lore.kernel.org
 help / color / mirror / Atom feed

* File as a directory - Ordered Relations
@ 2005-05-28  0:46 Alexander G. M. Smith
  2005-05-28  4:56 ` David Masover
  0 siblings, 1 reply; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-05-28  0:46 UTC (permalink / raw)
  To: Leo Comerford; +Cc: reiserfs-list

Leo Comerford wrote on Wed, 18 May 2005 12:50:38 +0100:
> But if you have relation-directories and the ability to find the
> pathnames of a given file, you can do everything you can do with
> subfiles, just as nicely, and more besides. And if subfiles are
> completely redundant and bad news anyway, we shouldn't have them.

I prefer subfiles (or fildirutes) as being easier to understand.  But
maybe that's just due to lots of experience with using file hierarchies.
I can see having a relational system, but I'd always want to also have
a directory hierarchy namespace, so that all files can be named.

Having those relationship directories seems kind of clunky - since
they're not located near the object being investigated.  Though
that's a GUI matter of making the system file browser pop up a
"Show Relationships..." menu item as contrasted with drilling down
to a subfile directory listing by clicking on an item.

> The idea is that if you want to assert a single-place predicate of a
> file, like "file x is important", you just use give the file an
> approprate full path"name" ('~/important' or whatever). If you want to
> assert a multi-place predicate - a relation - like "file x is more
> important than file y" then you use a relation-directory. That goes
> for every kind of multi-way relation/association you might want to
> assert between files - one to one, one to many, many to many.

Good point about sorted relationships.  Reminds me a bit about
attributes in BeOS and using them to sort file listings.  There may
be a duality between relation directories and a file system with
indexed attributes, like BeOS's BFS or a true file-is-a-directory
system.

One system has a relations directory stuffed with property values of
a similar kind (such as short text descriptions for photos).  The
directory implies the contents type (short text description), while
the items also link to the thing they are expressing a relationship
about (the photos).

The other system treats the object (the photo file) both as readable
data and as a directory with attribute-ish sub-things, like a sub-file
containing the text description of the photo.  File types are done as
meta-data (a couple of bytes attached to the object nodes), marking
the photo as JPEG data and the description subfile as text.  For bonus
points, the file type can also appear as a virtual child object to make
accessing the file type the same as accessing other data (no new APIs
needed).  Another advantage is that doing a directory listing of an
object gives you all its relations too (multiple parent objects listed).

Ordered Relations?

Now how to do an ordered relation?  For example, say you have the
shooting date for each photo as a property and you want to find
all photos shot on a given day, or range of days.  Either system
can be used to find them.

With relations, look in the relation directory that stores shooting
dates and sort by name (assuming that the naming of the items reflects
the date).  Say, just how do you name the relation items?  You had
aardvark and other animal names in your examples.  Should the actual
value be used (like the whole photo description text) as a name?
Or is the directory magically sorted by property value somehow?
Or is the relation directory just a concept, not actually browsable?

In the file/directory/attribute system, one of the file types would
be "shooting date" and all files with that type would be automatically
indexed, if you had created an index for "shooting date" earlier.  The
index merely stores the relevant value (a date) and a link (inode number?)
to the file.  To find the actual photo with a given date you'd have to
find the parent of the attribute-ish subfile thing the index gives you.

Come to think of it, if you display the index as a directory, you
kind of have your relation directory.  I had that as a feature in
AGMSRAMFileSystem, using the attribute values as the name of a
symbolic link pointing to the related item.  But it didn't occur to
me that it was like your relation directories until now.  Here's an
example - the last modified index for all files.  The date is in
microseconds for better sorting, and lack of time zone printing
functions in the BeOS kernel:

Fri May 27 20:12:36 24 /RAMDisk/.Indices>ls -l last_modified
total 1497
lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 10  2001 1000158923000000 #604f52f8 -> /RAMDisk/PineappleData/news/Servers/NLZ/music.in_fidelity
lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 10  2001 1000159028000000 #6094a790 -> /RAMDisk/PineappleData/saved/Keepsakes/PM999697.pmf
lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 10  2001 1000172295000000 #608fb640 -> /RAMDisk/PineappleData/saved/Keepsakes/PM999685.pmf
lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 18  2001 1000849618000000 #60bd8278 -> /RAMDisk/mozilla/res/samples/toolbarTest1.xul
lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 18  2001 1000849618000000 #60edc108 -> /RAMDisk/mozilla/res/samples/scrollbarTest1.xul
lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 18  2001 1000849627000000 #608eb2e0 -> /RAMDisk/mozilla/res/samples/tab.xul
...

However, that brings up the extra power of queries.  In BeOS you can build
a query string something like "ShootingDate=May 27 2005 & Location=*Home*"
for all photos shot on May 27th at locations that contain the word "Home".
It's a combination of relations.  Internally the simple query processor
just finds the files matching one of the indices (Shooting Date would be
the better choice since it is more constrained) then checks the resulting
set of files against the rest of the query.

If it's a duality, I suspect you'd be able to do something similar with
your relation directories.  For that matter, I was planning to add a
feature for showing query results as a virtual directory, so that queries
would be available to ordinary file namespace programs, like "ls".
Unfortunately the real world intervened and I had to end my sabbatical.
Lots of nifty topics still waiting there for someone to research and
implement!

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - Ordered Relations
  2005-05-28  0:46 File as a directory - Ordered Relations Alexander G. M. Smith
@ 2005-05-28  4:56 ` David Masover
  2005-05-28 19:42   ` Valdis.Kletnieks
  0 siblings, 1 reply; 51+ messages in thread
From: David Masover @ 2005-05-28  4:56 UTC (permalink / raw)
  To: Alexander G. M. Smith; +Cc: Leo Comerford, reiserfs-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hans, comment please?  Is this approaching v5 / v6 / Future Vision?  It
does seem more than a little "clunky" when applied to v4...

Alexander G. M. Smith wrote:
> Leo Comerford wrote on Wed, 18 May 2005 12:50:38 +0100:
> 
>>But if you have relation-directories and the ability to find the
>>pathnames of a given file, you can do everything you can do with
>>subfiles, just as nicely, and more besides. And if subfiles are
>>completely redundant and bad news anyway, we shouldn't have them.
> 
> 
> I prefer subfiles (or fildirutes) as being easier to understand.  But
> maybe that's just due to lots of experience with using file hierarchies.
> I can see having a relational system, but I'd always want to also have
> a directory hierarchy namespace, so that all files can be named.
> 
> Having those relationship directories seems kind of clunky - since
> they're not located near the object being investigated.  Though
> that's a GUI matter of making the system file browser pop up a
> "Show Relationships..." menu item as contrasted with drilling down
> to a subfile directory listing by clicking on an item.
> 
> 
>>The idea is that if you want to assert a single-place predicate of a
>>file, like "file x is important", you just use give the file an
>>approprate full path"name" ('~/important' or whatever). If you want to
>>assert a multi-place predicate - a relation - like "file x is more
>>important than file y" then you use a relation-directory. That goes
>>for every kind of multi-way relation/association you might want to
>>assert between files - one to one, one to many, many to many.
> 
> 
> Good point about sorted relationships.  Reminds me a bit about
> attributes in BeOS and using them to sort file listings.  There may
> be a duality between relation directories and a file system with
> indexed attributes, like BeOS's BFS or a true file-is-a-directory
> system.
> 
> One system has a relations directory stuffed with property values of
> a similar kind (such as short text descriptions for photos).  The
> directory implies the contents type (short text description), while
> the items also link to the thing they are expressing a relationship
> about (the photos).
> 
> The other system treats the object (the photo file) both as readable
> data and as a directory with attribute-ish sub-things, like a sub-file
> containing the text description of the photo.  File types are done as
> meta-data (a couple of bytes attached to the object nodes), marking
> the photo as JPEG data and the description subfile as text.  For bonus
> points, the file type can also appear as a virtual child object to make
> accessing the file type the same as accessing other data (no new APIs
> needed).  Another advantage is that doing a directory listing of an
> object gives you all its relations too (multiple parent objects listed).
> 
> 
> Ordered Relations?
> 
> Now how to do an ordered relation?  For example, say you have the
> shooting date for each photo as a property and you want to find
> all photos shot on a given day, or range of days.  Either system
> can be used to find them.
> 
> With relations, look in the relation directory that stores shooting
> dates and sort by name (assuming that the naming of the items reflects
> the date).  Say, just how do you name the relation items?  You had
> aardvark and other animal names in your examples.  Should the actual
> value be used (like the whole photo description text) as a name?
> Or is the directory magically sorted by property value somehow?
> Or is the relation directory just a concept, not actually browsable?
> 
> In the file/directory/attribute system, one of the file types would
> be "shooting date" and all files with that type would be automatically
> indexed, if you had created an index for "shooting date" earlier.  The
> index merely stores the relevant value (a date) and a link (inode number?)
> to the file.  To find the actual photo with a given date you'd have to
> find the parent of the attribute-ish subfile thing the index gives you.
> 
> Come to think of it, if you display the index as a directory, you
> kind of have your relation directory.  I had that as a feature in
> AGMSRAMFileSystem, using the attribute values as the name of a
> symbolic link pointing to the related item.  But it didn't occur to
> me that it was like your relation directories until now.  Here's an
> example - the last modified index for all files.  The date is in
> microseconds for better sorting, and lack of time zone printing
> functions in the BeOS kernel:
> 
> Fri May 27 20:12:36 24 /RAMDisk/.Indices>ls -l last_modified
> total 1497
> lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 10  2001 1000158923000000 #604f52f8 -> /RAMDisk/PineappleData/news/Servers/NLZ/music.in_fidelity
> lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 10  2001 1000159028000000 #6094a790 -> /RAMDisk/PineappleData/saved/Keepsakes/PM999697.pmf
> lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 10  2001 1000172295000000 #608fb640 -> /RAMDisk/PineappleData/saved/Keepsakes/PM999685.pmf
> lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 18  2001 1000849618000000 #60bd8278 -> /RAMDisk/mozilla/res/samples/toolbarTest1.xul
> lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 18  2001 1000849618000000 #60edc108 -> /RAMDisk/mozilla/res/samples/scrollbarTest1.xul
> lrwxrwxrwx   0 agmsmith agmsmith        2 Sep 18  2001 1000849627000000 #608eb2e0 -> /RAMDisk/mozilla/res/samples/tab.xul
> ...
> 
> However, that brings up the extra power of queries.  In BeOS you can build
> a query string something like "ShootingDate=May 27 2005 & Location=*Home*"
> for all photos shot on May 27th at locations that contain the word "Home".
> It's a combination of relations.  Internally the simple query processor
> just finds the files matching one of the indices (Shooting Date would be
> the better choice since it is more constrained) then checks the resulting
> set of files against the rest of the query.
> 
> If it's a duality, I suspect you'd be able to do something similar with
> your relation directories.  For that matter, I was planning to add a
> feature for showing query results as a virtual directory, so that queries
> would be available to ordinary file namespace programs, like "ls".
> Unfortunately the real world intervened and I had to end my sabbatical.
> Lots of nifty topics still waiting there for someone to research and
> implement!
> 
> - Alex

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQpf6A3gHNmZLgCUhAQJy/g//Z5TKj/bSOL64yQr9clXQQ7DmNdxYF8Yp
CpKi5r7boPgLYnZX3jHIX+paF7rpACYDfHKzq4fPJn9ey1c5qPSYLZSAQSm8pFgj
XfEm/L74aHIxfFNWrDcD9VBdeKjZ7ZDjRLNt8m3F7v4QfT/umz5n42IJoueKHsNm
u4ZPNgFxt6LxaIXoaRejG6VLISSdlAU7IqiIXi/UxiFEzXhTYeshPt/NoErT1Kyk
zCBkTB9cV1KedevfBOgodsZ5uRHt4UHxVhEM8CG0ioaUlkOUFo4bNlvsCS/Pyrq7
k5RB9U1cOZmx2hPnNr06JSQT1IeuX4z/6CpCdEkA1OPPjaMC1ei1N+eMIm5HZ/xp
FNP2kw2/Dsv2zkbIW74lLubJrk1UaFnanZ2uAKFfISxMJyctG58+HOvbn80wp1Ma
JHphD3CbIRkRbll0zJPanNBbJl587GYchlTh4PETOF//WNnW373aoyVyYUQfHWlM
y+y9PKaXTS1y5Jcffgwb5kXAWqQFWDM1lXWtcVHJtflnmqfBWRYJADUd0Da8YkyB
t+76TwMTCdoTbOyqOlrqehi7ypuLrXXodxBehfYhawjh032yoOuMqUG7BDOACs7m
T7zrT/XXyCLJUb308fPqYaDuMg7LFNjdKy+aDcPj/dfbmdSk7fyViZePaLmNShqG
LS9t3pvE0l8=
=Rl0b
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - Ordered Relations
  2005-05-28  4:56 ` David Masover
@ 2005-05-28 19:42   ` Valdis.Kletnieks
  2005-05-29 17:58     ` File as a directory - VFS Changes Alexander G. M. Smith
  2005-05-30  8:19     ` File as a directory - Ordered Relations Hans Reiser
  0 siblings, 2 replies; 51+ messages in thread
From: Valdis.Kletnieks @ 2005-05-28 19:42 UTC (permalink / raw)
  To: David Masover; +Cc: Alexander G. M. Smith, Leo Comerford, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 534 bytes --]

On Fri, 27 May 2005 23:56:35 CDT, David Masover said:

> Hans, comment please?  Is this approaching v5 / v6 / Future Vision?  It
> does seem more than a little "clunky" when applied to v4...

I'm not Hans, but I *will* ask "How much of this is *rationally* doable
without some help from the VFS?".  At the very least, some of this stuff
will require the FS to tell the VFS to suspend its disbelief (for starters,
doing this without confusing the VFS's concepts of dentries/inodes/reference
counts is going to be.... interesting... :)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-28 19:42   ` Valdis.Kletnieks
@ 2005-05-29 17:58     ` Alexander G. M. Smith
  2005-05-30  8:25       ` Hans Reiser
  2005-05-30 11:00       ` Nikita Danilov
  2005-05-30  8:19     ` File as a directory - Ordered Relations Hans Reiser
  1 sibling, 2 replies; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-05-29 17:58 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: leocomerford, reiserfs-list, ninja

Valdis.Kletnieks@vt.edu wrote on Sat, 28 May 2005 15:42:35 -0400:
> I'm not Hans, but I *will* ask "How much of this is *rationally* doable
> without some help from the VFS?".  At the very least, some of this stuff
> will require the FS to tell the VFS to suspend its disbelief (for starters,
> doing this without confusing the VFS's concepts of dentries/inodes/reference
> counts is going to be.... interesting... :)

Good point.  One way would be to cram it into the existing VFS (the
operating system's interface to file systems) as directories representing
the objects, containing a specially named file for the raw data, mixed in
with child items and symbolic links to parent objects.  Some inodes would
be fake ones, geneated as needed to represent the old style view of the
file / directory / attribute thing (such as the parent symbolic links).

But what would I (Hans likely has other views) like to see in a new VFS
to support files / directories / attributes all being the same kind of
object?  I'll talk about the user level API view of the VFS, rather than
the flip side for file systems or the gritty VFS internals, since it
doesn't need to be Linux specific.

For one, it would be almost the same as the existing VFS.  But when you
open a fildirute-thing, you can use the same file handle to read and
write its data and to list its children.

Thus open() and opendir() are combined into plain open().  It takes a
conventional hierarchical path (or later some of Hans Reiser's more
sophisticated namespaces?).  Returns a file handle.

The resulting file handle can be used with read(), write(), seek(),
readdir(), rewinddir() and the rest of the usual directory and file
basic operations.  And of course, close() it when you're done.

Stat() would disappear.  All the miscellaneous stat data would be
stored as sub-files, things like the date last modified, access
permissions and so on.  There would be a standard filename and file
type for those metadata subfiles to distinguish them from user created
subfiles (such as file/.meta.last_modified).  That also makes it
easier to add new kinds of metadata.

And that's about it for the basics.

Standard utilities, like "ls" would have to be changed to use the new
object structure - listing the contents of a thing and avoiding
recursion down paths that lead to parent objects (just like "ls"
currently avoids listing ".." recursively).  That may involve more
work than the kernel changes!

I'd add a multi-read function to replace stat().  Give it a list of
sub-file names to read and it returns their names and contents in a
packed list (like a dirent structure).  That way bulk reading date
stamps, permissions and other attributish small metadata as subfiles
won't have as much overhead as opening then individually.  Particularly
if under the hood they are stored as fields in the file's inode rather
than as totally separate files (this is what BeOS's BFS does for small
attributes).  Though conceptually you treat them as separate subfiles.

I'd also like to add indexing.  That could be done by creating a magic
directory with an associated file type to index.  Then whenever a file
with that file type is changed, the index is updated using the file's
contents as the key, and a link to the file as the value.  The file
type also implies the interpretation of the values for sorting
purposes - as strings, binary numbers, etc.  Unlike BeOS, I'd expose
the indices directly (appearing as a directory full of hard links)
and have query languages implemented in userland libraries that make
use the indices, rather than as part of the file system.  Now should
indices be system wide and maintained by the VFS, or per-volume and
maintained by the file system?  How about indices for things on network
drives?  Things on public web sites for a web-view file system?

I'd also like to add change notification.  If a file system object's
child list changes, then a notification message gets sent to interested
listeners.  Similarly for an object's data content change.  BeOS had
useful notifications for live changes to a query - I'd punt this to
the userland query library and have it build on the change notifications
from an index directory.  The VFS and other parts of the OS would need
to support change notification (BeOS used inter-process message queues).

Can a file-as-directory system fit into Linux, or some other OS?
I expect that it will only happen if the new system also exposes a
backwards compatible view for old software, using the old APIs.
After that's done, the first big user program that needs to be
updated is the desktop file browser.  Once there's a good GUI for
browsing file-as-directory file systems, the general public might
become more aware of their advantages (easily drilling down inside
files to attach a description subfile or add a bunch of MP3 tags,
magic query directories and indexing to find things quickly, multiple
parents to put the same file in multiple folders without the
breakability of symbolic links or Mac aliases).  Then I can sit back
and enjoy using the system rather than spending all this time debating
and implementing it :-).

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - Ordered Relations
  2005-05-28 19:42   ` Valdis.Kletnieks
  2005-05-29 17:58     ` File as a directory - VFS Changes Alexander G. M. Smith
@ 2005-05-30  8:19     ` Hans Reiser
  2005-05-31 16:46       ` Jonathan Briggs
  1 sibling, 1 reply; 51+ messages in thread
From: Hans Reiser @ 2005-05-30  8:19 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: David Masover, Alexander G. M. Smith, Leo Comerford,
	reiserfs-list

Valdis.Kletnieks@vt.edu wrote:

>On Fri, 27 May 2005 23:56:35 CDT, David Masover said:
>
>  
>
>>Hans, comment please?  Is this approaching v5 / v6 / Future Vision?  It
>>does seem more than a little "clunky" when applied to v4...
>>    
>>
Well, if you read our whitepaper, we consider relational algebra to be a
functional subset of what we will implement (which implies we think
relational algebra should be possible in the filesystem naming.)

>
>I'm not Hans, but I *will* ask "How much of this is *rationally* doable
>without some help from the VFS?".
>
Think of VFS as a standards committee.  That means that 5-15 years after
we implement it, they will copy it, break it, and then demand that we
conform to their breakage. 

Anytimes someone says it should go into VFS, what they really mean is,
nobody should get ahead of them because it will increase their workload.;-)

VFS is a baseline.  Once you support VFS, and your performance is good,
you can start to innovate.  Next year we finally start to seriously
innovate, after 10 years of groundwork.  The storage layer was never the
interesting part of our plans, not to me.....

BeFS is way cool by the way, and I am really interested in what Dominic
and Alexander do in the future.....


>  At the very least, some of this stuff
>will require the FS to tell the VFS to suspend its disbelief (for starters,
>doing this without confusing the VFS's concepts of dentries/inodes/reference
>counts is going to be.... interesting... :)
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-29 17:58     ` File as a directory - VFS Changes Alexander G. M. Smith
@ 2005-05-30  8:25       ` Hans Reiser
  2005-05-30 11:00       ` Nikita Danilov
  1 sibling, 0 replies; 51+ messages in thread
From: Hans Reiser @ 2005-05-30  8:25 UTC (permalink / raw)
  To: Alexander G. M. Smith
  Cc: Valdis.Kletnieks, leocomerford, reiserfs-list, ninja

I think what Alex is suggesting below is reasonable and something
resembling it should be done, though I will not go into details on it
until we have some working code....

Hans

Alexander G. M. Smith wrote:

>Valdis.Kletnieks@vt.edu wrote on Sat, 28 May 2005 15:42:35 -0400:
>  
>
>>I'm not Hans, but I *will* ask "How much of this is *rationally* doable
>>without some help from the VFS?".  At the very least, some of this stuff
>>will require the FS to tell the VFS to suspend its disbelief (for starters,
>>doing this without confusing the VFS's concepts of dentries/inodes/reference
>>counts is going to be.... interesting... :)
>>    
>>
>
>Good point.  One way would be to cram it into the existing VFS (the
>operating system's interface to file systems) as directories representing
>the objects, containing a specially named file for the raw data, mixed in
>with child items and symbolic links to parent objects.  Some inodes would
>be fake ones, geneated as needed to represent the old style view of the
>file / directory / attribute thing (such as the parent symbolic links).
>
>But what would I (Hans likely has other views) like to see in a new VFS
>to support files / directories / attributes all being the same kind of
>object?  I'll talk about the user level API view of the VFS, rather than
>the flip side for file systems or the gritty VFS internals, since it
>doesn't need to be Linux specific.
>
>For one, it would be almost the same as the existing VFS.  But when you
>open a fildirute-thing, you can use the same file handle to read and
>write its data and to list its children.
>
>Thus open() and opendir() are combined into plain open().  It takes a
>conventional hierarchical path (or later some of Hans Reiser's more
>sophisticated namespaces?).  Returns a file handle.
>
>The resulting file handle can be used with read(), write(), seek(),
>readdir(), rewinddir() and the rest of the usual directory and file
>basic operations.  And of course, close() it when you're done.
>
>Stat() would disappear.  All the miscellaneous stat data would be
>stored as sub-files, things like the date last modified, access
>permissions and so on.  There would be a standard filename and file
>type for those metadata subfiles to distinguish them from user created
>subfiles (such as file/.meta.last_modified).  That also makes it
>easier to add new kinds of metadata.
>
>And that's about it for the basics.
>
>Standard utilities, like "ls" would have to be changed to use the new
>object structure - listing the contents of a thing and avoiding
>recursion down paths that lead to parent objects (just like "ls"
>currently avoids listing ".." recursively).  That may involve more
>work than the kernel changes!
>
>I'd add a multi-read function to replace stat().  Give it a list of
>sub-file names to read and it returns their names and contents in a
>packed list (like a dirent structure).  That way bulk reading date
>stamps, permissions and other attributish small metadata as subfiles
>won't have as much overhead as opening then individually.  Particularly
>if under the hood they are stored as fields in the file's inode rather
>than as totally separate files (this is what BeOS's BFS does for small
>attributes).  Though conceptually you treat them as separate subfiles.
>
>I'd also like to add indexing.  That could be done by creating a magic
>directory with an associated file type to index.  Then whenever a file
>with that file type is changed, the index is updated using the file's
>contents as the key, and a link to the file as the value.  The file
>type also implies the interpretation of the values for sorting
>purposes - as strings, binary numbers, etc.  Unlike BeOS, I'd expose
>the indices directly (appearing as a directory full of hard links)
>and have query languages implemented in userland libraries that make
>use the indices, rather than as part of the file system.  Now should
>indices be system wide and maintained by the VFS, or per-volume and
>maintained by the file system?  How about indices for things on network
>drives?  Things on public web sites for a web-view file system?
>
>I'd also like to add change notification.  If a file system object's
>child list changes, then a notification message gets sent to interested
>listeners.  Similarly for an object's data content change.  BeOS had
>useful notifications for live changes to a query - I'd punt this to
>the userland query library and have it build on the change notifications
>from an index directory.  The VFS and other parts of the OS would need
>to support change notification (BeOS used inter-process message queues).
>
>Can a file-as-directory system fit into Linux, or some other OS?
>I expect that it will only happen if the new system also exposes a
>backwards compatible view for old software, using the old APIs.
>After that's done, the first big user program that needs to be
>updated is the desktop file browser.  Once there's a good GUI for
>browsing file-as-directory file systems, the general public might
>become more aware of their advantages (easily drilling down inside
>files to attach a description subfile or add a bunch of MP3 tags,
>magic query directories and indexing to find things quickly, multiple
>parents to put the same file in multiple folders without the
>breakability of symbolic links or Mac aliases).  Then I can sit back
>and enjoy using the system rather than spending all this time debating
>and implementing it :-).
>
>- Alex
>
>
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-29 17:58     ` File as a directory - VFS Changes Alexander G. M. Smith
  2005-05-30  8:25       ` Hans Reiser
@ 2005-05-30 11:00       ` Nikita Danilov
  2005-05-31  0:20         ` Alexander G. M. Smith
  1 sibling, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-05-30 11:00 UTC (permalink / raw)
  To: Alexander G. M. Smith; +Cc: leocomerford, reiserfs-list, ninja

Alexander G. M. Smith writes:
 > Valdis.Kletnieks@vt.edu wrote on Sat, 28 May 2005 15:42:35 -0400:
 > > I'm not Hans, but I *will* ask "How much of this is *rationally* doable
 > > without some help from the VFS?".  At the very least, some of this stuff
 > > will require the FS to tell the VFS to suspend its disbelief (for starters,
 > > doing this without confusing the VFS's concepts of dentries/inodes/reference
 > > counts is going to be.... interesting... :)
 > 
 > Good point.  One way would be to cram it into the existing VFS (the
 > operating system's interface to file systems) as directories representing
 > the objects, containing a specially named file for the raw data, mixed in
 > with child items and symbolic links to parent objects.  Some inodes would
 > be fake ones, geneated as needed to represent the old style view of the
 > file / directory / attribute thing (such as the parent symbolic links).
 > 
 > But what would I (Hans likely has other views) like to see in a new VFS
 > to support files / directories / attributes all being the same kind of
 > object?  I'll talk about the user level API view of the VFS, rather than
 > the flip side for file systems or the gritty VFS internals, since it
 > doesn't need to be Linux specific.
 > 
 > For one, it would be almost the same as the existing VFS.  But when you
 > open a fildirute-thing, you can use the same file handle to read and
 > write its data and to list its children.

This is doable with the current VFS.

 > 
 > Thus open() and opendir() are combined into plain open().  It takes a
 > conventional hierarchical path (or later some of Hans Reiser's more
 > sophisticated namespaces?).  Returns a file handle.

opendir(3) is user level function. It calls open(2) system
call. telldir(3) and seekdir(3) also are functions that call lseek(2)
under the hood.

 > 
 > The resulting file handle can be used with read(), write(), seek(),
 > readdir(), rewinddir() and the rest of the usual directory and file
 > basic operations.  And of course, close() it when you're done.

Nothing in VFS prevents files from supporting both read(2) and
readdir(3). The problem is with link(2): VFS assumes that directories
form _tree_, that is, every directory has well-defined parent.

 > 
 > Stat() would disappear.  All the miscellaneous stat data would be
 > stored as sub-files, things like the date last modified, access
 > permissions and so on.  There would be a standard filename and file
 > type for those metadata subfiles to distinguish them from user created
 > subfiles (such as file/.meta.last_modified).  That also makes it
 > easier to add new kinds of metadata.
 > 
 > And that's about it for the basics.

Problem with that is that in "/etc/passwd/..foo-meta-thing"
"/etc/passwd" is both regular (possibly with multiple names), and
directory at the same time, which is problem for VFS, see above. Read
Documentation/filesystems/directory-locking and imagine the following:

$ touch a
$ ln a b
$ mv a/..uid b/..uid

(and yes, rename had to lock parent directories _before_ ever calling
into file system back-end, so reiser4 code cannot somehow magically hint
VFS that "a" and "b" are to be treated in a special way).

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-30 11:00       ` Nikita Danilov
@ 2005-05-31  0:20         ` Alexander G. M. Smith
  2005-05-31  9:34           ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-05-31  0:20 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: leocomerford, reiserfs-list, ninja

Nikita Danilov wrote on Mon, 30 May 2005 15:00:52 +0400:
> Nothing in VFS prevents files from supporting both read(2) and
> readdir(3). The problem is with link(2): VFS assumes that directories
> form _tree_, that is, every directory has well-defined parent.

At least that's one problem that's solveable.  Just define one of
the parents as the master parent directory, with a guaranteed path
up to the root, and have the others as auxiliary parents.  That
also gives you a good path name to each and every file-thing.

The VFS or the file system (depending on where the designers want
to split the work) will still have to handle cycles in the graph
to recompute the new master parents, when an old one gets deleted
or moved.

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31  0:20         ` Alexander G. M. Smith
@ 2005-05-31  9:34           ` Nikita Danilov
  2005-05-31 15:04             ` Hans Reiser
  2005-06-01  2:11             ` Alexander G. M. Smith
  0 siblings, 2 replies; 51+ messages in thread
From: Nikita Danilov @ 2005-05-31  9:34 UTC (permalink / raw)
  To: Alexander G. M. Smith; +Cc: leocomerford, reiserfs-list, ninja

Alexander G. M. Smith writes:
 > Nikita Danilov wrote on Mon, 30 May 2005 15:00:52 +0400:
 > > Nothing in VFS prevents files from supporting both read(2) and
 > > readdir(3). The problem is with link(2): VFS assumes that directories
 > > form _tree_, that is, every directory has well-defined parent.
 > 
 > At least that's one problem that's solveable.  Just define one of
 > the parents as the master parent directory, with a guaranteed path
 > up to the root, and have the others as auxiliary parents.  That
 > also gives you a good path name to each and every file-thing.
 > 
 > The VFS or the file system (depending on where the designers want
 > to split the work) will still have to handle cycles in the graph
 > to recompute the new master parents, when an old one gets deleted
 > or moved.

Cycle may consists of more graph nodes than fits into memory. Cycle
detection is crucial for rename semantics, and if
cycle-just-about-to-be-formed doesn't fit into memory it's not clear how
to detect it, because tree has to be locked while checked for cycles, and
one definitely doesn't want to keep such a lock over IO.

 > 
 > - Alex

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31  9:34           ` Nikita Danilov
@ 2005-05-31 15:04             ` Hans Reiser
  2005-05-31 16:00               ` Nikita Danilov
  2005-05-31 16:30               ` Valdis.Kletnieks
  2005-06-01  2:11             ` Alexander G. M. Smith
  1 sibling, 2 replies; 51+ messages in thread
From: Hans Reiser @ 2005-05-31 15:04 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: Alexander G. M. Smith, leocomerford, reiserfs-list, ninja

Nikita Danilov wrote:

>Alexander G. M. Smith writes:
> > Nikita Danilov wrote on Mon, 30 May 2005 15:00:52 +0400:
> > > Nothing in VFS prevents files from supporting both read(2) and
> > > readdir(3). The problem is with link(2): VFS assumes that directories
> > > form _tree_, that is, every directory has well-defined parent.
> > 
> > At least that's one problem that's solveable.  Just define one of
> > the parents as the master parent directory, with a guaranteed path
> > up to the root, and have the others as auxiliary parents.  That
> > also gives you a good path name to each and every file-thing.
> > 
> > The VFS or the file system (depending on where the designers want
> > to split the work) will still have to handle cycles in the graph
> > to recompute the new master parents, when an old one gets deleted
> > or moved.
>
>Cycle may consists of more graph nodes than fits into memory. 
>
There are pathname length restrictions already in the kernel that should
prevent that, yes?

>Cycle
>detection is crucial for rename semantics, and if
>cycle-just-about-to-be-formed doesn't fit into memory it's not clear how
>to detect it, because tree has to be locked while checked for cycles, and
>one definitely doesn't want to keep such a lock over IO.
>
> > 
> > - Alex
>
>Nikita.
>
>
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 15:04             ` Hans Reiser
@ 2005-05-31 16:00               ` Nikita Danilov
  2005-05-31 16:30               ` Valdis.Kletnieks
  1 sibling, 0 replies; 51+ messages in thread
From: Nikita Danilov @ 2005-05-31 16:00 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Alexander G. M. Smith, leocomerford, reiserfs-list, ninja

Hello Hans,

Hans Reiser writes:
 > Nikita Danilov wrote:
 > 
 > >Alexander G. M. Smith writes:
 > > > Nikita Danilov wrote on Mon, 30 May 2005 15:00:52 +0400:
 > > > > Nothing in VFS prevents files from supporting both read(2) and
 > > > > readdir(3). The problem is with link(2): VFS assumes that directories
 > > > > form _tree_, that is, every directory has well-defined parent.
 > > > 
 > > > At least that's one problem that's solveable.  Just define one of
 > > > the parents as the master parent directory, with a guaranteed path
 > > > up to the root, and have the others as auxiliary parents.  That
 > > > also gives you a good path name to each and every file-thing.
 > > > 
 > > > The VFS or the file system (depending on where the designers want
 > > > to split the work) will still have to handle cycles in the graph
 > > > to recompute the new master parents, when an old one gets deleted
 > > > or moved.
 > >
 > >Cycle may consists of more graph nodes than fits into memory. 
 > >
 > There are pathname length restrictions already in the kernel that should
 > prevent that, yes?

UNIX namespaces are not _that_ retarded. :-)

int main(int argc, char **argv)
{
        int i;

        for (i = 0; ; ++ i) {
                mkdir("foo", 0777);
                chdir("foo");
                if ((i % 1000) == 0)
                        printf("%i\n", i);
        }
        return 0;
}

run it for a while, interrupt, and do

$ find foo
$ rm -frv foo

 > 
 > >Cycle
 > >detection is crucial for rename semantics, and if
 > >cycle-just-about-to-be-formed doesn't fit into memory it's not clear how
 > >to detect it, because tree has to be locked while checked for cycles, and
 > >one definitely doesn't want to keep such a lock over IO.
 > >
 > > > 
 > > > - Alex
 > >

Nikita.

 > >
 > >
 > >  
 > >

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 15:04             ` Hans Reiser
  2005-05-31 16:00               ` Nikita Danilov
@ 2005-05-31 16:30               ` Valdis.Kletnieks
  2005-05-31 16:55                 ` Jonathan Briggs
  1 sibling, 1 reply; 51+ messages in thread
From: Valdis.Kletnieks @ 2005-05-31 16:30 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Nikita Danilov, Alexander G. M. Smith, leocomerford,
	reiserfs-list, ninja

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

On Tue, 31 May 2005 08:04:42 PDT, Hans Reiser said:

> >Cycle may consists of more graph nodes than fits into memory. 
> >
> There are pathname length restrictions already in the kernel that should
> prevent that, yes?

The problem is that although a *single* pathname can't be longer than some
length, you can still create a cycle.  Consider for instance a pathname restriction
of 1024 chars.  Filenames A, B, and C are all 400 characters long.  A points at B,
B points at C - and C points back to A.

Also, although the set of inodes *in the cycle* fits in memory, the set of
inodes *in the entire graph* that has to be searched to verify the presence of
a cycle may not (in general, you have to be ready to examine *all* the inodes
unless you can do some pruning (unallocated, provably un-cycleable, and so
on)).  THis is the sort of thing that you can afford to do in userspace during
an fsck, but certainly can't do in the kernel on every syscall that might
create a cycle...

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - Ordered Relations
  2005-05-30  8:19     ` File as a directory - Ordered Relations Hans Reiser
@ 2005-05-31 16:46       ` Jonathan Briggs
  2005-05-31 17:07         ` Hans Reiser
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-05-31 16:46 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Valdis.Kletnieks, David Masover, Alexander G. M. Smith,
	Leo Comerford, reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 1983 bytes --]

On Mon, 2005-05-30 at 01:19 -0700, Hans Reiser wrote:
> Valdis.Kletnieks@vt.edu wrote:
> 
> >On Fri, 27 May 2005 23:56:35 CDT, David Masover said:
> >
> >  
> >
> >>Hans, comment please?  Is this approaching v5 / v6 / Future Vision?  It
> >>does seem more than a little "clunky" when applied to v4...
> >>    
> >>
> Well, if you read our whitepaper, we consider relational algebra to be a
> functional subset of what we will implement (which implies we think
> relational algebra should be possible in the filesystem naming.)
> 
> >
> >I'm not Hans, but I *will* ask "How much of this is *rationally* doable
> >without some help from the VFS?".
> >
> Think of VFS as a standards committee.  That means that 5-15 years after
> we implement it, they will copy it, break it, and then demand that we
> conform to their breakage. 
> 
> Anytimes someone says it should go into VFS, what they really mean is,
> nobody should get ahead of them because it will increase their workload.;-)
> 
> VFS is a baseline.  Once you support VFS, and your performance is good,
> you can start to innovate.  Next year we finally start to seriously
> innovate, after 10 years of groundwork.  The storage layer was never the
> interesting part of our plans, not to me.....

Why innovate in the filesystem though, when it would work just as well
or better in the VFS layer?  Files as directories and meta-files would
work for all filesystems.  Ext3 with extended attributes could support
the same file structures as Reiser4.  Reiser4 would then be the most
efficient implementation of the general case.

From the last LKML discussion, it didn't look to me as if the kernel
maintainers are going to accept Reiser4's stranger features into the
mainline kernel, so if you're going to be implementing and maintaining
them separately anyway, why not do it in the implementation of all
namespaces, in the VFS code?
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 16:30               ` Valdis.Kletnieks
@ 2005-05-31 16:55                 ` Jonathan Briggs
  2005-05-31 16:59                   ` Hans Reiser
  2005-05-31 18:23                   ` Nikita Danilov
  0 siblings, 2 replies; 51+ messages in thread
From: Jonathan Briggs @ 2005-05-31 16:55 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Hans Reiser, Nikita Danilov, Alexander G. M. Smith, leocomerford,
	reiserfs-list, ninja

[-- Attachment #1: Type: text/plain, Size: 1772 bytes --]

On Tue, 2005-05-31 at 12:30 -0400, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 31 May 2005 08:04:42 PDT, Hans Reiser said:
> 
> > >Cycle may consists of more graph nodes than fits into memory. 
> > >
> > There are pathname length restrictions already in the kernel that should
> > prevent that, yes?
> 
> The problem is that although a *single* pathname can't be longer than some
> length, you can still create a cycle.  Consider for instance a pathname restriction
> of 1024 chars.  Filenames A, B, and C are all 400 characters long.  A points at B,
> B points at C - and C points back to A.
> 
> Also, although the set of inodes *in the cycle* fits in memory, the set of
> inodes *in the entire graph* that has to be searched to verify the presence of
> a cycle may not (in general, you have to be ready to examine *all* the inodes
> unless you can do some pruning (unallocated, provably un-cycleable, and so
> on)).  THis is the sort of thing that you can afford to do in userspace during
> an fsck, but certainly can't do in the kernel on every syscall that might
> create a cycle...

You can avoid cycles by redefining the problem.

Every file or "data object" has one single True Name which is their
inode or OID.  Each data object then has one or more "names" as
properties.  Names are either single strings with slash separators for
directories, or each directory element is a unique object in an object
list.  Directories then become queries that return the set of objects
holding that directory name.  The query results are of course cached and
updated whenever a name property changes.

Now there are no cycles, although a naive Unix "find" program could get
stuck in a loop.
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 16:55                 ` Jonathan Briggs
@ 2005-05-31 16:59                   ` Hans Reiser
  2005-05-31 17:13                     ` Jonathan Briggs
  2005-05-31 18:23                   ` Nikita Danilov
  1 sibling, 1 reply; 51+ messages in thread
From: Hans Reiser @ 2005-05-31 16:59 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Valdis.Kletnieks, Nikita Danilov, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja

What happens when you unlink the True Name?

Hans

Jonathan Briggs wrote:

>
>You can avoid cycles by redefining the problem.
>
>Every file or "data object" has one single True Name which is their
>inode or OID.  Each data object then has one or more "names" as
>properties.  Names are either single strings with slash separators for
>directories, or each directory element is a unique object in an object
>list.  Directories then become queries that return the set of objects
>holding that directory name.  The query results are of course cached and
>updated whenever a name property changes.
>
>Now there are no cycles, although a naive Unix "find" program could get
>stuck in a loop.
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - Ordered Relations
  2005-05-31 16:46       ` Jonathan Briggs
@ 2005-05-31 17:07         ` Hans Reiser
  0 siblings, 0 replies; 51+ messages in thread
From: Hans Reiser @ 2005-05-31 17:07 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Valdis.Kletnieks, David Masover, Alexander G. M. Smith,
	Leo Comerford, reiserfs-list

Jonathan Briggs wrote:

>
>Why innovate in the filesystem though, when it would work just as well
>or better in the VFS layer?
>
Why don't we just have one filesystem, think of the advantages.....
;-)

I don't try to get other people to follow my lead anymore, I just ship
code that works.  Putting it into VFS requires getting others to follow
my lead.  Ain't gonna happen.  Getting them to leave me alone to
innovate in my corner of the kernel?  Might happen if I fight for it,
but it will be a real struggle.

Hans

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 16:59                   ` Hans Reiser
@ 2005-05-31 17:13                     ` Jonathan Briggs
  2005-05-31 18:27                       ` Hans Reiser
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-05-31 17:13 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Valdis.Kletnieks, Nikita Danilov, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja

[-- Attachment #1: Type: text/plain, Size: 1185 bytes --]

Either that isn't allowed, or it immediately vanishes from all
directories.

If deleting by OID isn't allowed, then every name property must be
removed in order to delete the file.

Personally, I would allow deleting the OID.  It would be a convenient
way to be sure every instance of a file was deleted.

On Tue, 2005-05-31 at 09:59 -0700, Hans Reiser wrote:
> What happens when you unlink the True Name?
> 
> Hans
> 
> Jonathan Briggs wrote:
> 
> >
> >You can avoid cycles by redefining the problem.
> >
> >Every file or "data object" has one single True Name which is their
> >inode or OID.  Each data object then has one or more "names" as
> >properties.  Names are either single strings with slash separators for
> >directories, or each directory element is a unique object in an object
> >list.  Directories then become queries that return the set of objects
> >holding that directory name.  The query results are of course cached and
> >updated whenever a name property changes.
> >
> >Now there are no cycles, although a naive Unix "find" program could get
> >stuck in a loop.
> >  
> >
> 
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 16:55                 ` Jonathan Briggs
  2005-05-31 16:59                   ` Hans Reiser
@ 2005-05-31 18:23                   ` Nikita Danilov
  2005-05-31 18:32                     ` Hans Reiser
  1 sibling, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-05-31 18:23 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Valdis.Kletnieks, Hans Reiser, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja

Jonathan Briggs writes:
 > On Tue, 2005-05-31 at 12:30 -0400, Valdis.Kletnieks@vt.edu wrote:
 > > On Tue, 31 May 2005 08:04:42 PDT, Hans Reiser said:
 > > 
 > > > >Cycle may consists of more graph nodes than fits into memory. 
 > > > >
 > > > There are pathname length restrictions already in the kernel that should
 > > > prevent that, yes?
 > > 
 > > The problem is that although a *single* pathname can't be longer than some
 > > length, you can still create a cycle.  Consider for instance a pathname restriction
 > > of 1024 chars.  Filenames A, B, and C are all 400 characters long.  A points at B,
 > > B points at C - and C points back to A.
 > > 
 > > Also, although the set of inodes *in the cycle* fits in memory, the set of
 > > inodes *in the entire graph* that has to be searched to verify the presence of
 > > a cycle may not (in general, you have to be ready to examine *all* the inodes
 > > unless you can do some pruning (unallocated, provably un-cycleable, and so
 > > on)).  THis is the sort of thing that you can afford to do in userspace during
 > > an fsck, but certainly can't do in the kernel on every syscall that might
 > > create a cycle...
 > 
 > You can avoid cycles by redefining the problem.
 > 
 > Every file or "data object" has one single True Name which is their
 > inode or OID.  Each data object then has one or more "names" as
 > properties.  Names are either single strings with slash separators for
 > directories, or each directory element is a unique object in an object
 > list.  Directories then become queries that return the set of objects
 > holding that directory name.  The query results are of course cached and
 > updated whenever a name property changes.
 > 
 > Now there are no cycles, although a naive Unix "find" program could get
 > stuck in a loop.

Huh? Cycles are still here.

Query D0 returns D1, query D1 returns D2, ... query DN returns D0. The
problem is not in the mechanism used to encode tree/graph structure. The
problem is in the limitations imposed by required semantics:

   (R) every object except some selected root is Reachable. (No leaks.)

   (G) unused objects are sooner or later discarded. (Garbage
   collection.)

Neither requirement is compatible with cycles in the directory
structure:

 - from (R) it follows that object can be discarded only if it empty (as
 a directory). All nodes in a cycle are not empty (because each of them
 contains at least a reference to the next one), and hence none of them
 can be ever removed;

 - if garbage collection is implemented through the reference counting
 (which is the only known way tractable for a file system), then cycles
 are never collected.

Unless you are talking about a two-level naming scheme, where "One True
Names" are visible to the user. In that case reachability problem
evaporates, because manipulations with normal directory structure never
make node unreachable---it is always accessible through its True
Name.

But the garbage collection problem is still there. You are more than
welcome to solve it by implementing generation mark-and-sweep GC on file
system scale. :-)

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 17:13                     ` Jonathan Briggs
@ 2005-05-31 18:27                       ` Hans Reiser
  2005-05-31 21:01                         ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Hans Reiser @ 2005-05-31 18:27 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Valdis.Kletnieks, Nikita Danilov, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Well,. if you allow multiple true names, then you start to resemble
something I suggested a few years ago, in which I outlined a taxonomy of
links, and suggested that some links would count towards the reference
count and some would not.

Of course, that does nothing for the cycle problem......

How are cycles handled for symlinks currently?

Hans

Jonathan Briggs wrote:

>Either that isn't allowed, or it immediately vanishes from all
>directories.
>
>If deleting by OID isn't allowed, then every name property must be
>removed in order to delete the file.
>
>Personally, I would allow deleting the OID.  It would be a convenient
>way to be sure every instance of a file was deleted.
>
>On Tue, 2005-05-31 at 09:59 -0700, Hans Reiser wrote:
>  
>
>>What happens when you unlink the True Name?
>>
>>Hans
>>
>>Jonathan Briggs wrote:
>>
>>    
>>
>>>You can avoid cycles by redefining the problem.
>>>
>>>Every file or "data object" has one single True Name which is their
>>>inode or OID.  Each data object then has one or more "names" as
>>>properties.  Names are either single strings with slash separators for
>>>directories, or each directory element is a unique object in an object
>>>list.  Directories then become queries that return the set of objects
>>>holding that directory name.  The query results are of course cached and
>>>updated whenever a name property changes.
>>>
>>>Now there are no cycles, although a naive Unix "find" program could get
>>>stuck in a loop.
>>> 
>>>
>>>      
>>>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 18:23                   ` Nikita Danilov
@ 2005-05-31 18:32                     ` Hans Reiser
  2005-06-02  1:27                       ` Alexander G. M. Smith
  2005-06-02  9:11                       ` Nikita Danilov
  0 siblings, 2 replies; 51+ messages in thread
From: Hans Reiser @ 2005-05-31 18:32 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Jonathan Briggs, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja

What about if we have it that only the first name a directory is created
with counts towards its reference count, and that if the directory is
moved if it is moved from its first name, the new name becomes the one
that counts towards the reference count?   A bit of a hack, but would work.

Hans

Nikita Danilov wrote:

>Jonathan Briggs writes:
> > On Tue, 2005-05-31 at 12:30 -0400, Valdis.Kletnieks@vt.edu wrote:
> > > On Tue, 31 May 2005 08:04:42 PDT, Hans Reiser said:
> > > 
> > > > >Cycle may consists of more graph nodes than fits into memory. 
> > > > >
> > > > There are pathname length restrictions already in the kernel that should
> > > > prevent that, yes?
> > > 
> > > The problem is that although a *single* pathname can't be longer than some
> > > length, you can still create a cycle.  Consider for instance a pathname restriction
> > > of 1024 chars.  Filenames A, B, and C are all 400 characters long.  A points at B,
> > > B points at C - and C points back to A.
> > > 
> > > Also, although the set of inodes *in the cycle* fits in memory, the set of
> > > inodes *in the entire graph* that has to be searched to verify the presence of
> > > a cycle may not (in general, you have to be ready to examine *all* the inodes
> > > unless you can do some pruning (unallocated, provably un-cycleable, and so
> > > on)).  THis is the sort of thing that you can afford to do in userspace during
> > > an fsck, but certainly can't do in the kernel on every syscall that might
> > > create a cycle...
> > 
> > You can avoid cycles by redefining the problem.
> > 
> > Every file or "data object" has one single True Name which is their
> > inode or OID.  Each data object then has one or more "names" as
> > properties.  Names are either single strings with slash separators for
> > directories, or each directory element is a unique object in an object
> > list.  Directories then become queries that return the set of objects
> > holding that directory name.  The query results are of course cached and
> > updated whenever a name property changes.
> > 
> > Now there are no cycles, although a naive Unix "find" program could get
> > stuck in a loop.
>
>Huh? Cycles are still here.
>
>Query D0 returns D1, query D1 returns D2, ... query DN returns D0. The
>problem is not in the mechanism used to encode tree/graph structure. The
>problem is in the limitations imposed by required semantics:
>
>   (R) every object except some selected root is Reachable. (No leaks.)
>
>   (G) unused objects are sooner or later discarded. (Garbage
>   collection.)
>
>Neither requirement is compatible with cycles in the directory
>structure:
>
> - from (R) it follows that object can be discarded only if it empty (as
> a directory). All nodes in a cycle are not empty (because each of them
> contains at least a reference to the next one), and hence none of them
> can be ever removed;
>
> - if garbage collection is implemented through the reference counting
> (which is the only known way tractable for a file system), then cycles
> are never collected.
>
>Unless you are talking about a two-level naming scheme, where "One True
>Names" are visible to the user. In that case reachability problem
>evaporates, because manipulations with normal directory structure never
>make node unreachable---it is always accessible through its True
>Name.
>
>But the garbage collection problem is still there. You are more than
>welcome to solve it by implementing generation mark-and-sweep GC on file
>system scale. :-)
>
>Nikita.
>
>
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 18:27                       ` Hans Reiser
@ 2005-05-31 21:01                         ` Jonathan Briggs
  2005-05-31 21:08                           ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-05-31 21:01 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Valdis.Kletnieks, Nikita Danilov, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 3136 bytes --]

I should create an example.

Wherever I used True Name previously, use OID instead.  True Name was
simply another term for a unique object identifier.

Three files with OIDs of 1001, 1002, and 1003.
Object 1001:
name: /tmp/A/file1
name: /tmp/A/B/file1
name: /tmp/A/B/C/file1

Object 1002:
name: /tmp/A/file2

Object 1003:
name: /tmp/A/B/file3

Three query objects (directories) with OIDs of 1, 2, and 3.
Object 1:
name: /tmp/A
name: /tmp/A/B/C/A
query: name begins with /tmp/A/
query result cache: B->2, file1->1001, file2->1002

Object 2:
name: /tmp/A/B
query: name begins with /tmp/A/B/
query result cache: C->3, file1->1001, file3->1003

Object 3:
name: /tmp/A/B/C
query: name begins with /tmp/A/B/C/
query result cache: A->1, file1->1001

Now there is a A -> B -> C -> A directory loop.  But removing
name: /tmp/A/B/C/A from Object 1 fixes the loop.  Deleting Object 1 also
fixes the loop.  Deleting any of Object 1, 2 or 3 does not affect any
other object, because in this scheme, directory objects do not need to
actually exist: they are just queries that return objects with certain
names.

One problem I already see with it is that there is no way to enforce the
Unix "x" permission without real directory traversal.  But I never liked
that anyway. :)

Are there other problems with it?  Did I explain it clearly?

On Tue, 2005-05-31 at 11:27 -0700, Hans Reiser wrote:
> Well,. if you allow multiple true names, then you start to resemble
> something I suggested a few years ago, in which I outlined a taxonomy of
> links, and suggested that some links would count towards the reference
> count and some would not.
> 
> Of course, that does nothing for the cycle problem......
> 
> How are cycles handled for symlinks currently?
> 
> Hans
> 
> Jonathan Briggs wrote:
> 
> >Either that isn't allowed, or it immediately vanishes from all
> >directories.
> >
> >If deleting by OID isn't allowed, then every name property must be
> >removed in order to delete the file.
> >
> >Personally, I would allow deleting the OID.  It would be a convenient
> >way to be sure every instance of a file was deleted.
> >
> >On Tue, 2005-05-31 at 09:59 -0700, Hans Reiser wrote:
> >  
> >
> >>What happens when you unlink the True Name?
> >>
> >>Hans
> >>
> >>Jonathan Briggs wrote:
> >>
> >>    
> >>
> >>>You can avoid cycles by redefining the problem.
> >>>
> >>>Every file or "data object" has one single True Name which is their
> >>>inode or OID.  Each data object then has one or more "names" as
> >>>properties.  Names are either single strings with slash separators for
> >>>directories, or each directory element is a unique object in an object
> >>>list.  Directories then become queries that return the set of objects
> >>>holding that directory name.  The query results are of course cached and
> >>>updated whenever a name property changes.
> >>>
> >>>Now there are no cycles, although a naive Unix "find" program could get
> >>>stuck in a loop.
> >>> 
> >>>
> >>>      
> >>>
> 
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 21:01                         ` Jonathan Briggs
@ 2005-05-31 21:08                           ` Jonathan Briggs
  2005-05-31 22:36                             ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-05-31 21:08 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Valdis.Kletnieks, Nikita Danilov, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 1987 bytes --]

On Tue, 2005-05-31 at 15:01 -0600, Jonathan Briggs wrote:
> I should create an example.
> 
> Wherever I used True Name previously, use OID instead.  True Name was
> simply another term for a unique object identifier.
> 
> Three files with OIDs of 1001, 1002, and 1003.
> Object 1001:
> name: /tmp/A/file1
> name: /tmp/A/B/file1
> name: /tmp/A/B/C/file1
> 
> Object 1002:
> name: /tmp/A/file2
> 
> Object 1003:
> name: /tmp/A/B/file3
> 
> Three query objects (directories) with OIDs of 1, 2, and 3.
> Object 1:
> name: /tmp/A
> name: /tmp/A/B/C/A
> query: name begins with /tmp/A/
> query result cache: B->2, file1->1001, file2->1002
> 
> Object 2:
> name: /tmp/A/B
> query: name begins with /tmp/A/B/
> query result cache: C->3, file1->1001, file3->1003
> 
> Object 3:
> name: /tmp/A/B/C
> query: name begins with /tmp/A/B/C/
> query result cache: A->1, file1->1001
> 
> Now there is a A -> B -> C -> A directory loop.  But removing
> name: /tmp/A/B/C/A from Object 1 fixes the loop.  Deleting Object 1 also
> fixes the loop.  Deleting any of Object 1, 2 or 3 does not affect any
> other object, because in this scheme, directory objects do not need to
> actually exist: they are just queries that return objects with certain
> names.

I forgot to address Nikita's point about reclaiming lost cycles.  In
this case, let me create Object 4 for /tmp
Object 4:
name: /tmp
query: name begins with /tmp/
query result cache: A->1

Now, if we delete Object 4, are Objects 1,2,3 lost?  I would say not
because they still have names.  When the shell calls chdir("/tmp") a new
query object (directory) must be created dynamically, and Objects
1001,1002,1003 still have their names that start with /tmp and so they
immediately appear again.  Their names still start with /, so the top
level query will still find them and /tmp as well.

Therefore, the cycle is never detached and lost.
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 21:08                           ` Jonathan Briggs
@ 2005-05-31 22:36                             ` Nikita Danilov
  2005-05-31 23:01                               ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-05-31 22:36 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Jonathan Briggs writes:
 > On Tue, 2005-05-31 at 15:01 -0600, Jonathan Briggs wrote:
 > > I should create an example.
 > > 
 > > Wherever I used True Name previously, use OID instead.  True Name was
 > > simply another term for a unique object identifier.
 > > 
 > > Three files with OIDs of 1001, 1002, and 1003.
 > > Object 1001:
 > > name: /tmp/A/file1
 > > name: /tmp/A/B/file1
 > > name: /tmp/A/B/C/file1
 > > 
 > > Object 1002:
 > > name: /tmp/A/file2
 > > 
 > > Object 1003:
 > > name: /tmp/A/B/file3
 > > 
 > > Three query objects (directories) with OIDs of 1, 2, and 3.
 > > Object 1:
 > > name: /tmp/A
 > > name: /tmp/A/B/C/A
 > > query: name begins with /tmp/A/
 > > query result cache: B->2, file1->1001, file2->1002
 > > 
 > > Object 2:
 > > name: /tmp/A/B
 > > query: name begins with /tmp/A/B/
 > > query result cache: C->3, file1->1001, file3->1003
 > > 
 > > Object 3:
 > > name: /tmp/A/B/C
 > > query: name begins with /tmp/A/B/C/
 > > query result cache: A->1, file1->1001
 > > 
 > > Now there is a A -> B -> C -> A directory loop.  But removing
 > > name: /tmp/A/B/C/A from Object 1 fixes the loop.  Deleting Object 1 also
 > > fixes the loop.  Deleting any of Object 1, 2 or 3 does not affect any
 > > other object, because in this scheme, directory objects do not need to
 > > actually exist: they are just queries that return objects with certain
 > > names.

One problem with the above is that directory structure is inconsistent
with lists of names associated with objects. For example, file1 is a
child of /tmp/A/B/C/A, but Object 1001 doesn't list /tmp/A/B/C/A/file1
among its names.

 > 
 > I forgot to address Nikita's point about reclaiming lost cycles.  In
 > this case, let me create Object 4 for /tmp
 > Object 4:
 > name: /tmp
 > query: name begins with /tmp/
 > query result cache: A->1
 > 
 > Now, if we delete Object 4, are Objects 1,2,3 lost?  I would say not
 > because they still have names.  When the shell calls chdir("/tmp") a new
 > query object (directory) must be created dynamically, and Objects
 > 1001,1002,1003 still have their names that start with /tmp and so they
 > immediately appear again.  Their names still start with /, so the top
 > level query will still find them and /tmp as well.

Object 4 is "/tmp". Once it was removed what does it _mean_ for, say,
Object 1003 to have a name "/tmp/A/B/file3"? What is "/tmp" bit there?
Just a string? If so, and your directories are but queries, what does it
mean for directory to be removed? How mv /tmp/A /tmp/A1 is implemented?
By scanning whole file system and updating leaf name-lists?

It seems that what you are proposing is a radical departure from file
system namespace as we know it. :-) In your scheme all structural
information is encoded in leaves _only_, and directories just do some
kind of pattern matching. This is closer to a relational database than
to the current file-systems where directories are the only source of
the structural inform

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 22:36                             ` Nikita Danilov
@ 2005-05-31 23:01                               ` Jonathan Briggs
  2005-06-01 10:39                                 ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-05-31 23:01 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 4939 bytes --]

On Wed, 2005-06-01 at 02:36 +0400, Nikita Danilov wrote:
> Jonathan Briggs writes:
>  > On Tue, 2005-05-31 at 15:01 -0600, Jonathan Briggs wrote:
>  > > I should create an example.
>  > > 
>  > > Wherever I used True Name previously, use OID instead.  True Name was
>  > > simply another term for a unique object identifier.
>  > > 
>  > > Three files with OIDs of 1001, 1002, and 1003.
>  > > Object 1001:
>  > > name: /tmp/A/file1
>  > > name: /tmp/A/B/file1
>  > > name: /tmp/A/B/C/file1
>  > > 
>  > > Object 1002:
>  > > name: /tmp/A/file2
>  > > 
>  > > Object 1003:
>  > > name: /tmp/A/B/file3
>  > > 
>  > > Three query objects (directories) with OIDs of 1, 2, and 3.
>  > > Object 1:
>  > > name: /tmp/A
>  > > name: /tmp/A/B/C/A
>  > > query: name begins with /tmp/A/
>  > > query result cache: B->2, file1->1001, file2->1002
>  > > 
>  > > Object 2:
>  > > name: /tmp/A/B
>  > > query: name begins with /tmp/A/B/
>  > > query result cache: C->3, file1->1001, file3->1003
>  > > 
>  > > Object 3:
>  > > name: /tmp/A/B/C
>  > > query: name begins with /tmp/A/B/C/
>  > > query result cache: A->1, file1->1001
>  > > 
>  > > Now there is a A -> B -> C -> A directory loop.  But removing
>  > > name: /tmp/A/B/C/A from Object 1 fixes the loop.  Deleting Object 1 also
>  > > fixes the loop.  Deleting any of Object 1, 2 or 3 does not affect any
>  > > other object, because in this scheme, directory objects do not need to
>  > > actually exist: they are just queries that return objects with certain
>  > > names.
> 
> One problem with the above is that directory structure is inconsistent
> with lists of names associated with objects. For example, file1 is a
> child of /tmp/A/B/C/A, but Object 1001 doesn't list /tmp/A/B/C/A/file1
> among its names.

file1 *appears* to be a child because it is actually returned as the
query result for its name of /tmp/A/file1 because A is a query
for /tmp/A/.  If the shell was smart enough to normalize its path by
asking the directory for its name, it would know that /tmp/A/B/C/A
was /tmp/A.   But yes, a stupid program could be confused by the
difference between names.

> 
>  > 
>  > I forgot to address Nikita's point about reclaiming lost cycles.  In
>  > this case, let me create Object 4 for /tmp
>  > Object 4:
>  > name: /tmp
>  > query: name begins with /tmp/
>  > query result cache: A->1
>  > 
>  > Now, if we delete Object 4, are Objects 1,2,3 lost?  I would say not
>  > because they still have names.  When the shell calls chdir("/tmp") a new
>  > query object (directory) must be created dynamically, and Objects
>  > 1001,1002,1003 still have their names that start with /tmp and so they
>  > immediately appear again.  Their names still start with /, so the top
>  > level query will still find them and /tmp as well.
> 
> Object 4 is "/tmp". Once it was removed what does it _mean_ for, say,
> Object 1003 to have a name "/tmp/A/B/file3"? What is "/tmp" bit there?
> Just a string? If so, and your directories are but queries, what does it
> mean for directory to be removed? How mv /tmp/A /tmp/A1 is implemented?
> By scanning whole file system and updating leaf name-lists?

Well, the name doesn't mean anything. :-)  It is just a convenient
metadata for describing where to find the file in a hierarchy, and for
Unix compatibility.

If a directory was removed by a standard rm -rf, it would work as
expected because it would descend the tree removing names (unlink) from
each object it found.

Moving an object with "mv" would change its name.  Moving a top-level
directory like /usr would require visiting every object starting
with /usr and doing an edit.  A compression scheme could be used where
the most-used top-level directory names were replaced with lookup
tables, then /usr could be renamed just once in the table.

> It seems that what you are proposing is a radical departure from file
> system namespace as we know it. :-) In your scheme all structural
> information is encoded in leaves _only_, and directories just do some
> kind of pattern matching. This is closer to a relational database than
> to the current file-systems where directories are the only source of
> the structural inform

Yes. :-)  It is radical, and the idea is taken from databases.  I
thought that seemed to be the direction Reiser filesystems were moving.
In this scheme a name is just another bit of metadata and not
first-class important information.  The name-query directories would be
there for traditional filesystem users and Unix compatibility.  They
would probably be virtual and dynamic, only being created when needed
and only being persistent if assigned meta-data (extra names (links),
non-default permission bits, etc) or for performance reasons (faster to
load from cache than searching every file).
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31  9:34           ` Nikita Danilov
  2005-05-31 15:04             ` Hans Reiser
@ 2005-06-01  2:11             ` Alexander G. M. Smith
  2005-06-01 10:58               ` Nikita Danilov
  1 sibling, 1 reply; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-06-01  2:11 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: leocomerford, reiserfs-list, ninja

Nikita Danilov wrote on Tue, 31 May 2005 13:34:55 +0400:
> Cycle may consists of more graph nodes than fits into memory. Cycle
> detection is crucial for rename semantics, and if
> cycle-just-about-to-be-formed doesn't fit into memory it's not clear how
> to detect it, because tree has to be locked while checked for cycles, and
> one definitely doesn't want to keep such a lock over IO.

Sometimes you'll just have to return an error code if the rename operation
is too complex to be done.  The user will have to then delete individual
leaf files to make the situation simpler.  I hope this won't happen very
often.

On the plus side, the detection of all the files that may be affected
means you can now delete a directory directly, contents and all, if all
the related inodes fit into memory.

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 23:01                               ` Jonathan Briggs
@ 2005-06-01 10:39                                 ` Nikita Danilov
  2005-06-01 10:43                                   ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-01 10:39 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Jonathan Briggs writes:
 > On Wed, 2005-06-01 at 02:36 +0400, Nikita Danilov wrote:

[...]

 > > 
 > > One problem with the above is that directory structure is inconsistent
 > > with lists of names associated with objects. For example, file1 is a
 > > child of /tmp/A/B/C/A, but Object 1001 doesn't list /tmp/A/B/C/A/file1
 > > among its names.
 > 
 > file1 *appears* to be a child because it is actually returned as the
 > query result for its name of /tmp/A/file1 because A is a query

I beg your pardon, but this is confusing. Objects have "real" names that
are stings attached to them. User, on the other hand, accesses objects
through paths in directory hierarchy which is just a way to execute
queries on real-names. But some paths do correspond to real-names and
same do not? I, personally, would be very wary to use such a behavior as
a fundamental model of file system.

Also, if directories are just queries, it is not clear why they have
real-names on their own. For example, what does it mean, for object O1
(a directory) to have a real-name "/a/b", and to return ("c" -> O2) as a
part of query result, where O2 has only one name, viz. "/d/e"?

Basically, without some extra restrictions, your model doesn't provide
consistency between user visible paths, and hidden real-names, which
makes it not very useful in the practice, I am afraid.

 > for /tmp/A/.  If the shell was smart enough to normalize its path by
 > asking the directory for its name, it would know that /tmp/A/B/C/A
 > was /tmp/A.   

/tmp/A/B/C/A may have other names beyond /tmp/A, which one to choose?

 >               But yes, a stupid program could be confused by the
 > difference between names.

A _user_ will most definitely be confused, which is much more important.

[...]

 > 
 > Moving an object with "mv" would change its name.  Moving a top-level
 > directory like /usr would require visiting every object starting
 > with /usr and doing an edit.  A compression scheme could be used where
 > the most-used top-level directory names were replaced with lookup
 > tables, then /usr could be renamed just once in the table.

Heh, you just invented good old directories, by the way.

[...]

 > 
 > Yes. :-)  It is radical, and the idea is taken from databases.  I
 > thought that seemed to be the direction Reiser filesystems were moving.
 > In this scheme a name is just another bit of metadata and not
 > first-class important information.  The name-query directories would be
 > there for traditional filesystem users and Unix compatibility.  They
 > would probably be virtual and dynamic, only being created when needed
 > and only being persistent if assigned meta-data (extra names (links),
 > non-default permission bits, etc) or for performance reasons (faster to
 > load from cache than searching every file).

That latter bit, about making them persistent, is where the tr

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 10:39                                 ` Nikita Danilov
@ 2005-06-01 10:43                                   ` Nikita Danilov
  2005-06-01 14:06                                     ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-01 10:43 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Nikita Danilov writes:

[...]

 > 
 >  > 
 >  > Yes. :-)  It is radical, and the idea is taken from databases.  I
 >  > thought that seemed to be the direction Reiser filesystems were moving.
 >  > In this scheme a name is just another bit of metadata and not
 >  > first-class important information.  The name-query directories would be
 >  > there for traditional filesystem users and Unix compatibility.  They
 >  > would probably be virtual and dynamic, only being created when needed
 >  > and only being persistent if assigned meta-data (extra names (links),
 >  > non-default permission bits, etc) or for performance reasons (faster to
 >  > load from cache than searching every file).
 > 
 > That latter bit, about making them persistent, is where the tr
 > 

[Hmm... grue ate my message.]

That latter bit, about making them persistent, is where the trouble
begins: once queries acquire identity and a place in the file system
name-space, they logically become part of that very name-space they are
querying! This leads to various complication, and you are trying to work
around them by claiming that queries are not _always_ part of name-space
("file1 [only] **appears** to be a child..."). This non-uniform behavior
is a big disadvantage.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01  2:11             ` Alexander G. M. Smith
@ 2005-06-01 10:58               ` Nikita Danilov
  2005-06-02  1:58                 ` Alexander G. M. Smith
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-01 10:58 UTC (permalink / raw)
  To: Alexander G. M. Smith; +Cc: leocomerford, reiserfs-list, ninja

Alexander G. M. Smith writes:
 > Nikita Danilov wrote on Tue, 31 May 2005 13:34:55 +0400:
 > > Cycle may consists of more graph nodes than fits into memory. Cycle
 > > detection is crucial for rename semantics, and if
 > > cycle-just-about-to-be-formed doesn't fit into memory it's not clear how
 > > to detect it, because tree has to be locked while checked for cycles, and
 > > one definitely doesn't want to keep such a lock over IO.
 > 
 > Sometimes you'll just have to return an error code if the rename operation
 > is too complex to be done.  The user will have to then delete individual
 > leaf files to make the situation simpler.  I hope this won't happen very
 > often.

Huh? How are you planning to check that adding new edge to the graph
does not introduce a cycle without inspecting all nodes of that graph at
least once (note: you don't know what other objects contain references
to the given one, only out-going edges are known)?

For example: mv /d0 /d1

To check that this doesn't introduce a cycle one has to load each child
of /d0 (which may be millions) and recursively check that from none of
them /d1 is reachable. This has to be done on each rename. I believe
this is unacceptable overhead.

 > 
 > On the plus side, the detection of all the files that may be affected
 > means you can now delete a directory directly, contents and all, if all
 > the related inodes fit into memory.
 > 
 > - Alex
 > 

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 10:43                                   ` Nikita Danilov
@ 2005-06-01 14:06                                     ` Jonathan Briggs
  2005-06-01 14:42                                       ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-06-01 14:06 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 2580 bytes --]

On Wed, 2005-06-01 at 14:43 +0400, Nikita Danilov wrote:
> Nikita Danilov writes:
> 
> [...]
> 
>  > 
>  >  > 
>  >  > Yes. :-)  It is radical, and the idea is taken from databases.  I
>  >  > thought that seemed to be the direction Reiser filesystems were moving.
>  >  > In this scheme a name is just another bit of metadata and not
>  >  > first-class important information.  The name-query directories would be
>  >  > there for traditional filesystem users and Unix compatibility.  They
>  >  > would probably be virtual and dynamic, only being created when needed
>  >  > and only being persistent if assigned meta-data (extra names (links),
>  >  > non-default permission bits, etc) or for performance reasons (faster to
>  >  > load from cache than searching every file).
>  > 
>  > That latter bit, about making them persistent, is where the tr
>  > 
> 
> [Hmm... grue ate my message.]
> 
> That latter bit, about making them persistent, is where the trouble
> begins: once queries acquire identity and a place in the file system
> name-space, they logically become part of that very name-space they are
> querying! This leads to various complication, and you are trying to work
> around them by claiming that queries are not _always_ part of name-space
> ("file1 [only] **appears** to be a child..."). This non-uniform behavior
> is a big disadvantage.

In this scheme, query objects were always part of the name-space.

None of the objects are really children of any of the others. They only
appear to be children when viewed through a set of name-query
directories.  In reality every object would be an equal in the true OID
name-space.  Only meta-data objects are children of their data objects.

You could also create a confusing query named /tmp/G that returned
results for /usr/lib/.  This is the same sort of abuse that creates
A->B->C->A loops: the query was deliberately set to have a misleading
name/name-query relationship.

The user is responsible for sensible naming.   Under normal use, a user
would hardly notice the difference between traditional directories and
this name-query system.  

With persistent disk cache of queries and lookup tables for common
names, it does start to look like regular directory structures, but it
is still coming at the problem from the opposite direction.  Traditional
directories store information about a file (its name) outside the file,
and this system would store everything about a file with the file
itself.
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 14:06                                     ` Jonathan Briggs
@ 2005-06-01 14:42                                       ` Nikita Danilov
  2005-06-01 15:40                                         ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-01 14:42 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Jonathan Briggs writes:
 > On Wed, 2005-06-01 at 14:43 +0400, Nikita Danilov wrote:
 > > Nikita Danilov writes:

[...]

 > > 
 > > That latter bit, about making them persistent, is where the trouble
 > > begins: once queries acquire identity and a place in the file system
 > > name-space, they logically become part of that very name-space they are
 > > querying! This leads to various complication, and you are trying to work
 > > around them by claiming that queries are not _always_ part of name-space
 > > ("file1 [only] **appears** to be a child..."). This non-uniform behavior
 > > is a big disadvantage.
 > 
 > In this scheme, query objects were always part of the name-space.

Then, paths visible through queries are inconsistent with names of
underlying objects. You querying system returns fake results
("/tmp/A/B/C/A/file1") that are not present in the database queries are
ran against. This is *wrong*. Nobody is going to tolerate DBMS that
sometimes returns extra rows in SELECT statement, right?

[...]

 > 
 > The user is responsible for sensible naming.   Under normal use, a user
 > would hardly notice the difference between traditional directories and
 > this name-query system.  

Heh, this assumes that users will continue to use new namespace as they
use old one. Which is not true. Usage is determined by features
provided. This is, by the way, one of driving forces behind reiserfs
support for small files and large directories.

If file system provides ability to create namespaces in the form of
arbitrary graphs, this will be used.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 14:42                                       ` Nikita Danilov
@ 2005-06-01 15:40                                         ` Jonathan Briggs
  2005-06-01 17:27                                           ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-06-01 15:40 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 1927 bytes --]

On Wed, 2005-06-01 at 18:42 +0400, Nikita Danilov wrote:
> Jonathan Briggs writes:
>  > On Wed, 2005-06-01 at 14:43 +0400, Nikita Danilov wrote:
>  > > Nikita Danilov writes:
> 
> [...]
> 
>  > > 
>  > > That latter bit, about making them persistent, is where the trouble
>  > > begins: once queries acquire identity and a place in the file system
>  > > name-space, they logically become part of that very name-space they are
>  > > querying! This leads to various complication, and you are trying to work
>  > > around them by claiming that queries are not _always_ part of name-space
>  > > ("file1 [only] **appears** to be a child..."). This non-uniform behavior
>  > > is a big disadvantage.
>  > 
>  > In this scheme, query objects were always part of the name-space.
> 
> Then, paths visible through queries are inconsistent with names of
> underlying objects. You querying system returns fake results
> ("/tmp/A/B/C/A/file1") that are not present in the database queries are
> ran against. This is *wrong*. Nobody is going to tolerate DBMS that
> sometimes returns extra rows in SELECT statement, right?

If you wished to enforce name-query directories always having a single
name and their query always being identical to their name, then that
wouldn't happen.

However, query directories (or "smart folders") will have this namespace
problem in every case and there is no avoiding it.  If the query is for
every file modified in the past day, the file path through the query
directory is not going to match any given name of the file.  Same for
keyword queries, ownership queries, or whatever.

In the traditional directory system, a file doesn't have an official
name, just links to it from directory entries.  Perhaps if you think of
the proposed "name" meta-data as a "preferred name" the idea would work
better for you?
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 15:40                                         ` Jonathan Briggs
@ 2005-06-01 17:27                                           ` Nikita Danilov
  2005-06-01 19:03                                             ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-01 17:27 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Jonathan Briggs writes:

[...]

 > 
 > However, query directories (or "smart folders") will have this namespace
 > problem in every case and there is no avoiding it.  If the query is for
 > every file modified in the past day, the file path through the query
 > directory is not going to match any given name of the file.  Same for
 > keyword queries, ownership queries, or whatever.

Which I think exactly points to one fundamental problem with the idea
that names are attributes of object: this idea is incompatible with the
notion of dynamically created "views" that in effect add new paths
through which objects are reachable. These paths _are_ names as far as
user is concerned (after all names exist to reach objects), but they are
not in the name-as-attribute model.

 > 
 > In the traditional directory system, a file doesn't have an official
 > name, just links to it from directory entries.  Perhaps if you think of
 > the proposed "name" meta-data as a "preferred name" the idea would work
 > better for you?

Frankly speaking, I suspect that name-as-attribute is going to limit
usability of file system significantly.

Note, that in the "real world", only names from quite limited class are
attributes of objects, viz. /proper names/ like "France", or "Jonathan
Briggs". Communication wouldn't get any far if only proper names were
allowed.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 17:27                                           ` Nikita Danilov
@ 2005-06-01 19:03                                             ` Jonathan Briggs
  2005-06-02 10:38                                               ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Jonathan Briggs @ 2005-06-01 19:03 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 984 bytes --]

On Wed, 2005-06-01 at 21:27 +0400, Nikita Danilov wrote:
[snip]
> Frankly speaking, I suspect that name-as-attribute is going to limit
> usability of file system significantly.
> 
> Note, that in the "real world", only names from quite limited class are
> attributes of objects, viz. /proper names/ like "France", or "Jonathan
> Briggs". Communication wouldn't get any far if only proper names were
> allowed.
> 
> Nikita.

Bringing up /proper names/ from the real world agrees with my idea
though! :-)

As a person, you have a list of "proper names" that you answer to and
that you prefer.  However, in some cases you will also answer to "Hey,
you over there!" or "Someone who left a white Honda in the parking lot,
please turn your lights off."

So a file could have a list of proper names, but it can also be referred
to in any other way and by any other name.  Proper names would be
preferred, though.
-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 18:32                     ` Hans Reiser
@ 2005-06-02  1:27                       ` Alexander G. M. Smith
  2005-06-02  7:46                         ` Hans Reiser
  2005-06-02  9:11                       ` Nikita Danilov
  1 sibling, 1 reply; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-06-02  1:27 UTC (permalink / raw)
  To: Hans Reiser
  Cc: jbriggs, Valdis.Kletnieks, leocomerford, reiserfs-list, ninja,
	nikita

Hans Reiser wrote on Tue, 31 May 2005 11:32:04 -0700:
> What about if we have it that only the first name a directory is created
> with counts towards its reference count, and that if the directory is
> moved if it is moved from its first name, the new name becomes the one
> that counts towards the reference count?   A bit of a hack, but would work.

Sounds a lot like what I did earlier.  Files got really deleted when the
true name was the only name for a file (only one parent in other words).
But I also had a large cycle finding pause when any file movement happened.
I'm not sure if it would still be needed.

Nikita Danilov wrote:
> - if garbage collection is implemented through the reference counting
> (which is the only known way tractable for a file system), then cycles
> are never collected.
> [...]
> But the garbage collection problem is still there. You are more than
> welcome to solve it by implementing generation mark-and-sweep GC on file
> system scale. :-)

There are at least two choices:

Bite the bullet and have a file system that is occasionally slow due to
cycle checking, but only when the user somehow makes a huge cycle.  Keep
in mind that this only happens when you use the new functionality, if you
only create files with one parent, it should be as fast as regular file
systems.  I see its features being useful for desktop use, not servers,
so the occasional speed hit is less annoyance than the lack of features
(the ability to file your files in several places).

Another way is to not delete the files when they get unlinked.  Similar
to some other allocation management systems, have a background thread
doing the garbage collection and cycle tracing.  The drawback is that
you might run out of disc space if you're creating files faster than
the collector is cleaning up.

I wonder if you can combine a wandering journal (or whatever it is called,
where the journalled data blocks become the file's current contents) with
the copy type garbage collection (is that the same as a 2 generation mark
and sweep?).  Copy type collection copies all known reachable objects to
an empty half of the disk.  When that's done, the original half is marked
empty and the next pass copies in the other direction.  Could work nicely
if you have two disk drives.  Yet another PhD topic on garbage collection
for someone to research :-)

There are lots of other garbage collection schemes that might be
applicable to file systems with cycles.  It could work, maybe with
decent speed too!

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 10:58               ` Nikita Danilov
@ 2005-06-02  1:58                 ` Alexander G. M. Smith
  2005-06-02 10:03                   ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-06-02  1:58 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: reiserfs-list

Nikita Danilov wrote on Wed, 1 Jun 2005 14:58:47 +0400:
> For example: mv /d0 /d1
> 
> To check that this doesn't introduce a cycle one has to load each child
> of /d0 (which may be millions) and recursively check that from none of
> them /d1 is reachable. This has to be done on each rename. I believe
> this is unacceptable overhead.

That's where we differ.  I think it is an acceptable overhead.  It also
only happens on rename and delete operations for objects with multiple
parents or descendants.  If you just move or delete an ordinary file
that's got just one parent directory and no children, the cost is
ordinary too.

If it's a fildirute object with a dozen attribute type things as
children, then it will need to traverse those dozen children.  Not
a big deal.  Consider this example:

The typical worst case operation will be deleting a link to your photo
from a directory you decided didn't classify it properly.  The photo may
be in several directories, such as Cottage, Aunt and Bottles if it is
a picture of a champaign bottle you polished off at your aunt's cottage.
You decide that it shouldn't really be in the Aunt folder, so you delete
it (or rather the link) from there.

The traversal starts with recursively finding all the children of the
deleted object, which will include the photo and all attributish
subobjects (thumbnail, description, ...).  Not too bad, maybe a
dozen objects.  Then reconnect those children to objects which have
a known good path to the root, reached through whatever parents remain.
That path through the new link becomes their true path name.  The photo
goes first, finding one of the alternative parent directories, say
Cottage as its new main parent.  Then the other children find the Photo
as their main parent.

In other words, the cycle checker has to find all the children of the
deleted object(s).  In most cases there aren't very many of them.

Now if you move the directory containing millions of files, then it's
going to take a while.  And if it has a hard link down to another
directory, that gets traversed too.  But that won't happen too often,
only around spring time when you're reorganizing your mail archives.

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02  1:27                       ` Alexander G. M. Smith
@ 2005-06-02  7:46                         ` Hans Reiser
  0 siblings, 0 replies; 51+ messages in thread
From: Hans Reiser @ 2005-06-02  7:46 UTC (permalink / raw)
  To: Alexander G. M. Smith
  Cc: jbriggs, Valdis.Kletnieks, leocomerford, reiserfs-list, ninja,
	nikita

Alexander G. M. Smith wrote:

>Hans Reiser wrote on Tue, 31 May 2005 11:32:04 -0700:
>  
>
>>What about if we have it that only the first name a directory is created
>>with counts towards its reference count, and that if the directory is
>>moved if it is moved from its first name, the new name becomes the one
>>that counts towards the reference count?   A bit of a hack, but would work.
>>    
>>
>
>Sounds a lot like what I did earlier.  Files got really deleted when the
>true name was the only name for a file (only one parent in other words).
>But I also had a large cycle finding pause when any file movement happened.
>I'm not sure if it would still be needed.
>
>Nikita Danilov wrote:
>  
>
>>- if garbage collection is implemented through the reference counting
>>(which is the only known way tractable for a file system), then cycles
>>are never collected.
>>[...]
>>But the garbage collection problem is still there. You are more than
>>welcome to solve it by implementing generation mark-and-sweep GC on file
>>system scale. :-)
>>    
>>
>
>There are at least two choices:
>
>Bite the bullet and have a file system that is occasionally slow due to
>cycle checking, but only when the user somehow makes a huge cycle.  Keep
>in mind that this only happens when you use the new functionality, if you
>only create files with one parent, it should be as fast as regular file
>systems.  I see its features being useful for desktop use, not servers,
>so the occasional speed hit is less annoyance than the lack of features
>(the ability to file your files in several places).
>  
>
I prefer the above to the below.

>Another way is to not delete the files when they get unlinked.  Similar
>to some other allocation management systems, have a background thread
>doing the garbage collection and cycle tracing.  The drawback is that
>you might run out of disc space if you're creating files faster than
>the collector is cleaning up.
>
>I wonder if you can combine a wandering journal (or whatever it is called,
>where the journalled data blocks become the file's current contents) with
>the copy type garbage collection (is that the same as a 2 generation mark
>and sweep?).  Copy type collection copies all known reachable objects to
>an empty half of the disk.  When that's done, the original half is marked
>empty and the next pass copies in the other direction.  Could work nicely
>if you have two disk drives.  Yet another PhD topic on garbage collection
>for someone to research :-)
>
>There are lots of other garbage collection schemes that might be
>applicable to file systems with cycles.  It could work, maybe with
>decent speed too!
>
>- Alex
>
>
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-05-31 18:32                     ` Hans Reiser
  2005-06-02  1:27                       ` Alexander G. M. Smith
@ 2005-06-02  9:11                       ` Nikita Danilov
  2005-06-02 17:23                         ` Hubert Chan
  1 sibling, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-02  9:11 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Jonathan Briggs, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja

Hans Reiser writes:
 > What about if we have it that only the first name a directory is created
 > with counts towards its reference count, and that if the directory is
 > moved if it is moved from its first name, the new name becomes the one
 > that counts towards the reference count?   A bit of a hack, but would work.

This means that list of names has to be kept together with every object
(to find out where "true" reference has to be moved). And this makes
rename of directory problematic, as lists of names of all directory
children have to be updated.

 > 
 > Hans

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02  1:58                 ` Alexander G. M. Smith
@ 2005-06-02 10:03                   ` Nikita Danilov
  2005-06-03  3:35                     ` Performance Impacts of Graph Cycles due to Multiple Parents Alexander G. M. Smith
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-02 10:03 UTC (permalink / raw)
  To: Alexander G. M. Smith; +Cc: reiserfs-list

Alexander G. M. Smith writes:

[...]

 > 
 > The typical worst case operation will be deleting a link to your photo
 > from a directory you decided didn't classify it properly.  The photo may
 > be in several directories, such as Cottage, Aunt and Bottles if it is
 > a picture of a champaign bottle you polished off at your aunt's cottage.
 > You decide that it shouldn't really be in the Aunt folder, so you delete
 > it (or rather the link) from there.

This is typical operation for a desktop usage, I agree. But desktop is
not interesting. It doesn't pose technical difficulty to implement
whatever indexing structure when your dataset is but a few dozen
thousand objects [1]. What _is_ interesting, is to make file system
scalable. Solution that fails to move directory simply because sub-tree
rooted at it is large is not scalable.

 > 
 > The traversal starts with recursively finding all the children of the
 > deleted object, which will include the photo and all attributish
 > subobjects (thumbnail, description, ...).  Not too bad, maybe a
 > dozen objects.  Then reconnect those children to objects which have
 > a known good path to the root, reached through whatever parents remain.

And at that moment user hits ^C...

That is, how atomicity guarantees of rename will be preserved? Note that
many applications, like some mail servers crucially depend on rename
atomicity to implement their transaction mini-engines.

And concurrency issues also don't look bright: what if while

        mv /d0/d1/d2/d2 /b0/b1/b2

is performed and thread is in the middle of scanning descendants of
/d0/d1/d2/d2 recursively, another thread does

        mv /d0/d1 /c0/c1/c2

? Obviously scanning cannot take locks on individual files as it sees
them (because, namespace being an arbitrary graph, this will
deadlock). The only remaining solution is to take whole-fs-lock during
every rename/link/unlink operation. Which is another nail to the
scalability coffin.

[...]

 > 
 > Now if you move the directory containing millions of files, then it's
 > going to take a while.  And if it has a hard link down to another
 > directory, that gets traversed too.  But that won't happen too often,
 > only around spring time when you're reorganizing your mail archives.

It happens all the time on my workstation, when I move Linux source
trees around.

 > 
 > - Alex

Nikita.

Footnotes: 
[1]  Implementing things like Spotlight does not require
any innovation at the file system layer (and not coincidentally,
Spotlight is based on almost 20 years old BSDLite kernel code).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-01 19:03                                             ` Jonathan Briggs
@ 2005-06-02 10:38                                               ` Nikita Danilov
  2005-06-02 18:35                                                 ` Jonathan Briggs
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-02 10:38 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Jonathan Briggs writes:
 > On Wed, 2005-06-01 at 21:27 +0400, Nikita Danilov wrote:
 > [snip]
 > > Frankly speaking, I suspect that name-as-attribute is going to limit
 > > usability of file system significantly.
 > > 
 > > Note, that in the "real world", only names from quite limited class are
 > > attributes of objects, viz. /proper names/ like "France", or "Jonathan
 > > Briggs". Communication wouldn't get any far if only proper names were
 > > allowed.
 > > 
 > > Nikita.
 > 
 > Bringing up /proper names/ from the real world agrees with my idea
 > though! :-)

I don't understand why if you are liberty to design new namespace model
from scratch (it seems POSIX semantics are not binding in our case), you
are going to faithfully replicate deficiencies of natural languages.

It is common trait in both science and engineering that when two flavors
of the same functionality (real names vs. indices) arise, an attempt is
made to reduce one of them to another, simplifying the system as a
result.

In our case, motivation to reduce one type of names to another is even
more pressing, as these types are incompatible: in the presence of
cycles or dynamic queries, namespace visible through the directory
hierarchy is different from the namespace of real names.

Indices cannot be reduced to real names (as rename is impossible to
implement efficiently), but real names can very well be reduced to
indices as exemplified by each and every UNIX file system out there.

So, the question is: what real names buy one, that indices do not?

[...]

 > -- 
 > Jonathan Briggs <jbriggs@esoft.com>
 > eSoft, Inc.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: File as a directory - VFS Changes
@ 2005-06-02 14:46 Faraz Ahmed
  0 siblings, 0 replies; 51+ messages in thread
From: Faraz Ahmed @ 2005-06-02 14:46 UTC (permalink / raw)
  To: reiserfs-list

 Hi Nikita;

     The problems of files not fitting in the query of the smart folder is a
 serious one. We had implemented this same thing for our semantic
filesystem.

For ex we create a MP3 file is a JPEG folder things it wont ever get
listed.
This will fundamentally change the way users see your filesytem, the users
expect to see the files in the folder they created. This it self should be a
default search criteria.
We almost solved this by having the "parentdirectory" as a attribute of the
file. All the smart folders have thier query transparently modified as
"where type=jpg Or parentdirectory=thisdirectory". This make the virtual
folder stuff work as EXTENSION to standard file/directory relationship
rather than work as RELPLACEMENT.

    Personal experience says that user dont digest any change to UNIX
filesystem mode. Anything extra is OK but replacements are BAD. Think of it
you created a C file in a virtual folder for "h" files the files wont get
listed(althoug they will exist). THEN WHAT??? the user has to search it BAD,
your whole fancy virtual directory USECASE itself is lost and eventually we
endup solving nothing.

    Other issues include this display name stuff etc. They are bad. what if
two files with same display name get listed in the same virtual directory.
No point in creating a problem and then solving it. Good Work though we dont
want to get booged down once WinFS is released.
Regards
Faraz.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02  9:11                       ` Nikita Danilov
@ 2005-06-02 17:23                         ` Hubert Chan
  0 siblings, 0 replies; 51+ messages in thread
From: Hubert Chan @ 2005-06-02 17:23 UTC (permalink / raw)
  To: reiserfs-list

On Thu, 2 Jun 2005 13:11:05 +0400, Nikita Danilov <nikita@clusterfs.com> said:

> Hans Reiser writes:
>> What about if we have it that only the first name a directory is
>> created with counts towards its reference count, and that if the
>> directory is moved if it is moved from its first name, the new name
>> becomes the one that counts towards the reference count?  A bit of a
>> hack, but would work.

> This means that list of names has to be kept together with every
> object (to find out where "true" reference has to be moved). And this
> makes rename of directory problematic, as lists of names of all
> directory children have to be updated.

Don't you just need to keep a pointer (inode number) to the parent
directory?  When you move a file, check if the parent inode number is
equal to the file's 'true parent' inode number, and if so, update the
'true parent' pointer.  And do a similar thing when you delete.

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02 10:38                                               ` Nikita Danilov
@ 2005-06-02 18:35                                                 ` Jonathan Briggs
  2005-06-02 23:54                                                   ` Nikita Danilov
  2005-06-03  6:44                                                   ` Faraz Ahmed
  0 siblings, 2 replies; 51+ messages in thread
From: Jonathan Briggs @ 2005-06-02 18:35 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

[-- Attachment #1: Type: text/plain, Size: 3043 bytes --]

On Thu, 2005-06-02 at 14:38 +0400, Nikita Danilov wrote:
> Jonathan Briggs writes:
>  > On Wed, 2005-06-01 at 21:27 +0400, Nikita Danilov wrote:
>  > [snip]
>  > > Frankly speaking, I suspect that name-as-attribute is going to limit
>  > > usability of file system significantly.

Usability as in features?  Or usability as in performance?

>  > > 
>  > > Note, that in the "real world", only names from quite limited class are
>  > > attributes of objects, viz. /proper names/ like "France", or "Jonathan
>  > > Briggs". Communication wouldn't get any far if only proper names were
>  > > allowed.
>  > > 
>  > > Nikita.
>  > 
>  > Bringing up /proper names/ from the real world agrees with my idea
>  > though! :-)
> 
> I don't understand why if you are liberty to design new namespace model
> from scratch (it seems POSIX semantics are not binding in our case), you
> are going to faithfully replicate deficiencies of natural languages.
> 
> It is common trait in both science and engineering that when two flavors
> of the same functionality (real names vs. indices) arise, an attempt is
> made to reduce one of them to another, simplifying the system as a
> result.

A index is an arrangement of information about the indexed items.  The
index contents *belong* to the items.  An index by name?  That name
belongs to the item.  An index by date?  Those dates are properties of
the item.  Anything that can be indexed about an item can be described
as a property of the item.

Only for efficiency reasons are index data not included with the item
data.

> 
> In our case, motivation to reduce one type of names to another is even
> more pressing, as these types are incompatible: in the presence of
> cycles or dynamic queries, namespace visible through the directory
> hierarchy is different from the namespace of real names.

Queries create indexes based on properties of the items.  This is no
different from directories, which are indexes based on names of the
items.

In the same way that you can descend a directory tree and copy the names
found into each item, you can check each item and copy the names found
into a directory tree.

> 
> Indices cannot be reduced to real names (as rename is impossible to
> implement efficiently), but real names can very well be reduced to
> indices as exemplified by each and every UNIX file system out there.
> 
> So, the question is: what real names buy one, that indices do not?

By storing the names in the items, cycles become solvable because you
can always look at the current directory's name(s) to see where you
really are.  Every name becomes absolutely connected to the top of the
namespace instead of depending on a parent pointer that may not ever
connect to the top.

If speeding up rename was very important, you can replace every pathname
component with a indirect reference instead of using simple strings.
Changing directory levels is still difficult.

-- 
Jonathan Briggs <jbriggs@esoft.com>
eSoft, Inc.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02 18:35                                                 ` Jonathan Briggs
@ 2005-06-02 23:54                                                   ` Nikita Danilov
  2005-06-03 17:57                                                     ` Hans Reiser
  2005-06-03  6:44                                                   ` Faraz Ahmed
  1 sibling, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-02 23:54 UTC (permalink / raw)
  To: Jonathan Briggs
  Cc: Hans Reiser, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Jonathan Briggs writes:
 > On Thu, 2005-06-02 at 14:38 +0400, Nikita Danilov wrote:
 > > Jonathan Briggs writes:
 > >  > On Wed, 2005-06-01 at 21:27 +0400, Nikita Danilov wrote:
 > >  > [snip]
 > >  > > Frankly speaking, I suspect that name-as-attribute is going to limit
 > >  > > usability of file system significantly.
 > 
 > Usability as in features?  Or usability as in performance?

Usability as in ease of use.

[...]

 > 
 > A index is an arrangement of information about the indexed items.  The
 > index contents *belong* to the items.  An index by name?  That name
 > belongs to the item.  An index by date?  Those dates are properties of

In the flat world of relation databases, maybe. But almost nowhere else
improper name is an attribute of its signified: variable is not an
attribute of object it points to, URL is not an attribute of the web
page, block number is not an attribute of data stored in that block on
the disk, etc.

[...]

 > 
 > In the same way that you can descend a directory tree and copy the names
 > found into each item, you can check each item and copy the names found
 > into a directory tree.

Except that as was already discussed resulting directory tree is _bound_
to be inconsistent with "real names".

 > 
 > > 
 > > Indices cannot be reduced to real names (as rename is impossible to
 > > implement efficiently), but real names can very well be reduced to
 > > indices as exemplified by each and every UNIX file system out there.
 > > 
 > > So, the question is: what real names buy one, that indices do not?
 > 
 > By storing the names in the items, cycles become solvable because you
 > can always look at the current directory's name(s) to see where you
 > really are.  Every name becomes absolutely connected to the top of the
 > namespace instead of depending on a parent pointer that may not ever
 > connect to the top.

But cycles are "solvable" in current file systems too: they simply do
not exist there.

 > 
 > If speeding up rename was very important, you can replace every pathname
 > component with a indirect reference instead of using simple strings.
 > Changing directory levels is still difficult.

It is not only speed that will be extremely hard to achieve in that
design; atomicity (in the face of possible crash during rename), and
concurrency control look problematic too.

 > 
 > -- 
 > Jonathan Briggs <jbriggs@esoft.com>
 > eSoft, Inc.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Performance Impacts of Graph Cycles due to Multiple Parents
  2005-06-02 10:03                   ` Nikita Danilov
@ 2005-06-03  3:35                     ` Alexander G. M. Smith
  2005-06-03 11:15                       ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-06-03  3:35 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: reiserfs-list

Nikita Danilov wrote on Thu, 2 Jun 2005 14:03:54 +0400 in the
"Re: File as a directory - VFS Changes" thread:
> This is typical operation for a desktop usage, I agree. But desktop is
> not interesting. It doesn't pose technical difficulty to implement
> whatever indexing structure when your dataset is but a few dozen
> thousand objects [1].

Getting people to use something different does pose an interesting
social engineering problem :-).  I wonder how tough it was to move from
block records (descended from 80 byte punched cards) to streams of bytes.
I'd expect people complained of the inefficiencies of reading things
byte by byte, the uncertainty of where a record boundary was, and
possibly other limitations.  Did the complexity of having to put things
into directories, rather than just unassociated card decks, worry them?

> What _is_ interesting, is to make file system scalable. Solution
> that fails to move directory simply because sub-tree rooted at it
> is large is not scalable.

But that scalability isn't all that important, I don't see large systems
making use of a chaotic collection of cross links between items.  Since
they're so large, they're usually uniform and thus simple collections of
items, such as a series of scientific experiment observations over time.
Well, unless someone tries to do an AI knowledge representation as linked
files or something similarly weird.  Then it gets challenging.

There is some scalability in that operations going on in one subgraph
of the file system don't depend on things happening in the rest of it.

> That is, how atomicity guarantees of rename will be preserved? Note that
> many applications, like some mail servers crucially depend on rename
> atomicity to implement their transaction mini-engines.

Same as before.  Grab locks on all the children affected before doing
any work.  If there's a deadlock, or memory shortage, just report an
error back to the caller and don't change anything.  For the typical
mail server use, don't they just rename one file at a time?  If they
are moving whole large directories around, then it would be a problem.

Or require that the user move all the child files individually to avoid
dealing with deadlocks and long operations.  Simplistic you say?  As
simple as the kludge of doing "rm -r DirectoryName" to delete a
directory and all its contents just because a classical file system
can't handle the locking of large things.

> It happens all the time on my workstation, when I move Linux source
> trees around.

Good point.  Traversing all the children when you're moving the directory
would be expensive, and isn't needed if the children don't cause cycles.
A simple optimization for ordinary files (not cross linked) is to have a
flag saying "I am a tree" for every file system object.  Then when doing
a move or delete, the children of an object marked with that flag don't
need to be examined.  So for ordinary files, we can go back to getting
classical performance.  Maintaining the flag doesn't cost much if it
isn't changing, even less if directories (which means everything in a
directory-is-file system) keep a count the number of tree children they
have.  But when something does acquire or lose an extra parent, all its
parent directories have to be updated, possibly bubbling the change up
to the root.

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02 18:35                                                 ` Jonathan Briggs
  2005-06-02 23:54                                                   ` Nikita Danilov
@ 2005-06-03  6:44                                                   ` Faraz Ahmed
  1 sibling, 0 replies; 51+ messages in thread
From: Faraz Ahmed @ 2005-06-03  6:44 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: reiserfs-list, jbriggs

Hi;
                  Why is this discussion revoling around Relational
Databases. The attributes of the files and files themselves, if were to be
modelled for querying a Realtional Database would really s**k.  The
attribute info is neither structured, nor is it unstructured, its
SEMI-STRUCTURED. Exceuting  Structured Query Lang(Sql) over semistrutured
data would result in
-> Harder modelling (almost a waste of effort),
-> Complex Quering (Eleganant system of no use because of the amout of joins
that would result in Quering , if you somehow model semi-structured data in
some structured Data Model);
                    The best option, to start would be with best COT. I feel
we should look at Loreal a stanford project. For hints about modelling our
"whatever".

Regards
Faraz :)


----- Original Message ----- 
From: "Nikita Danilov" <nikita@clusterfs.com>
To: "Jonathan Briggs" <jbriggs@esoft.com>
Cc: "Hans Reiser" <reiser@namesys.com>; <Valdis.Kletnieks@vt.edu>;
"Alexander G. M. Smith" <agmsmith@rogers.com>; <leocomerford@gmail.com>;
<reiserfs-list@namesys.com>; <ninja@slaphack.com>; "Nate Diller"
<ndiller@namesys.com>
Sent: Thursday, June 02, 2005 4:54 PM
Subject: Re: File as a directory - VFS Changes


> Jonathan Briggs writes:
>  > On Thu, 2005-06-02 at 14:38 +0400, Nikita Danilov wrote:
>  > > Jonathan Briggs writes:
>  > >  > On Wed, 2005-06-01 at 21:27 +0400, Nikita Danilov wrote:
>  > >  > [snip]
>  > >  > > Frankly speaking, I suspect that name-as-attribute is going to
limit
>  > >  > > usability of file system significantly.
>  >
>  > Usability as in features?  Or usability as in performance?
>
> Usability as in ease of use.
>
> [...]
>
>  >
>  > A index is an arrangement of information about the indexed items.  The
>  > index contents *belong* to the items.  An index by name?  That name
>  > belongs to the item.  An index by date?  Those dates are properties of
>
> In the flat world of relation databases, maybe. But almost nowhere else
> improper name is an attribute of its signified: variable is not an
> attribute of object it points to, URL is not an attribute of the web
> page, block number is not an attribute of data stored in that block on
> the disk, etc.
>
> [...]
>
>  >
>  > In the same way that you can descend a directory tree and copy the
names
>  > found into each item, you can check each item and copy the names found
>  > into a directory tree.
>
> Except that as was already discussed resulting directory tree is _bound_
> to be inconsistent with "real names".
>
>  >
>  > >
>  > > Indices cannot be reduced to real names (as rename is impossible to
>  > > implement efficiently), but real names can very well be reduced to
>  > > indices as exemplified by each and every UNIX file system out there.
>  > >
>  > > So, the question is: what real names buy one, that indices do not?
>  >
>  > By storing the names in the items, cycles become solvable because you
>  > can always look at the current directory's name(s) to see where you
>  > really are.  Every name becomes absolutely connected to the top of the
>  > namespace instead of depending on a parent pointer that may not ever
>  > connect to the top.
>
> But cycles are "solvable" in current file systems too: they simply do
> not exist there.
>
>  >
>  > If speeding up rename was very important, you can replace every
pathname
>  > component with a indirect reference instead of using simple strings.
>  > Changing directory levels is still difficult.
>
> It is not only speed that will be extremely hard to achieve in that
> design; atomicity (in the face of possible crash during rename), and
> concurrency control look problematic too.
>
>  >
>  > -- 
>  > Jonathan Briggs <jbriggs@esoft.com>
>  > eSoft, Inc.
>
> Nikita.
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Performance Impacts of Graph Cycles due to Multiple Parents
  2005-06-03  3:35                     ` Performance Impacts of Graph Cycles due to Multiple Parents Alexander G. M. Smith
@ 2005-06-03 11:15                       ` Nikita Danilov
  2005-06-07  2:04                         ` Alexander G. M. Smith
  0 siblings, 1 reply; 51+ messages in thread
From: Nikita Danilov @ 2005-06-03 11:15 UTC (permalink / raw)
  To: Alexander G. M. Smith; +Cc: reiserfs-list

Alexander G. M. Smith writes:
 > Nikita Danilov wrote on Thu, 2 Jun 2005 14:03:54 +0400 in the

[...]

 > > That is, how atomicity guarantees of rename will be preserved? Note that
 > > many applications, like some mail servers crucially depend on rename
 > > atomicity to implement their transaction mini-engines.
 > 
 > Same as before.  Grab locks on all the children affected before doing
 > any work.  If there's a deadlock, or memory shortage, just report an

And how do you find that deadlock happened? By... wait, by finding a
cycle in the graph, right? Deadlock detection requires exponential time,
so even with few thousand locks, it's intractable within timing
constraints of interactive file system usage.

 > error back to the caller and don't change anything.  For the typical
 > mail server use, don't they just rename one file at a time?  If they
 > are moving whole large directories around, then it would be a problem.

This is exactly what some application do. Here is how transactions can
be implemented in the POSIX file system:

 - you have a symlink ./d.active pointing to the "current" directory
 under which some sub-tree of interest is located;

 - to start new transaction create directory ./d.new and populate it
 with hard-links to ./d.active content exactly replicating its
 structure;

 - perform in ./d.new compound operation that you want to be atomic:
 when file in ./d.new is to be modified, hard link is broken, and new
 file created;

 - mv d.new d.committed.$(date +%s.%N);

 - when system is initialized (possibly after a crash), re-target
 ./d.current to the latest ./d.committed.*, remove uncommitted ./d.new
 if any.

This mechanism, known as "phase trees", obviously depends on rename(2)
atomicity. (While this is not relevant to our discussion, a by-product
advantage of phase-trees is that they also provide some form of
isolation for free: read-only queries run through ./d.current and see
only committed data.) Note that I-AM-A-TREE optimization you proposed
doesn't work here.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-02 23:54                                                   ` Nikita Danilov
@ 2005-06-03 17:57                                                     ` Hans Reiser
  2005-06-04 19:45                                                       ` Nikita Danilov
  0 siblings, 1 reply; 51+ messages in thread
From: Hans Reiser @ 2005-06-03 17:57 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Jonathan Briggs, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Nikita Danilov wrote:

>
>But cycles are "solvable" in current file systems too: they simply do
>not exist there.
>  
>
Yes, but Nikita, cycles represent semantic functionality that has value
because being able to embody more expressions means more power of
expression.  If some way can be found to allow them, then functionality
is increased. Separating links that increase reference count from links
that merely point (ala hard vs. sym links) is one approach.  If there
was effective enough for real world use cycle detection, that would be
better.

> > 
> > If speeding up rename was very important, you can replace every pathname
> > component with a indirect reference instead of using simple strings.
> > Changing directory levels is still difficult.
>
>It is not only speed that will be extremely hard to achieve in that
>design; atomicity (in the face of possible crash during rename), and
>concurrency control look problematic too.
>
> > 
> > -- 
> > Jonathan Briggs <jbriggs@esoft.com>
> > eSoft, Inc.
>
>Nikita.
>
>
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-03 17:57                                                     ` Hans Reiser
@ 2005-06-04 19:45                                                       ` Nikita Danilov
  2005-06-04 20:13                                                         ` David Masover
  2005-06-07  5:08                                                         ` Hans Reiser
  0 siblings, 2 replies; 51+ messages in thread
From: Nikita Danilov @ 2005-06-04 19:45 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Jonathan Briggs, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Hans Reiser writes:
 > Nikita Danilov wrote:
 > 
 > >
 > >But cycles are "solvable" in current file systems too: they simply do
 > >not exist there.
 > >  
 > >
 > Yes, but Nikita, cycles represent semantic functionality that has value
 > because being able to embody more expressions means more power of

If you mean that multiple parents have some value, I agree. Problem is
that solutions proposed so far have severe limitation:

 - they add support for cycle detection that is necessary to support
 multiple parents, but that support is only efficient for "small"
 datasets: when total number of objects is not very big, and "average"
 object has only one parent.

 - even when there are no multiple parents, system is not efficient for
 large number of files.

 > expression.  If some way can be found to allow them, then functionality
 > is increased. Separating links that increase reference count from links
 > that merely point (ala hard vs. sym links) is one approach.  If there
 > was effective enough for real world use cycle detection, that would be
 > better.

It seems to me that in the domains where proposed designs are
applicable, symlinks already provide viable solution.

Nikita.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-04 19:45                                                       ` Nikita Danilov
@ 2005-06-04 20:13                                                         ` David Masover
  2005-06-07  5:08                                                         ` Hans Reiser
  1 sibling, 0 replies; 51+ messages in thread
From: David Masover @ 2005-06-04 20:13 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Hans Reiser, Jonathan Briggs, Valdis.Kletnieks,
	Alexander G. M. Smith, leocomerford, reiserfs-list, Nate Diller

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nikita Danilov wrote:
[...]
> It seems to me that in the domains where proposed designs are
> applicable, symlinks already provide viable solution.

It's a little late in the game for me to jump in, but can someone else
comment on this?  Is this a "viable" solution in the same way that gnome
VFS is a "viable" solution to the problem of transparently handling
zipfiles and such?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQqILU3gHNmZLgCUhAQLJKg//a+czcmSVqsDXcDKgAJx+3iCfsDeJRmKd
sZJt0vIBD6EEOQLrx7dnADkPiSg5KBY/6fJCq4eb/1PyugO61iDFavoMW13gztJ3
WuodCrni3OPgzQhU640rkURGR5RcM9efd0pdG/9gA6jQrjGt5B/UYdL7cZOa5PAa
yb6+SdfQXqyi2cwpzWY3glZy8260Lk/J910j16a2IwugG9jC+q2YXmEfuaM4feib
rgE+79a9a54FnxGsn+KYcHvRbQyxT6FrDhA5G6TN5KXyduqgi5CWyY1Fu44tOnwR
P5Css4se/fY7vguBaebzF/qcGNZ3R8YqQbimdgmxgycrbFoz4sKX/IO3uPKXu9AO
yK8YHmutPBxs9O1iorz3niX0KA9OE26IGb0TmOea97p5dyGG3LKqDIDj2RaCzr4O
D2t1ErMlJQhSO5TevzVLMWrBVsS8B845ML2DrJMS0M0gaAZ9KRqKHo2vL3rpKl+R
oUukFEnfi7anIClMc1TNkycRYJfOUlvPkjHEoPXrgLyjlbJ2raPBzTSSEO02dNgm
eITAKAKCY8PQDeUnB1omCAER3mbNTBwi+f1F0UiQZzUcLJNAVyNyuSzCCJHoRoO3
39kmTgv0E3lh8tcsplMIysCWTTSqQUPbF/N07u/B20oqQ47Nd6Ttt6kADNNvjXP6
2+4158WYoxc=
=Vc/s
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Performance Impacts of Graph Cycles due to Multiple Parents
  2005-06-03 11:15                       ` Nikita Danilov
@ 2005-06-07  2:04                         ` Alexander G. M. Smith
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander G. M. Smith @ 2005-06-07  2:04 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: reiserfs-list

Nikita Danilov wrote on Fri, 3 Jun 2005 15:15:08 +0400:
> This is exactly what some application do. Here is how transactions can
> be implemented in the POSIX file system:
> 
>  - you have a symlink ./d.active pointing to the "current" directory
>  under which some sub-tree of interest is located;
> 
>  - to start new transaction create directory ./d.new and populate it
>  with hard-links to ./d.active content exactly replicating its
>  structure;
> 
>  - perform in ./d.new compound operation that you want to be atomic:
>  when file in ./d.new is to be modified, hard link is broken, and new
>  file created;
> 
>  - mv d.new d.committed.$(date +%s.%N);
> 
>  - when system is initialized (possibly after a crash), re-target
>  ./d.current to the latest ./d.committed.*, remove uncommitted ./d.new
>  if any.
> 
> This mechanism, known as "phase trees", obviously depends on rename(2)
> atomicity. (While this is not relevant to our discussion, a by-product
> advantage of phase-trees is that they also provide some form of
> isolation for free: read-only queries run through ./d.current and see
> only committed data.) Note that I-AM-A-TREE optimization you proposed
> doesn't work here.

Yes, with the hard links, the multi-parent file system would need to do
slightly more checking, well, only slightly more since the hard links
are to files that have no children.  Still, it would have to lock all
the children of the moving directories.  Actually, it would be one step
better in another sense - you could move the whole directory over your
current directory and have it replace "current" and all its children with
the new versions, since it can understand deleting children of a directory
that's become nonexistent, while keeping the ones which are linked elsewhere.

Also, just double checking my understanding, but is that single
threaded?  I'd expect it to break if you had more than one ./d.new
directory.

- Alex

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: File as a directory - VFS Changes
  2005-06-04 19:45                                                       ` Nikita Danilov
  2005-06-04 20:13                                                         ` David Masover
@ 2005-06-07  5:08                                                         ` Hans Reiser
  1 sibling, 0 replies; 51+ messages in thread
From: Hans Reiser @ 2005-06-07  5:08 UTC (permalink / raw)
  To: Nikita Danilov
  Cc: Jonathan Briggs, Valdis.Kletnieks, Alexander G. M. Smith,
	leocomerford, reiserfs-list, ninja, Nate Diller

Nikita Danilov wrote:

>Hans Reiser writes:
> > Nikita Danilov wrote:
> > 
> > >
> > >But cycles are "solvable" in current file systems too: they simply do
> > >not exist there.
> > >  
> > >
> > Yes, but Nikita, cycles represent semantic functionality that has value
> > because being able to embody more expressions means more power of
>
>If you mean that multiple parents have some value, I agree. Problem is
>that solutions proposed so far have severe limitation:
>
> - they add support for cycle detection that is necessary to support
> multiple parents, but that support is only efficient for "small"
> datasets: when total number of objects is not very big, and "average"
> object has only one parent.
>
> - even when there are no multiple parents, system is not efficient for
> large number of files.
>  
>
Can you say this in more detail?

> > expression.  If some way can be found to allow them, then functionality
> > is increased. Separating links that increase reference count from links
> > that merely point (ala hard vs. sym links) is one approach.  If there
> > was effective enough for real world use cycle detection, that would be
> > better.
>
>It seems to me that in the domains where proposed designs are
>applicable, symlinks already provide viable solution.
>  
>
I have been thinking that disabling hard links for filedirectories might
be an acceptable solution for reiser4 if cycles are a deeper problem
than I currently appreciate.  We can then allow people to turn off one
of either filedirectories or hard links.  I would prefer solving the
cycles problem though....

>Nikita.
>
>
>  
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2005-06-07  5:08 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-28  0:46 File as a directory - Ordered Relations Alexander G. M. Smith
2005-05-28  4:56 ` David Masover
2005-05-28 19:42   ` Valdis.Kletnieks
2005-05-29 17:58     ` File as a directory - VFS Changes Alexander G. M. Smith
2005-05-30  8:25       ` Hans Reiser
2005-05-30 11:00       ` Nikita Danilov
2005-05-31  0:20         ` Alexander G. M. Smith
2005-05-31  9:34           ` Nikita Danilov
2005-05-31 15:04             ` Hans Reiser
2005-05-31 16:00               ` Nikita Danilov
2005-05-31 16:30               ` Valdis.Kletnieks
2005-05-31 16:55                 ` Jonathan Briggs
2005-05-31 16:59                   ` Hans Reiser
2005-05-31 17:13                     ` Jonathan Briggs
2005-05-31 18:27                       ` Hans Reiser
2005-05-31 21:01                         ` Jonathan Briggs
2005-05-31 21:08                           ` Jonathan Briggs
2005-05-31 22:36                             ` Nikita Danilov
2005-05-31 23:01                               ` Jonathan Briggs
2005-06-01 10:39                                 ` Nikita Danilov
2005-06-01 10:43                                   ` Nikita Danilov
2005-06-01 14:06                                     ` Jonathan Briggs
2005-06-01 14:42                                       ` Nikita Danilov
2005-06-01 15:40                                         ` Jonathan Briggs
2005-06-01 17:27                                           ` Nikita Danilov
2005-06-01 19:03                                             ` Jonathan Briggs
2005-06-02 10:38                                               ` Nikita Danilov
2005-06-02 18:35                                                 ` Jonathan Briggs
2005-06-02 23:54                                                   ` Nikita Danilov
2005-06-03 17:57                                                     ` Hans Reiser
2005-06-04 19:45                                                       ` Nikita Danilov
2005-06-04 20:13                                                         ` David Masover
2005-06-07  5:08                                                         ` Hans Reiser
2005-06-03  6:44                                                   ` Faraz Ahmed
2005-05-31 18:23                   ` Nikita Danilov
2005-05-31 18:32                     ` Hans Reiser
2005-06-02  1:27                       ` Alexander G. M. Smith
2005-06-02  7:46                         ` Hans Reiser
2005-06-02  9:11                       ` Nikita Danilov
2005-06-02 17:23                         ` Hubert Chan
2005-06-01  2:11             ` Alexander G. M. Smith
2005-06-01 10:58               ` Nikita Danilov
2005-06-02  1:58                 ` Alexander G. M. Smith
2005-06-02 10:03                   ` Nikita Danilov
2005-06-03  3:35                     ` Performance Impacts of Graph Cycles due to Multiple Parents Alexander G. M. Smith
2005-06-03 11:15                       ` Nikita Danilov
2005-06-07  2:04                         ` Alexander G. M. Smith
2005-05-30  8:19     ` File as a directory - Ordered Relations Hans Reiser
2005-05-31 16:46       ` Jonathan Briggs
2005-05-31 17:07         ` Hans Reiser
  -- strict thread matches above, loose matches on Subject: below --
2005-06-02 14:46 File as a directory - VFS Changes Faraz Ahmed

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.