From mboxrd@z Thu Jan 1 00:00:00 1970 From: "David Dabbs" Subject: RE: Fibration questions Date: Mon, 19 Jul 2004 02:21:08 -0500 Message-ID: <20040719072026.1959F15D1B@mail03.powweb.com> References: <40FB4CC4.80300@slaphack.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: list-help: list-unsubscribe: list-post: Errors-To: flx@namesys.com In-Reply-To: <40FB4CC4.80300@slaphack.com> List-Id: Content-Type: text/plain; charset="us-ascii" To: 'David Masover' , 'Hans Reiser' Cc: reiserfs-list@namesys.com > -----Original Message----- > From: David Masover [mailto:ninja@slaphack.com] > Sent: Sunday, July 18, 2004 11:24 PM > To: Hans Reiser > Cc: David Dabbs; reiserfs-list@namesys.com > > Hans Reiser wrote: > [...] > | If FS naming was better designed, filenames would not have extensions. > | I prefer to first better design naming, and then not need to optimize > | the API for extensions. > > Still, if we're going to fibrate by file type and want to find a file by > file type, there needs to be -- surprise! -- a standard way to determine > file type. > There be dragons. Despite the fact that I advocated applying fibration data to filesystem queries, the two (fibrating by file type [extension] and 'finding a file by file type') are quite different. The former is simply a way to bunch/glom/group particular filesystem objects together in the tree. The latter requires metadata beyond that provided by the filesystem objects themselves. Leaving aside fibration for a moment, there is a (big) difference between the filesystem's ability to answer an (objective) query regarding some aspect of an object's human-readable name (search by extension) and what it seems you are seeking (search by file _type_). In your use case, the filesystem would be required to 'deduce,' via suitable, consistent and human-maintained metadata, which objects are subclasses of some 'type' in a human-maintained 'FileObject' ontology. This is the kind of thing for which the W3C's SemanticWeb activity might advocate OWL/RDF. Possible means aside, the following are among the questions the community would need to address: 1. What is the range of 'file types'? 2. The range of known 'file type aliases' (extensions)? 3. How should applications interpret and buy into this consensus? 4. At what level is this ontology managed? The OS, VFS, particular filesystems? 5. What is a portable metadata storage format that is easily maintained (and shared) by humans and parsed/employed by applications? > Extensions are universal, and aren't going away soon. What do we want Extensions are a convention humans share that are tenuously/inconsistently 'understood' by the computers humans use. Under Windows, an installed application also installs a 'rule' that associates the application with filesystem objects that exhibit certain attributes, e.g. that they end in '.foo.' > to replace them with? Only thing I particularly care about here is that > a file can appear as more than one type -- a script is both a program > and a text file, for example. Of course, you'd only be able to fibrate > by one type (right?), so there'd have to be a "primary" file type. > > This is helpful because it's the right thing to do, but also because it > may have implications right now. For example, Gedit and AbiWord should > both show me uncompressed AbiWord docs in their "open" dialogs, by > default. Using file extensions, that isn't possible unless you hack > Gedit to "know" every possible text file extension, or check the magic > on each file in the directory. > > The "propor" way to implement it would be to ask the filesystem > something like "List all the text files in this directory". And not, > btw, "List all the *.txt files in this directory". > > I believe the proper thing to do is to leave this service to the operating system (prob. the VFS) and to application programmers. The filesystem can be very good/fast at finding objects that end in some 'extension,' but any more understanding about objects should be handled 'above' the (a particular) filesystem.