* Fibration questions
@ 2004-07-16 10:35 David Dabbs
2004-07-16 11:04 ` Nikita Danilov
2004-07-18 7:11 ` Hans Reiser
0 siblings, 2 replies; 46+ messages in thread
From: David Dabbs @ 2004-07-16 10:35 UTC (permalink / raw)
To: reiserfs-list
I'm curious why the fibration function prototypes take an inode* (that is
unused)? I looked at struct_inode and I can't see anything that looks
helpful to calculating a fibre.
/* fibration.c: sample fibration_plugin.fibre() function proto */
static __u64 fibre_dot_o(const struct inode *dir, const char *name, int
len);
Since fibre_dot_o is the default fibration plugin, could it be safely
shortened from:
if (len > 2 && name[len - 1] == 'o' && name[len - 2] == '.')
return FIBRE_NO(1);
else
return FIBRE_NO(0);
to
return FIBRE_NO((len > 2 && name[len - 1] == 'o' && name[len - 2] == '.'));
since the boolean expression will always return 1 or 0?
In Nikita's explanation of fibration [http://kerneltrap.org/node/view/2761]
(keeping .o files separate form all other files to speed compilation), would
there be any advantage to the following function?
unsigned char no=1;
if (len > 2 && name[len - 2] == '.') {
switch (name[len - 1]) {
case: 'c'
case: 'h'
no=0
break;
case: 'o'
no = 2;
break;
}
}
return FIBRE_NO(no);
Using the fibre for fast extension testing
A common filesystem client query is for '*.xxx'. If there were (is?) an
alternate readdir[64]() interface that allowed the caller to pass an
'extension mask/filter,' then the fibre could be leveraged to quickly test
whether a given file matches the mask without calling unpack_string(), etc.
Kind of like saying to the filesystem:
SELECT filename WHERE dir = '/home' AND filename LIKE '%.xxx'
A 'fibration map' would need to be specified in an array, say
const char * fibre_map[] = {"c", "h", "o"...};
or using a structure like
struct fibre_map_ent {
unsigned char fibrno;
unsigned char *exten;
};
#define FIBR_MAX 127
#define FIBR_DEFAULT FIBR_MAX-1
/* should be ordered by exten for searching */
const struct fibre_map_ent map[] =
{
{.fibrno=6, exten="bar"},
{.fibrno=0, exten="c"},
{.fibrno=1, exten="h"},
{.fibrno=3, exten="htm"},
{.fibrno=3, exten="html"},
{.fibrno=2, exten="java"},
{.fibrno=FIBR_MAX, exten="o"},
{.fibrno=4, exten="pl"},
{.fibrno=5, exten="py"}
{.fibrno=7, exten="xml"}
};
/*
failure to find a match returns
FIBRE_NO(FIBR_DEFAULT);
else
FIBRE_NO(map[i].fibro);
The enhanced readdir would lookup the mask. If found, it would use the
fibrno to select appropriate directory entries for the readdir stream.
Otherwise, it would execute the default readdir code. Of course, this map
would need to be the global default for the filesystem and per-directory
fibration override would be prohibited, otherwise readdir would return
incorrect results.
^ permalink raw reply [flat|nested] 46+ messages in thread* Re: Fibration questions 2004-07-16 10:35 Fibration questions David Dabbs @ 2004-07-16 11:04 ` Nikita Danilov 2004-07-16 17:45 ` David Dabbs 2004-07-18 7:11 ` Hans Reiser 1 sibling, 1 reply; 46+ messages in thread From: Nikita Danilov @ 2004-07-16 11:04 UTC (permalink / raw) To: David Dabbs; +Cc: reiserfs-list David Dabbs writes: > > > I'm curious why the fibration function prototypes take an inode* (that is > unused)? I looked at struct_inode and I can't see anything that looks > helpful to calculating a fibre. This for more advanced fibration plugins that keep some per-directory state (none at the moment). > > /* fibration.c: sample fibration_plugin.fibre() function proto */ > static __u64 fibre_dot_o(const struct inode *dir, const char *name, int > len); > > > > Since fibre_dot_o is the default fibration plugin, could it be safely > shortened from: > > if (len > 2 && name[len - 1] == 'o' && name[len - 2] == '.') > return FIBRE_NO(1); > else > return FIBRE_NO(0); > > to > > return FIBRE_NO((len > 2 && name[len - 1] == 'o' && name[len - 2] == '.')); > > since the boolean expression will always return 1 or 0? > Well, what would this achieve (besides code obfuscation)? I highly doubt that one can present a work-load where fibration calls will be CPU bottlenecks. Premature optimization is a root of all evil, as they say. > > > In Nikita's explanation of fibration [http://kerneltrap.org/node/view/2761] > (keeping .o files separate form all other files to speed compilation), would > there be any advantage to the following function? > > unsigned char no=1; > if (len > 2 && name[len - 2] == '.') { > switch (name[len - 1]) { > case: 'c' > case: 'h' > no=0 > break; > > case: 'o' > no = 2; > break; > > } > } > return FIBRE_NO(no); The whole point of having plugin infrastructure is that one is able to play with new file system policies easily. Just try it! > > > > Using the fibre for fast extension testing > A common filesystem client query is for '*.xxx'. If there were (is?) an > alternate readdir[64]() interface that allowed the caller to pass an > 'extension mask/filter,' then the fibre could be leveraged to quickly test > whether a given file matches the mask without calling unpack_string(), etc. > Kind of like saying to the filesystem: > > SELECT filename WHERE dir = '/home' AND filename LIKE '%.xxx' There is no such system call (there are user-level functions with the similar functionality, fts(3), and scandir(3)). sys_reiser4() system call is targeted to similar data-base like kind of access paths. > > A 'fibration map' would need to be specified in an array, say > > const char * fibre_map[] = {"c", "h", "o"...}; > > or using a structure like > > struct fibre_map_ent { > unsigned char fibrno; > unsigned char *exten; > }; > > #define FIBR_MAX 127 > #define FIBR_DEFAULT FIBR_MAX-1 > /* should be ordered by exten for searching */ > const struct fibre_map_ent map[] = > { > {.fibrno=6, exten="bar"}, > {.fibrno=0, exten="c"}, > {.fibrno=1, exten="h"}, > {.fibrno=3, exten="htm"}, > {.fibrno=3, exten="html"}, > {.fibrno=2, exten="java"}, > {.fibrno=FIBR_MAX, exten="o"}, > {.fibrno=4, exten="pl"}, > {.fibrno=5, exten="py"} > {.fibrno=7, exten="xml"} > }; > > /* > failure to find a match returns > > FIBRE_NO(FIBR_DEFAULT); > else > > FIBRE_NO(map[i].fibro); > > > The enhanced readdir would lookup the mask. If found, it would use the > fibrno to select appropriate directory entries for the readdir stream. > Otherwise, it would execute the default readdir code. Of course, this map > would need to be the global default for the filesystem and per-directory > fibration override would be prohibited ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-16 11:04 ` Nikita Danilov @ 2004-07-16 17:45 ` David Dabbs 0 siblings, 0 replies; 46+ messages in thread From: David Dabbs @ 2004-07-16 17:45 UTC (permalink / raw) To: 'Nikita Danilov'; +Cc: reiserfs-list > Nikita Danilov wrote: > > Well, what would this achieve (besides code obfuscation)? I highly > doubt that one can present a work-load where fibration calls will be > CPU bottlenecks. Premature optimization is a root of all evil, as they > say. > Yeah, you're right Prof Knuth. :) I thought better of that after I sent it. > > The whole point of having plugin infrastructure is that one is able to > play with new file system policies easily. Just try it! > I'll do that. Recalling your kerneltrap post, I figured that if it was desirable to separate the .o files from the sources that it would be good to keep c & h together. However, if you or another r4 developer said, 'we profiled this and it is of marginal benefit,' then why bother? > > Using the fibre for fast extension testing > > A common filesystem client query is for '*.xxx'. If there were (is?) an > > alternate readdir[64]() interface that allowed the caller to pass an > > 'extension mask/filter,' then the fibre could be leveraged to quickly > test > > whether a given file matches the mask without calling unpack_string(), > etc. > > Kind of like saying to the filesystem: > > > > SELECT filename WHERE dir = '/home' AND filename LIKE '%.xxx' > > There is no such system call (there are user-level functions with the > similar functionality, fts(3), and scandir(3)). > > sys_reiser4() system call is targeted to similar data-base like kind > of access paths. > I saw something about the syscall syntax somewhere, perhaps in the parser code. Should go back and investigate further. Any pointers to the most up-to-date docs would be appreciated. Thanks Nikita. David ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-16 10:35 Fibration questions David Dabbs 2004-07-16 11:04 ` Nikita Danilov @ 2004-07-18 7:11 ` Hans Reiser 2004-07-18 7:47 ` David Dabbs 2004-07-19 4:23 ` David Masover 1 sibling, 2 replies; 46+ messages in thread From: Hans Reiser @ 2004-07-18 7:11 UTC (permalink / raw) To: David Dabbs; +Cc: reiserfs-list David Dabbs wrote: > > >Using the fibre for fast extension testing >A common filesystem client query is for '*.xxx'. If there were (is?) an >alternate readdir[64]() interface that allowed the caller to pass an >'extension mask/filter,' then the fibre could be leveraged to quickly test >whether a given file matches the mask without calling unpack_string(), etc. >Kind of like saying to the filesystem: > > SELECT filename WHERE dir = '/home' AND filename LIKE '%.xxx' > >A 'fibration map' would need to be specified in an array, say > > const char * fibre_map[] = {"c", "h", "o"...}; > >or using a structure like > > struct fibre_map_ent { > unsigned char fibrno; > unsigned char *exten; > }; > > #define FIBR_MAX 127 > #define FIBR_DEFAULT FIBR_MAX-1 > /* should be ordered by exten for searching */ > const struct fibre_map_ent map[] = > { > {.fibrno=6, exten="bar"}, > {.fibrno=0, exten="c"}, > {.fibrno=1, exten="h"}, > {.fibrno=3, exten="htm"}, > {.fibrno=3, exten="html"}, > {.fibrno=2, exten="java"}, > {.fibrno=FIBR_MAX, exten="o"}, > {.fibrno=4, exten="pl"}, > {.fibrno=5, exten="py"} > {.fibrno=7, exten="xml"} > }; > > /* > failure to find a match returns > > FIBRE_NO(FIBR_DEFAULT); > else > > FIBRE_NO(map[i].fibro); > > >The enhanced readdir would lookup the mask. If found, it would use the >fibrno to select appropriate directory entries for the readdir stream. >Otherwise, it would execute the default readdir code. Of course, this map >would need to be the global default for the filesystem and per-directory >fibration override would be prohibited, otherwise readdir would return >incorrect results. > > > > > I think this is too implementation dependent of an optimization to be allowed to influence API design. There is a need for support for * in filename queries. I would support adding that to sys_reiser4. If FS naming was better designed, filenames would not have extensions. I prefer to first better design naming, and then not need to optimize the API for extensions. ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-18 7:11 ` Hans Reiser @ 2004-07-18 7:47 ` David Dabbs 2004-07-19 4:23 ` David Masover 1 sibling, 0 replies; 46+ messages in thread From: David Dabbs @ 2004-07-18 7:47 UTC (permalink / raw) To: 'Hans Reiser'; +Cc: reiserfs-list I see I'm not the only one up late. > -----Original Message----- > From: Hans Reiser [mailto:reiser@namesys.com] > Sent: Sunday, July 18, 2004 2:12 AM > To: David Dabbs > Cc: reiserfs-list@namesys.com > > > I think this is too implementation dependent of an optimization to be > allowed to influence API design. > Understood. > There is a need for support for * in filename queries. I would support > adding that to sys_reiser4. > When the time comes for investigating such support, would it mean a change to the 'sys_reiser4 syntax' (therefore the parser)? Other than a brief gloss of a web page that deals with it I haven't investigated the syscall syntax/capabilities, so pardon if this is misplaced. > If FS naming was better designed, filenames would not have extensions. > I prefer to first better design naming, and then not need to optimize > the API for extensions. While the subject of priorities is on the table, is there a 'reiser4 janitors' list' similar to the kernel janitors' list? Reading through the code I've seen a number of XXX-FIXME-HANS, etc. Would these TODOs, assuming the comments remain relevant, be (one of) the places for those wising to contribute to investigate? David ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-18 7:11 ` Hans Reiser 2004-07-18 7:47 ` David Dabbs @ 2004-07-19 4:23 ` David Masover 2004-07-19 7:21 ` David Dabbs 1 sibling, 1 reply; 46+ messages in thread From: David Masover @ 2004-07-19 4:23 UTC (permalink / raw) To: Hans Reiser; +Cc: David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: [...] | If FS naming was better designed, filenames would not have extensions. | I prefer to first better design naming, and then not need to optimize | the API for extensions. Still, if we're going to fibrate by file type and want to find a file by file type, there needs to be -- surprise! -- a standard way to determine file type. Extensions are universal, and aren't going away soon. What do we want to replace them with? Only thing I particularly care about here is that a file can appear as more than one type -- a script is both a program and a text file, for example. Of course, you'd only be able to fibrate by one type (right?), so there'd have to be a "primary" file type. This is helpful because it's the right thing to do, but also because it may have implications right now. For example, Gedit and AbiWord should both show me uncompressed AbiWord docs in their "open" dialogs, by default. Using file extensions, that isn't possible unless you hack Gedit to "know" every possible text file extension, or check the magic on each file in the directory. The "propor" way to implement it would be to ask the filesystem something like "List all the text files in this directory". And not, btw, "List all the *.txt files in this directory". -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQPtMxHgHNmZLgCUhAQKJxA//R1jxJuQ+PerBlgXO0FeGawp/i6Fr0wEU WnZBG81fX07fcp+BzZWWu0lFE6kOrrmY/nuyKNsV1GT54jz1BkYklCBxGGmgHjGs GZW3FvHHVWEfKNNuYv9zUVEkgA3RhxbCiQ5v44IrHzBLOFFDGTaOaBtsSC/NJa5C I95IitqXRDj3AYDgpdu+++absgH47mCzThEdXP++Sb8RQwGEgnaubEDJIWqi+qOu K5Psgudnu7+QejTlNzU5WuoCAwCX3b2+Trk6qj1bOJk7CucaBtojlwvifLJpZzau Mi5gBe8cddyderP9f1rDHovUCcRcd5x0If7ESFYwfmNZO/GakJI1MSzt9W6c5PWK Elap3jaA8aHmPJ6gfN24+GFvEgM7LDM/zw9GpjJ41awcf4Y/m7n8AfRaGv1+Ybvm B+JmWMexqk+O+phCNvI/lfOOnbOCcaIMihzw5LO/e+jpJkOTuMwozPWMaIy1OZBC FRjvTv28uCcbTepjHRDw1xmdY3lQ7GuO53/li0DerIYiAC3+zlX1mg1/PV7kl85B jWXh89OEQd6ak9W/DKJnxd3uFe5c3nue9yr6PzSdC+bqSaN3p/KWmlZxtkqKhxOM E9QVgtsS9j5snwsf5d+viCZ1eEn3p8UYDZsVvvrqzwaC3opgXpYyQZ/YmnivLnJJ EXadVtulIuU= =67XN -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-19 4:23 ` David Masover @ 2004-07-19 7:21 ` David Dabbs 2004-07-19 21:34 ` David Masover 0 siblings, 1 reply; 46+ messages in thread From: David Dabbs @ 2004-07-19 7:21 UTC (permalink / raw) To: 'David Masover', 'Hans Reiser'; +Cc: reiserfs-list > -----Original Message----- > From: David Masover [mailto:ninja@slaphack.com] > Sent: Sunday, July 18, 2004 11:24 PM > To: Hans Reiser > Cc: David Dabbs; reiserfs-list@namesys.com > > Hans Reiser wrote: > [...] > | If FS naming was better designed, filenames would not have extensions. > | I prefer to first better design naming, and then not need to optimize > | the API for extensions. > > Still, if we're going to fibrate by file type and want to find a file by > file type, there needs to be -- surprise! -- a standard way to determine > file type. > There be dragons. Despite the fact that I advocated applying fibration data to filesystem queries, the two (fibrating by file type [extension] and 'finding a file by file type') are quite different. The former is simply a way to bunch/glom/group particular filesystem objects together in the tree. The latter requires metadata beyond that provided by the filesystem objects themselves. Leaving aside fibration for a moment, there is a (big) difference between the filesystem's ability to answer an (objective) query regarding some aspect of an object's human-readable name (search by extension) and what it seems you are seeking (search by file _type_). In your use case, the filesystem would be required to 'deduce,' via suitable, consistent and human-maintained metadata, which objects are subclasses of some 'type' in a human-maintained 'FileObject' ontology. This is the kind of thing for which the W3C's SemanticWeb activity might advocate OWL/RDF. Possible means aside, the following are among the questions the community would need to address: 1. What is the range of 'file types'? 2. The range of known 'file type aliases' (extensions)? 3. How should applications interpret and buy into this consensus? 4. At what level is this ontology managed? The OS, VFS, particular filesystems? 5. What is a portable metadata storage format that is easily maintained (and shared) by humans and parsed/employed by applications? > Extensions are universal, and aren't going away soon. What do we want Extensions are a convention humans share that are tenuously/inconsistently 'understood' by the computers humans use. Under Windows, an installed application also installs a 'rule' that associates the application with filesystem objects that exhibit certain attributes, e.g. that they end in '.foo.' > to replace them with? Only thing I particularly care about here is that > a file can appear as more than one type -- a script is both a program > and a text file, for example. Of course, you'd only be able to fibrate > by one type (right?), so there'd have to be a "primary" file type. > > This is helpful because it's the right thing to do, but also because it > may have implications right now. For example, Gedit and AbiWord should > both show me uncompressed AbiWord docs in their "open" dialogs, by > default. Using file extensions, that isn't possible unless you hack > Gedit to "know" every possible text file extension, or check the magic > on each file in the directory. > > The "propor" way to implement it would be to ask the filesystem > something like "List all the text files in this directory". And not, > btw, "List all the *.txt files in this directory". > > I believe the proper thing to do is to leave this service to the operating system (prob. the VFS) and to application programmers. The filesystem can be very good/fast at finding objects that end in some 'extension,' but any more understanding about objects should be handled 'above' the (a particular) filesystem. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-19 7:21 ` David Dabbs @ 2004-07-19 21:34 ` David Masover 2004-07-19 22:06 ` Valdis.Kletnieks ` (2 more replies) 0 siblings, 3 replies; 46+ messages in thread From: David Masover @ 2004-07-19 21:34 UTC (permalink / raw) To: David Dabbs; +Cc: 'Hans Reiser', reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Dabbs wrote: |>-----Original Message----- |>From: David Masover [mailto:ninja@slaphack.com] |>Sent: Sunday, July 18, 2004 11:24 PM |>To: Hans Reiser |>Cc: David Dabbs; reiserfs-list@namesys.com |> |>Hans Reiser wrote: |>[...] |>| If FS naming was better designed, filenames would not have extensions. |>| I prefer to first better design naming, and then not need to optimize |>| the API for extensions. |> |>Still, if we're going to fibrate by file type and want to find a file by |>file type, there needs to be -- surprise! -- a standard way to determine |>file type. |> | | | There be dragons. Despite the fact that I advocated applying fibration data | to filesystem queries, the two (fibrating by file type [extension] and | 'finding a file by file type') are quite different. The former is simply a | way to bunch/glom/group particular filesystem objects together in the tree. | The latter requires metadata beyond that provided by the filesystem objects | themselves. Why beyond? Ask each fs object (without knowing its name), "What is your primary type?" Put like-typed objects together. Simple. How do the file objects know what type they are? After the first atom is committed, they default to a type based on their magic. That is, a file that begins with "#!/usr/bin/perl" is a Perl, a Text file, a Script, and a Program. Primarily Perl, so it gets fibrated that way. This can be optimized -- a file that begins with "#!" is a script, we know this because the OS does. If the file doesn't begin with "#!", we don't need to look at the rest of the line. And for things which aren't perl, that's already a simpler check than "does the file end in '.pl'?" On top of that, we only have to assign the file type once -- at creation. For the rest of the file's lifetime, until someone decides to change its type, the type is a bit of static metadata, as optimized (fast/small) as file permissions, much faster and smaller than file extensions. | This is the kind of thing for which the W3C's SemanticWeb activity might | advocate OWL/RDF. Possible means aside, the following are among the | questions the community would need to address: | | 1. What is the range of 'file types'? How many "file types" are there on Windows? That might be a good place to start. They'd just be implemented in a more flexible way. | 2. The range of known 'file type aliases' (extensions)? No extensions. Just file types. You could name an mp3 file ".doc" and not fool the system. The tooltip in GNOME would say "foo.doc -- mpeg music file" or something similar. I'm thinking something like MIME, more or less. | 3. How should applications interpret and buy into this consensus? The app defines what file types it can deal with, and then only shows the user files of that type. It finds the type by looking at ..metas/type. | 4. At what level is this ontology managed? The OS, VFS, particular | filesystems? Reiser4 plugin, at first. VFS (as in GNOME VFS) would probably be the next layer up. | 5. What is a portable metadata storage format that is easily maintained (and | shared) by humans and parsed/employed by applications? Reiser4 metadata. Possibly a default is set using file magic. Users who don't know how to directly access such metadata probably don't understand extensions anyway -- note that Windows "hides file extensions by default". You know it's a word document because the icon is of a word document and when you go to Word's open dialog, it shows up. That's the level at which the user understands "file types". Portable? I'm hoping that other filesystems start supporting metadata in a similar way. Otherwise, this just becomes yet another enhancement for reiser4-based systems. In fact, if this is supported in some library (say, at the GNOME VFS level), it is entirely portable, because it can fall back on extensions if the metadata isn't supported, and we can fall back on asking for *.foo if the fs doesn't support a query for "files of type foo". | Extensions are a convention humans share that are tenuously/inconsistently | 'understood' by the computers humans use. Under Windows, an installed | application also installs a 'rule' that associates the application with | filesystem objects that exhibit certain attributes, e.g. that they end in | '.foo.' Under Windows, when I open notepad and go to File->Open, it shows me, by default, files that end in txt. When on Windows, I'd use notepad for a lot more -- editing html files, batch files, and so on. So I basically have to use the dropdown menu to select "all files", which means I might accidently open an mp3 file in Notepad -- I'll certainly have to sort through mp3 files to get to the .m3u file I wanted. The main drawback of extensions is that you can't have a file with two extensions. Witness things like .tar.bz2 and .tbz2. You now have an exception for files that end in .bz2 -- check if the preceding characters are .tar, and if so, treat it as a .tbz2. Or, if a file ends in .tbz2, and we're looking for things we can extract with bzcat (maybe using tab-completion in Bash), we have to support .tbz2, not to mention .bzabw -- and easily a dozen more really obscure ones that we don't know about. | I believe the proper thing to do is to leave this service to the operating | system (prob. the VFS) and to application programmers. The filesystem can be You don't think a file type is metadata? And I bet it'd be nice to be good/fast at finding objects which have a certain property. Say, a permission set. rwx=some_value -- type=some_value -- what's the diff? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQPw+VngHNmZLgCUhAQJAcA//TH5DgiWAkdt1I2xrBQGiydynoeknVnD5 08ULzcgy+JkXqxbcupBwUX3yqhvJu0i7jx/UjMhJdFFtmeSJqqoGXB5UWGaFg7s3 dix3klCLKiuEIyNrWQnJjjnlivO7uq0oV62cFe5NE+NwFQTgusP+k6VMf4DkFI+d /8ddW6YBtD8UIMHi980/n/9BcVeNd7NJrpC35QYJqASDOIYkj2TeoMk3tz9z6J8g 0V4jmpV8212XrWXy1acEwQOIbKsa3xdlhS0LkQ5As41qEpisV3M//QQSwY8zSucH 57YPrfLWEA1oO5jMvsLQCbTORjksGoBjIlB5idED7d75xB24obovuBilp+UJmQwH zBNyJLdcjxpmkeqWW3aHadjQNNPGG/+uWVonOOmLfU2RQ1T+OoFWjqb5fj25eDwD JBdBdyrBn0KPOKLKWCElD6jM9z+6xvSgJ0nP42jrI1OdGM3XVAco76h4KQ5mrv9Y 1ssk4isGMfJaen9MIrr43k8T4SY8FGul7WklpRue+UhNt95PwFfD1PGrBGzU5JNM U8xprMt8td7jswRk2JKuYBZru1ihHtWD/eBfC4sAxOd/7JFQ7ctubk/BqPozW3z2 UnDc5XZeDSJfsZUKmCwN0qpsK6M1bPzibMHxIq2xP1ao0ldZmdUezzMQb80x42is G5j11VswkJk= =9YdA -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-19 21:34 ` David Masover @ 2004-07-19 22:06 ` Valdis.Kletnieks 2004-07-19 22:32 ` David Dabbs 2004-07-20 5:30 ` Hans Reiser 2 siblings, 0 replies; 46+ messages in thread From: Valdis.Kletnieks @ 2004-07-19 22:06 UTC (permalink / raw) To: David Masover; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 2187 bytes --] On Mon, 19 Jul 2004 16:34:14 CDT, David Masover said: > Why beyond? Ask each fs object (without knowing its name), "What is > your primary type?" Put like-typed objects together. Simple. Here there be dragons. And you just met one. :) > On top of that, we only have to assign the file type once -- at > creation. For the rest of the file's lifetime, until someone decides to > change its type, the type is a bit of static metadata, as optimized > (fast/small) as file permissions, much faster and smaller than file > extensions. Congrats - combining these two means that (for example) if you slice-n-mice a script fragment out of a file and save it, and then go back and add that little '#!/bin/bash' line after the fact, your file now lives WAAAY over there with all the other *.txt files, rather than over HERE with the other shell scripts. You think I'm kidding? Consider the following file found in the syslog-ng-1.6.4 tarball: [~/src/syslog-ng-1.6.4] head contrib/init.d.RedHat-7.3 ################################################################################ # # Program: syslog-ng init script for Red Hat # ################################################################################ # the following information is for use by chkconfig # if you are want to manage this through chkconfig (as you should), you must # first must add syslog-ng to chkconfig's list of startup scripts it # manages by typing: # So you copy that to /etc/init.d - what type is it? Could be Perl, could be Bash, and the 'file' command thinks neither: [~/src/syslog-ng-1.6.4] file contrib/init.d.RedHat-7.3 contrib/init.d.RedHat-7.3: ASCII English text Wander over to SELinux and see what sort of fun and games you have to play to do file labelling - and SELinux is able to make simplifying assumptions based on rules like "All files under this directory are labelled foo_t, and all files in that directory inherit a label based on their context" (somewhat similar to how files inherit their group-id on most Unixoids)... The file_contexts in Fedora is fast approaching 1,600 different regular expressions for files to assign a label to them. Large and nasty dragons, indeed.... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-19 21:34 ` David Masover 2004-07-19 22:06 ` Valdis.Kletnieks @ 2004-07-19 22:32 ` David Dabbs 2004-07-20 6:03 ` Hans Reiser 2004-07-20 7:03 ` David Masover 2004-07-20 5:30 ` Hans Reiser 2 siblings, 2 replies; 46+ messages in thread From: David Dabbs @ 2004-07-19 22:32 UTC (permalink / raw) To: 'David Masover'; +Cc: reiserfs-list, Hans Reiser > > David Dabbs wrote: > |>-----Original Message----- > |>From: David Masover [mailto:ninja@slaphack.com] > |>Sent: Sunday, July 18, 2004 11:24 PM > |>To: Hans Reiser > |>Cc: David Dabbs; reiserfs-list@namesys.com > |> > |>Hans Reiser wrote: > |>[...] > |>| If FS naming was better designed, filenames would not have extensions. > |>| I prefer to first better design naming, and then not need to optimize > |>| the API for extensions. > |> > |>Still, if we're going to fibrate by file type and want to find a file by > |>file type, there needs to be -- surprise! -- a standard way to determine > |>file type. > |> > | > | > | There be dragons. Despite the fact that I advocated applying fibration > data > | to filesystem queries, the two (fibrating by file type [extension] and > | 'finding a file by file type') are quite different. The former is simply > a > | way to bunch/glom/group particular filesystem objects together in the > tree. > | The latter requires metadata beyond that provided by the filesystem > objects > | themselves. > > Why beyond? Ask each fs object (without knowing its name), "What is > your primary type?" Put like-typed objects together. Simple. > > How do the file objects know what type they are? After the first atom > is committed, they default to a type based on their magic. That is, a > file that begins with "#!/usr/bin/perl" is a Perl, a Text file, a > Script, and a Program. Primarily Perl, so it gets fibrated that way. > [David Dabbs] The files don't really know their type. The filesystem/OS is deducing this, yes? > This can be optimized -- a file that begins with "#!" is a script, we > know this because the OS does. If the file doesn't begin with "#!", we > don't need to look at the rest of the line. And for things which aren't > perl, that's already a simpler check than "does the file end in '.pl'?" > > On top of that, we only have to assign the file type once -- at > creation. For the rest of the file's lifetime, until someone decides to > change its type, the type is a bit of static metadata, as optimized > (fast/small) as file permissions, much faster and smaller than file > extensions. > [David Dabbs] True, but this would need to be recomputed when some process changes the file contents that contributed to the initial type signature. > | This is the kind of thing for which the W3C's SemanticWeb activity might > | advocate OWL/RDF. Possible means aside, the following are among the > | questions the community would need to address: > | > | 1. What is the range of 'file types'? > > How many "file types" are there on Windows? That might be a good place > to start. They'd just be implemented in a more flexible way. > [David Dabbs] ...and Unix, etc. Anyway, when you get down to it, and leaving out encodings, there are really only two essential file types: text and binary. From there, you move into 'abstract' types based upon these. Using text as an example, you might have an abstract type such as 'XML,' which would be any text/* or application/* (using MIME) that is known to be based on an XML format. After that, you get to application-specific types. > | 2. The range of known 'file type aliases' (extensions)? > > No extensions. Just file types. You could name an mp3 file ".doc" and > not fool the system. The tooltip in GNOME would say "foo.doc -- mpeg > music file" or something similar. > > I'm thinking something like MIME, more or less. > > | 3. How should applications interpret and buy into this consensus? > > The app defines what file types it can deal with, and then only shows > the user files of that type. It finds the type by looking at > ..metas/type. > [David Dabbs] True. But in today's extension-based 'consensus,' there is no coordination required between _anyone_ if some enterprising developer creates a great new file format for, say, music files. Applications that decide to consume these files simply add *.foo to the list of files presented to users. Using metas/type, file type creators and application developers would need to share and maintain consistent type IDs/signatures namespace. I'm not against what you're proposing, just trying to consider possible issues in implementing it. > | 4. At what level is this ontology managed? The OS, VFS, particular > | filesystems? > > Reiser4 plugin, at first. VFS (as in GNOME VFS) would probably be the > next layer up. > [David Dabbs] While I'm a reiser4 'true believer,' other VFS filesystems do and will continue to exist. Might an application developer's job be complicated if not every filesystem for which it presents a file list supports metas or some means to query file objects' type? > | 5. What is a portable metadata storage format that is easily > maintained (and > | shared) by humans and parsed/employed by applications? > > Reiser4 metadata. Possibly a default is set using file magic. Users > who don't know how to directly access such metadata probably don't > understand extensions anyway -- note that Windows "hides file extensions > by default". You know it's a word document because the icon is of a > word document and when you go to Word's open dialog, it shows up. > That's the level at which the user understands "file types". > > Portable? I'm hoping that other filesystems start supporting metadata > in a similar way. Otherwise, this just becomes yet another enhancement > for reiser4-based systems. > > In fact, if this is supported in some library (say, at the GNOME VFS > level), it is entirely portable, because it can fall back on extensions > if the metadata isn't supported, and we can fall back on asking for > *.foo if the fs doesn't support a query for "files of type foo". > > | Extensions are a convention humans share that are > tenuously/inconsistently > | 'understood' by the computers humans use. Under Windows, an installed > | application also installs a 'rule' that associates the application with > | filesystem objects that exhibit certain attributes, e.g. that they end > in > | '.foo.' > > Under Windows, when I open notepad and go to File->Open, it shows me, by > default, files that end in txt. When on Windows, I'd use notepad for a > lot more -- editing html files, batch files, and so on. So I basically > have to use the dropdown menu to select "all files", which means I might > accidently open an mp3 file in Notepad -- I'll certainly have to sort > through mp3 files to get to the .m3u file I wanted. > > The main drawback of extensions is that you can't have a file with two > extensions. Witness things like .tar.bz2 and .tbz2. You now have an > exception for files that end in .bz2 -- check if the preceding > characters are .tar, and if so, treat it as a .tbz2. Or, if a file ends > in .tbz2, and we're looking for things we can extract with bzcat (maybe > using tab-completion in Bash), we have to support .tbz2, not to mention > .bzabw -- and easily a dozen more really obscure ones that we don't know > about. > > | I believe the proper thing to do is to leave this service to the > operating > | system (prob. the VFS) and to application programmers. The filesystem > can be > > You don't think a file type is metadata? And I bet it'd be nice to be > good/fast at finding objects which have a certain property. Say, a > permission set. rwx=some_value -- type=some_value -- what's the diff? > > [David Dabbs] I do think a file type is metadata. And it would certainly be nice to search by and (quickly) find a file by its type. But I think the APIs, etc. above the filesystem(s) will first need to incorporate a notion of type. Until applications/users start screaming for filesystem type attributes/queries, the fs overhead and effort involved to figure it out doesn't really seem worth it. Going back to your original response to Hans's comment: > |>Still, if we're going to fibrate by file type and want to find a file by > |>file type, there needs to be -- surprise! -- a standard way to determine > |>file type. What I (we) originally started out exploring was using fibration plugin flexibility to group files beyond one character of the file name, which is unfortunately the best, shared means for file typing we have today. If you're interested in a more robust type system and its use in fibration, then 'Just try it!' That's what one of the reiser4 developers suggested to me. One thing to note when coming up with a fibration-compatible type signature is that r4's key structure only provides 7 bits with which to work. I'd bet that there are many more than 7 bits worth of distinct file types out there. Cheers, David ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-19 22:32 ` David Dabbs @ 2004-07-20 6:03 ` Hans Reiser 2004-07-20 7:03 ` David Masover 1 sibling, 0 replies; 46+ messages in thread From: Hans Reiser @ 2004-07-20 6:03 UTC (permalink / raw) To: David Dabbs; +Cc: 'David Masover', reiserfs-list David Dabbs wrote: > >While I'm a reiser4 'true believer,' other VFS filesystems do and will >continue to exist. Might an application developer's job be complicated if >not every filesystem for which it presents a file list supports metas or >some means to query file objects' type? > > > >>| >> > > > > > > Maybe I should start describing reiser4 as a VFS layer enhancement, and indicate that the plugins are the FS, and then reiser4 can be a linux standard.;-) Hans ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-19 22:32 ` David Dabbs 2004-07-20 6:03 ` Hans Reiser @ 2004-07-20 7:03 ` David Masover 1 sibling, 0 replies; 46+ messages in thread From: David Masover @ 2004-07-20 7:03 UTC (permalink / raw) To: David Dabbs; +Cc: reiserfs-list, Hans Reiser -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Dabbs wrote: |>David Dabbs wrote: [...] | [David Dabbs] | The files don't really know their type. The filesystem/OS is deducing this, | yes? Yes and no. Yes, at first, to support apps which don't set a type on files they create. No, because after the type has been set to some default, it only changes "manually" -- if the user or an app decides the type should change. |>This can be optimized -- a file that begins with "#!" is a script, we |>know this because the OS does. If the file doesn't begin with "#!", we |>don't need to look at the rest of the line. And for things which aren't |>perl, that's already a simpler check than "does the file end in '.pl'?" |> |>On top of that, we only have to assign the file type once -- at |>creation. For the rest of the file's lifetime, until someone decides to |>change its type, the type is a bit of static metadata, as optimized |>(fast/small) as file permissions, much faster and smaller than file |>extensions. |> | | [David Dabbs] | True, but this would need to be recomputed when some process changes the | file contents that contributed to the initial type signature. Not really. I mean, yes, to be perfectly consistent. But no, because if I named a file '.pl', and then changed its contents to a shell script, I have to rename it to a '.sh' anyway. So I'd probably change its type somewhere in there anyway. It would probably be an acceptable performance hit for vim to request that the fs try and re-type (scan the ~ magic) at every write. Just not for, say, mysql, for which the file type is static but there are tons of writes. | [David Dabbs] | True. But in today's extension-based 'consensus,' there is no coordination | required between _anyone_ if some enterprising developer creates a great new | file format for, say, music files. Applications that decide to consume these True, but it's clumsy. Read on: | files simply add *.foo to the list of files presented to users. Using So if I made a text format with extension foo, and Notepad decided to consume it, and then someone made a video format with extension foo, then Notepad now shows me files that it doesn't know what to do with. Also, file types are "associated" in Windows, and users expect to be able to click on a file from the Windows Explorer and have it open in the right app. | metas/type, file type creators and application developers would need to | share and maintain consistent type IDs/signatures namespace. I'm not against No, not any more than above. If such things could be altered during runtime, an app really only has to register on the local machine. Yeah, you'd want to share and maintain it, but it'd be possible to write type names and signatures that are much less ambiguous than three-letter extensions -- especially because, since a file can have multiple types but only one primary, you can always name the primary file type '<yourname>-<appname>-<description>' or something. Behind the scenes, the names all get mapped to a number anyway, so there's no performance hit. | [David Dabbs] | While I'm a reiser4 'true believer,' other VFS filesystems do and will | continue to exist. Might an application developer's job be complicated if | not every filesystem for which it presents a file list supports metas or | some means to query file objects' type? Application developer, no. VFS developer, yes. Unless Reiser4 becomes a standard, the easiest way to go is to use an existing VFS or some kind of library to get the file type. So app X could use Gnome VFS, and Gnome VFS could use file extensions if reiser4 wasn't available. | [David Dabbs] | I do think a file type is metadata. And it would certainly be nice to search | by and (quickly) find a file by its type. But I think the APIs, etc. above | the filesystem(s) will first need to incorporate a notion of type. Until | applications/users start screaming for filesystem type attributes/queries, | the fs overhead and effort involved to figure it out doesn't really seem Not much fs overhead. Effort involved to figure it out seems to mostly be me trying to express myself ;) although I acknowledge that it may be non-trivial to implement. | worth it. Is it really worth it to have a ..metas dir when we already have xattrs? ~ Not yet -- I don't use either. Eventually, when apps start to support ..metas, it will be very much worth it. Also, until the FS supports a notion of type, either we're building a layer on top of it (VFS) or using a crude hack (extensions). As long as the hack is available, btw, no one will try to solve the problem the right way. And I don't know about Gnome VFS, but Nautilus already supports a notion of type. It's just very centrallized and based on extensions. Wouldn't be too hard to port to fs-based types. | One thing to note when coming up with a fibration-compatible type signature | is that r4's key structure only provides 7 bits with which to work. I'd bet | that there are many more than 7 bits worth of distinct file types out there. Maybe this has moved beyond fibration :P Or maybe fibration would be better handled by the grosser types -- "text", "script", "video", "audio", "binary", "library"... Of course you'd need a central list. Which is what we've got right now, only it uses a handful of very common extensions. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQPzD1ngHNmZLgCUhAQL1Qg/+MEUi/Bbdv7ziA0A/bKtGLVmmPvB5UKGr KZ4U8kgUWAEYoTKtqF/WMYrzhn/gUlzozLpyAoVYKZ2C8rKbF7tBKvGiZYuGCIVm ZKIb9kdj+TbZaQ5BKBPvvVXKNdaLRo+r68YZSMO4YhsYsnrUjcxB7GNPmKZ1LS8v NxCDVA/31HLHhn+kF+u3xdadCctduiyIgmRqZg6zUkp6yQPCAmMTT26s7iR/UqxQ 6UsGcd66OKZIWQn5hGA2uTj4MlNBaHMOluWcaN6GV1RIEci/ACnOgoIeZS2N9q5+ zLbw7GMWcXUbD2w6tZPuKJJqff24z0oMzpQbzyJ0XgyhNRu5mT1HVQ1Pp/SkhT5Q 3FMhgpOM8DevaCWGoeEunOAr1jdXSE55eN6/y9RcIM6iAqAN2jUJK+ZjhFSAxcaO 4NHa00/N55WRH8ZKFBSnNyvVQwDp+3VKlvfFo/bw8n8C79RzazCFs04Pxo1F8hnC CUdIuXBRoVszHkLM5f0Y0hLdqlyzf0kHfOKn/Rz5EE3/aNacneuaoehcVRv+k0Bk EdwYM8okmqLKogM6AUvNukLmdG7nHm2cei/XuUFyln3fTrB+C1wIgO9LwAxqLZAq EXREghOnwC/nqGKuBzJOatCZHUIj/Fy0RMO5at80v1LxzHZP2PuArS6vYajHm/jI zDcxpbDUZlo= =kUWM -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-19 21:34 ` David Masover 2004-07-19 22:06 ` Valdis.Kletnieks 2004-07-19 22:32 ` David Dabbs @ 2004-07-20 5:30 ` Hans Reiser 2004-07-20 7:07 ` David Masover 2 siblings, 1 reply; 46+ messages in thread From: Hans Reiser @ 2004-07-20 5:30 UTC (permalink / raw) To: David Masover; +Cc: David Dabbs, reiserfs-list David Masover wrote: > > > Why beyond? Ask each fs object (without knowing its name), "What is > your primary type?" Put like-typed objects together. Simple. Except that at look up time all you know is the name, and if the type is not in the name then you cannot fibrate by it. Hans ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-20 5:30 ` Hans Reiser @ 2004-07-20 7:07 ` David Masover 2004-07-20 8:31 ` David Dabbs 0 siblings, 1 reply; 46+ messages in thread From: David Masover @ 2004-07-20 7:07 UTC (permalink / raw) To: Hans Reiser; +Cc: David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: | David Masover wrote: | |> Why beyond? Ask each fs object (without knowing its name), "What is |> your primary type?" Put like-typed objects together. Simple. | | Except that at look up time all you know is the name, and if the type is | not in the name then you cannot fibrate by it. I must not understand fibration. Do you have to know the fibration of an object to find it? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQPzElngHNmZLgCUhAQJHmg/+IZUY1k9s6X476t9eyu/TXTbNUUUFEO/M m5VfhsoQ8K+WuIyV0l0p77t8LWczsMWIXC4Lr8MKrcPAzPDBTv3Mhr1MCaxQzeCm 6CaITQf0bB6xKGsuKwb0jlN/wV85jDu7oTyGOCvc9Nqww5AqnwVcDbmkykuN+2Mk /OIM3hhV3a86JfuWt4NwvLhwDU2ii1T0ZudPEJpmiH/rcsqChQ1Frl70KJwzEh/p FuhzaD4liBbWTdvwFKUr3/9tbWxVs/a8u+/RDjiipZ1q6xBgd6N+xMSrIc31vkKc RV9uAFLwTuF54i2ZGnhXNuqqY+5VBpooP7qFEA0+qbjpNnb9u6ftHzTB6LalWDd8 u+JvcuXt28DZWy03r685wk9EY/Sixie7siQ3f4lJ+txlh0fN4ubw8yfL11X4f/yr S96ayAbbevTzTicc0mqJkq5KLawDnqP4qQ5ki+BZaXxAPMJjgyU/5dOMbx/qOWmT eDPZ2gQ4U3e4xDBjPbIorhG38mfcM/mstUc7+Ty3FzAT13StG8OrF84x39K0QsCU p4iOQ4v60BsJ/pBNl5oue8l4wIdKNVJr5y457ru4+ML5k/uPHvmfrQSR2qjpHbaD F7CG7ZSJYRCVvBYMViQjUlETGBraVnuI+8vssrUiTtg3Bwq2p3/3i4fELGrb0fwo bjpkqxVAMZ4= =ETaB -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-20 7:07 ` David Masover @ 2004-07-20 8:31 ` David Dabbs 2004-07-21 5:13 ` David Masover 0 siblings, 1 reply; 46+ messages in thread From: David Dabbs @ 2004-07-20 8:31 UTC (permalink / raw) To: 'David Masover'; +Cc: reiserfs-list > > Hans Reiser wrote: > | David Masover wrote: > | > |> Why beyond? Ask each fs object (without knowing its name), "What is > |> your primary type?" Put like-typed objects together. Simple. > | > | Except that at look up time all you know is the name, and if the type is > | not in the name then you cannot fibrate by it. > > I must not understand fibration. Do you have to know the fibration of > an object to find it? > Fibration is simply a means to physically group together filesystem objects you want grouped together, or perhaps more likely, separated from some other group. I muddied the waters by proposing that the fibration bits might be gainfully employed as a quick and dirty to test whether an object had a particular extension. Here is the schematic for directory item keys with which you are probably familiar. The file /etc/foo will have the dirid for /etc as key.el[0], the first 7 characters of the file name in key.el[1]. etc. Fibration, if enabled, assigns a 'fibre' as the 7 high bits of the second key element, thus acting like 'GROUP BY' for those file names that end up generating the same fibre bits using the configured fibration plugin. See below. David /* * KEY ASSIGNMENT: PLAN A, LONG KEYS. * * DIRECTORY ITEMS * | 60 | 4 | 7 |1| 56 | 64 | 64 +--------------+---+---+-+-------------+------------------+----------------+ | dirid | 0 | F |H| prefix-1 | prefix-2 | prefix-3/hash | +--------------+---+---+-+-------------+------------------+----------------+ | | | | | | 8 bytes | 8 bytes | 8 bytes | 8 bytes | dirid objectid of directory this item is for F fibration, see fs/reiser4/plugin/fibration.[ch] (BELOW) H 1 if last 8 bytes of the key contain hash, 0 if last 8 bytes of the key contain prefix-3 prefix-1 first 7 characters of file name. Padded by zeroes if name is not long enough. prefix-2 next 8 characters of the file name. prefix-3 next 8 characters of the file name. hash hash of the rest of file name (i.e., portion of file name not included into prefix-1 and prefix-2). File names shorter than 23 (== 7 + 8 + 8) characters are completely encoded in the key. Such file names are called "short". They are distinguished by H bit set in the key. Other file names are "long". For long name, H bit is 0, and first 15 (== 7 + 8) characters are encoded in prefix-1 and prefix-2 portions of the key. Last 8 bytes of the key are occupied by hash of the remaining characters of the name. This key assignment reaches following important goals: (1) directory entries are sorted in approximately lexicographical order. (2) collisions (when multiple directory items have the same key), while principally unavoidable in a tree with fixed length keys, are rare. /***************************** fibration.c ****************************/ /* Copyright 2004 by Hans Reiser, licensing governed by * reiser4/README */ /* * Suppose we have a directory tree with sources of some project. During * compilation .o files are created within this tree. This makes access * to the original source files less efficient, because source files are * now "diluted" by object files: default directory plugin uses prefix * of a file name as a part of the key for directory entry (and this * part is also inherited by the key of file body). This means that * foo.o will be located close to foo.c and foo.h in the tree. * * To avoid this effect directory plugin fills highest 7 (unused * originally) bits of the second component of the directory entry key * by bit-pattern depending on the file name (see * fs/reiser4/kassign.c:build_entry_key_common()). These bits are called * "fibre". Fibre of the file name key is inherited by key of stat data * and keys of file body (in the case of REISER4_LARGE_KEY). * * Fibre for a given file is chosen by per-directory fibration * plugin. Names within given fibre are ordered lexicographically. */ static const int fibre_shift = 57; #define FIBRE_NO(n) (((__u64)(n)) << fibre_shift) /* * Trivial fibration: all files of directory are just ordered * lexicographically. */ static __u64 fibre_trivial(const struct inode *dir, const char *name, int len) { return FIBRE_NO(0); } /* * dot-o fibration: place .o files after all others. */ static __u64 fibre_dot_o(const struct inode *dir, const char *name, int len) { /* special treatment for .*\.o */ if (len > 2 && name[len - 1] == 'o' && name[len - 2] == '.') return FIBRE_NO(1); else return FIBRE_NO(0); } /* * ext.1 fibration: subdivide directory into 128 fibrations one for each * 7bit extension character (file "foo.h" goes into fibre "h"), plus * default fibre for the rest. */ static __u64 fibre_ext_1(const struct inode *dir, const char *name, int len) { if (len > 2 && name[len - 2] == '.') return FIBRE_NO(name[len - 1]); else return FIBRE_NO(0); } /* * ext.3 fibration: try to separate files with different 3-character * extensions from each other. */ static __u64 fibre_ext_3(const struct inode *dir, const char *name, int len) { if (len > 4 && name[len - 4] == '.') return FIBRE_NO(name[len - 3] + name[len - 2] + name[len - 1]); else return FIBRE_NO(0); } ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-20 8:31 ` David Dabbs @ 2004-07-21 5:13 ` David Masover 2004-07-21 5:44 ` David Dabbs 0 siblings, 1 reply; 46+ messages in thread From: David Masover @ 2004-07-21 5:13 UTC (permalink / raw) To: David Dabbs; +Cc: reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Dabbs wrote: |>Hans Reiser wrote: |>| David Masover wrote: |>| |>|> Why beyond? Ask each fs object (without knowing its name), "What is |>|> your primary type?" Put like-typed objects together. Simple. |>| |>| Except that at look up time all you know is the name, and if the type is |>| not in the name then you cannot fibrate by it. |> |>I must not understand fibration. Do you have to know the fibration of |>an object to find it? |> | | | Fibration is simply a means to physically group together filesystem objects =>> MEGA SNIP <<= So, what you're trying to say is, yes, because it's part of the key? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQP37gHgHNmZLgCUhAQJ0xxAAjsgWwIjjrM/mjZTkWgR0L5ih3bAIdWhB gXZtcXkzqOYkEi4slSdBpGaTgz1UytuS6eKWSU46sRTQAp9AOH9RAJrdPYBWCKHQ K1TGRqK+BhbM8Ygk7B389zr84KO8wI+ekY5a2ErAYNZ+s41xztBfC0SsMPdzTflQ T2ePnPzRQjImARjwhYELgXsUVDmXM9nO694oLoYhUvdRzMZfpEeL/S1duCbtr4zQ o7FCk487opiE4q0HfRS+dXdJh06daOKI0uCKUZWCVeZkKoASaCo9Zy8rVfdLORUp weRSfONRS2l/DHgpu0hfmnZ3GSfTijugDzvGt6L8Xgae++sj7FcoRWVUK3QNWlwg M74dRy85vPEYdflHAqqD8az1AGVsj6DWz/GcXpwQH7gutfpcmk6rdKgxt0YlLC3p NEAtqHwT/S7T1/7ya8L9UU5kqub861xiJ1NZacZ1V0YTN8jLAxeXqX3lGNJMYnPy Vo+esgMGFbksFV4adS27tbX1jxLbn/Y9s1A9UdMmqN7td6APEXzIafwcWo+jBZdS TXWEletyoT5CbtrUISlXzgb9CWFhyzofwtl6wHR3NBuSmesIsgtBZi7/nil1TDX9 y7w3Hrhpgd1at2ch0JsnB+X5J7jdjgvQMmKBms0y/qk6r/KLP5hRMjrA73YmBrup rEhzNyWMzgo= =8NIG -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-21 5:13 ` David Masover @ 2004-07-21 5:44 ` David Dabbs 2004-07-21 6:20 ` David Masover 0 siblings, 1 reply; 46+ messages in thread From: David Dabbs @ 2004-07-21 5:44 UTC (permalink / raw) To: 'David Masover'; +Cc: reiserfs-list > |> > |>I must not understand fibration. Do you have to know the fibration of > |>an object to find it? > |> > | > | Fibration is simply a means to physically group together filesystem > objects > =>> MEGA SNIP <<= > > So, what you're trying to say is, yes, because it's part of the key? > No, not really, at least you (as a filesystem client) don't specify the fibration when searching for an object. Yes, when the key is generated, of course the fibration bits matter, but they simply come from a blackbox plugin function that simply operates on the name and which may differ per directory. As Hans pointed out, there may be an opportunity to offer some explicit support for * via the syscall interface -- as to whether or not the implementation would even involve fibration is open for discussion. It seems like we are violently agreeing, as my father sometimes says. I too think it would be great to have an enhanced or more structured typing system. And it would be interesting to work on at some point, but not just yet, for me at least. There's still so much to learn about what's here. Anyway, I've kind of lost track of what it is you were looking to accomplish in the thread. Best, David ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-21 5:44 ` David Dabbs @ 2004-07-21 6:20 ` David Masover 2004-07-21 6:36 ` David Dabbs 2004-07-22 8:03 ` Hans Reiser 0 siblings, 2 replies; 46+ messages in thread From: David Masover @ 2004-07-21 6:20 UTC (permalink / raw) To: David Dabbs; +Cc: reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Dabbs wrote: |>|> |>|>I must not understand fibration. Do you have to know the fibration of |>|>an object to find it? |>|> |>| |>| Fibration is simply a means to physically group together filesystem |>objects |>=>> MEGA SNIP <<= |> |>So, what you're trying to say is, yes, because it's part of the key? |> | | | No, not really, at least you (as a filesystem client) don't specify the | fibration when searching for an object. Yes, when the key is generated, of | course the fibration bits matter, but they simply come from a blackbox | plugin function that simply operates on the name and which may differ per Thus, a fibration plugin must rely on the name to decide how to fibrate something, because at lookup time, this plugin is asked "I'm looking for a file named foo, how is that fibrated?" | directory. As Hans pointed out, there may be an opportunity to offer some | explicit support for * via the syscall interface -- as to whether or not the | implementation would even involve fibration is open for discussion. It might. If I'm looking for *.foo and fibration is by last character in filename, then the system knows to only look in the *o files. So it's an optimization for it to involve fibration. But whether it could be done elegantly, I don't pretend to know. | It seems like we are violently agreeing, as my father sometimes says. I too | think it would be great to have an enhanced or more structured typing | system. And it would be interesting to work on at some point, but not just | yet, for me at least. There's still so much to learn about what's here. I agree. I'd much rather get the practical things done -- a stable release, inclusion in distros, patches to common apps. Also, I think I want to reimplement (a subset of) Lustre using reiser4 as a cache. If I do any coding, it'll be on that. | Anyway, I've kind of lost track of what it is you were looking to accomplish | in the thread. So far, I'm experimenting with ideas. My real goal is to learn, because in the months that I've been on this list, I've discovered two things: It's usually fine the way it is. The best way to learn more about it is to suggest a change. If I stumble on a *good* change idea, so much the better! -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQP4LR3gHNmZLgCUhAQJpRw//bbqBxze7wagRp1x7pIpIRc6udlYLzK2S db3bGJwfcEcznENearTC8jn1D/Jf8IIHKpIK0evoaacndMQbxUOun+Aihuf8fbA2 PZPGgsTHgCXDJFYAIhLVFCSJ5D4gw8KsGRi5YLTLmoc0rE1bxNGhXEkLJ0ZSDkpx dtsUnDxCCiui/lZ5WEf7HqeMFiUA7395X9DTfi96UcbgqwanBXv/UTs876r8W1au 9MzcY9S9sArROFRPr6O7NWfqj7Hn22AFaR1CIq8xpdh8DkUf3bwgVrwten2IJxJT ZlbM/1R1o+qUYydRtCf0S0S4whyWfVkHNjAdw6TtwFoqzT9JMV6FGZFIJpF2dSbA FZNPK0fYDxBoDgdlGXoCmF+P5wEaWL/rptGQy/iWGwCwWGGbFhclwUKJHjhxnNo7 u9L50jUMS93lXgH4paOFgjpuGII0bRRtpdCsev7apGYADirXpCo7HwU9yplT+7aa FCm1CmpQp81/J+q9X9xfgHShU+yMPAAzgJcaCzm0qEM/imS63D2RwNs1uyiKZm/R pMwyCdALWPJL4R6s+97IN9LviAGFQIgxh8/4iV6DINC1m+ukH80B3glBh92v3+Ed 5KHYS9n4WtBIqu4lXDCDkFMQqLEMglU4Wgx357x+tvXO5iUGunZXi0CB2mVoGZ2f tbLco4gTnd0= =OA/E -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* RE: Fibration questions 2004-07-21 6:20 ` David Masover @ 2004-07-21 6:36 ` David Dabbs 2004-07-21 8:32 ` mjt 2004-07-22 8:03 ` Hans Reiser 1 sibling, 1 reply; 46+ messages in thread From: David Dabbs @ 2004-07-21 6:36 UTC (permalink / raw) To: 'David Masover'; +Cc: reiserfs-list > | > | No, not really, at least you (as a filesystem client) don't specify the > | fibration when searching for an object. Yes, when the key is generated, > of > | course the fibration bits matter, but they simply come from a blackbox > | plugin function that simply operates on the name and which may differ > per > > Thus, a fibration plugin must rely on the name to decide how to fibrate > something, because at lookup time, this plugin is asked "I'm looking for > a file named foo, how is that fibrated?" > > | directory. As Hans pointed out, there may be an opportunity to offer > some > | explicit support for * via the syscall interface -- as to whether or > not the > | implementation would even involve fibration is open for discussion. > > It might. If I'm looking for *.foo and fibration is by last character > in filename, then the system knows to only look in the *o files. > Yes, which is where I started off. BTW, the default fibration plugin uses the last filename character when period is the preceding character, though that's the joy of plugins -- one can do whatever one wants! > | yet, for me at least. There's still so much to learn about what's here. > > I agree. I'd much rather get the practical things done -- a stable > release, inclusion in distros, patches to common apps. > > Also, I think I want to reimplement (a subset of) Lustre using reiser4 > as a cache. If I do any coding, it'll be on that. > I'll have to look into that more, too. So many things to do, so little time to have fun. David ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-21 6:36 ` David Dabbs @ 2004-07-21 8:32 ` mjt 2004-07-22 4:08 ` David Masover 0 siblings, 1 reply; 46+ messages in thread From: mjt @ 2004-07-21 8:32 UTC (permalink / raw) To: David Dabbs; +Cc: 'David Masover', reiserfs-list On Wed, Jul 21, 2004 at 01:36:15AM -0500, David Dabbs wrote: >> Also, I think I want to reimplement (a subset of) Lustre using reiser4 >> as a cache. If I do any coding, it'll be on that. >I'll have to look into that more, too. So many things to do, so little time >to have fun. This is going off-topic but still... Do any of the Namesys guys want to comment on a reiser4-janitors list? I think it'd be a great idea and people would get to learn the code by really working it. This would make it easier for people to implement stuff like the aforementioned Lustre subset, new plugins, chdir to -x files or something. Or help Namesys if they get around to doing this stuff first ;) Also, would reiser4-newbies be a better list for that, would we need two lists? I think semantically yes, practically no. -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-21 8:32 ` mjt @ 2004-07-22 4:08 ` David Masover 2004-07-22 10:06 ` mjt 2004-07-22 10:10 ` Vitaly Fertman 0 siblings, 2 replies; 46+ messages in thread From: David Masover @ 2004-07-22 4:08 UTC (permalink / raw) To: Markus Törnqvist; +Cc: David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Markus Törnqvist wrote: | I think it'd be a great idea and people would get to learn the code | by really working it. I do! I do! I've been stumbling around the reiser4 code, trying to make sense of things. It looks very clean, very well implemented, sort of allright documentation -- no idea where to start. | This would make it easier for people to implement stuff like the | aforementioned Lustre subset, new plugins, chdir to -x files or something. I'm thinking of doing said Lustre subset as a more generic facility. I definitely want reiser4 support, though, and it'd be really nice to: - - support remote 'metas' dirs as if they were local - - have our own area inside 'metas' which is always local, to allow things like forcing particular files to always be in the cache - - have almost no speed difference between local cache reads and local fs reads - - allow separate caching of different logical pieces of a file, to sanely allow files which are directories - - share an fs between a cache and entirely local files Most of that would be best implemented as some form of plugin. First question: Can I manually enable/disable a particular plugin for a particular directory? (like how cryptocompress is supposed to be...) Second question: Would such a setting be recursive? Can I tell it whether to recurse or not? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQP89rHgHNmZLgCUhAQIbWg//estyCCE5Ae0/YiSS1SXU0ZOevSSTy3Aa HtoGRlWCbkXWQhkpyrnyE4rMdjqDyaFuCIEh9bq3yuzIT0azHajW2nRzur8ffzmn allFZtPtNsMRh4w3kRuQR37s1I0BXs4zQLGstWb4KveyHp/QxegtZ1JLSiS8Jd3w rqaI4UusIHmVTLBJAK5ilVFcor98p14zG9NphPrsht0HDa0LXeGBKsDb9yQ1Db2F bWYqliZW5V/+z4JrGuHSBeqbgR+tSla5gRWHhVuDTe8s24cVAzq+Oks7m2Dxw1D3 JKA1UiCRfWG0jpBjRMrZV37TaFHN9OCgFqQVK8VFreojuuGkPxfqOH2/64LhnYXh 4AexfMB5+IUQHSU+rjWQ58Hm0Lv+VYXyupo2snvig4xZUi7P2RaSK0/zLNSZ61jy Hor+mlxn0/Qsk8GLZG3JO5loVPrdq4cYeDcx8jUaOQRV5IzeWubt7hRgscjed+HI P+w5pkFRXwEK1cqPn+R8BFHGqF2M/AMfmNgY3vEnRaNGrtQ/c4p05ZxQa8DBxWuT A0rnpZlkBkrQM2sGGUd71EzNQJIPLSxMmKiZOJH1UNo4m36IvGCdPmec3cgLQwud WsXAPS1pxlMbqKAIv7t2VV4X2sWJP7xNHTE7hEG0DqQ6c3T4eSEaenUjWq6Y92rf XwoUS85igeM= =jPhu -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 4:08 ` David Masover @ 2004-07-22 10:06 ` mjt 2004-07-22 18:14 ` Hans Reiser 2004-07-23 2:45 ` David Masover 2004-07-22 10:10 ` Vitaly Fertman 1 sibling, 2 replies; 46+ messages in thread From: mjt @ 2004-07-22 10:06 UTC (permalink / raw) To: David Masover; +Cc: David Dabbs, reiserfs-list On Wed, Jul 21, 2004 at 11:08:13PM -0500, David Masover wrote: >- have our own area inside 'metas' which is always local, to allow >things like forcing particular files to always be in the cache Hmm. This is interesting. Difficult to wrap my brain around it, though :)) Is it a bit like what I've understoon union mounts to be? Where would this data be stored, if it's in metas? Stat data on the local partition that's not bound to a specific file? Wouldn't fsck now fix that as a broken fs?-) >First question: Can I manually enable/disable a particular plugin for a >particular directory? (like how cryptocompress is supposed to be...) Sure, but for example with fibration you can't re-fibrate a directory. You have to move the stuff out and back in after you've changed the policy. With tail policies you must access the file in order to get it moved out of tails, if you change the formatting to never. The Namesys Guys may want to correct me on those above notions, if they're totally wrong. >Second question: Would such a setting be recursive? Can I tell it >whether to recurse or not? Isn't this unimplemented, and called hsets? Hereditory Plugin Sets, or something? I think I read once that they are different plugins from the current ones, so that you'd have to write them differently from normal plugins. It must have been documented somewhere in the Reiser4 source code docs, can't imagine where else they would have been, so I'll just refresh my memory when I get home.. Anyway, setting propagation is important, but it's also important to handle conditions like (cool, pseudo-sql! ;) UPDATE formatting SET policy='never\0' WHERE policy='smart\0' RECURSE; instead of just UPDATE formatting SET policy='never\0' RECURSE; which may break something else... -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 10:06 ` mjt @ 2004-07-22 18:14 ` Hans Reiser 2004-07-23 2:45 ` David Masover 1 sibling, 0 replies; 46+ messages in thread From: Hans Reiser @ 2004-07-22 18:14 UTC (permalink / raw) To: Markus Törnqvist; +Cc: David Masover, David Dabbs, reiserfs-list Markus Törnqvist wrote: >On Wed, Jul 21, 2004 at 11:08:13PM -0500, David Masover wrote: > > >>- have our own area inside 'metas' which is always local, to allow >>things like forcing particular files to always be in the cache >> >> Is this the sticky bit or new? > >Hmm. This is interesting. Difficult to wrap my brain around it, though :)) >Is it a bit like what I've understoon union mounts to be? >Where would this data be stored, if it's in metas? Stat data on the >local partition that's not bound to a specific file? Wouldn't fsck >now fix that as a broken fs?-) > > > >>First question: Can I manually enable/disable a particular plugin for a >>particular directory? (like how cryptocompress is supposed to be...) >> >> > >Sure, but for example with fibration you can't re-fibrate a directory. >You have to move the stuff out and back in after you've changed the >policy. > >With tail policies you must access the file in order to get it moved >out of tails, if you change the formatting to never. > >The Namesys Guys may want to correct me on those above notions, if >they're totally wrong. > > Some plugins are mutable, some are not, and some are mutable when empty or some other condition applies. We should document it for all of them. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 10:06 ` mjt 2004-07-22 18:14 ` Hans Reiser @ 2004-07-23 2:45 ` David Masover 2004-07-23 9:42 ` mjt 1 sibling, 1 reply; 46+ messages in thread From: David Masover @ 2004-07-23 2:45 UTC (permalink / raw) To: Markus Törnqvist; +Cc: David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Markus Törnqvist wrote: | Anyway, setting propagation is important, but it's also important to | handle conditions like (cool, pseudo-sql! ;) | UPDATE formatting SET policy='never\0' WHERE policy='smart\0' RECURSE; | instead of just | UPDATE formatting SET policy='never\0' RECURSE; | which may break something else... Both should be allowed. Can that be done now? And with echo, not SQL. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQQB7xngHNmZLgCUhAQIDVA/+JRN5/b2SEzHwqZYGyFVBRyXGQkvnUiUo 2IAkguvwZ0CJvrpKo5u2WjDIn/V0v4P8miq7CoLavYSsYpd5dvNeeayRdMDz7cLv 9AcQxXlfxJfu0wPvscP7iEBybs3ayHmkLalDbpMybeJfn+T3/+65YrBMVIEnrIGs wz+4/j9ffEtC6jbij+Xt+XnI6GUUdK6jUSNP0lTe+Z/XGztZrd85uI4KuGsYjdrb QkI9IEoUj1wLNUcNEdZD7kHS70kK9BA1CD7d/rj0Pc2urJYYHQ55LHlRt1s7VsQU 9qhBC8xtUqtZQe48IfOQ4VHLoZPoVygba3lMhFfAsTHvkieavHykZauNI6uBmOeY uqc4gAZZtw8QvpnOYNsmDMyFo1Qlz9ZTNfi/BJQBpOvtFFqt6Q+NeMA2kSyyxd1u GlagtGQMxrtMY9pgBiRK5R7+RusiSbEe2Skda1vDMok8owoUpLSW57dzVVlWCuXC SPSdCSx23iQYBjNaKgeQtP1Zl4eVD1lsAAPq38tzK+Hx/66oWFlLLnlN5o2pbllI togm9bW8dG2wGAptYaixcq0XAz7vi2G+9aD1snIJsAtuTVi1FEciYvjry+5tXiLO LLogs6uIdMMJiQTr+S5781d/+KQ0YEo3dQXl/tilfIFUqWTKCJ7hqni/xE0kMDlS ejswFHGTEeI= =dUdg -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 2:45 ` David Masover @ 2004-07-23 9:42 ` mjt 2004-07-23 18:21 ` David Masover 0 siblings, 1 reply; 46+ messages in thread From: mjt @ 2004-07-23 9:42 UTC (permalink / raw) To: David Masover; +Cc: David Dabbs, reiserfs-list On Thu, Jul 22, 2004 at 09:45:26PM -0500, David Masover wrote: >| UPDATE formatting SET policy='never\0' WHERE policy='smart\0' RECURSE; >| instead of just >| UPDATE formatting SET policy='never\0' RECURSE; >| which may break something else... > >Both should be allowed. Can that be done now? And with echo, not SQL. I can't check now but maybe: for dir in $(find . -type d); do format=$(cat $dir/..metas/plugin/formatting) if [ $format == "smart\0" ]; then echo -e 'never\0' > $format for file in $(find $dir -type f); do cat file > /dev/null done fi done IIRC the access required to change the file's formatting policy on the file system was read-only. If you actually had to change something, replace cat with chmod +x && chmod -x or something. Namesys guys want to comment?-) -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 9:42 ` mjt @ 2004-07-23 18:21 ` David Masover 0 siblings, 0 replies; 46+ messages in thread From: David Masover @ 2004-07-23 18:21 UTC (permalink / raw) To: Markus Törnqvist; +Cc: David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Markus Törnqvist wrote: | On Thu, Jul 22, 2004 at 09:45:26PM -0500, David Masover wrote: | |>| UPDATE formatting SET policy='never\0' WHERE policy='smart\0' RECURSE; |>| instead of just |>| UPDATE formatting SET policy='never\0' RECURSE; |>| which may break something else... |> |>Both should be allowed. Can that be done now? And with echo, not SQL. | | | I can't check now but maybe: | | for dir in $(find . -type d); do | format=$(cat $dir/..metas/plugin/formatting) | if [ $format == "smart\0" ]; then | echo -e 'never\0' > $format | for file in $(find $dir -type f); do | cat file > /dev/null | done | fi | done I was thinking a plugin. Also, "metas" is _still_ the default on the auto snapshots. I change it to '...' whenever I feel like looking at the code. "find -type d" would most likely not work. I'm thinking of an attribute which would apply to every type of object, including files, also files which are directories, and so on. Mainly just a few flags -- right now I'm thinking "never purge me from cache" and "I'm a stub; if something actually tries accessing me, pull me from the network." -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQQFXDXgHNmZLgCUhAQIsSBAAiThNM1TkyvVQzkhK48RPzrAlM8T3rYTW QWYCdY5TjIQ5svt/m+hj27HPAFh1PCU/ZtRIE3lv01eGHmlz5/560r2v3VAIHlpn GQEJaw587hcvN+owUsj4PgajIWX1tJJ5fiyfV3Llq6QqJhb6jm8UxH9mX/03Gz0d p76AbtBF0z8xDxt87lEwTrhxWZmgas1H+qyPnCR1F4fFEn54H7DA4Sp9e+5rqtGJ 8p9kXlYgjigfxjITkXCYlA3k/9K3+T7bWvlM16P8Buf9O4HaUKhAv/s1mAx6iJVy FUPxzaz9hF169O89RGWAbbyxZ9k2Z7+VK9sxLoiqkTRKNqBfZv8sdQyQ9r9ZwBxc HKfF2Ndzjd3pR/NES1ukvECHqjsSmXI3fRFBB80x+o1hM/+MvoZjtCcn3H/Aes96 yKgwR3UFYnZfBi9PHOdQn2sOfGXJaEj1n37QeuyGEAmUZb/5Zf/ukMfYyhBfgqPY rZw7QSY71bhrbRu0k1/cZ6kkEvdyAA90+nAzwyHMqLktPZ/uZc324HfyudrAb6cy g7x3mM8Jj5Wm7XK1lD6OU1EsJQcNNt/dsRdZgT82tX4w+ZRbedYEhWqCM1ER83OS pwmWTiumpxbDgCWyrEDOyHtfihPJkPmz4xBHytKrX/pRKVCqNl28UPw32gJA0P4i uwkYWv8vnDI= =2D+S -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 4:08 ` David Masover 2004-07-22 10:06 ` mjt @ 2004-07-22 10:10 ` Vitaly Fertman 2004-07-23 2:43 ` David Masover 2004-07-23 9:59 ` Christian Mayrhuber 1 sibling, 2 replies; 46+ messages in thread From: Vitaly Fertman @ 2004-07-22 10:10 UTC (permalink / raw) To: David Masover, Markus Törnqvist; +Cc: David Dabbs, reiserfs-list > First question: Can I manually enable/disable a particular plugin for a > particular directory? (like how cryptocompress is supposed to be...) you can change a plugin for a file if it does not destroy its structure. Thus for an empty directory you can : # cat somedir/metas/plugin/hash ; echo 1 r5 r5 hash # echo -e "tea\0" > somedir/metas/plugin/hash # cat somedir/metas/plugin/hash ; echo 2 tea tea hash > Second question: Would such a setting be recursive? Can I tell it > whether to recurse or not? plugins are inherited at the creation time from the parent of the object being created. Thus in the above example all subdirs would have the hash r5 before changing the 'somedir' hash to 'tea' and will have the 'tea' hash after. -- Thanks, Vitaly Fertman ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 10:10 ` Vitaly Fertman @ 2004-07-23 2:43 ` David Masover 2004-07-23 9:09 ` Vitaly Fertman 2004-07-23 9:59 ` Christian Mayrhuber 1 sibling, 1 reply; 46+ messages in thread From: David Masover @ 2004-07-23 2:43 UTC (permalink / raw) To: Vitaly Fertman; +Cc: Markus To"rnqvist, David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Vitaly Fertman wrote: |>First question: Can I manually enable/disable a particular plugin for a |>particular directory? (like how cryptocompress is supposed to be...) | | | you can change a plugin for a file if it does not destroy its structure. | Thus for an empty directory you can : | # cat somedir/metas/plugin/hash ; echo | 1 r5 r5 hash | # echo -e "tea\0" > somedir/metas/plugin/hash | # cat somedir/metas/plugin/hash ; echo | 2 tea tea hash | | |>Second question: Would such a setting be recursive? Can I tell it |>whether to recurse or not? | | | plugins are inherited at the creation time from the parent of the object | being created. Thus in the above example all subdirs would have the | hash r5 before changing the 'somedir' hash to 'tea' and will have the | 'tea' hash after. wait -- what does this do: echo -e "tea\0" > somedir/metas/plugin/hash echo -e "something_differenet\0" > somedir/subdir/metas/plugins/hash echo -e "r5\0" > somedir/metas/plugins/hash What is the hash of "subdir" now? What about new subdirs? What hash will they get? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQQB7VngHNmZLgCUhAQLxLA/+PWfdMA1LVzwY3HjINl/ZsGD0TtrW6oxc eMJVbtGFXYeCSv5hV1bqE9IbSX/HgMEyqq/8TFvf4Cy66s1aUAcbflXiZQOzEMXa UwW69kGH7H74ENP9TYyrySLNpvx54rhm/uU+g0XTaQHQ85VjuGC/0fruspCITTGX +nLhMNDO/FPtokAHunug3dqBV7WI9oNhtj43lCslhlYAKK926GrEblYqFDFbEwIj p+DK9QSALxGq4FvkDbb+70sgL8dK1WEQ7x/j2gGLMKgXbwrqfjSkIv6Zhk2BmOwM W9sQRtkx6ovcJWmH6QhdwH8PScxreN/w22+JO28g8hhKUrb/YxR8RMR4U3kBflf0 xXonMAISVRBCT0hBuOPyOvFHuPXlKNruW9LExmNOyVE1NzmMCs97uY7FyEP87kJl cffhA1+z6EjDJC3yqYCQlG7lIAGtISxB0zJWHAhZVKPNxH1bYH5xxDc/VGvs6O2K 33jnta8MXMP7r4PBaCe0frE85krsqtk9NxviTf7REabJprXD1zmPa7w/XvGSynVK N71Y01fsLqKJoalXgedSpIAsohtBBDWPix+RJglmqULJddNWJppcayVlHCGJ8EcD YILWo/gWloSmYHPpppDAmDG7YkbJBc2FXeuIyxiP9oWiBJm61b10VK8btfyKmmtQ 2GyhYEmH69w= =wL4f -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 2:43 ` David Masover @ 2004-07-23 9:09 ` Vitaly Fertman 2004-07-26 6:28 ` Hans Reiser 0 siblings, 1 reply; 46+ messages in thread From: Vitaly Fertman @ 2004-07-23 9:09 UTC (permalink / raw) To: David Masover; +Cc: Markus Törnqvist, David Dabbs, reiserfs-list > wait -- what does this do: > echo -e "tea\0" > somedir/metas/plugin/hash will set the hash tea in 'somedir' if it is empty. if it has 'subdir' already, nothing happens. > echo -e "something_differenet\0" > somedir/subdir/metas/plugins/hash will set the hash 'something_differenet' if any of [r5 | tea | fnv1 | rupasov | degenerate hash | all future hash plugins] in 'somedir/subdir' if 'subdir' is empty. > echo -e "r5\0" > somedir/metas/plugins/hash nothing happens as 'subdir' exists. > What is the hash of "subdir" now? 'somthing_different' > What about new subdirs? What hash will they get? new subobjects inherit all plugins from the object they are created in at the creation time. future changes of plugins of the object will not be applied to already created subobjects, only to new subobjects. Note: if a plugin is used for rendering the object content then its changing needs a convertion method. Thus if you create a file in the dir then the hash plugin gets used, and you need to convert the dir to change the hash successfully. As there is no such convertion methods yet, you cannot change the hash of not empty dir. -- Thanks, Vitaly Fertman ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 9:09 ` Vitaly Fertman @ 2004-07-26 6:28 ` Hans Reiser 2004-07-26 10:11 ` Vitaly Fertman 0 siblings, 1 reply; 46+ messages in thread From: Hans Reiser @ 2004-07-26 6:28 UTC (permalink / raw) To: Vitaly Fertman Cc: David Masover, Markus Törnqvist, David Dabbs, reiserfs-list Vitaly Fertman wrote: >>wait -- what does this do: >>echo -e "tea\0" > somedir/metas/plugin/hash >> >> > >will set the hash tea in 'somedir' if it is empty. >if it has 'subdir' already, nothing happens. > > > >>echo -e "something_differenet\0" > somedir/subdir/metas/plugins/hash >> >> > >will set the hash 'something_differenet' if any of [r5 | tea | fnv1 | >rupasov | degenerate hash | all future hash plugins] in 'somedir/subdir' >if 'subdir' is empty. > > is something_different constrained to [r5 | tea | fnv1 | rupasov | degenerate hash | some future hash plugin existing at time of command] ? (I hope so.) > > >>echo -e "r5\0" > somedir/metas/plugins/hash >> >> > >nothing happens as 'subdir' exists. > > > >>What is the hash of "subdir" now? >> >> > >'somthing_different' > > > >>What about new subdirs? What hash will they get? >> >> > >new subobjects inherit all plugins from the object they are created >in at the creation time. future changes of plugins of the object will >not be applied to already created subobjects, only to new subobjects. > >Note: if a plugin is used for rendering the object content then its >changing needs a convertion method. Thus if you create a file in >the dir then the hash plugin gets used, and you need to convert the >dir to change the hash successfully. As there is no such convertion >methods yet, you cannot change the hash of not empty dir. > > > ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-26 6:28 ` Hans Reiser @ 2004-07-26 10:11 ` Vitaly Fertman 0 siblings, 0 replies; 46+ messages in thread From: Vitaly Fertman @ 2004-07-26 10:11 UTC (permalink / raw) To: Hans Reiser Cc: David Masover, Markus Törnqvist, David Dabbs, reiserfs-list On Monday 26 July 2004 10:28, Hans Reiser wrote: > Vitaly Fertman wrote: > >>wait -- what does this do: > >>echo -e "tea\0" > somedir/metas/plugin/hash > > > >will set the hash tea in 'somedir' if it is empty. > >if it has 'subdir' already, nothing happens. > > > >>echo -e "something_differenet\0" > somedir/subdir/metas/plugins/hash > > > >will set the hash 'something_differenet' if any of [r5 | tea | fnv1 | > >rupasov | degenerate hash | all future hash plugins] in 'somedir/subdir' > >if 'subdir' is empty. > > is something_different constrained to > > [r5 | tea | fnv1 | rupasov | degenerate hash | some future hash plugin > existing at time of command] > > ? yes, right. -- Thanks, Vitaly Fertman ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 10:10 ` Vitaly Fertman 2004-07-23 2:43 ` David Masover @ 2004-07-23 9:59 ` Christian Mayrhuber 2004-07-23 9:59 ` mjt 2004-07-23 10:05 ` mjt 1 sibling, 2 replies; 46+ messages in thread From: Christian Mayrhuber @ 2004-07-23 9:59 UTC (permalink / raw) To: reiserfs-list On Thursday 22 July 2004 12:10, Vitaly Fertman wrote: > > First question: Can I manually enable/disable a particular plugin for a > > particular directory? (like how cryptocompress is supposed to be...) > > you can change a plugin for a file if it does not destroy its structure. > Thus for an empty directory you can : > # cat somedir/metas/plugin/hash ; echo > 1 r5 r5 hash > # echo -e "tea\0" > somedir/metas/plugin/hash So if one does # echo -e "tea" > somedir/metas/plugin/hash it can crash reiser4, because of a non terminated C string? It would be a good idea if the meta filesystem interface always auto terminates strings (add + '\0') issued by echo. procfs does it that way. -- lg, Chris ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 9:59 ` Christian Mayrhuber @ 2004-07-23 9:59 ` mjt 2004-07-23 18:13 ` David Masover 2004-07-23 10:05 ` mjt 1 sibling, 1 reply; 46+ messages in thread From: mjt @ 2004-07-23 9:59 UTC (permalink / raw) To: Christian Mayrhuber; +Cc: reiserfs-list On Fri, Jul 23, 2004 at 11:59:23AM +0200, Christian Mayrhuber wrote: >It would be a good idea if the meta filesystem interface always auto >terminates strings (add + '\0') issued by echo. procfs does it that way. Or add a small parser that allows \n so that cat output would look better... -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 9:59 ` mjt @ 2004-07-23 18:13 ` David Masover 0 siblings, 0 replies; 46+ messages in thread From: David Masover @ 2004-07-23 18:13 UTC (permalink / raw) To: Markus Törnqvist; +Cc: Christian Mayrhuber, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Markus Törnqvist wrote: | On Fri, Jul 23, 2004 at 11:59:23AM +0200, Christian Mayrhuber wrote: | |>It would be a good idea if the meta filesystem interface always auto |>terminates strings (add + '\0') issued by echo. procfs does it that way. | | | Or add a small parser that allows \n so that cat output would look | better... And remove \n from input so we could really just do "echo foo > metas/bar" with no worries. Amen, if it could be done for only access through the "open" system call. That is, if someone wants speed, they'll use sys_reiser4, and if they use sys_reiser4, they want speed -- thus sys_reiser4 would not add \0 or add/remove \n. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQQFVS3gHNmZLgCUhAQJ86g/6AsEwzPGRmBIfHCOW/SBS/iErwxhRHYmj 1SOwgUan+Aa/G7YVQZziI6/wS3hjcjW0lzelI8LAJM7fomWprwHKmFP7a7BngOEh Phwmk0TGVvOiFZ3J34vfyBnYTeM15ttOfWDbJQI5x15K3nkPNgytgajIYKYtGkeL eBuEo0YUoJ7IGyFnJScqnC7QsLgR/ak5F2oJrZCKRskycVN+ceXG9xdFX5K6U6TM p8mfmRXGpKMtw9Ob1CQ+jjOQ98T2cFgGQ/00i3U2wfzrkFEcjOqZdNwIDbEKXHND Fvxz1X//WPsaZkXOAO9TUneWkthVjcrDCJCJfSG5FxSeE04JscxOLyxZlyJqB2nG YHIhQetrSI7hJqwT9eCWzO8ihpkmdqmfTHjEkmmpWbZ4L51e7uom9KQw0+SCCp0C ag+zHE7W32ysULqKT+ozpi/P7FaEKq5kizZ/KYUkIYrco1IMrPV9ZCUxA0YDINy/ MBTfxNrMy+HoTHDsy4EFrX3Dldi/hm7dDoBcDrmssfDiWcfKuJad+Zg/edSyRRcb NpjaaAiIX9LQP0JQsh/k4+zPSf5Gp+JU8T8IG8MxZQOTInvKAbXL7oqx2Zmieae7 bWscaJzC6HvLSUcBfHXJdM1+ELLrLoExlJFMMzqrdjTypxRHvkKM9qVm50JM6lfu h2t8BlnnKtI= =vUTa -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 9:59 ` Christian Mayrhuber 2004-07-23 9:59 ` mjt @ 2004-07-23 10:05 ` mjt 1 sibling, 0 replies; 46+ messages in thread From: mjt @ 2004-07-23 10:05 UTC (permalink / raw) To: Christian Mayrhuber; +Cc: reiserfs-list On Fri, Jul 23, 2004 at 11:59:23AM +0200, Christian Mayrhuber wrote: > >So if one does ># echo -e "tea" > somedir/metas/plugin/hash >it can crash reiser4, because of a non terminated C string? I also meant to say that Reiser4 just silently ignores the request, at least last time I checked... -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-21 6:20 ` David Masover 2004-07-21 6:36 ` David Dabbs @ 2004-07-22 8:03 ` Hans Reiser 2004-07-22 12:16 ` Nikita Danilov 1 sibling, 1 reply; 46+ messages in thread From: Hans Reiser @ 2004-07-22 8:03 UTC (permalink / raw) To: David Masover; +Cc: David Dabbs, reiserfs-list David Masover wrote: > > > Thus, a fibration plugin must rely on the name to decide how to fibrate > something, because at lookup time, this plugin is asked "I'm looking for > a file named foo, how is that fibrated?" Yes. > > > > Also, I think I want to reimplement (a subset of) Lustre using reiser4 > as a cache. If I do any coding, it'll be on that. You might ask nikita if that is what they hired him away from us for..... > > ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 8:03 ` Hans Reiser @ 2004-07-22 12:16 ` Nikita Danilov 2004-07-22 14:39 ` mjt 2004-07-23 2:40 ` David Masover 0 siblings, 2 replies; 46+ messages in thread From: Nikita Danilov @ 2004-07-22 12:16 UTC (permalink / raw) To: Hans Reiser; +Cc: David Masover, David Dabbs, reiserfs-list Hans Reiser writes: > David Masover wrote: > > > > > > > Thus, a fibration plugin must rely on the name to decide how to fibrate > > something, because at lookup time, this plugin is asked "I'm looking for > > a file named foo, how is that fibrated?" > > Yes. > > > > > > > > > Also, I think I want to reimplement (a subset of) Lustre using reiser4 > > as a cache. If I do any coding, it'll be on that. > > You might ask nikita if that is what they hired him away from us for..... > Nope. :) Ask Peter Braam (braam@clusterfs.com), whether he has any plans to use reiser4 as a back-end. This shouldn't be hard to implement, technically. > > > > > Nikita. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 12:16 ` Nikita Danilov @ 2004-07-22 14:39 ` mjt 2004-07-22 18:17 ` Hans Reiser 2004-07-22 19:57 ` Valdis.Kletnieks 2004-07-23 2:40 ` David Masover 1 sibling, 2 replies; 46+ messages in thread From: mjt @ 2004-07-22 14:39 UTC (permalink / raw) To: Nikita Danilov; +Cc: Hans Reiser, David Masover, David Dabbs, reiserfs-list On Thu, Jul 22, 2004 at 04:16:06PM +0400, Nikita Danilov wrote: >Ask Peter Braam (braam@clusterfs.com), whether he has any plans to use >reiser4 as a back-end. >This shouldn't be hard to implement, technically. Maybe a chance to bleed some Dell, Cray, HP and DataDirect money to Namesys ;) Has Namesys contacted these guys, btw, and asked for sponsorship? They are probably all companies with more money than they'd know how to spend and they'd benefit from having the best filesystem ever. -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 14:39 ` mjt @ 2004-07-22 18:17 ` Hans Reiser 2004-07-22 18:26 ` mjt 2004-07-22 19:57 ` Valdis.Kletnieks 1 sibling, 1 reply; 46+ messages in thread From: Hans Reiser @ 2004-07-22 18:17 UTC (permalink / raw) To: Markus Törnqvist Cc: Nikita Danilov, David Masover, David Dabbs, reiserfs-list Markus Törnqvist wrote: >On Thu, Jul 22, 2004 at 04:16:06PM +0400, Nikita Danilov wrote: > > >>Ask Peter Braam (braam@clusterfs.com), whether he has any plans to use >>reiser4 as a back-end. >>This shouldn't be hard to implement, technically. >> >> > >Maybe a chance to bleed some Dell, Cray, HP and DataDirect money >to Namesys ;) > >Has Namesys contacted these guys, btw, and asked for sponsorship? > > No. >They are probably all companies with more money than they'd know how >to spend and they'd benefit from having the best filesystem ever. > > > The world is filled with companies with money. Getting them to part with it is more work than you might think. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 18:17 ` Hans Reiser @ 2004-07-22 18:26 ` mjt 0 siblings, 0 replies; 46+ messages in thread From: mjt @ 2004-07-22 18:26 UTC (permalink / raw) To: Hans Reiser; +Cc: Nikita Danilov, David Masover, David Dabbs, reiserfs-list On Thu, Jul 22, 2004 at 11:17:14AM -0700, Hans Reiser wrote: >The world is filled with companies with money. Getting them to part >with it is more work than you might think. One of the main reasons I'm not a businessman is that it must be frustrating as hell, and hanging around in this project only seems to verify that suspicion. But I also think that there are no means too desperate, like, I wouldn't think twice about trying, because every now and then trying has paid off. Even sometimes the desired results have come easier than suspected. I'm of course not trying to be presumptuous or anything, I just wondered if there's a potential here... -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 14:39 ` mjt 2004-07-22 18:17 ` Hans Reiser @ 2004-07-22 19:57 ` Valdis.Kletnieks 2004-07-22 21:05 ` mjt 1 sibling, 1 reply; 46+ messages in thread From: Valdis.Kletnieks @ 2004-07-22 19:57 UTC (permalink / raw) To: Markus =?UNKNOWN?Q?T=F6rnqvist?=; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 778 bytes --] On Thu, 22 Jul 2004 17:39:09 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said: > Maybe a chance to bleed some Dell, Cray, HP and DataDirect money > to Namesys ;) Cray is still in business? ;) > Has Namesys contacted these guys, btw, and asked for sponsorship? > They are probably all companies with more money than they'd know how > to spend and they'd benefit from having the best filesystem ever. Umm.. *would* they benefit? Think it through - if the "best available filesystem" suddenly gets 30% faster, they end up selling 30% smaller servers - and there's more profit margin at the high end than at the low end. So you can't really get much support from the hardware people (look at how rich Intel has gotten from Windows bloating every release, and think about it...) [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 19:57 ` Valdis.Kletnieks @ 2004-07-22 21:05 ` mjt 2004-07-22 21:36 ` Valdis.Kletnieks 0 siblings, 1 reply; 46+ messages in thread From: mjt @ 2004-07-22 21:05 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: reiserfs-list On Thu, Jul 22, 2004 at 03:57:32PM -0400, Valdis.Kletnieks@vt.edu wrote: >Cray is still in business? ;) I got the name from clusterfs.com ;) Cray and Cluster File Systems are preparing the Lustre file system for deployment on the Red Storm supercomputer. Red Storm supercomputer.. hmm! >Umm.. *would* they benefit? Think it through - if the "best available >filesystem" suddenly gets 30% faster, they end up selling 30% smaller servers - >and there's more profit margin at the high end than at the low end. So you >can't really get much support from the hardware people (look at how rich Intel >has gotten from Windows bloating every release, and think about it...) Well, I think they would. Guys like Dell deploy big-ass servers, if they shipped it with a super-fast and advanced secure file system like Reiser4, I very much doubt anyone would go "Gee, let's buy this smaller model which costs less but that ships with a slower file system, which also happens to be less secure" Intel won't stop selling fastest processors to Dell just because Dell ships something with which they squeeze even more power out of the hardware. It's all about marketing, I guess. If they had a slogan like "More for a lesser price - Reiser4!" their sales would not at go down. Besides, they control the market, they keep pushing out faster and sharper PowerEdges and there's not a damned thing anyone can do about it, and they don't want to either, even less if it has a good file system. Customer companies allocate x currency units for hardware. That's what they always do, because x buys them a decent machine, and decent machines tend to have a constant cost through the ages. I mean the contemporary power versus money and an upgrade sequence passing through every few years. They will not allocate x-y because they want something less. People want a system they can trust, and when they buy from Dell, they trust Dell, and if Dell has covered its ass in Reiser4 development and ships it, it's a total win-win situation. That's my view on things. -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 21:05 ` mjt @ 2004-07-22 21:36 ` Valdis.Kletnieks 2004-07-23 9:28 ` mjt 0 siblings, 1 reply; 46+ messages in thread From: Valdis.Kletnieks @ 2004-07-22 21:36 UTC (permalink / raw) To: Markus =?UNKNOWN?Q?T=F6rnqvist?=; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 2339 bytes --] On Fri, 23 Jul 2004 00:05:15 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said: > Guys like Dell deploy big-ass servers, if they shipped it with a super-fast > and advanced secure file system like Reiser4, I very much doubt anyone > would go "Gee, let's buy this smaller model which costs less but that > ships with a slower file system, which also happens to be less secure" This of course explains why IE is the premier web browser on 95% of the desktops, right? I mean, it's faster and more secure than all the alternatives, right? ;) And how many times has the global RAM market been put under severe strain because the latest Windows upgrade needed more RAM, so everybody went out and bought more RAM, and more RAM, and more RAM... The *correct* model is "Gee, let's buy this new box, because the salesman told us we need at least a Model 9000 to run this app, and *never* *mentioned* that with proper tuning, we could actually get by with a Model 6000 that costs half as much, and makes him half as much in sales commission....". I mean.. *seriously* - think about it for a moment. A co-worker recently spec'ed out a Dell 6600 for a project - paid US$20k or so for that one. I needed another server for another project, ended up spec'ing a Dell 2650 for about US$7k. Now *what* motivation does Dell have to sell both of us 2650's rather than trying to talk us both into 6600's? (As it was, Dell got lucky I didn't go for a box even *smaller* than a 2650... at $7K, trying to save MORE money isn't worth it. And I'm quite sure that Sun wasn't overjoyed when we replaced an E10K with 2 Sunfire 2900's and a number of Sunfire 240's - they wanted to sell us a third 2900... ;) OK... not *all* vendors are like that - Apple likes us. Probably has something to do with the fact we bought 1,100+ units in one shot to make a supercomputer. But then, there's *no* vendors that won't do whatever it takes to land a $5M deal with lots of good PR attached to it... But seriously - once the purchase price falls under $1M or so, vendors fast start losing their desire to bend over backwards to help you out... Also - in what way, exactly, is Reiser4 "more secure"? (Think carefully about the Linux security model here, and where the LSM hooks are - almost all of it happens at the VFS layer. And there's the whole xattr debacle too...) [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 21:36 ` Valdis.Kletnieks @ 2004-07-23 9:28 ` mjt 2004-07-23 22:42 ` Valdis.Kletnieks 0 siblings, 1 reply; 46+ messages in thread From: mjt @ 2004-07-23 9:28 UTC (permalink / raw) To: Valdis.Kletnieks; +Cc: reiserfs-list On Thu, Jul 22, 2004 at 05:36:43PM -0400, Valdis.Kletnieks@vt.edu wrote: > >This of course explains why IE is the premier web browser on 95% of the >desktops, right? I mean, it's faster and more secure than all the alternatives, >right? ;) I think desktops for all the Joe Q. Averages are pretty much a different scene from servers.. >And how many times has the global RAM market been put under severe strain >because the latest Windows upgrade needed more RAM, so everybody went out and >bought more RAM, and more RAM, and more RAM... But Windows isn't the only thing that starts requiring more RAM, and if you can buy more for a lesser price, that's what you'll do, regardless. >The *correct* model is "Gee, let's buy this new box, because the salesman told >us we need at least a Model 9000 to run this app, and *never* *mentioned* that >with proper tuning, we could actually get by with a Model 6000 that costs half >as much, and makes him half as much in sales commission....". But that happens. Maybe I'm the unexperienced obnoxious adolescent again, as I'm in only my second job so far, but I've noticed that both employers have the principle that if you can get anything, even the slightest guarantee, that something is faster and more stable at a somewhat higher cost, it's worth it. Even if you'd be paying for a scapegoat-factor warranty. OK, the first job didn't last _that_ long because it was a start-up and started running out of money, so maybe we should have invested in poorer servers, but then again, we got to see all our competitors go down in flames. And the company was alive and kicking (with a smaller staff) until it got bought by a bigger company, so it all ended quite well. Tune even faster solution and get even more power, it'll last us all weekend, before it goes obsolete... >I mean.. *seriously* - think about it for a moment. A co-worker recently >spec'ed out a Dell 6600 for a project - paid US$20k or so for that one. I >needed another server for another project, ended up spec'ing a Dell 2650 for >about US$7k. Now *what* motivation does Dell have to sell both of us 2650's >rather than trying to talk us both into 6600's? (As it was, Dell got lucky I didn't >go for a box even *smaller* than a 2650... at $7K, trying to save MORE money >isn't worth it. And I'm quite sure that Sun wasn't overjoyed when we So a Dell 2650 could have could have handled what the 6600 did? And they're still selling 6600s, how big an impact would Reiser4's speed advantage have really on them? But it seems I'm over over my head now :) But this isn't the only way of trying to get funding from a big company. Speed, that is. >Also - in what way, exactly, is Reiser4 "more secure"? (Think carefully about >the Linux security model here, and where the LSM hooks are - almost all of it >happens at the VFS layer. And there's the whole xattr debacle too...) Should I have said safe instead of secure? Maybe that would be the better English word for it. Like being safe at power failures. Then there's view security, which should be implemented. I make my meager living as a small-time administrator and writer of web (and similar) magick in Python, so I don't know why the xattrs couldn't be mapped to Reiser4 calls, but shouldn't it be technically possible? Maybe that's a point on which to whore out at the prospect of big cash... Maybe some potential sponsor would agree that the Reiser4 way is the better way, they'd agree that Linux file systems are starting to suck compared to, say, Apple's file system and they'd like to do something about it. Also asking for a smaller sum of money might be a tactic, "That other company asked for x money and they got it for their trivial thing, but we're asking for x/10 to do so much more." But it boils down to presenting this in a convincing manner... But these are just ideas, I have absolutely zero marketing experience so this should not be taken as a presumptuous manual on how to do things :) -- mjt ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-23 9:28 ` mjt @ 2004-07-23 22:42 ` Valdis.Kletnieks 0 siblings, 0 replies; 46+ messages in thread From: Valdis.Kletnieks @ 2004-07-23 22:42 UTC (permalink / raw) To: Markus =?UNKNOWN?Q?T=F6rnqvist?=; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 5364 bytes --] On Fri, 23 Jul 2004 12:28:49 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said: > I think desktops for all the Joe Q. Averages are pretty much a different > scene from servers.. It's not as different as you might think. Remember in most corporations that use Active Directory, all the infrastructure boxes (domain controllers, etc) are Windows boxes too. Quite recently, an amazing number of webservers got 0wned because somebody browsed the net using IE while logged on at the server console.... > >And how many times has the global RAM market been put under severe strain > >because the latest Windows upgrade needed more RAM, so everybody went out and > >bought more RAM, and more RAM, and more RAM... > > But Windows isn't the only thing that starts requiring more RAM, and > if you can buy more for a lesser price, that's what you'll do, regardless. No, the price of RAM went *UP*, dramatically, because demand was higher than supply, so you were buying less for a higher price. The point is that the manufacturers of RAM and systems had *no* incentive to do anything to stop it. "Microsoft is expected to recommend that the "average" Longhorn PC feature a dual-core CPU running at 4 to 6GHz; a minimum of 2 gigs of RAM; up to a terabyte of storage; a 1 Gbit, built-in, Ethernet-wired port and an 802.11g wireless link; and a graphics processor that runs three times faster than those on the market today." http://www.microsoft-watch.com/article2/0,1995,1581842,00.asp Now *try* to convince me that the Dell and HP saw this, and their first thought was "Let's see if we can get it to run well on a single-core 3GHz with 1G of RAM" ;) If that was their first thought, the second was "OK, I'm done laughing, now I need to pick myself up off the floor...." > Maybe I'm the unexperienced obnoxious adolescent again, as I'm in only > my second job so far, but I've noticed that both employers have the > principle that if you can get anything, even the slightest guarantee, > that something is faster and more stable at a somewhat higher cost, it's > worth it. Even if you'd be paying for a scapegoat-factor warranty. Right. Which is why you end up *buying* that faster server at higher cost than you might really need. Most managers have a *really* hard time dealing with the concept "If you use this alternative, totally free, no-cost, software, it will run faster and save you money". > Tune even faster solution and get even more power, it'll last us > all weekend, before it goes obsolete... You'd be *amazed* at how many sites *dont* have somebody on the payroll who can do tuning well. Usually, it's whatever they remember from the MSCE exam. Just because my shop has people experienced in tuning everything from old ferrite-core systems to top-10 supercomputers doesn't mean every shop does. ;) > So a Dell 2650 could have could have handled what the 6600 did? No, the 2650 would certainly have gotten swamped, the two boxes are doing different things. The point is that *DELL* didn't have any incentive to get me to buy a 2650 instead of another 6600. And if I had little clue, and actually talked to a Dell sales rep, they probably could have convinced me I needed a 6600. > And they're still selling 6600s, how big an impact would Reiser4's speed > advantage have really on them? But it seems I'm over over my head now :) Trust me, it wouldn't have helped enough to get the 6600's workload to fit on a 2650. > Should I have said safe instead of secure? Maybe that would be the better > English word for it. > Like being safe at power failures. Is it *demonstrably* better than ext3 with 'data=journal'? > Then there's view security, which should be implemented. Ahh.. but view security doesn't do you as much good as it could, mostly because of the "support at the VFS level" issues. > I make my meager living as a small-time administrator and writer of > web (and similar) magick in Python, so I don't know why the xattrs > couldn't be mapped to Reiser4 calls, but shouldn't it be technically possible? I'll refrain from saying anything except "read the list archives".... > But these are just ideas, I have absolutely zero marketing experience > so this should not be taken as a presumptuous manual on how to do things :) You'd have more luck not talking to the people who sell hardware or systems, but to the people who *use* hardware and systems, or who sell consulting/maintenance. For instance, Google has multiple large server farms, each of which has 15K to 20K systems in it. They're a Linux shop, and would probably be willing to part with a fairly large sum of cash if it meant their hardware upgrade costs went down even 5%. There's lots of places making money doing custom one-off solutions based on Linux - for instance, most of IBM's Linux revenue comes from consulting/ support. A shop that's doing systems integration might well be willing to pay $100K for another thing in their bag of tricks that lets them land 20 contracts that make them $10K profit each, by being able to deliver a solution that uses $15K less hardware... The only people willing to pay for something that uses less hardware are people who are either trying to save money by spending less on hardware, or trying to make money by selling less of somebody's hardware while charging the same amount for consulting/development... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Fibration questions 2004-07-22 12:16 ` Nikita Danilov 2004-07-22 14:39 ` mjt @ 2004-07-23 2:40 ` David Masover 1 sibling, 0 replies; 46+ messages in thread From: David Masover @ 2004-07-23 2:40 UTC (permalink / raw) To: Nikita Danilov; +Cc: Hans Reiser, David Dabbs, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Nikita Danilov wrote: | Ask Peter Braam (braam@clusterfs.com), whether he has any plans to use | reiser4 as a back-end. I've never used Lustre, but I've heard it implements at least some ideas from InterMezzo, which used other filesystems (supposedly ANY other filesystem) as a cache. | This shouldn't be hard to implement, technically. That's why I want to do it. I need useful things with my name on them - -- my grades suck, and next year is Senior year. Of High school. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iQIVAwUBQQB6tHgHNmZLgCUhAQIpHA/+NPFWOUxnkm9AviTYePr2P0CylV8T7Cnn bhxh41fObPOFEcCKhigjmvO9+osE7JfMQZ2CcUMdzQGl0FEQ72pqZZ+c2UPhEx34 NyJh/eKGUv3440GtbXJeAmqSz8Zx0CHVnq/pnQehX48rXUAx9y4eN3V/KfK1IzYm Sow/Y1HOJGQprvzcoe3yiMEFRb3aHl5bd/N2kuyd10cxAItj2/BknRTR8iIQdHJR Wy7dVZ5fb4+FcTQPAF0dupS6IrfjwDQeqBIvTNfOPt3F3yBJ5XYGLvBZeZ/drLmf OBtdtC79R6dCJDSCT5YnSD5UWAB7MZ1xEz0Ce0f84aeHRibtn6l0v0Ke5tqFngEt uqcNJcB7mk8vPwVUBlp1kZUW7Q2vvsfcezscGDfndsLeqAv+MaGI7TzLllYiE2lf UVyKCCHoLGTwoIXqrQSxs1JYUH1E71G/GMjb/h+3VKGUB1h6DQH4gVHjS2RSOIRm NF4txmJCijzgVIv3teTGzsldHZZ/1WI+4ADvaxE3dXq2xneCjJ9x4/DB1KdrjCB7 9jRGABON37wDmycZalpJJtA667jpv5TSgX6WOiZ6448Sm5UGr192T04UuLUmIJlZ Hy5iI9iQKYI+7Hr95Pyo9U7BUNzh0c9vyg3cyQfMZat0dv8WapN9vsjROUSdXwln RAvCCpfP/rg= =ur6j -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2004-07-26 10:11 UTC | newest] Thread overview: 46+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-07-16 10:35 Fibration questions David Dabbs 2004-07-16 11:04 ` Nikita Danilov 2004-07-16 17:45 ` David Dabbs 2004-07-18 7:11 ` Hans Reiser 2004-07-18 7:47 ` David Dabbs 2004-07-19 4:23 ` David Masover 2004-07-19 7:21 ` David Dabbs 2004-07-19 21:34 ` David Masover 2004-07-19 22:06 ` Valdis.Kletnieks 2004-07-19 22:32 ` David Dabbs 2004-07-20 6:03 ` Hans Reiser 2004-07-20 7:03 ` David Masover 2004-07-20 5:30 ` Hans Reiser 2004-07-20 7:07 ` David Masover 2004-07-20 8:31 ` David Dabbs 2004-07-21 5:13 ` David Masover 2004-07-21 5:44 ` David Dabbs 2004-07-21 6:20 ` David Masover 2004-07-21 6:36 ` David Dabbs 2004-07-21 8:32 ` mjt 2004-07-22 4:08 ` David Masover 2004-07-22 10:06 ` mjt 2004-07-22 18:14 ` Hans Reiser 2004-07-23 2:45 ` David Masover 2004-07-23 9:42 ` mjt 2004-07-23 18:21 ` David Masover 2004-07-22 10:10 ` Vitaly Fertman 2004-07-23 2:43 ` David Masover 2004-07-23 9:09 ` Vitaly Fertman 2004-07-26 6:28 ` Hans Reiser 2004-07-26 10:11 ` Vitaly Fertman 2004-07-23 9:59 ` Christian Mayrhuber 2004-07-23 9:59 ` mjt 2004-07-23 18:13 ` David Masover 2004-07-23 10:05 ` mjt 2004-07-22 8:03 ` Hans Reiser 2004-07-22 12:16 ` Nikita Danilov 2004-07-22 14:39 ` mjt 2004-07-22 18:17 ` Hans Reiser 2004-07-22 18:26 ` mjt 2004-07-22 19:57 ` Valdis.Kletnieks 2004-07-22 21:05 ` mjt 2004-07-22 21:36 ` Valdis.Kletnieks 2004-07-23 9:28 ` mjt 2004-07-23 22:42 ` Valdis.Kletnieks 2004-07-23 2:40 ` David Masover
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.