From: Theodore Tso <tytso@mit.edu>
To: Jeff Shanab <jshanab@earthlink.net>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Starting a grad project that may change kernel VFS. Early research
Date: Mon, 24 Aug 2009 21:26:38 -0400 [thread overview]
Message-ID: <20090825012638.GO17684@mit.edu> (raw)
In-Reply-To: <4A93284C.7060604@earthlink.net>
On Mon, Aug 24, 2009 at 04:54:52PM -0700, Jeff Shanab wrote:
> I was thinking that a good way to handle this is that it starts with
> a file change in a directory. The directory entry contains a sum already
> for itself and all the subdirs and an adjustment is made immediately to
> that, it should be in the cache. Then we queue up the change to be sent
> to the parent(s?). These queued up events should be a low priority at a
> more human time like 1 second. If a large number of changes come to a
> directory, multiple adjustments hit the queue with the same (directory
> name, inode #?) and early ones are thrown out. So levels above would see
> at most a 1 per second low priority update.
Is this something that you want to be stored in the file system, or
just cached in memory? If it is going to be stored on disk, which
seems to be implied by your description, and it is only going to be
updated once a second, what happens if there is a system crash? Over
time, the values will go out of date. Fsck could fix this, sure, but
that means you have to do the equivant of running "du -s" on the root
directory of the filesystem after an unclean shutdown.
You could write the size changes in a journal, but that blows up the
size of information that would need to be stored in a journal. It
also slows down the very common operaton of writing to a file, all for
the sake of speeding up the relatively uncommon "du -s" operation.
It's not at all clear it's worthwhile tradeoff.
In addition, how will you handle hard links? An inode can have
multiple hard links in different directories, and there is no way to
find all of the directories which might contain a hard link to a
particular inode, short of doing a brute force search. Hence if you
have a file living in src/linux/v2.6.29/README, and it is a hard link
to ~/hacker/linux/README, and a program appends data to the file
~/hacker/linux/README, this would also change the result of running du
-s src/linux/v2.6.29; however, there's no way for your extension to
know that.
> title: "User Metadata" aka "pet peeve reduction"
> I would like to maintain a few classifications of metadata, most
> optional and configurable.
Most Linux filesystems already have extended attributes that can be
used to store your proposed metadata. Changing user application
programs to store the keywords, etc., is an exercise in
application-level programming; the kernel-side support is already
there.
- Ted
next prev parent reply other threads:[~2009-08-25 1:26 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-24 23:54 Starting a grad project that may change kernel VFS. Early research Jeff Shanab
2009-08-25 0:59 ` Bryan Donlan
2009-08-25 1:26 ` Theodore Tso [this message]
2009-08-25 12:13 ` Pavel Machek
-- strict thread matches above, loose matches on Subject: below --
2009-08-25 2:05 Jeff Shanab
2009-08-25 3:18 ` Bryan Donlan
2009-08-25 4:23 ` Jeff Shanab
2009-08-25 14:37 ` Bryan Donlan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090825012638.GO17684@mit.edu \
--to=tytso@mit.edu \
--cc=jshanab@earthlink.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.