All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Michael Tharp <gxti@partiallystapled.com>
Cc: Ray Lee <ray-lk@madrabbit.org>,
	Douglas J Hunley <doug@hunley.homeip.net>,
	slocate@trakker.ca, linux-kernel@vger.kernel.org
Subject: Re: solving(?) the updatedb problem w/ the kernel cache
Date: Fri, 27 Jul 2007 13:24:52 -0400	[thread overview]
Message-ID: <20070727172452.GD13965@fieldses.org> (raw)
In-Reply-To: <46AA26B9.7050904@partiallystapled.com>

On Fri, Jul 27, 2007 at 01:09:13PM -0400, Michael Tharp wrote:
> Ray Lee wrote:
> > But yes, if we had a full filesystem events notifier, then we could
> > just toss updatedb aside and have the benefit of a live index into the
> > system. It's been suggested before, at least by me. Other projects
> > want this as well, such as an on-demand virus scanner, or a live
> > backup to another site, or beagle/tracker who would like to index
> > documents on the fly. beagled already uses inotify, I think, but as it
> > takes over my system (in a bad way) whenever I tried to run it, I had
> > no choice but to remove it.
> 
> Beagle's problem is that it inspects the file contents, often far too
> closely. I, too, had to uninstall it after it started indexing 40GB raw
> huffyuv video files (probably treating them as text) and driving load
> averages through the roof. Just watching for structure changes won't be
> nearly as painful, assuming inotify can handle watching the entire
> filesystem tree.

Events notification only helps while you've got someone around to listen
to the events.  If you reboot (or even just log out?  I don't know),
then when you come back the only completely reliable way to find out
what's changed may be to re-read everything.

So I'd think that far more important and basic than events notification
would be ways to reliably tell when a file have changed by looking just
at the attributes.

You can just use everything that "stat" gives you and figure that if a
file is still at the same path, with the same ctime, mtime, size,
permissions, owner, inode number, etc., then it's probably the same
file.

If that's not enough, then maybe you want a change attribute (that's
guaranteed to change even when changes happen within less than the
granularity of ctime), and generation number (that's bumped whenever an
inode number is reused).

--b.

  reply	other threads:[~2007-07-27 17:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-27 12:55 solving(?) the updatedb problem w/ the kernel cache Douglas J Hunley
2007-07-27 16:42 ` Ray Lee
2007-07-27 17:09   ` Michael Tharp
2007-07-27 17:24     ` J. Bruce Fields [this message]
2007-07-27 17:22   ` Kevin Lindsay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070727172452.GD13965@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=doug@hunley.homeip.net \
    --cc=gxti@partiallystapled.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ray-lk@madrabbit.org \
    --cc=slocate@trakker.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.