All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Lindsay <kevin@trakker.ca>
To: Ray Lee <ray-lk@madrabbit.org>
Cc: Douglas J Hunley <doug@hunley.homeip.net>,
	slocate@trakker.ca, linux-kernel@vger.kernel.org
Subject: Re: solving(?) the updatedb problem w/ the kernel cache
Date: Fri, 27 Jul 2007 10:22:19 -0700	[thread overview]
Message-ID: <20070727172219.GO3357@trakker.ca> (raw)
In-Reply-To: <2c0942db0707270942k784890a2ld44d1312dde02379@mail.gmail.com>

On Fri, Jul 27, 2007 at 09:42:27AM -0700, Ray Lee wrote:

> On 7/27/07, Douglas J Hunley <doug@hunley.homeip.net> wrote:
> > I've been following lkml for a little while (not understanding it all, but
> > following nonetheless <g>) and I've noticed that in a lot of the talks about
> > schedulers, elevators, and performance, the issue of running updatedb and its
> > effects on the kernel's fs cache seems to recur. I've also yet to see anyone
> > present a solution that others think is worth pursuing. I'm curious why we're
> > trying to solve the problem, when we can simply avoid the problem to begin
> > with by making use of inotify and introducing a new user-space
> > daemon, 'located'.
> 
> inotify doesn't scale for lots of directories. I have about 18,000
> directories under ~ on my laptop, and that's with a few source trees
> that I use infrequently tarballed up.
> 
> But yes, if we had a full filesystem events notifier, then we could
> just toss updatedb aside and have the benefit of a live index into the
> system. It's been suggested before, at least by me. Other projects
> want this as well, such as an on-demand virus scanner, or a live
> backup to another site, or beagle/tracker who would like to index
> documents on the fly. beagled already uses inotify, I think, but as it
> takes over my system (in a bad way) whenever I tried to run it, I had
> no choice but to remove it.
> 
> Perhaps it was choking on the 18k subdirectories, dunno.

The interface for inotify requires you to explicity watch files and folders.
As Ray suggests, I am also skeptical that using inotify to watch 18k of inodes
is very efficient, although it would be nice to be wrong. Possibly someone
needs to take a peak into OS X's fsevents mechanism?

The other problems with this approach is that the locate DB uses incremental
encoding. Each change to the filesystem may require large protions or possibly
a complete re-encoding of the database. An alterntive DB format would need to
be considered.

Kevin

      parent reply	other threads:[~2007-07-27 17:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-27 12:55 solving(?) the updatedb problem w/ the kernel cache Douglas J Hunley
2007-07-27 16:42 ` Ray Lee
2007-07-27 17:09   ` Michael Tharp
2007-07-27 17:24     ` J. Bruce Fields
2007-07-27 17:22   ` Kevin Lindsay [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070727172219.GO3357@trakker.ca \
    --to=kevin@trakker.ca \
    --cc=doug@hunley.homeip.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ray-lk@madrabbit.org \
    --cc=slocate@trakker.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.