linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Chuck Lever" <chuck.lever@oracle.com>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: Beagle and logging inotify events
Date: Wed, 14 Nov 2007 08:44:11 -0500	[thread overview]
Message-ID: <9e4733910711140544l3f311868n96d753ce0b70cee5@mail.gmail.com> (raw)
In-Reply-To: <45578746-916A-4F59-9A92-E7CEEFBC09B0@oracle.com>

On 11/14/07, Chuck Lever <chuck.lever@oracle.com> wrote:
> On Nov 13, 2007, at 7:04 PM, Jon Smirl wrote:
> > Is it feasible to do something like this in the linux file system
> > architecture?
> >
> > Beagle beats on my disk for an hour when I reboot. Of course I don't
> > like that and I shut Beagle off.
>
> Leopard, by the way, does exactly this: it has a daemon that starts
> at boot time and taps FSEvents then journals file system changes to a
> well-known file on local disk.

Logging file systems have all of the needed info. Plus they know what
is going on with rollback/replay after a crash. How about a fs API
where Beagle has a token for a checkpoint, and then it can ask for a
recreation of inotify events from that point forward.  It's always
possible for the file system to say I can't do that and trigger a full
rebuild from Beagle. Daemons that aren't coordinated with the file
system have a window during crash/reboot where they can get confused.

Without low level support like this Beagle is forced to do a rescan on
every boot. Since I crash my machine all of the time the disk load
from rebooting is intolerable and I turn Beagle off. Even just turning
the machine on in the morning generates an annoyingly large load on
the disk.



>
> I don't see why this couldn't be done on Linux as well.
>
> > ---------- Forwarded message ----------
> > From: Jon Smirl <jonsmirl@gmail.com>
> > Date: Nov 13, 2007 4:44 PM
> > Subject: Re: Strange "beagle" interaction..
> > To: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: "J. Bruce Fields" <bfields@fieldses.org>, Junio C Hamano
> > <gitster@pobox.com>, Git Mailing List <git@vger.kernel.org>, Johannes
> > Schindelin <Johannes.Schindelin@gmx.de>
> >
> >
> > On 11/13/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> >>
> >>
> >> On Tue, 13 Nov 2007, J. Bruce Fields wrote:
> >>>
> >>> Last I ran across this, I believe I found it was adding extended
> >>> attributes to the file.
> >>
> >> Yeah, I just straced it and found the same thing. It's saving
> >> fingerprints
> >> and mtimes to files in the extended attributes.
> >
> > Things like Beagle need a guaranteed log of global inotify events.
> > That would let them efficiently find changes made since the last time
> > they updated their index.
> >
> > Right now every time Beagle starts it hasn't got a clue what has
> > changed in the file system since it was last run. This forces Beagle
> > to rescan the entire filesystem every time it is started. The xattrs
> > are used as cache to reduce this load somewhat.
> >
> > A better solution would be for the kernel to log inotify events to
> > disk in a manner that survives reboots. When Beagle starts it would
> > locate its last checkpoint and then process the logged inotify events
> > from that time forward. This inotify logging needs to be bullet proof
> > or it will mess up your Beagle index.
> >
> > Logged files systems already contain the logged inotify data (in their
> > own internal form). There's just no universal API for retrieving it in
> > a file system independent manner.
> >
> >>
> >>> Yeah, I just turned off beagle.  It looked to me like it was doing
> >>> something wrongheaded.
> >>
> >> Gaah. The problem is, setting xattrs does actually change ctime.
> >> Which
> >> means that if we want to make git play nice with beagle, I guess
> >> we have
> >> to just remove the comparison of ctime.
> >>
> >> Oh, well. Git doesn't *require* it, but I like the notion of
> >> checking the
> >> inode really really carefully. But it looks like it may not be an
> >> option,
> >> because of file indexers hiding stuff behind our backs.
> >>
> >> Or we could just tell people not to run beagle on their git trees,
> >> but I
> >> suspect some people will actually *want* to. Even if it flushes
> >> their disk
> >> caches.
> >>
> >>                 Linus
> >> -
> >> To unsubscribe from this list: send the line "unsubscribe git" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >
> >
> > --
> > Jon Smirl
> > jonsmirl@gmail.com
> >
> >
> > --
> > Jon Smirl
> > jonsmirl@gmail.com
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-
> > fsdevel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
>


-- 
Jon Smirl
jonsmirl@gmail.com

  reply	other threads:[~2007-11-14 13:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-14  0:04 Beagle and logging inotify events Jon Smirl
2007-11-14 13:29 ` Chuck Lever
2007-11-14 13:44   ` Jon Smirl [this message]
2007-11-14 14:41     ` Chuck Lever
2007-11-14 15:01       ` Jon Smirl
2007-11-14 16:32         ` Chuck Lever
2007-11-14 17:46           ` Jon Smirl
2007-11-14 19:32           ` Andreas Dilger
2007-11-14 19:38             ` J. Bruce Fields
2007-11-15 19:59               ` Jan Kara
2007-11-15 20:14                 ` J. Bruce Fields
2007-11-15 20:14                 ` Jon Smirl
2007-11-14 15:30     ` Andi Kleen
2007-11-14 19:09       ` J. Bruce Fields
2007-11-14 19:22         ` Jon Smirl
2007-11-14 19:30           ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e4733910711140544l3f311868n96d753ce0b70cee5@mail.gmail.com \
    --to=jonsmirl@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).