FS callback routines

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* FS callback routines
@ 2001-01-09  1:21 Sean R. Bright
  2001-01-09 11:22 ` Philipp Matthias Hahn
  2001-01-09 11:34 ` Daniel Stodden
  0 siblings, 2 replies; 18+ messages in thread
From: Sean R. Bright @ 2001-01-09  1:21 UTC (permalink / raw)
  To: linux-kernel


	Ok, before I begin, don't shoot me down, but I had an idea for a kernel
modification and was wondering how feasible the group thought it was.

	I was writing a user space application to monitor a folder's contents.  The
folder itself contained 100 folders, and each of those contained 24 folders.
While writing the code to traverse the directory structure I realized that
instead of my software figuring out when things change, why not just have
the fs tell my application when something was updated.  For example, say we
had a function called watch_fs(), that took an inode reference and a
function pointer and maybe a bitmask of events to watch for.  When that
inode (or its children) were changed, why couldn't the fs code call the
callback function I specified?

	I have no idea how expensive this would be or if its even worth it at this
point.  It also wouldn't be portable at all considering that I know of no
other OS that does this (could be wrong).

	Like I said, I am not asking that this be (necessarily) implemented, I am
just curious as to what the percieved performance ramifications would be if
it were to implemented, say, by a virgin kernel developer ;)

	Thanks,
	Sean
	elixer@erols.com



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-09  1:21 FS callback routines Sean R. Bright
@ 2001-01-09 11:22 ` Philipp Matthias Hahn
  2001-01-09 11:34 ` Daniel Stodden
  1 sibling, 0 replies; 18+ messages in thread
From: Philipp Matthias Hahn @ 2001-01-09 11:22 UTC (permalink / raw)
  To: Sean R. Bright; +Cc: Linux Kernel Mailing List

On Mon, 8 Jan 2001, Sean R. Bright wrote:

> I was writing a user space application to monitor a folder's contents.  The
> folder itself contained 100 folders, and each of those contained 24 folders.
> While writing the code to traverse the directory structure I realized that
> instead of my software figuring out when things change, why not just have
> the fs tell my application when something was updated.  For example, say we
> had a function called watch_fs(), that took an inode reference and a
> function pointer and maybe a bitmask of events to watch for.  When that
> inode (or its children) were changed, why couldn't the fs code call the
> callback function I specified?
RFTM: linux-2.4.0/Documentation/dnotify.txt

BYtE
Philipp
-- 
  / /  (_)__  __ ____  __ Philipp Hahn
 / /__/ / _ \/ // /\ \/ /
/____/_/_//_/\_,_/ /_/\_\ pmhahn@titan.lahn.de

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-09  1:21 FS callback routines Sean R. Bright
  2001-01-09 11:22 ` Philipp Matthias Hahn
@ 2001-01-09 11:34 ` Daniel Stodden
  1 sibling, 0 replies; 18+ messages in thread
From: Daniel Stodden @ 2001-01-09 11:34 UTC (permalink / raw)
  To: elixer; +Cc: linux-kernel

"Sean R. Bright" <elixer@erols.com> writes:

> 	Ok, before I begin, don't shoot me down, but I had an idea for a kernel
> modification and was wondering how feasible the group thought it was.
> 
> 	I was writing a user space application to monitor a folder's contents.  The
> folder itself contained 100 folders, and each of those contained 24 folders.
> While writing the code to traverse the directory structure I realized that
> instead of my software figuring out when things change, why not just have
> the fs tell my application when something was updated.  For example, say we
> had a function called watch_fs(), that took an inode reference and a
> function pointer and maybe a bitmask of events to watch for.  When that
> inode (or its children) were changed, why couldn't the fs code call the
> callback function I specified?
> 
> 	I have no idea how expensive this would be or if its even worth it at this
> point.  It also wouldn't be portable at all considering that I know of no
> other OS that does this (could be wrong).
> 
> 	Like I said, I am not asking that this be (necessarily) implemented, I am
> just curious as to what the percieved performance ramifications would be if
> it were to implemented, say, by a virgin kernel developer ;)

you want to have a look at

http://oss.sgi.com/projects/fam/

resp. imon, the corresponding kernel modules. 

this has been around for quite some time now. enlightenment has
been/still is? using it since it's earliest incarnations of its file
manager extension efm. (same with kde? not sure..)

i'm wondering whether this could get into the mainstream kernels soon?
i'm not really deep in the filesystem layers, but this sounds to me
like an extremely useful feature.

could anyone comment on section 2 of
http://oss.sgi.com/projects/fam/imon.txt ? would this actually be the
way to do it or is there any better method?


regards,
dns

-- 
___________________________________________________________________________
 mailto:stodden@in.tum.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
@ 2001-01-11 16:39 Jesse Pollard
  2001-01-11 17:53 ` Daniel Phillips
  0 siblings, 1 reply; 18+ messages in thread
From: Jesse Pollard @ 2001-01-11 16:39 UTC (permalink / raw)
  To: phillips, Jamie Lokier, linux-kernel

Daniel Phillips <phillips@innominate.de>:
> Jamie Lokier wrote:
> > 
> > Daniel Phillips wrote:
> > >         DN_OPEN       A file in the directory was opened
> > >
> > > You open the top level directory and register for events.  When somebody
> > > opens a subdirectory of the top level directory, you receive
> > > notification and register for events on the subdirectory, and so on,
> > > down to the file that is actually modified.
> > 
> > If it worked, and I'm not sure the timing would be reliable enough, the
> > daemon would only have to have open every directory being accessed by
> > every program in the system.  Hmm.  Seems like overkill when you're only
> > interested in files that are being modified.
> 
> It gets to close some too.  Normally just the directories in the path to
> the file(s) being modified would be open.
> 
> Good point about the timing.  A directory should not disappear before an
> in-flight notification has been serviced.  I doubt the current scheme
> enforces this.  There is no more room for 'works most of the time' in
> this than there is in our memory page handling.
> 
> > It would be much, much more reliable to do a walk over d_parent in
> > dnotify.c.  Your idea is a nice way to flag kernel dentries such that
> > you don't do d_parent walks unnecessarily.
> 
> It's bottom-up vs top-down.  It's worth analyzing the top-down approach
> a little more, it does solve a lot of problems (and creates some as you
> pointed out, or at least makes some existing problems more obvious). 
> For make it's really quite nice.  The make daemon only needs to register
> in the top level directory of the source tree.  I think this solves the
> hard link problem too, because each path that's interested in
> notification will receive it.

It makes security checks impossible though. You would have to reboot
the system every time a directory changes permission to block unauthorized
monitoring of files that are no longer accessable by the user.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-11 16:39 Jesse Pollard
@ 2001-01-11 17:53 ` Daniel Phillips
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Phillips @ 2001-01-11 17:53 UTC (permalink / raw)
  To: Jesse Pollard, linux-kernel

Jesse Pollard wrote:
> 
> Daniel Phillips <phillips@innominate.de>:
> > Jamie Lokier wrote:
> > >
> > > Daniel Phillips wrote:
> > > >         DN_OPEN       A file in the directory was opened
> > > >
> > > > You open the top level directory and register for events.  When somebody
> > > > opens a subdirectory of the top level directory, you receive
> > > > notification and register for events on the subdirectory, and so on,
> > > > down to the file that is actually modified.
> > >
> > > If it worked, and I'm not sure the timing would be reliable enough, the
> > > daemon would only have to have open every directory being accessed by
> > > every program in the system.  Hmm.  Seems like overkill when you're only
> > > interested in files that are being modified.
> >
> > It gets to close some too.  Normally just the directories in the path to
> > the file(s) being modified would be open.
> >
> > Good point about the timing.  A directory should not disappear before an
> > in-flight notification has been serviced.  I doubt the current scheme
> > enforces this.  There is no more room for 'works most of the time' in
> > this than there is in our memory page handling.
> >
> > > It would be much, much more reliable to do a walk over d_parent in
> > > dnotify.c.  Your idea is a nice way to flag kernel dentries such that
> > > you don't do d_parent walks unnecessarily.
> >
> > It's bottom-up vs top-down.  It's worth analyzing the top-down approach
> > a little more, it does solve a lot of problems (and creates some as you
> > pointed out, or at least makes some existing problems more obvious).
> > For make it's really quite nice.  The make daemon only needs to register
> > in the top level directory of the source tree.  I think this solves the
> > hard link problem too, because each path that's interested in
> > notification will receive it.
> 
> It makes security checks impossible though. You would have to reboot
> the system every time a directory changes permission to block unauthorized
> monitoring of files that are no longer accessable by the user.

Heh.  *No reboots*.  At worst you would have to kill, but I don't see
what is impossible about this.  It's not worse than the current
situation, which is just to check permissions on open and trust they
don't change.  That is not a reason to give up and accept the status
quo:

In a separate thread (Re: Subtle MM bug) "Albert D. Cahalan" wrote:
> 
> Credentials could be changed on syscall exit. It is a bit like
> doing signals I think, with less overhead than making userspace
> muck around with signal handlers and synchronization crud.

IOW, I don't think this notification method makes things worse for
security.  In fact, it could have important benefits for security.  How
about a security daemon that gets notified every time a file is changed
and sounds alarms when it doesn't make sense?

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
@ 2001-01-10 15:18 Jesse Pollard
  0 siblings, 0 replies; 18+ messages in thread
From: Jesse Pollard @ 2001-01-10 15:18 UTC (permalink / raw)
  To: phillips, Jesse Pollard, linux-kernel

---------  Received message begins Here  ---------

> 
> Jesse Pollard wrote:
> > Daniel Phillips <phillips@innominate.de>:
> > > This may be the most significant new feature in 2.4.0, as it allows us
> > > to take a fundamentally different approach to many different problems.
> > > Three that come to mind: mail (get your mail instantly without polling);
> > > make (don't rely on timestamps to know when rebuilding is needed, don't
> > > scan huge directory trees on each build); locate (reindex only those
> > > directories that have changed, keep index database current).  As you
> > > noticed, there are many others.
> > > ...
> > 
> > It would also be very nice if the security of the feature could be
> > confirmed. The problem with SGI's implementation is that it becomes
> > possible to monitor files that you don't own, don't have access to,
> > or are not permitted to know even exist.
> 
> To receive notification about events in a given directory you have to be
> able to open it.  Is this adequate for your needs?

It depends on the implementation - One problem is that you may be able
to open the directory at the time you start monitoring, then the permission
is removed. Most implementations do not recheck access rights on each
notification (fair amount of overhead).

My belief is (and I could be wrong) that most such callbacks are done by
placing a watch list on/for a device:inode identifier. When activity on
a matching device:inode occurs, the matching callback is invoked. No path name
access rights rechecking is performed since the scan of the path could
easily overwhelm the the operation being performed on the file.

The only guard would be the path scan done at the time the callback is
established. The justification is that "It's just like opening a file, the
access rights are checked on open". A callback is not an open, UNLESS callbacks
can only be placed on open file id's.

In SGI's case, the file alteration monitor (a daemon) performs this activity
while running as root, and providing RPC access to remote systems. This RPC
is unauthenticated, permitting non-local users to track files that exist on
local hosts. Since RPC cannot verify the identity of the remote user, it
permits tracking of ANY file on the system (via RPC spoofing of user identity
in the RPC call).

This was determined to be A Bad Thing, and not allowed.

> > For these reasons, we have disabled the feature.
> 
> It's nice to have that option, isn't it? ;-)

It would be if it could be done in a secure manner.

Also usefull (after callbacks work): an extension to callbacks to allow
catching file read/write/seek/truncate/ioctl actions, and being able to
perform actions on behalf of the owner of the file (as an implementation
of the original action). This might reqire the user to provide a "daemon"
to monitor the file, or to be activated by a callback established by the
owner of the file. (I know, there is a lot to consider when implementing
something like this - process environment, what executable is to be run
to do this, the interface to getting the users request/returning results..)

Use: ability to provide customized versions of the file based on the identity
of the user that opened the file - fields could be hidden/generated, queries
could be passed (via ioctl). This could provide a simple way to implement
a small data manager, but without the overhead (ie $$$) for a data base system
or the complexity of establishing a data base style client-server.

I do think such an extension should be permitted/denied based on a capability
though.

Ahh well, rambling ideas....

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
@ 2001-01-09 14:05 Jesse Pollard
  2001-01-09 15:41 ` Daniel Phillips
  0 siblings, 1 reply; 18+ messages in thread
From: Jesse Pollard @ 2001-01-09 14:05 UTC (permalink / raw)
  To: phillips, Michael D. Crawford, Stephen Rothwell, linux-kernel

Daniel Phillips <phillips@innominate.de>:
> "Michael D. Crawford" wrote:
> > 
> > Regarding notification when there's a change to the filesystem:
> > 
> > This is one of the most significant things about the BeOS BFS filesystem, and
> > something I'd dearly love to see Linux adopt.  It makes an app very efficient,
> > you just get notified when a directory changes and you never waste time polling.
> > 
> > I think it would require changes to the VFS layer, not just to the filesystems,
> > because this is a concept POSIX filesystems do not presently possess.
> > 
> > The other is indexed filesystem attributes, for example a file can have its
> > mimetype in the filesystem, and any application can add an attribute and have it
> > indexed.
> > 
> > There's a method to do boolean queries on indexed attributes, and you can find
> > files in an entire filesystem that match a query in a blazingly short time, much
> > faster than walking the directory tree.
> > 
> > If you want to try out the BeOS, there's a free-as-in-beer version at
> > http://free.be.com for Pentium PC's.  You can also purchase a version that comes
> > for both PC's and certain PowerPC macs.
> > 
> > There are read-only versions of this for Linux which I believe are under the
> > GPL.  The original author is here:
> > 
> > http://hp.vector.co.jp/authors/VA008030/bfs/
> > 
> > He refers you to here to get a version that works under 2.2.16:
> > 
> > http://milosch.net/beos/
> > 
> > The author's intention was to take it read-write, but it's complex because it is
> > a journaling filesystem.
> > 
> > Daniel Berlin, a BeOS developer modified the Linux BFS driver so it works with
> > 2.4.0-test1.  I don't know if it works with 2.4.0.  The web site where it used
> > to be posted isn't there anymore, and the laptop where I had it is in for
> > repair.  I may have it on a backup, and I'll see if I can track Daniel down.
> > 
> > While Be, Inc.'s implementation is closed-source, the design of the BFS (_not_
> > "befs" as it is sometimes called) is explained in Practical File System Design
> > with the Be File System by Dominic Giampolo, ISBN 1-55860-497-9.  Dominic has
> > since left Be and I understand works at Google now.
> 
> fs/dnotify.c:
> 
>    /*
>     * Directory notifications for Linux.
>     *
>     * Copyright (C) 2000 Stephen Rothwell
>     ...
> 
> The currently defined events are:
> 
> 	DN_ACCESS	A file in the directory was accessed (read)
> 	DN_MODIFY	A file in the directory was modified (write,truncate)
> 	DN_CREATE	A file was created in the directory
> 	DN_DELETE	A file was unlinked from directory
> 	DN_RENAME	A file in the directory was renamed
> 	DN_ATTRIB	A file in the directory had its attributes
> 			changed (chmod,chown)
> 
> It was done last year, quietly and without fanfare, by Stephen Rothwell:
> 
>   http://www.linuxcare.com/about-us/os-dev/rothwell.epl
> 
> This may be the most significant new feature in 2.4.0, as it allows us
> to take a fundamentally different approach to many different problems. 
> Three that come to mind: mail (get your mail instantly without polling);
> make (don't rely on timestamps to know when rebuilding is needed, don't
> scan huge directory trees on each build); locate (reindex only those
> directories that have changed, keep index database current).  As you
> noticed, there are many others.
> 
> Stephen, it would be very interesting to know more about the development
> process you went through and what motivated you to provide this
> fundamental facility.

It would also be very nice if the security of the feature could be
confirmed. The problem with SGI's implementation is that it becomes
possible to monitor files that you don't own, don't have access to,
or are not permitted to know even exist. For these reasons, we have
disabled the feature.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-09 14:05 Jesse Pollard
@ 2001-01-09 15:41 ` Daniel Phillips
  2001-01-10 10:48   ` Jamie Lokier
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Phillips @ 2001-01-09 15:41 UTC (permalink / raw)
  To: Jesse Pollard, linux-kernel

Jesse Pollard wrote:
> Daniel Phillips <phillips@innominate.de>:
> > This may be the most significant new feature in 2.4.0, as it allows us
> > to take a fundamentally different approach to many different problems.
> > Three that come to mind: mail (get your mail instantly without polling);
> > make (don't rely on timestamps to know when rebuilding is needed, don't
> > scan huge directory trees on each build); locate (reindex only those
> > directories that have changed, keep index database current).  As you
> > noticed, there are many others.
> > ...
> 
> It would also be very nice if the security of the feature could be
> confirmed. The problem with SGI's implementation is that it becomes
> possible to monitor files that you don't own, don't have access to,
> or are not permitted to know even exist.

To receive notification about events in a given directory you have to be
able to open it.  Is this adequate for your needs?

> For these reasons, we have disabled the feature.

It's nice to have that option, isn't it? ;-)

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-09 15:41 ` Daniel Phillips
@ 2001-01-10 10:48   ` Jamie Lokier
  0 siblings, 0 replies; 18+ messages in thread
From: Jamie Lokier @ 2001-01-10 10:48 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Jesse Pollard, linux-kernel

Daniel Phillips wrote:
> > It would also be very nice if the security of the feature could be
> > confirmed. The problem with SGI's implementation is that it becomes
> > possible to monitor files that you don't own, don't have access to,
> > or are not permitted to know even exist.
> 
> To receive notification about events in a given directory you have to be
> able to open it.  Is this adequate for your needs?

No, because to open a directory you only nead read permission, whereas
to read attributes of files in the directory, you need execute
permission on the directory.

Also, you are getting notifications for unlinked files, which perhaps
you should not be able to know anything about.  (If the directory wasn't
accessible when the file was unlinked for example, but was made
accessible later).

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
@ 2001-01-08 23:12 Michael D. Crawford
  2001-01-09  2:37 ` Sean R. Bright
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Michael D. Crawford @ 2001-01-08 23:12 UTC (permalink / raw)
  To: linux-kernel

Regarding notification when there's a change to the filesystem:

This is one of the most significant things about the BeOS BFS filesystem, and
something I'd dearly love to see Linux adopt.  It makes an app very efficient,
you just get notified when a directory changes and you never waste time polling.

I think it would require changes to the VFS layer, not just to the filesystems,
because this is a concept POSIX filesystems do not presently possess.

The other is indexed filesystem attributes, for example a file can have its
mimetype in the filesystem, and any application can add an attribute and have it
indexed.

There's a method to do boolean queries on indexed attributes, and you can find
files in an entire filesystem that match a query in a blazingly short time, much
faster than walking the directory tree.

If you want to try out the BeOS, there's a free-as-in-beer version at
http://free.be.com for Pentium PC's.  You can also purchase a version that comes
for both PC's and certain PowerPC macs.

There are read-only versions of this for Linux which I believe are under the
GPL.  The original author is here:

http://hp.vector.co.jp/authors/VA008030/bfs/

He refers you to here to get a version that works under 2.2.16:

http://milosch.net/beos/

The author's intention was to take it read-write, but it's complex because it is
a journaling filesystem.

Daniel Berlin, a BeOS developer modified the Linux BFS driver so it works with
2.4.0-test1.  I don't know if it works with 2.4.0.  The web site where it used
to be posted isn't there anymore, and the laptop where I had it is in for
repair.  I may have it on a backup, and I'll see if I can track Daniel down.

While Be, Inc.'s implementation is closed-source, the design of the BFS (_not_
"befs" as it is sometimes called) is explained in Practical File System Design
with the Be File System by Dominic Giampolo, ISBN 1-55860-497-9.  Dominic has
since left Be and I understand works at Google now.

-- 
Michael D. Crawford
GoingWare Inc. - Expert Software Development and Consulting
http://www.goingware.com/
crawford@goingware.com

   Tilting at Windmills for a Better Tomorrow.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: FS callback routines
  2001-01-08 23:12 Michael D. Crawford
@ 2001-01-09  2:37 ` Sean R. Bright
  2001-01-09  3:48 ` David Weinehall
  2001-01-09 13:07 ` Daniel Phillips
  2 siblings, 0 replies; 18+ messages in thread
From: Sean R. Bright @ 2001-01-09  2:37 UTC (permalink / raw)
  To: 'Michael D. Crawford'; +Cc: linux-kernel

Agreed.

I will have a look at the URLs you passed along.  I was talking to a
colleague about this just after I sent the initial message and the number of
places where this would be useful suddenly became much more apparent to me
:)  For example, _ANY_ daemon process could be notified of configuration
changes when they happen, mail servers/spoolers would have immediate access
to the locations of files, without tying up the disk with polling.  All and
all a very good idea (and its not even mine anymore it would seem, so I can
say that! :))

Thanks again,
Sean

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org
> [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Michael D.
> Crawford
> Sent: Monday, January 08, 2001 6:12 PM
> To: linux-kernel@vger.kernel.org
> Subject: Re: FS callback routines
>
>
> Regarding notification when there's a change to the filesystem:
>
> This is one of the most significant things about the BeOS BFS
> filesystem, and
> something I'd dearly love to see Linux adopt.  It makes an
> app very efficient,
> you just get notified when a directory changes and you never
> waste time polling.
>
> I think it would require changes to the VFS layer, not just
> to the filesystems,
> because this is a concept POSIX filesystems do not presently possess.
>
> The other is indexed filesystem attributes, for example a
> file can have its
> mimetype in the filesystem, and any application can add an
> attribute and have it
> indexed.
>
> There's a method to do boolean queries on indexed attributes,
> and you can find
> files in an entire filesystem that match a query in a
> blazingly short time, much
> faster than walking the directory tree.
>
> If you want to try out the BeOS, there's a free-as-in-beer version at
> http://free.be.com for Pentium PC's.  You can also purchase a
> version that comes
> for both PC's and certain PowerPC macs.
>
> There are read-only versions of this for Linux which I
> believe are under the
> GPL.  The original author is here:
>
> http://hp.vector.co.jp/authors/VA008030/bfs/
>
> He refers you to here to get a version that works under 2.2.16:
>
> http://milosch.net/beos/
>
> The author's intention was to take it read-write, but it's
> complex because it is
> a journaling filesystem.
>
> Daniel Berlin, a BeOS developer modified the Linux BFS driver
> so it works with
> 2.4.0-test1.  I don't know if it works with 2.4.0.  The web
> site where it used
> to be posted isn't there anymore, and the laptop where I had
> it is in for
> repair.  I may have it on a backup, and I'll see if I can
> track Daniel down.
>
> While Be, Inc.'s implementation is closed-source, the design
> of the BFS (_not_
> "befs" as it is sometimes called) is explained in Practical
> File System Design
> with the Be File System by Dominic Giampolo, ISBN
> 1-55860-497-9.  Dominic has
> since left Be and I understand works at Google now.
>
>
> --
> Michael D. Crawford
> GoingWare Inc. - Expert Software Development and Consulting
> http://www.goingware.com/
> crawford@goingware.com
>
>    Tilting at Windmills for a Better Tomorrow.
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-08 23:12 Michael D. Crawford
  2001-01-09  2:37 ` Sean R. Bright
@ 2001-01-09  3:48 ` David Weinehall
  2001-01-09 13:07 ` Daniel Phillips
  2 siblings, 0 replies; 18+ messages in thread
From: David Weinehall @ 2001-01-09  3:48 UTC (permalink / raw)
  To: Michael D. Crawford; +Cc: linux-kernel

On Mon, Jan 08, 2001 at 11:12:24PM +0000, Michael D. Crawford wrote:

[snipped a lot of sane opinions]
> While Be, Inc.'s implementation is closed-source, the design of the
> BFS (_not_ "befs" as it is sometimes called) is explained in Practical
> File System Design with the Be File System by Dominic Giampolo, ISBN
> 1-55860-497-9.  Dominic has since left Be and I understand works at
> Google now.

The reason why BFS is often referred to as BeFS, is that there is a
another file-system, far older than Be's filesystem AFAIK, called BFS;
the SCO Unixware Boot File System, which is already supported in the
Linux-kernel. Hence the misnomer BeFS. I think we should keep it that
way to avoid confusion... After all, BeFS does indicate pretty well what
file-system we mean, and other alternatives, such as be_bfs, or renaming
SCO BFS to sco_bfs or similar feels awkward.


/David Weinehall
  _                                                                 _
 // David Weinehall <tao@acc.umu.se> /> Northern lights wander      \\
//  Project MCA Linux hacker        //  Dance across the winter sky //
\>  http://www.acc.umu.se/~tao/    </   Full colour fire           </
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-08 23:12 Michael D. Crawford
  2001-01-09  2:37 ` Sean R. Bright
  2001-01-09  3:48 ` David Weinehall
@ 2001-01-09 13:07 ` Daniel Phillips
  2001-01-10 10:56   ` Jamie Lokier
  2 siblings, 1 reply; 18+ messages in thread
From: Daniel Phillips @ 2001-01-09 13:07 UTC (permalink / raw)
  To: Michael D. Crawford, Stephen Rothwell, linux-kernel

"Michael D. Crawford" wrote:
> 
> Regarding notification when there's a change to the filesystem:
> 
> This is one of the most significant things about the BeOS BFS filesystem, and
> something I'd dearly love to see Linux adopt.  It makes an app very efficient,
> you just get notified when a directory changes and you never waste time polling.
> 
> I think it would require changes to the VFS layer, not just to the filesystems,
> because this is a concept POSIX filesystems do not presently possess.
> 
> The other is indexed filesystem attributes, for example a file can have its
> mimetype in the filesystem, and any application can add an attribute and have it
> indexed.
> 
> There's a method to do boolean queries on indexed attributes, and you can find
> files in an entire filesystem that match a query in a blazingly short time, much
> faster than walking the directory tree.
> 
> If you want to try out the BeOS, there's a free-as-in-beer version at
> http://free.be.com for Pentium PC's.  You can also purchase a version that comes
> for both PC's and certain PowerPC macs.
> 
> There are read-only versions of this for Linux which I believe are under the
> GPL.  The original author is here:
> 
> http://hp.vector.co.jp/authors/VA008030/bfs/
> 
> He refers you to here to get a version that works under 2.2.16:
> 
> http://milosch.net/beos/
> 
> The author's intention was to take it read-write, but it's complex because it is
> a journaling filesystem.
> 
> Daniel Berlin, a BeOS developer modified the Linux BFS driver so it works with
> 2.4.0-test1.  I don't know if it works with 2.4.0.  The web site where it used
> to be posted isn't there anymore, and the laptop where I had it is in for
> repair.  I may have it on a backup, and I'll see if I can track Daniel down.
> 
> While Be, Inc.'s implementation is closed-source, the design of the BFS (_not_
> "befs" as it is sometimes called) is explained in Practical File System Design
> with the Be File System by Dominic Giampolo, ISBN 1-55860-497-9.  Dominic has
> since left Be and I understand works at Google now.

fs/dnotify.c:

   /*
    * Directory notifications for Linux.
    *
    * Copyright (C) 2000 Stephen Rothwell
    ...

The currently defined events are:

	DN_ACCESS	A file in the directory was accessed (read)
	DN_MODIFY	A file in the directory was modified (write,truncate)
	DN_CREATE	A file was created in the directory
	DN_DELETE	A file was unlinked from directory
	DN_RENAME	A file in the directory was renamed
	DN_ATTRIB	A file in the directory had its attributes
			changed (chmod,chown)

It was done last year, quietly and without fanfare, by Stephen Rothwell:

  http://www.linuxcare.com/about-us/os-dev/rothwell.epl

This may be the most significant new feature in 2.4.0, as it allows us
to take a fundamentally different approach to many different problems. 
Three that come to mind: mail (get your mail instantly without polling);
make (don't rely on timestamps to know when rebuilding is needed, don't
scan huge directory trees on each build); locate (reindex only those
directories that have changed, keep index database current).  As you
noticed, there are many others.

Stephen, it would be very interesting to know more about the development
process you went through and what motivated you to provide this
fundamental facility.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-09 13:07 ` Daniel Phillips
@ 2001-01-10 10:56   ` Jamie Lokier
  2001-01-10 18:25     ` Daniel Phillips
  2001-01-11 14:30     ` Daniel Phillips
  0 siblings, 2 replies; 18+ messages in thread
From: Jamie Lokier @ 2001-01-10 10:56 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Michael D. Crawford, Stephen Rothwell, linux-kernel

Daniel Phillips wrote:
> It was done last year, quietly and without fanfare, by Stephen Rothwell:
> 
>   http://www.linuxcare.com/about-us/os-dev/rothwell.epl
> 
> This may be the most significant new feature in 2.4.0, as it allows us
> to take a fundamentally different approach to many different problems. 
> Three that come to mind:
[...]
>    mail (get your mail instantly without polling);

You'll be notified if _any_ mailbox is changed in your mail directory.
On a multiuser system that's going to be more often than a typical
polling interval, so you'll have to revert to polling.

> make (don't rely on timestamps to know when rebuilding is needed, don't
> scan huge directory trees on each build)

You will need to rescan the timestamps of files, but yes you can skip
subdirectories in which no file has changed.  But only if you're running
a "make daemon" on the same box as make, and if there aren't too many
directories.

> locate (reindex only those directories that have changed, keep index
> database current).

Not a chance.  dnotify doesn't work recursively, so you can't monitor
just a few top level directories like "/usr/lib".  And have you ever
tried to keep all 3000 directories on your filesystem directories open
at the same time?  Would you want to consume that much non-swappable
memory, and also prevent the directories from being removed from the
filesystem?

Long ago I proposed something similar that works at the disk level, is
recursive, and the checks can be done without keeping directories open.
But I never wrote the code :(  That's interesting because it speeds up
make without needing a daemon, and really can speed up
locate/updatedb/find.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-10 10:56   ` Jamie Lokier
@ 2001-01-10 18:25     ` Daniel Phillips
  2001-01-11 14:30     ` Daniel Phillips
  1 sibling, 0 replies; 18+ messages in thread
From: Daniel Phillips @ 2001-01-10 18:25 UTC (permalink / raw)
  To: Jamie Lokier, linux-kernel

Jamie Lokier wrote:
> 
> Daniel Phillips wrote:
> > It was done last year, quietly and without fanfare, by Stephen Rothwell:
> >
> >   http://www.linuxcare.com/about-us/os-dev/rothwell.epl
> >
> > This may be the most significant new feature in 2.4.0, as it allows us
> > to take a fundamentally different approach to many different problems.
> > Three that come to mind:
> [...]
> >    mail (get your mail instantly without polling);
> 
> You'll be notified if _any_ mailbox is changed in your mail directory.
> On a multiuser system that's going to be more often than a typical
> polling interval, so you'll have to revert to polling.

?? I don't get it.  I'd expect a daemon to be notified and the daemon in
turn to notify me.

> > make (don't rely on timestamps to know when rebuilding is needed, don't
> > scan huge directory trees on each build)
> 
> You will need to rescan the timestamps of files, but yes you can skip
> subdirectories in which no file has changed.  But only if you're running
> a "make daemon" on the same box as make, and if there aren't too many
> directories.

A 'make daemon' is exactly what I was thinking of.  The 'too many
directories' think is a problem, see below.

> > locate (reindex only those directories that have changed, keep index
> > database current).
> 
> Not a chance.  dnotify doesn't work recursively, so you can't monitor
> just a few top level directories like "/usr/lib".  And have you ever
> tried to keep all 3000 directories on your filesystem directories open
> at the same time?  Would you want to consume that much non-swappable
> memory, and also prevent the directories from being removed from the
> filesystem?

Yes, basic problem.  The notification has to be persistent across
directory open/close.  There had to be a reason why dnotify.c is just
140 lines long, right?  OK, that doesn't mean 'not a chance', it just
means the current implementation is inadequate.  Now at least I can
start writing programs that work this way, try them out, and think about
what has to be done to go the rest of the distance.

Another problem is not handling hard links properly.  That's not ok.

> Long ago I proposed something similar that works at the disk level, is
> recursive, and the checks can be done without keeping directories open.
> But I never wrote the code :(  That's interesting because it speeds up
> make without needing a daemon, and really can speed up
> locate/updatedb/find.

Yes, there are obvious flaws but it's now in and it's obvious where it's
going.  A simple-minded inadequate piece of code that's working beats a
perfect one that exists purely in the imagination, any day. :-)

Do you still have your original proposal around?

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-10 10:56   ` Jamie Lokier
  2001-01-10 18:25     ` Daniel Phillips
@ 2001-01-11 14:30     ` Daniel Phillips
  2001-01-11 15:37       ` Jamie Lokier
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel Phillips @ 2001-01-11 14:30 UTC (permalink / raw)
  To: Jamie Lokier, linux-kernel

Jamie Lokier wrote:
> Daniel Phillips wrote:
> > [things that can benefit from dnotify]
> > locate (reindex only those directories that have changed, keep index
> > database current).
> 
> Not a chance.  dnotify doesn't work recursively, so you can't monitor
> just a few top level directories like "/usr/lib".  And have you ever
> tried to keep all 3000 directories on your filesystem directories open
> at the same time?  Would you want to consume that much non-swappable
> memory, and also prevent the directories from being removed from the
> filesystem?
> 
> Long ago I proposed something similar that works at the disk level, is
> recursive, and the checks can be done without keeping directories open.
> But I never wrote the code :(  That's interesting because it speeds up
> make without needing a daemon, and really can speed up
> locate/updatedb/find.

It took a little while for the following to dawn on me: it's not hard to
make the dnotify scheme work recursively and you don't have to keep 3000
directories open.  To modify a file you must have opened all the
directories in its path.  We need a new event: 

        DN_OPEN       A file in the directory was opened

You open the top level directory and register for events.  When somebody
opens a subdirectory of the top level directory, you receive
notification and register for events on the subdirectory, and so on,
down to the file that is actually modified.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-11 14:30     ` Daniel Phillips
@ 2001-01-11 15:37       ` Jamie Lokier
  2001-01-11 16:11         ` Daniel Phillips
  0 siblings, 1 reply; 18+ messages in thread
From: Jamie Lokier @ 2001-01-11 15:37 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

Daniel Phillips wrote:
>         DN_OPEN       A file in the directory was opened
> 
> You open the top level directory and register for events.  When somebody
> opens a subdirectory of the top level directory, you receive
> notification and register for events on the subdirectory, and so on,
> down to the file that is actually modified.

If it worked, and I'm not sure the timing would be reliable enough, the
daemon would only have to have open every directory being accessed by
every program in the system.  Hmm.  Seems like overkill when you're only
interested in files that are being modified.

It would be much, much more reliable to do a walk over d_parent in
dnotify.c.  Your idea is a nice way to flag kernel dentries such that
you don't do d_parent walks unnecessarily.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: FS callback routines
  2001-01-11 15:37       ` Jamie Lokier
@ 2001-01-11 16:11         ` Daniel Phillips
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel Phillips @ 2001-01-11 16:11 UTC (permalink / raw)
  To: Jamie Lokier, linux-kernel

Jamie Lokier wrote:
> 
> Daniel Phillips wrote:
> >         DN_OPEN       A file in the directory was opened
> >
> > You open the top level directory and register for events.  When somebody
> > opens a subdirectory of the top level directory, you receive
> > notification and register for events on the subdirectory, and so on,
> > down to the file that is actually modified.
> 
> If it worked, and I'm not sure the timing would be reliable enough, the
> daemon would only have to have open every directory being accessed by
> every program in the system.  Hmm.  Seems like overkill when you're only
> interested in files that are being modified.

It gets to close some too.  Normally just the directories in the path to
the file(s) being modified would be open.

Good point about the timing.  A directory should not disappear before an
in-flight notification has been serviced.  I doubt the current scheme
enforces this.  There is no more room for 'works most of the time' in
this than there is in our memory page handling.

> It would be much, much more reliable to do a walk over d_parent in
> dnotify.c.  Your idea is a nice way to flag kernel dentries such that
> you don't do d_parent walks unnecessarily.

It's bottom-up vs top-down.  It's worth analyzing the top-down approach
a little more, it does solve a lot of problems (and creates some as you
pointed out, or at least makes some existing problems more obvious). 
For make it's really quite nice.  The make daemon only needs to register
in the top level directory of the source tree.  I think this solves the
hard link problem too, because each path that's interested in
notification will receive it.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2001-01-11 17:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-09  1:21 FS callback routines Sean R. Bright
2001-01-09 11:22 ` Philipp Matthias Hahn
2001-01-09 11:34 ` Daniel Stodden
  -- strict thread matches above, loose matches on Subject: below --
2001-01-11 16:39 Jesse Pollard
2001-01-11 17:53 ` Daniel Phillips
2001-01-10 15:18 Jesse Pollard
2001-01-09 14:05 Jesse Pollard
2001-01-09 15:41 ` Daniel Phillips
2001-01-10 10:48   ` Jamie Lokier
2001-01-08 23:12 Michael D. Crawford
2001-01-09  2:37 ` Sean R. Bright
2001-01-09  3:48 ` David Weinehall
2001-01-09 13:07 ` Daniel Phillips
2001-01-10 10:56   ` Jamie Lokier
2001-01-10 18:25     ` Daniel Phillips
2001-01-11 14:30     ` Daniel Phillips
2001-01-11 15:37       ` Jamie Lokier
2001-01-11 16:11         ` Daniel Phillips

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox