public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* FS callback routines
@ 2001-01-09  1:21 Sean R. Bright
  2001-01-09 11:22 ` Philipp Matthias Hahn
  2001-01-09 11:34 ` Daniel Stodden
  0 siblings, 2 replies; 18+ messages in thread
From: Sean R. Bright @ 2001-01-09  1:21 UTC (permalink / raw)
  To: linux-kernel


	Ok, before I begin, don't shoot me down, but I had an idea for a kernel
modification and was wondering how feasible the group thought it was.

	I was writing a user space application to monitor a folder's contents.  The
folder itself contained 100 folders, and each of those contained 24 folders.
While writing the code to traverse the directory structure I realized that
instead of my software figuring out when things change, why not just have
the fs tell my application when something was updated.  For example, say we
had a function called watch_fs(), that took an inode reference and a
function pointer and maybe a bitmask of events to watch for.  When that
inode (or its children) were changed, why couldn't the fs code call the
callback function I specified?

	I have no idea how expensive this would be or if its even worth it at this
point.  It also wouldn't be portable at all considering that I know of no
other OS that does this (could be wrong).

	Like I said, I am not asking that this be (necessarily) implemented, I am
just curious as to what the percieved performance ramifications would be if
it were to implemented, say, by a virgin kernel developer ;)

	Thanks,
	Sean
	elixer@erols.com



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: FS callback routines
@ 2001-01-11 16:39 Jesse Pollard
  2001-01-11 17:53 ` Daniel Phillips
  0 siblings, 1 reply; 18+ messages in thread
From: Jesse Pollard @ 2001-01-11 16:39 UTC (permalink / raw)
  To: phillips, Jamie Lokier, linux-kernel

Daniel Phillips <phillips@innominate.de>:
> Jamie Lokier wrote:
> > 
> > Daniel Phillips wrote:
> > >         DN_OPEN       A file in the directory was opened
> > >
> > > You open the top level directory and register for events.  When somebody
> > > opens a subdirectory of the top level directory, you receive
> > > notification and register for events on the subdirectory, and so on,
> > > down to the file that is actually modified.
> > 
> > If it worked, and I'm not sure the timing would be reliable enough, the
> > daemon would only have to have open every directory being accessed by
> > every program in the system.  Hmm.  Seems like overkill when you're only
> > interested in files that are being modified.
> 
> It gets to close some too.  Normally just the directories in the path to
> the file(s) being modified would be open.
> 
> Good point about the timing.  A directory should not disappear before an
> in-flight notification has been serviced.  I doubt the current scheme
> enforces this.  There is no more room for 'works most of the time' in
> this than there is in our memory page handling.
> 
> > It would be much, much more reliable to do a walk over d_parent in
> > dnotify.c.  Your idea is a nice way to flag kernel dentries such that
> > you don't do d_parent walks unnecessarily.
> 
> It's bottom-up vs top-down.  It's worth analyzing the top-down approach
> a little more, it does solve a lot of problems (and creates some as you
> pointed out, or at least makes some existing problems more obvious). 
> For make it's really quite nice.  The make daemon only needs to register
> in the top level directory of the source tree.  I think this solves the
> hard link problem too, because each path that's interested in
> notification will receive it.

It makes security checks impossible though. You would have to reboot
the system every time a directory changes permission to block unauthorized
monitoring of files that are no longer accessable by the user.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: FS callback routines
@ 2001-01-10 15:18 Jesse Pollard
  0 siblings, 0 replies; 18+ messages in thread
From: Jesse Pollard @ 2001-01-10 15:18 UTC (permalink / raw)
  To: phillips, Jesse Pollard, linux-kernel

---------  Received message begins Here  ---------

> 
> Jesse Pollard wrote:
> > Daniel Phillips <phillips@innominate.de>:
> > > This may be the most significant new feature in 2.4.0, as it allows us
> > > to take a fundamentally different approach to many different problems.
> > > Three that come to mind: mail (get your mail instantly without polling);
> > > make (don't rely on timestamps to know when rebuilding is needed, don't
> > > scan huge directory trees on each build); locate (reindex only those
> > > directories that have changed, keep index database current).  As you
> > > noticed, there are many others.
> > > ...
> > 
> > It would also be very nice if the security of the feature could be
> > confirmed. The problem with SGI's implementation is that it becomes
> > possible to monitor files that you don't own, don't have access to,
> > or are not permitted to know even exist.
> 
> To receive notification about events in a given directory you have to be
> able to open it.  Is this adequate for your needs?

It depends on the implementation - One problem is that you may be able
to open the directory at the time you start monitoring, then the permission
is removed. Most implementations do not recheck access rights on each
notification (fair amount of overhead).

My belief is (and I could be wrong) that most such callbacks are done by
placing a watch list on/for a device:inode identifier. When activity on
a matching device:inode occurs, the matching callback is invoked. No path name
access rights rechecking is performed since the scan of the path could
easily overwhelm the the operation being performed on the file.

The only guard would be the path scan done at the time the callback is
established. The justification is that "It's just like opening a file, the
access rights are checked on open". A callback is not an open, UNLESS callbacks
can only be placed on open file id's.

In SGI's case, the file alteration monitor (a daemon) performs this activity
while running as root, and providing RPC access to remote systems. This RPC
is unauthenticated, permitting non-local users to track files that exist on
local hosts. Since RPC cannot verify the identity of the remote user, it
permits tracking of ANY file on the system (via RPC spoofing of user identity
in the RPC call).

This was determined to be A Bad Thing, and not allowed.

> > For these reasons, we have disabled the feature.
> 
> It's nice to have that option, isn't it? ;-)

It would be if it could be done in a secure manner.

Also usefull (after callbacks work): an extension to callbacks to allow
catching file read/write/seek/truncate/ioctl actions, and being able to
perform actions on behalf of the owner of the file (as an implementation
of the original action). This might reqire the user to provide a "daemon"
to monitor the file, or to be activated by a callback established by the
owner of the file. (I know, there is a lot to consider when implementing
something like this - process environment, what executable is to be run
to do this, the interface to getting the users request/returning results..)

Use: ability to provide customized versions of the file based on the identity
of the user that opened the file - fields could be hidden/generated, queries
could be passed (via ioctl). This could provide a simple way to implement
a small data manager, but without the overhead (ie $$$) for a data base system
or the complexity of establishing a data base style client-server.

I do think such an extension should be permitted/denied based on a capability
though.

Ahh well, rambling ideas....

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: FS callback routines
@ 2001-01-09 14:05 Jesse Pollard
  2001-01-09 15:41 ` Daniel Phillips
  0 siblings, 1 reply; 18+ messages in thread
From: Jesse Pollard @ 2001-01-09 14:05 UTC (permalink / raw)
  To: phillips, Michael D. Crawford, Stephen Rothwell, linux-kernel

Daniel Phillips <phillips@innominate.de>:
> "Michael D. Crawford" wrote:
> > 
> > Regarding notification when there's a change to the filesystem:
> > 
> > This is one of the most significant things about the BeOS BFS filesystem, and
> > something I'd dearly love to see Linux adopt.  It makes an app very efficient,
> > you just get notified when a directory changes and you never waste time polling.
> > 
> > I think it would require changes to the VFS layer, not just to the filesystems,
> > because this is a concept POSIX filesystems do not presently possess.
> > 
> > The other is indexed filesystem attributes, for example a file can have its
> > mimetype in the filesystem, and any application can add an attribute and have it
> > indexed.
> > 
> > There's a method to do boolean queries on indexed attributes, and you can find
> > files in an entire filesystem that match a query in a blazingly short time, much
> > faster than walking the directory tree.
> > 
> > If you want to try out the BeOS, there's a free-as-in-beer version at
> > http://free.be.com for Pentium PC's.  You can also purchase a version that comes
> > for both PC's and certain PowerPC macs.
> > 
> > There are read-only versions of this for Linux which I believe are under the
> > GPL.  The original author is here:
> > 
> > http://hp.vector.co.jp/authors/VA008030/bfs/
> > 
> > He refers you to here to get a version that works under 2.2.16:
> > 
> > http://milosch.net/beos/
> > 
> > The author's intention was to take it read-write, but it's complex because it is
> > a journaling filesystem.
> > 
> > Daniel Berlin, a BeOS developer modified the Linux BFS driver so it works with
> > 2.4.0-test1.  I don't know if it works with 2.4.0.  The web site where it used
> > to be posted isn't there anymore, and the laptop where I had it is in for
> > repair.  I may have it on a backup, and I'll see if I can track Daniel down.
> > 
> > While Be, Inc.'s implementation is closed-source, the design of the BFS (_not_
> > "befs" as it is sometimes called) is explained in Practical File System Design
> > with the Be File System by Dominic Giampolo, ISBN 1-55860-497-9.  Dominic has
> > since left Be and I understand works at Google now.
> 
> fs/dnotify.c:
> 
>    /*
>     * Directory notifications for Linux.
>     *
>     * Copyright (C) 2000 Stephen Rothwell
>     ...
> 
> The currently defined events are:
> 
> 	DN_ACCESS	A file in the directory was accessed (read)
> 	DN_MODIFY	A file in the directory was modified (write,truncate)
> 	DN_CREATE	A file was created in the directory
> 	DN_DELETE	A file was unlinked from directory
> 	DN_RENAME	A file in the directory was renamed
> 	DN_ATTRIB	A file in the directory had its attributes
> 			changed (chmod,chown)
> 
> It was done last year, quietly and without fanfare, by Stephen Rothwell:
> 
>   http://www.linuxcare.com/about-us/os-dev/rothwell.epl
> 
> This may be the most significant new feature in 2.4.0, as it allows us
> to take a fundamentally different approach to many different problems. 
> Three that come to mind: mail (get your mail instantly without polling);
> make (don't rely on timestamps to know when rebuilding is needed, don't
> scan huge directory trees on each build); locate (reindex only those
> directories that have changed, keep index database current).  As you
> noticed, there are many others.
> 
> Stephen, it would be very interesting to know more about the development
> process you went through and what motivated you to provide this
> fundamental facility.

It would also be very nice if the security of the feature could be
confirmed. The problem with SGI's implementation is that it becomes
possible to monitor files that you don't own, don't have access to,
or are not permitted to know even exist. For these reasons, we have
disabled the feature.

-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: FS callback routines
@ 2001-01-08 23:12 Michael D. Crawford
  2001-01-09  2:37 ` Sean R. Bright
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Michael D. Crawford @ 2001-01-08 23:12 UTC (permalink / raw)
  To: linux-kernel

Regarding notification when there's a change to the filesystem:

This is one of the most significant things about the BeOS BFS filesystem, and
something I'd dearly love to see Linux adopt.  It makes an app very efficient,
you just get notified when a directory changes and you never waste time polling.

I think it would require changes to the VFS layer, not just to the filesystems,
because this is a concept POSIX filesystems do not presently possess.

The other is indexed filesystem attributes, for example a file can have its
mimetype in the filesystem, and any application can add an attribute and have it
indexed.

There's a method to do boolean queries on indexed attributes, and you can find
files in an entire filesystem that match a query in a blazingly short time, much
faster than walking the directory tree.

If you want to try out the BeOS, there's a free-as-in-beer version at
http://free.be.com for Pentium PC's.  You can also purchase a version that comes
for both PC's and certain PowerPC macs.

There are read-only versions of this for Linux which I believe are under the
GPL.  The original author is here:

http://hp.vector.co.jp/authors/VA008030/bfs/

He refers you to here to get a version that works under 2.2.16:

http://milosch.net/beos/

The author's intention was to take it read-write, but it's complex because it is
a journaling filesystem.

Daniel Berlin, a BeOS developer modified the Linux BFS driver so it works with
2.4.0-test1.  I don't know if it works with 2.4.0.  The web site where it used
to be posted isn't there anymore, and the laptop where I had it is in for
repair.  I may have it on a backup, and I'll see if I can track Daniel down.

While Be, Inc.'s implementation is closed-source, the design of the BFS (_not_
"befs" as it is sometimes called) is explained in Practical File System Design
with the Be File System by Dominic Giampolo, ISBN 1-55860-497-9.  Dominic has
since left Be and I understand works at Google now.


-- 
Michael D. Crawford
GoingWare Inc. - Expert Software Development and Consulting
http://www.goingware.com/
crawford@goingware.com

   Tilting at Windmills for a Better Tomorrow.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2001-01-11 17:56 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-09  1:21 FS callback routines Sean R. Bright
2001-01-09 11:22 ` Philipp Matthias Hahn
2001-01-09 11:34 ` Daniel Stodden
  -- strict thread matches above, loose matches on Subject: below --
2001-01-11 16:39 Jesse Pollard
2001-01-11 17:53 ` Daniel Phillips
2001-01-10 15:18 Jesse Pollard
2001-01-09 14:05 Jesse Pollard
2001-01-09 15:41 ` Daniel Phillips
2001-01-10 10:48   ` Jamie Lokier
2001-01-08 23:12 Michael D. Crawford
2001-01-09  2:37 ` Sean R. Bright
2001-01-09  3:48 ` David Weinehall
2001-01-09 13:07 ` Daniel Phillips
2001-01-10 10:56   ` Jamie Lokier
2001-01-10 18:25     ` Daniel Phillips
2001-01-11 14:30     ` Daniel Phillips
2001-01-11 15:37       ` Jamie Lokier
2001-01-11 16:11         ` Daniel Phillips

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox