linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Announcing Trapfs: a small lookup trapping filesystem, like autofs and devfs
@ 2004-11-02 10:33 Adam J. Richter
  2004-11-01 21:43 ` Jamie Lokier
  2004-11-02 15:44 ` Mike Waychison
  0 siblings, 2 replies; 6+ messages in thread
From: Adam J. Richter @ 2004-11-02 10:33 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: michael.waychison, thockin

 	I am pleased to announce trapfs, a virtual file system that
allows a user level program to trap dcache misses and fill them in
before the caller returns.  In many cases, it can provide the
functionality of autofs or devfs, but is smaller, at under 3kB .text +
.data, and 591 lines of source code, including some lengthy comments.
That is one third the source code line count of autofs, and just over a
fifth of its .data+.text size.  I also subjectively believe that trapfs
will usually be simpler to configure (although I don't know that it
completely obseletes anything).  Documentations/filesystems/trapfs.txt
shows several examples applications of trapfs using shell scripts or
very small programs.

	I also have a trapfs-based devfs which I am running now and
cleaning up for release in the next few days.  Trapfs can also be used
to provide create-on-demand device file functionality for some
non-devfs systems.

	Some of you may recall that almost two years ago I posted a
devfs reimplementation based on ramfs that was less than a quarter of
the size of the original devfs.  Trapfs is derived from that.
( http://marc.theaimsgroup.com/?l=linux-kernel&m=104138806530375&w=2 )

	My previous devfs code shrink was generally well received, but
not integrated due to "stable kernel" issues and my not pushing it at
the time.  This time, I would like to get trapfs and trapfs-based
devfs into the stock kernel pretty promptly.  So, please take a good
look at it and tell me what you think.

	If only trivially-fixed problems are identified, then I hope
to run and regenerate the patch against -bk11, fix any problems that
are identified, and then shop the patch to linux-hotplug and perhaps
lkml before cleaning up and posting the devfs patch in the next couple
of days.  I suspect the devfs changes will draw some tangential
discussions, so here is your chance to have a more focused more
technical discussion of trapfs first.

	Finally, I will mention the deficiencies of trapfs that I'm
already aware of, but which I think should not block integration.

	1. It has at least one race condition.  While the first
process is blocking, waiting for a lookup target to be filled in,
other process that attempt to access the file will just see whatever
state the file system is actually in with respect to that file
(typically "file not found", but perhaps an incomplete file in the
case of an ftp mirror, for example). There certainly are applications
where you need something like lufs or fuse, or some special process
state or alternative mount point for providing more reliable semantics
might be worth exploring, but I think that trapfs in its present form
will be useful enough to people to be worth integrating now.

	2. Like sysfs and ramfs, trapfs uses a struct inode and a
struct dentry for every file system node, consuming something like 500
bytes per node.  I think that at some point in the future, it might be
useful to implement some kind of release of struct inode's for device
files at least, similar to the "sysfs backing store" patch that is
supposedly on its way into the stock kernel.

	3. Until the trapfs helper exits, it is impossible to
control-C out of the access that invoked the helper.  This is a
deficiency of the synchronous call_usermodehelper interface.  Every
kernel facility that uses call_usermodhelper has his problem.  There
are a number of ways to fix synchronous call_usermodehelper, and I
surely expect trapfs to use whatever solution is implemented.

	Comments?

Signed-off-by: Adam J. Richter <adam@yggdrasil.com>

                    __     ______________ 
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l



--- linux-2.6.10-rc1-bk9/Documentation/filesystems/trapfs.txt	1969-12-31 16:00:00.000000000 -0800
+++ linux/Documentation/filesystems/trapfs.txt	2004-11-02 00:02:14.000000000 -0800
@@ -0,1 +1,312 @@
+Trapping File System Guide
+Version 0.1
+
+
+1. INSTRUCTIONS FOR THE IMPATIENT
+
+	% modprobe trapfs
+	% mount -t trapfs -o helper=/path/to/helper/program whatever /mnt
+	% ls /mnt/foo
+	# Notice that "/path/to/helper/program LOOKUP foo" was executed.
+
+
+2. OVERVIEW (from the Kconfig help)
+
+	Trapfs allows user level programs to implement file systems
+that are filled in on demand.  It works by invoking a configurable
+helper program on the first attempt to access a nonexistent file.  The
+access waits until the helper finishes, so the helper can install the
+missing file if desired.
+
+	Using this facility, a shell script or small C program can
+implement a file system that automatically mounts remote file systems
+or creates device files on demand, similar to autofs or devfs,
+respectively.  Trapfs is, however, daemonless and, perhaps
+consequently, smaller than either of these, and may avoid some
+recursion problems.
+
+	Trapfs might also be useful for debugging programs where you
+want to trap the first access a particular file or perhaps in
+automatic installation of missing command or libraries by specifying a
+trapfs file system in certain search paths.
+
+
+3. GETTING STARTED
+
+3.1 MOUNTING THE FILE SYSTEM
+
+	First, build and boot a kernel with trapfs either compiled in
+or built as a module.  If you compile trapfs as a module, you may have
+to load it, although, since the module name ("trapfs.ko") and the file
+system name ("trapfs") match, that may be unnecessary if you've
+configured modprobe automatically.
+
+	% modprobe trapfs
+
+	Now let's mount a trapfs file system on /mnt.
+
+	% mount -t trapfs blah /mnt
+
+	The file system will behave exactly like a ramfs file system.
+In fact, trapfs is derived from ramfs.  You can create files,
+directories, symbolic links and device nodes in it, and they will
+exist only in the computer's main memory.  The contents of the file
+system will disappear as soon as you unmount it.
+
+	If you mount multiple instances of trapfs, you will get
+separate file systems.
+
+
+3.2. THE HELPER PROGRAM
+
+	What distinguishes trapfs from ramfs is that it can invoke a
+user level helper program when an attempt is made to open or stat a
+nonexistent file for the first time (if the name is not already in the
+dcache).  The user level program is set with the "helper" mount
+option.  It is possible to set, clear or change the helper command at
+any time, so let's go back to our example and put a helper command on
+the trapfs file system that we mounted on /mnt:
+
+	% mount -o remount,helper=/tmp/helper /mnt
+
+	If you use the file system in /mnt now, nothing appears to have
+changed.  Now let's put a simple shell script in /tmp/helper:
+
+	% cat > /tmp/helper
+	#!/bin/sh
+	echo "$*" > /dev/console
+	^D
+	% chmod a+x /tmp/helper
+
+	Now you should see console messages like "LOOKUP foo" when you
+try to access the file /mnt/foo for the first time.
+
+	You can also pass arguments to the helper program by using
+spaces in the helper mount option, like so:
+
+	% mount -o remount,helper='/tmp/helper my_argument' /mnt
+
+	If you do this, your console messages will start to look
+something like "my_argument LOOKUP foo".  The arguments that
+you specify come before "LOOKUP foo" to facilitate the use of
+command interpreters, like, say, helper='/usr/bin/perl handler.pl'.
+Arguments also make it easy to pass things like the mount point or
+configuration files, which should make it easier to write facilities
+that work on multiple mount points.
+
+	You can also deactivate the helper at any time, like so:
+
+	mount -o remount,helper='' /mnt
+
+4. PRACTICAL EXAMPLES
+
+4.1 AN NFS AUTOMOUNTER
+
+	% cat > /usr/sbin/trapfs-automount
+	#!/bin/sh
+	topdir=$1
+	host=${3%/*}
+	dir=$topdir/$host
+	mkdir $dir
+	mount -t nfs $host:/ $dir
+	^D
+	% chmod a+x /usr/sbin/trapfs-automount
+	% mkdir /auto
+	% mount -t trapfs -o helper="/usr/sbin/trapfs-automount /auto" x /auto
+
+	Notice how we pass the additional argument "/auto" to the
+trapfs-automount command.
+
+	If you want automatic unmount after a timeout, you'll probably
+want to do something a little more elaborate, perhaps with a script that
+runs from cron.
+
+
+4.2 DEMAND LOADING OF DEVICE DRIVERS
+
+	A version of devfs that uses trapfs is under development and
+running on the system I am using to write this document, but I am
+still cleaning it up.  Here is how it should work, although I have
+not yet actually tried devfs_helper on it.
+
+	The devfs_helper program was originally written for a stripped down
+rewrite of devfs, from which trapfs is derived.  It can read your
+/etc/devfs.conf file (the file previously used to configured
+devfsd) and load modules specified by "LOOKUP" commands.  Other
+devfs.conf command are ignored.
+
+	% ftp ftp.yggdrasil.com
+	login: anonymous
+	password; guest
+	ftp> cd /pub/dist/device_control/devfs
+	ftp> get devfs_helper-0.2.tar.gz
+	.....
+	ftp> quit
+	% tar xfpvz devfs_helper-0.2.tar.gz
+	% cd devfs_helper-0.2
+	% make
+	% make install
+	% mkdir /tmp/tmpdev
+	% mount -t devfs /tmp/tmpdev
+	% cp -apRx /dev/* /tmpdev/
+	% mount -t devfs -o helper=/sbin/devfs_helper blah /dev
+	% mount -t msods /dev/floppy/0 /mnt
+
+	The above example should load the floppy.ko kernel module
+if you have a a line in your /etc/devfs.conf file like this:
+
+	LOOKUP	floppy		EXECUTE modprobe floppy
+
+
+	You should also be able to use execfs in this fashion to get
+automatic loading of kernel modules on non-devfs systems, although
+you'll need something like udev the larger udev to create the
+device files once the device drivers are registered.
+
+
+4.3 DEBUGGING A PROGRAM TRYING TO ACCESS A FILE
+
+	% cat > /tmp/call-sleep
+	#!/bin/sh
+	sleep 30
+	^D
+	% mount -t trapfs -o helper=/tmp/call-sleep foo /mnt
+	% mv .bashrc .bashrc-
+	% ln -s /mnt/whatever .bashrc
+	% gdb /bin/sh
+	GNU gdb 5.2
+	[blah blah blah]
+	(gdb) run
+	[program eventually hangs.  Switch to another terminal session.  You
+         cannot control-C out of it, a trapfs bug from call_usermodehelper.]
+	% ps axf
+	[Find the process under gdb.  Let say it's pid 1152.]
+	% kill -SEGV 1152
+	% ps auxww | grep sleep
+	[Find the sleeping trapfs helper.  Let's say it's pid 1120.]
+	% kill -9 1120
+	[Now back at the first session, running gdb on /bin/sh.]
+	Program received signal SIGSEGV, Segmentation fault.
+	0xb7f303d4 in __libc_open () at __libc_open:-1
+	-1      __libc_open: No such file or directory.
+        	in __libc_open
+		(gdb) where
+	#0  0xb7f303d4 in __libc_open () at __libc_open:-1
+	#1  0xb7f8b4c0 in __DTOR_END__ () from /lib/libc.so.6
+	#2  0x080921ef in _evalfile (filename=0x80dc788 "/tmp/junk/.bashrc", flags=9)
+	    at evalfile.c:85
+	#3  0x08092635 in maybe_execute_file (
+	    fname=0xfffffffe <Address 0xfffffffe out of bounds>, 
+	    force_noninteractive=1) at evalfile.c:218
+	#4  0x08059fe8 in run_startup_files () at shell.c:1019
+	#5  0x08059849 in main (argc=1, argv=0xbfffebc4, env=0xbfffebcc) at shell.c:581
+	#6  0xb7e88e02 in __libc_start_main (main=0x8059380 <main>, argc=1, 
+	    ubp_av=0xbfffebc4, init=0x805897c <_init>, 
+	    fini=0xb80005ac <_dl_debug_mask>, rtld_fini=0x8000, stack_end=0x0)
+	    at ../sysdeps/generic/libc-start.c:129
+
+
+4.4 AUTOMATIC LOADING OF MISSING PROGRAMS
+
+	% cat > /usr/sbin/missing-program
+	#!/bin/sh
+	my-automatic-network-downloader $2
+	^D
+	% chmod a+x /usr/sbin/missing-program
+	% mount -t trapfs -o helper=/usr/sbin/my-automatic-installer /mnt
+	% PATH=$PATH:/mnt:$PATH
+	# We include $PATH a second time so that the program can be
+	# found after it is installed.
+	% kdevelop		# Or some other program you don't have...
+
+	...or maybe something like this...
+
+	% cat > /usr/sbin/missing-program
+	#!/bin/sh
+	export DISPLAY=:0
+	konqueror http://www.google.com/search?q="download+$2" &
+	^D
+	% chmod a+x /usr/sbin/missing-program
+	% mount -t trapfs -o helper=/usr/sbin/missing-program glorp /mnt
+	% xhost localhost
+	% PATH=$PATH:/mnt
+	% kdevelop
+
+4.4.1 AUTOMATIC LOADING OF MISSING LIBRARIES
+
+	Same as above, but with this line:
+	% LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/mnt:$LD_LIBRARY_PATH
+
+4.5. OTHER USES?
+
+     I would be interested in hearing about any other uses that you up
+with for trapfs, especially if I can include them in this document.
+
+
+5. KERNEL DEVELOPER ANSWERS ABOUT IMPLEMENTATION DECISIONS
+
+5.1 Q:  Why doesn't trapfs provide REGISTER and UNREGISTER events when
+        new nodes are created or deleted from the file system, as the
+	mini-devfs implementation from which is derived did?
+
+    A:  {,UN}REGISTER in Richard Gooch's implementation of devfs enabled
+        things like automatically setting permissions and sound settings
+	on your sound device when the driver was loaded, even if
+	the loading of the driver had not been caused by devfs.
+	For trapfs-based devfs, I expect to implement that in
+	a more complex way by shadowing the real devfs file system
+	and creating {,UN}REGISTER events as updates are propagated
+	from the real devfs to /dev.  The advantages of this would be
+	that module initialization would not be blockable by a user
+	level program, and events like a device quickly appearing and
+	disappearing could be coalesced (i.e., ignored in this case).
+
+	I'm not convinced that {,UN}REGISTER has to go, but I haven't
+	seen any compelling uses for it, and I know it's politically
+	easier to add a feature than to remove one, especially if anyone
+	has developed a depence on it and does not want to port.  So,
+	I'm starting out with trapfs not providing {,UN}REGISTER events.
+
+5.2 Q:	Why isn't trapfs implemented as an overlay file system?  I'd
+	like to be able to apply it to, say, /dev, without having
+	to start out with /dev being a devfs file system.  I could
+	also demand load certain facilities based on accesses to /proc.
+
+    A:	There are about two dozen routines in inode_operations and
+	file_operations that would require pass-through versions, and
+	they are not as trivial as you might think because of
+	locking issues involved in going through the vfs layer again.
+	Also, currently, trapfs, like ramfs and sysfs, use a struct
+	inode and a struct dentry in kernel low memory for every
+	existing node in the file system, about half a kilobyte
+	per entry.  So, trapfs would need to be converted to allow
+	inode structures to be released if it were to overlay
+	potentially large directory trees.  Also may are issues
+	about keeping track of the underlying file system changing
+	from out from under trapfs.  Perhaps in the future this
+	can be implemented.
+
+
+5.3. Q: Why isn't trapfs a built-in kernel facility that can be
+	applied to any file, like dnotify?  That would also have
+	the above advantages and could eliminate the mount complexity
+	of overlays.
+
+     A:	I thought about breaking out a separate inode->directory_operations
+	from inode->inode_operations. Since all of those operations that
+	I want to intercept are called with inode->i_sem held, it
+	follows that it would be SMP-safe to change an inode->dir_ops
+	pointer dynamically, which would allow stacking of inode
+	operations.  This approach could be used to remove the special
+	case code that is used to implement dnotify.  But what happens
+	when the inode is freed?  It would be necessary to intercept
+	the victim superblock's superblock_operations.drop_inode
+	routine, which could get pretty messy, especially, if, for
+	example, more than one trap was being used on the same file
+	system.  Perhaps if drop_inode were moved to struct
+	inode_operations this would be easier.
+
+
+Adam J. Richter (adam@yggdrasil.com)
+2004.11.01
--- linux-2.6.10-rc1-bk9/fs/Kconfig	2004-10-31 00:25:33.000000000 -0700
+++ linux/fs/Kconfig	2004-11-01 22:45:34.000000000 -0800
@@ -846,6 +846,29 @@
 
 	Designers of embedded systems may wish to say N here to conserve space.
 
+config TRAP_FS
+	tristate "access trapping file system"
+	help
+	  Trapfs allows user level programs to implement file systems that
+	  are filled in on demand.  It works by invoking a configurable
+	  helper program on the first attempt to access a nonexistant file.
+	  The access waits until the helper finishes.
+
+          Using this facility, a shell script or small C program can
+	  implement a file system that automatically mounts remote file
+	  systems or creates device files on demand, similar to autofs or
+	  devfs, respetively.  Trapfs is, however, daemonless and, perhaps
+	  consequently, smaller than either of these, and may avoid some
+	  recursion problems.  
+
+	  Trapfs might also be useful for debugging programs where you want
+	  to trap an attempt to access a particular file or perhaps in
+	  automatic installation of missing command or libraries by
+	  specifying a trapfs file system in certain search paths.
+
+	  Please see <file:Documentation/filesystems/trapfs.txt> for
+	  more information.
+
 config DEVFS_FS
 	bool "/dev file system support (OBSOLETE)"
 	depends on EXPERIMENTAL
--- linux-2.6.10-rc1-bk9/fs/Makefile	2004-10-31 00:25:33.000000000 -0700
+++ linux/fs/Makefile	2004-11-01 22:45:39.000000000 -0800
@@ -61,6 +61,7 @@
 obj-$(CONFIG_VFAT_FS)		+= vfat/
 obj-$(CONFIG_BFS_FS)		+= bfs/
 obj-$(CONFIG_ISO9660_FS)	+= isofs/
+obj-$(CONFIG_TRAP_FS)		+= trapfs/  # Before devfs so devfs can use it.
 obj-$(CONFIG_DEVFS_FS)		+= devfs/
 obj-$(CONFIG_HFSPLUS_FS)	+= hfsplus/ # Before hfs to find wrapped HFS+
 obj-$(CONFIG_HFS_FS)		+= hfs/
--- linux-2.6.10-rc1-bk9/fs/trapfs/Makefile	1969-12-31 16:00:00.000000000 -0800
+++ linux/fs/trapfs/Makefile	2004-11-01 22:45:30.000000000 -0800
@@ -0,0 +1,6 @@
+# Makefile for "trapfs", a file system that can invoke a user level
+# helper program when an attempt is made to access a nonexistant file.
+
+obj-$(CONFIG_DEVFS_FS)	+= trapfs.o
+
+trapfs-objs := 		inode.o notify.o
--- linux-2.6.10-rc1-bk9/fs/trapfs/inode.c	1969-12-31 16:00:00.000000000 -0800
+++ linux/fs/trapfs/inode.c	2004-11-01 22:52:43.000000000 -0800
@@ -0,0 +1,365 @@
+/*
+  Access trapping file system (trapfs).
+  Derived from ramfs by Adam J. Richter via a rewrite of devfs.
+  Ramfs was written by Linus Torvalds.
+
+  Copyright (C) 2000 Linus Torvalds.
+  		2000 Transmeta Corp.
+ 		2002-2004 Yggdrasil Computing, Inc.
+ 
+  This file is released under the GPL.
+ 
+  trapfs is split into two files:
+
+ 	inode.c - a ramfs-based filesystem.  It only exports one symbols:
+		  trapfs_init(), which is used only for early initialization
+		  by devfs_init when trapfs is compiled into the kernel
+		  to support devfs.
+
+ 	notify.c - Invokes the user level helper program.
+*/
+
+#include <linux/module.h>
+#include <linux/pagemap.h>
+#include <linux/backing-dev.h>
+#include <linux/parser.h>
+#include <linux/namei.h>
+#include <linux/mount.h>
+#include <linux/trap_fs_sb.h>
+
+static struct super_operations trapfs_ops;
+static struct address_space_operations trapfs_aops;
+static struct file_operations trapfs_file_operations;
+static struct inode_operations trapfs_dir_inode_operations;
+
+static struct backing_dev_info trapfs_backing_dev_info = {
+	.ra_pages	= 0,	/* No readahead */
+	.memory_backed	= 1,	/* Does not contribute to dirty memory */
+};
+
+static struct inode *trapfs_get_inode(struct super_block *sb, int mode, dev_t dev)
+{
+	struct inode * inode = new_inode(sb);
+
+	if (!inode)
+		return NULL;
+
+	inode->i_mode = mode;
+	inode->i_uid = current->fsuid;
+	inode->i_gid = current->fsgid;
+	inode->i_blksize = PAGE_CACHE_SIZE;
+	inode->i_blocks = 0;
+	inode->i_mapping->a_ops = &trapfs_aops;
+	inode->i_mapping->backing_dev_info = &trapfs_backing_dev_info;
+	inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+	
+	switch (mode & S_IFMT) {
+	default:
+		init_special_inode(inode, mode, dev);
+		break;
+	case S_IFREG:
+		inode->i_fop = &trapfs_file_operations;
+		break;
+	case S_IFDIR:
+		inode->i_op = &trapfs_dir_inode_operations;
+		inode->i_fop = &simple_dir_operations;
+
+		/* directory inodes start off with i_nlink == 2 (for "." entry) */
+		inode->i_nlink++;
+		break;
+	case S_IFLNK:
+		inode->i_op = &page_symlink_inode_operations;
+		break;
+	}
+
+	return inode;
+}
+
+static struct dentry * trapfs_lookup(struct inode *dir,
+				    struct dentry *dentry,
+				    struct nameidata *nd)
+{
+	/*
+	  We must do simple_looukp before trapfs_event to prevent
+	  a duplicate dentry from being created if the trapfs_helper
+	  program attempts to access the same file name in /dev.
+	  If simple_lookup returns non-NULL, then that is to an error
+	  like a malformed file name, so we do not invoke trapfs_event.
+	  If the file is not found but there was no other error,
+	  simple_lookup returns NULL, and that is the only case
+	  in which we want to generate a notification.
+
+	  We also filter out the final path element of mknod, mkdir
+	  and symlink, because invoking the helper for mknod and mkdir
+	  could lead to deadlock when trapfs loads a device driver
+	  kernel module than.  One would think that the way to filter
+	  would be to look at nd->flags to check that LOOKUP_CREATE
+	  is set and LOOKUP_OPEN is clear, but instead, the vfs
+	  layers passed nd==NULL in these cases via a routine
+	  called lookup_create (without any leading underscores),
+	  so we filter out the case where nd == NULL.
+
+	  Filtering out nd==NULL has the unintented side-effect of
+	  filtering out the final path component of arguments to
+	  rmdir, unlink and rename (both source and destination).
+	  For rmdir, unlink, and the source arguement to rename,
+	  that's fine, since nobody cares about attempts to remove
+	  nonexistant files.  We're probably also OK skipping the
+	  notifications with regard to the destination argument to
+	  rename, although that is less clear.
+	*/
+
+	struct dentry *result = simple_lookup(dir, dentry, nd);
+
+	if (result == NULL && nd != NULL)
+		trapfs_event("LOOKUP", dir, dentry);
+
+	return result;
+}
+
+
+/*
+ * File creation. Allocate an inode, and we're done..
+ */
+/* SMP-safe */
+static int
+trapfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+	struct inode * inode = trapfs_get_inode(dir->i_sb, mode, dev);
+	int error = -ENOSPC;
+
+	if (inode) {
+		d_instantiate(dentry, inode);
+		dget(dentry);	/* Extra count - pin the dentry in core */
+		error = 0;
+	}
+
+	return error;
+}
+
+static int trapfs_mkdir(struct inode * dir, struct dentry * dentry, int mode)
+{
+	int retval = trapfs_mknod(dir, dentry, mode | S_IFDIR, 0);
+	if (!retval)
+		dir->i_nlink++;
+	return retval;
+}
+
+static int trapfs_create(struct inode *dir,
+			struct dentry *dentry,
+			int mode,
+			struct nameidata *nd)
+{
+	return trapfs_mknod(dir, dentry, mode | S_IFREG, 0);
+}
+
+
+static int trapfs_symlink(struct inode * dir, struct dentry *dentry, const char * symname)
+{
+	struct inode *inode;
+	int error = -ENOSPC;
+
+	inode = trapfs_get_inode(dir->i_sb, S_IFLNK|S_IRWXUGO, 0);
+	if (inode) {
+		int l = strlen(symname)+1;
+		error = page_symlink(inode, symname, l);
+		if (!error) {
+			d_instantiate(dentry, inode);
+			dget(dentry);
+		} else
+			iput(inode);
+	}
+	return error;
+}
+
+
+enum trapfs_mount_options { NO_MATCH = 0,
+			    CLEAR_HELPER,
+			    SET_HELPER };
+
+static struct match_token option_table[] = {
+	{CLEAR_HELPER,		"helper=" },
+	{SET_HELPER,		"helper=%s" },
+};
+
+static int
+set_helper_str(struct trapfs_super *sb_info, substring_t *arg)
+{
+	char *dup;
+
+	if (arg->from == arg->to)
+		dup = NULL;
+	else {
+		dup = match_strdup(arg);
+		if (!dup)
+			return -ENOMEM;
+	}
+
+	down_write(&sb_info->helper_rwsem);
+	kfree(sb_info->helper);
+	sb_info->helper = dup;
+	up_write(&sb_info->helper_rwsem);
+	return 0;
+}
+
+static int
+trapfs_remount_set_options (struct super_block *sb, int *flags, char *options)
+{
+	char *opt;
+	substring_t arg;
+	int token;
+	int err = 0;
+	struct trapfs_super *sb_info = sb->s_fs_info;
+
+	while ((opt = strsep (&options, ",")) != NULL && !err) {
+
+	  	/* We rely on arg pointing to a zero length string
+		   if an option with no arguments was matched. */
+		arg.from = arg.to = opt;
+
+		token = match_token(opt, option_table, &arg);
+
+		switch (token) {
+
+		case CLEAR_HELPER:
+		case SET_HELPER:
+			err = set_helper_str(sb_info, &arg);
+			break;
+
+		default:
+			err = -EINVAL;
+			break;
+		}
+	}
+
+	return err;
+}
+
+static struct address_space_operations trapfs_aops = {
+	.readpage	= simple_readpage,
+	.prepare_write	= simple_prepare_write,
+	.commit_write	= simple_commit_write
+};
+
+static struct file_operations trapfs_file_operations = {
+	.owner		= THIS_MODULE,
+	.read		= generic_file_read,
+	.write		= generic_file_write,
+	.mmap		= generic_file_mmap,
+	.fsync		= simple_sync_file,
+	.sendfile	= generic_file_sendfile,
+};
+
+static struct inode_operations trapfs_dir_inode_operations = {
+	.create		= trapfs_create,
+	.lookup		= trapfs_lookup,
+	.link		= simple_link,
+	.unlink		= simple_unlink,
+	.symlink	= trapfs_symlink,
+	.mkdir		= trapfs_mkdir,
+	.rmdir		= simple_rmdir,
+	.mknod		= trapfs_mknod,
+	.rename		= simple_rename,
+};
+
+static struct super_operations trapfs_ops = {
+	.statfs		= simple_statfs,
+	.remount_fs	= trapfs_remount_set_options,
+	.drop_inode	= generic_delete_inode,
+};
+
+static int trapfs_fill_super(struct super_block * sb, void * data, int silent)
+{
+	struct inode *inode;
+	struct dentry *root;
+	int err;
+	struct trapfs_super *sb_info;
+
+	sb->s_blocksize = PAGE_CACHE_SIZE;
+	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
+	sb->s_magic = TRAP_FS_SUPER_MAGIC;
+	sb->s_op = &trapfs_ops;
+	inode = trapfs_get_inode(sb, S_IFDIR | 0755, 0);
+	if (!inode)
+		return -ENOMEM;
+
+	root = d_alloc_root(inode);
+	if (!root) {
+		iput(inode);
+		return -ENOMEM;
+	}
+	sb->s_root = root;
+
+	sb_info = kmalloc(sizeof(struct trapfs_super), GFP_KERNEL);
+	if (sb_info == NULL)
+		err = -ENOMEM;
+	else {
+		sb->s_fs_info = sb_info;
+
+		init_rwsem(&sb_info->helper_rwsem);
+
+		sb_info->helper = NULL;
+		err = trapfs_remount_set_options(sb, NULL, data);
+		if (!err)
+			return 0;
+
+		kfree(sb->s_fs_info);
+	}
+
+	iput(inode);
+	dput(root);
+	sb->s_root = NULL;
+	return err;
+}
+
+static struct super_block *trapfs_get_sb(struct file_system_type *fs_type,
+					int flags,
+					const char *dev_name,
+					void *data)
+{
+	return get_sb_nodev(fs_type, flags, data, trapfs_fill_super);
+}
+
+static void trapfs_kill_super(struct super_block *sb)
+{
+	struct trapfs_super *sb_info = sb->s_fs_info;
+
+	kfree(sb_info->helper);
+	kfree(sb_info);
+	kill_litter_super(sb);
+}
+
+static struct file_system_type trapfs_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "trapfs",
+	.get_sb		= trapfs_get_sb,
+	.kill_sb	= trapfs_kill_super,
+};
+
+/* Not static, because it can also be called from init_devfs_fs. */
+int __init trapfs_init(void)
+{
+	static int initialized;	/* = 0 */
+	int err;
+
+	if (initialized)
+		err = 0;
+	else {
+		err = register_filesystem(&trapfs_fs_type);
+		if (!err)
+			initialized = 1;
+	}
+
+	return err;
+}
+
+
+static void __exit trapfs_exit(void)
+{
+	unregister_filesystem(&trapfs_fs_type);
+}
+
+module_init(trapfs_init)
+module_exit(trapfs_exit)
+
+MODULE_LICENSE("GPL");
--- linux-2.6.10-rc1-bk9/fs/trapfs/notify.c	1969-12-31 16:00:00.000000000 -0800
+++ linux/fs/trapfs/notify.c	2004-11-01 22:52:43.000000000 -0800
@@ -0,0 +1,190 @@
+/*
+  Access trapping file system (trapfs)
+
+  notify.c -- Invoke the user level notification program.
+
+
+  Written by Adam J. Richter
+  Copyright (C) 2004 Yggdrasil Computing, Inc.
+
+  This program is free software; you can redistribute it and/or modify
+  it under the terms of the GNU General Public License as published by
+  the Free Software Foundation; either version 2 of the License, or
+  (at your option) any later version.
+
+  This program is distributed in the hope that it will be useful,
+  but WITHOUT ANY WARRANTY; without even the implied warranty of
+  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+  GNU General Public License for more details.
+
+  You should have received a copy of the GNU General Public License
+  along with this program; if not, write to the Free Software
+  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/fs.h>
+#include <linux/mount.h>
+#include <linux/trap_fs_sb.h>
+
+static int path_len (struct dentry *de, struct dentry *root)
+{
+	int len = 0;
+	while (de != root) {
+		len += de->d_name.len + 1;	/* count the '/' */
+		de = de->d_parent;
+	}
+	return len;		/* -1 because we omit the leading '/',
+				   +1 because we include trailing '\0' */
+}
+
+static int trapfs_generate_path (struct dentry *de, char *path, int buflen)
+{
+	struct dentry *mnt_root = de->d_parent->d_inode->i_sb->s_root;
+	int len;
+	char *path_orig = path;
+
+	if (de == NULL || de == mnt_root)
+		return -EINVAL;
+
+	spin_lock(&dcache_lock);
+	len = path_len(de, mnt_root);
+	if (len > buflen) {
+		spin_unlock(&dcache_lock);
+		return -ENAMETOOLONG;
+	}
+
+	path += len - 1;
+	*path = '\0';
+
+	for (;;) {
+		path -= de->d_name.len;
+		memcpy(path, de->d_name.name, de->d_name.len);
+		de = de->d_parent;
+		if (de == mnt_root)
+			break;
+		*(--path) = '/';
+	}
+		
+	spin_unlock(&dcache_lock);
+
+	BUG_ON(path != path_orig);
+
+	return 0;
+}
+
+static inline int
+trapfs_calc_argc(const char *str_in, int *str_len)
+{
+	const char *str = str_in;
+	int argc = 0;
+	while (*str) {
+		while (*str == ' ' || *str == '\t')
+			str++;
+		argc++;
+		while (*str != ' ' && *str != '\t' && *str)
+			str++;
+	}
+	*str_len = str - str_in;
+	return argc;
+}
+
+static char **
+trapfs_gen_argv(char *str_in, int argc_extra, int *argc_out)
+{
+	int argc;
+	char **argv;
+	char *str_out;
+	int str_len;
+
+	if (!str_in)
+		return NULL;
+
+	while (*str_in == ' ' || *str_in == '\t')
+		str_in++;
+
+	if (*str_in == '\0')
+		return NULL;
+
+	argc = trapfs_calc_argc(str_in, &str_len);
+
+	argv = kmalloc(((argc + argc_extra) * sizeof(char*)) + str_len + 1,
+		       GFP_KERNEL);
+	if (!argv)
+		return NULL;
+
+	str_out = (char*) (argv + argc + argc_extra);
+
+	argc = 0;
+	while (*str_in) {
+		argv[argc++] = str_out;
+
+		while (*str_in != ' ' && *str_in != '\t' && *str_in)
+			*(str_out++) = *(str_in++);
+
+		*(str_out++) = '\0';
+
+		while (*str_in == ' ' || *str_in == '\t')
+			str_in++;
+	}
+	*argc_out = argc;
+	return argv;
+}
+
+
+/*
+  Warning: trapfs_event releases and retakes parent->i_sem.
+*/
+void trapfs_event(const char *event, struct inode *parent,
+		  struct dentry *dentry)
+{
+	char path[64];
+	struct trapfs_super *sb_info;
+	char **argv;
+	int argc;
+
+	if (trapfs_generate_path(dentry, path, sizeof(path)) == 0) {
+
+		up(&parent->i_sem);
+
+		sb_info = parent->i_sb->s_fs_info;
+
+		/*
+		  FIXME.  We would not need the extra memory allocation,
+		  string copying, error branch and lines of source code
+		  due to err_strdup(), and we could put trapfs_gen_argv
+		  into the ioctl that sets the command string, if
+		  call_usermodehelper and execve had a callback to inform
+		  us when execve was done copying argv and envp.  With
+		  such a facility, we could just hold sb_info->helper_rwsem
+		  up to that point, without having to make a copy of the
+		  argument (which we currently do) or hold the semaphore
+		  until the helper process exits (which would cause a
+		  deadlock if a helper process ever tried to change
+		  the helper string of a file system, and there is
+		  not even a rw_down_read_interruptible
+		  that would make the deadlock breakable).
+		*/
+
+		down_read(&sb_info->helper_rwsem);
+		argv = trapfs_gen_argv(sb_info->helper, 3, &argc);
+		up_read(&sb_info->helper_rwsem);
+
+		if (argv != NULL) {
+			static char *envp[] =
+				{"PATH=/bin:/sbin:/usr/bin:/usr/sbin",
+				 "HOME=/", NULL };
+
+			argv[argc++] = (char*) event;
+			argv[argc++] = path;
+			argv[argc] = NULL;
+
+			call_usermodehelper(argv[0], argv, envp, 1);
+			kfree(argv);
+		}
+
+		down(&parent->i_sem);
+	}
+}
+
--- linux-2.6.10-rc1-bk9/include/linux/trap_fs_sb.h	1969-12-31 16:00:00.000000000 -0800
+++ linux/include/linux/trap_fs_sb.h	2004-11-01 22:50:10.000000000 -0800
@@ -0,0 +1,30 @@
+/* Access trapping file system (trapfs)
+
+  Written by Adam J. Richter
+  Copyright (C) 2002-2004 Yggdrasil Computing, Inc.
+
+  You can redistribute and/or modify this file under the terms of the
+  GNU General Public License as published by the Free Software Foundation;
+  either version 2 of the License, or (at your option) any later version.
+*/
+
+#ifndef TRAPFS_INTERNAL_H
+#define TRAPFS_INTERNAL_H
+
+#include <linux/fs.h>
+#include <linux/rwsem.h>
+
+
+#define TRAP_FS_SUPER_MAGIC	0x1DEA
+
+struct trapfs_super {
+	struct rw_semaphore	helper_rwsem;
+	char 			*helper;
+};
+
+extern int __init trapfs_init(void); /* For early initialization by devfs. */
+
+extern void trapfs_event(const char *event, struct inode *parent,
+			 struct dentry *dentry);
+
+#endif /* TRAPFS_INTERNAL_H */

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Announcing Trapfs: a small lookup trapping filesystem, like autofs and devfs
@ 2004-11-02 17:17 Adam J. Richter
  0 siblings, 0 replies; 6+ messages in thread
From: Adam J. Richter @ 2004-11-02 17:17 UTC (permalink / raw)
  To: Michael.Waychison; +Cc: linux-fsdevel, thockin

>>  = Adam Richter
>   = Mike Waychison

>> 	1. [Trapfs] has at least one race condition.  While the first
>> process is blocking, waiting for a lookup target to be filled in,
>> other process that attempt to access the file will just see whatever
>> state the file system is actually in with respect to that file
>> (typically "file not found", but perhaps an incomplete file in the
>> case of an ftp mirror, for example). There certainly are applications
>> where you need something like lufs or fuse, or some special process
>> state or alternative mount point for providing more reliable semantics
>> might be worth exploring, but I think that trapfs in its present form
>> will be useful enough to people to be worth integrating now.

>This race condition is what really erks me about trapfs. It may not be
>an issue for device nodes and/or directories, but for IFREG I can see it
>as being a problem.

	I certainly do not claim that the "--helper=" facility (in
trapfs or added to tmpfs) is bullet-proof, but I would be interested
in knowing a likely scenario of what you are describing.

>In Autofs NG, I work around this issue by having _all_
>lookups/revalidates on a node wait for the node to be ready. In autofs,
>the point is to mount a filesystem on a directory, so to avoid deadlock,
>I instead pass a file descriptor of the target location (cause I know it
>will be a directory). The helper performs the mount elsewhere and
>'moves' the mounted filesystem[s] onto the directory by fd (new API
>required).

	If you want something that might actually provide correct
semantics without a new API or a new daemon, might I suggest having
two mount points, one that traps the lookups and another one that
the invoked user level handlers would typically use.  There would
be nothing special about these processes.  They would just be given
path names in a different location to use.  Come to think of it,
I think you could do this with the current trapfs/tmpfs code by
having tmpfs_lookup() check only trap if nd->mnt had a certain
value (for example, not to trap from the first mountpoint that
did a lookup).  From user level, it would look something like this:

mount -t tmpfs -o handler='/bin/autofs-handler /private' foo /private

# Signal to tmpfs not to trap from this mount point:
ls /private/x > /dev/null 2>&1

mkdir /private/mirror
mount --bind /private/mirror /public

	Accesses to /public would trap, and the handler would do its
work in /private/mirror, unless it wanted recursion or wanted to
create a new mount point, in which case it would work in /public.
The tmpfs_lookup() routine would be slightly different in that it
would block all lookups on the private node until the handler
returned.  Conceivably, the handler could be called multiple times,
so it would be up to the handler instances to synchronizes via flock
for something if they needed to.

	Also note that I suggest doing the "mount --bind" from
/private/mirror rather than /private.  That way, the handler can
do its work in /private/inaccessible-sandbox and then (atomically)
rename its results into the appropriate place in /private/mirror/.

	I suspect that for many cases "correctness" is not worth even
this much effort, but it could be implemented in proably ~200 lines of
kernel code.  I say ~200 lines, because I think you might have to
write a set of dentry_ops to ensure that lookup() is called again
if it is blocking on a user level helper and somebody else tries to
resolve the same dentry.

	I may have more to say about the rest of your email after I've
taken a look at your autofsng code that you mention.  However, I
probably won't be able to do that for another 12 hours, so I'm sending
this response now.

                    __     ______________
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l

^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re: Announcing Trapfs: a small lookup trapping filesystem, like autofs and devfs
@ 2004-11-02  6:50 Adam J. Richter
  0 siblings, 0 replies; 6+ messages in thread
From: Adam J. Richter @ 2004-11-02  6:50 UTC (permalink / raw)
  To: greg, jamie, linux-fsdevel

On Mon, 1 Nov 2004 14:04:59 -0800, Greg KH wrote:
>On Mon, Nov 01, 2004 at 09:43:53PM +0000, Jamie Lokier wrote:
>> 
>> It would be nice if opening a non-existent file in /dev would trigger
>> a hotplug/udev event - but otherwise have a perfectly normal
>> tmpfs-like filesystem.  IMHO that would fix udev nicely.

>I've been considering creating a fs based on tmpfs that does just that
>for udev, if only to keep issues like this from coming up all the time
>:)

>> Is trapfs suitable for that?

>trapfs looks to be based on ramfs, not tmpfs, so I don't think it would
>work out as well.  But Adam might have a different idea.

	If I understand correctly, there are three or four
advantages of using tmpfs:

	1. You can set the maximum number of inodes.

	2. You can set the maximum number of blocks used for plain
	   file data.

	3. The plain file data blocks can be swapped out.

	4. Extended attributes.

	For traditional devfs and autofs applications, neither of
which create data files, only limiting the number of inodes and
maybe extended attributes might be useful.  It would be trivial
to add limiting the number of inodes to trapfs, but there might
be other applications that would benefit from these features,
and the memory cost of adding the callback mechansim to tmpfs
should be low.

	After I read Greg's mail, I tried making an fs/helper.c
interface that would only be compiled in when a file system
that used it was also being compiled in.  I also started
adding the support to tmpfs, but ran into one problem that
will make the patches big: tmpfs currently allows super_block->sb_info
to be NULL to indicate that there are no limits, and once that
is set, it cannot be changed by a remount command.  That would
have to be changed to support recording the user level helper
string.

	So, I would like to know if there is opposition to patching
tmpfs this way before I invest the better part of day in it.
Is there someone else I should cc about this?

	I should also mention that I can think of a couple of
disadvantages to using tmpfs.

	1. tmpfs is also derived from ramfs (or vice-versa?).
It appears to keep a struct inode and a struct dname in unswappable
low memory for each node in the file system, like ramfs, sysfs and trapfs
(but less than an approach involving a sysfs directory with
attributes for every device, at least until the sysfs).  Conceivably,
I might end up having a separate more memory-efficient devfs file
system that would share the common fs/helper.c code with tmpfs.
On the other hand, it might be if tmpfs is the collection point
for virtual filesystem hacks, it might be possible to allow
tmpfs to drop and reconstruct just device inodes.  Also, I
think it would be a good idea to look at shrinking struct inode
and perhaps dname, although this will require changing a lot of
file systems (for example, the stat information could be stored
as a pointer for virtual file systems that want to keep it around
while letting rest of struct inode be freed or for file systems
where this information is one of a few common constants).

	2. I do not know if tmpfs can be initialized early enough
to provide the devfs root device for initrd.  This probably
should be okay even if it can't, but I mention it for
completeness.

                    __     ______________
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-11-02 18:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-02 10:33 Announcing Trapfs: a small lookup trapping filesystem, like autofs and devfs Adam J. Richter
2004-11-01 21:43 ` Jamie Lokier
2004-11-01 22:04   ` Greg KH
2004-11-02 15:44 ` Mike Waychison
  -- strict thread matches above, loose matches on Subject: below --
2004-11-02 17:17 Adam J. Richter
2004-11-02  6:50 Adam J. Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).