linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Waychison <Michael.Waychison@Sun.COM>
To: "Adam J. Richter" <adam@yggdrasil.com>
Cc: linux-fsdevel@vger.kernel.org, Tim Hockin <thockin@hockin.org>
Subject: Re: Announcing Trapfs: a small lookup trapping filesystem, like autofs and devfs
Date: Tue, 02 Nov 2004 10:44:14 -0500	[thread overview]
Message-ID: <4187AB4E.7070403@sun.com> (raw)
In-Reply-To: <200411021033.iA2AXBq10563@freya.yggdrasil.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Adam,

Interesting concept. This is infact where I could see autofsng going
towards in the future. Comments below,

Adam J. Richter wrote:
>  	I am pleased to announce trapfs, a virtual file system that
> allows a user level program to trap dcache misses and fill them in
> before the caller returns.  In many cases, it can provide the
> functionality of autofs or devfs, but is smaller, at under 3kB .text +
> .data, and 591 lines of source code, including some lengthy comments.
> That is one third the source code line count of autofs, and just over a
> fifth of its .data+.text size.  I also subjectively believe that trapfs
> will usually be simpler to configure (although I don't know that it
> completely obseletes anything).  Documentations/filesystems/trapfs.txt
> shows several examples applications of trapfs using shell scripts or
> very small programs.
> 
> 	I also have a trapfs-based devfs which I am running now and
> cleaning up for release in the next few days.  Trapfs can also be used
> to provide create-on-demand device file functionality for some
> non-devfs systems.
> 
> 	Some of you may recall that almost two years ago I posted a
> devfs reimplementation based on ramfs that was less than a quarter of
> the size of the original devfs.  Trapfs is derived from that.
> ( http://marc.theaimsgroup.com/?l=linux-kernel&m=104138806530375&w=2 )
> 
> 	My previous devfs code shrink was generally well received, but
> not integrated due to "stable kernel" issues and my not pushing it at
> the time.  This time, I would like to get trapfs and trapfs-based
> devfs into the stock kernel pretty promptly.  So, please take a good
> look at it and tell me what you think.
> 
> 	If only trivially-fixed problems are identified, then I hope
> to run and regenerate the patch against -bk11, fix any problems that
> are identified, and then shop the patch to linux-hotplug and perhaps
> lkml before cleaning up and posting the devfs patch in the next couple
> of days.  I suspect the devfs changes will draw some tangential
> discussions, so here is your chance to have a more focused more
> technical discussion of trapfs first.
> 
> 	Finally, I will mention the deficiencies of trapfs that I'm
> already aware of, but which I think should not block integration.
> 
> 	1. It has at least one race condition.  While the first
> process is blocking, waiting for a lookup target to be filled in,
> other process that attempt to access the file will just see whatever
> state the file system is actually in with respect to that file
> (typically "file not found", but perhaps an incomplete file in the
> case of an ftp mirror, for example). There certainly are applications
> where you need something like lufs or fuse, or some special process
> state or alternative mount point for providing more reliable semantics
> might be worth exploring, but I think that trapfs in its present form
> will be useful enough to people to be worth integrating now.

This race condition is what really erks me about trapfs. It may not be
an issue for device nodes and/or directories, but for IFREG I can see it
as being a problem.

In Autofs NG, I work around this issue by having _all_
lookups/revalidates on a node wait for the node to be ready. In autofs,
the point is to mount a filesystem on a directory, so to avoid deadlock,
I instead pass a file descriptor of the target location (cause I know it
will be a directory). The helper performs the mount elsewhere and
'moves' the mounted filesystem[s] onto the directory by fd (new API
required).

Off the top of my head, I think the following sequence of events may
allow you to deal with raciness. It's ugly, but I may work:

- - Lookups/revalidates from path walks block until call_usermodehelper
returns.
- - The helper application gets an anonymous directory, possibly in the
form of an fd, that nobody else can access.
- - The helper then creates the magic file / directory / socket / device
within that anonymous directory, presumably with a magic name ("FOO123"
for this example).
- - The helper returns, and the trapfs then 'moves' "FOO123" from the
anonymous directory to the dentry it was originally trying to fill in.

This leaves you with trying to 'expire' data, which is neccesary for
autofs-like functionality..

Also wrt autofs functionality, this lacks the ability to 'trap' on the
root directory of a filesystem and the ability to 'ghost' directories
and only perform actions when they are walked into. See the autofsng
tree at http://autofsng.bkbits.net/linux-2.6-autofsng to see how I do
this using ->follow_link traps.


> 
> 	2. Like sysfs and ramfs, trapfs uses a struct inode and a
> struct dentry for every file system node, consuming something like 500
> bytes per node.  I think that at some point in the future, it might be
> useful to implement some kind of release of struct inode's for device
> files at least, similar to the "sysfs backing store" patch that is
> supposedly on its way into the stock kernel.
> 
> 	3. Until the trapfs helper exits, it is impossible to
> control-C out of the access that invoked the helper.  This is a
> deficiency of the synchronous call_usermodehelper interface.  Every
> kernel facility that uses call_usermodhelper has his problem.  There
> are a number of ways to fix synchronous call_usermodehelper, and I
> surely expect trapfs to use whatever solution is implemented.
> 

Autofs NG also has this issue. I'm still considering what it would take
to push this call into a schedule_work invocation. I don't think a
call_usermodehelper_interruptible is what is needed.

- --
Mike Waychison
Sun Microsystems, Inc.
1 (650) 352-5299 voice
1 (416) 202-8336 voice

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NOTICE:  The opinions expressed in this email are held by me,
and may not represent the views of Sun Microsystems, Inc.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBh6tOdQs4kOxk3/MRAu84AJ0TQohRa+Ecgh4RTP5yUSQDLarfswCgnNwj
3aYs3DSsrp0dDKVF978cFJU=
=udKJ
-----END PGP SIGNATURE-----

  parent reply	other threads:[~2004-11-02 15:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-02 10:33 Announcing Trapfs: a small lookup trapping filesystem, like autofs and devfs Adam J. Richter
2004-11-01 21:43 ` Jamie Lokier
2004-11-01 22:04   ` Greg KH
2004-11-02 15:44 ` Mike Waychison [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-11-02 17:17 Adam J. Richter
2004-11-02  6:50 Adam J. Richter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4187AB4E.7070403@sun.com \
    --to=michael.waychison@sun.com \
    --cc=adam@yggdrasil.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=thockin@hockin.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).