public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Alexander Viro <viro@math.psu.edu>
Cc: Linus Torvalds <torvalds@transmeta.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	autofs@linux.kernel.org
Subject: Re: Fix for SMP deadlock in autofs4
Date: Fri, 20 Apr 2001 18:46:14 -0700	[thread overview]
Message-ID: <20010420184613.B12962@goop.org> (raw)
In-Reply-To: <Pine.LNX.4.31.0104201158290.5632-100000@penguin.transmeta.com> <Pine.GSO.4.21.0104201516480.21455-100000@weyl.math.psu.edu>
In-Reply-To: <Pine.GSO.4.21.0104201516480.21455-100000@weyl.math.psu.edu>; from viro@math.psu.edu on Fri, Apr 20, 2001 at 03:53:45PM -0400

[-- Attachment #1: Type: text/plain, Size: 3727 bytes --]

On Fri, Apr 20, 2001 at 03:53:45PM -0400, Alexander Viro wrote:
> > Why are we doing the mntget/dget at all? We hold the spinlock, so we know
> > they are not going away. Not doing the mntget/dget means that we (a) run
> > faster and (b) don't have the bug, because we don't need to put the damn
> > things.
> > 
> > Comments?
> 
> It looks like you are right, but I wonder how the hell did that code
> happen at all. Looks like somewhere around 2.4.0-test10-pre* dcache_lock
> was moved out of is_tree_busy() and covered dget/dput. Hmm... Might be
> my fault - I don't remember doing that, but...

I did it.  I couldn't see a point in continiously taking and releasing
the dcache lock, since it just increased complexity and expire is not
a performance-critical path (ie, it happens rarely).

I kept the dget/put out caution and ignorance, but they're clearly
problematic.  I'm happy to drop them if holding dcache_lock is enough
to keep the tree stable while I traverse it.

> Removing that will require an obvious change in is_tree_busy() (shift
> count by 1). However, the real question is WTF are we trying to 
> get in autofs4_expire() - it returns dentry without grabbing a
> reference to it. The only thing that saves us is that we have a
> ramfs-style situation (dentries are pinned until we rmdir) and
> everything up to the point where we silently forget about dentry
> is covered by BKL. Since ->rmdir() is under BKL too it's enough,
> but... Eww... 

The dentry it returns is always an autofs4 dentry, and autofs4 always
keeps a refcount on its dentries like ramfs (because like ramfs,
autofs4 exists only in the dcache).

> Jeremy, what are you really trying to do there? is_tree_busy()
> seems to be written in assumption that mnt/dentry is not a
> mountpoint but root of a subtree with something mounted on its
> leaves. And autofs4_expire() traverses the list of root's
> subdirectories, picks one that has nothing busy mounted in
> _its_ subdirectories and essentially pass the name to caller.
> Which sends that name (of first-level subdirectory) to
> userland.

Exactly right.

> Is that what you really want there? It looks very odd - why don't we pass
> the names of actual mountpoints? What's wrong with the case when foo/bar
> is busy, but foo/baz is not?

Say for example you have an autofs4 filesystem mounted on /net.  When you
do a "cd /net/host", all of host's exported NFS filesystems are mounted
on the directory /net/host; obviously the mountpoint /net/host is an
autofs4 directory.

autofs4_expire traverses the directories in its root and finds the ones
which are currently unused and have been idle for some time.  Since all
the filesystems mounted under /net/host are part of the same logical
tree, it examines them as a single unit so they can be umounted as a
single unit.

Note that /net/host may not itself be a mountpoint.  If host doesn't
export / but only, say, /home and /usr/local, then there'll be a tree of
skeleton directories in the autofs4 filesystem to create the paths
up to the mountpoints (so there'll be /net/host/{home,usr/local}).
But because everything under /net/host is treated as a single unit, it's
only correct for autofs4_expire to return /net/host, not /net/host/home
or /net/host/usr/local.

The simplifying assumption I make is that there's a single root directory
with a number of sub-directories; each subdirectory is treated as
a single unit.  They general case would be to mark some directories
as being the root of an atomic set, and other directories simply being
structural, but the need has never come up (and it can be worked around by
having nested autofs filesystems).

	J

[-- Attachment #2: Type: application/pgp-signature, Size: 240 bytes --]

  reply	other threads:[~2001-04-21  1:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-04-20  8:49 Fix for SMP deadlock in autofs4 Jeremy Fitzhardinge
2001-04-20  9:00 ` Alexander Viro
2001-04-20  9:11   ` Jeremy Fitzhardinge
2001-04-20 19:01 ` Linus Torvalds
2001-04-20 19:53   ` Alexander Viro
2001-04-21  1:46     ` Jeremy Fitzhardinge [this message]
2001-04-21  5:59       ` Linus Torvalds
2001-04-21  6:21         ` Alexander Viro
2001-04-21  7:04           ` Jeremy Fitzhardinge
2001-04-21  7:02         ` Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010420184613.B12962@goop.org \
    --to=jeremy@goop.org \
    --cc=autofs@linux.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    --cc=viro@math.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox