From: Al Viro <viro@ZenIV.linux.org.uk>
To: Ian Kent <raven@themaw.net>
Cc: Nick Piggin <npiggin@suse.de>,
Trond Myklebust <Trond.Myklebust@netapp.com>,
David Howells <dhowells@redhat.com>,
Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 0/3] Fixes for vfs-scale and vfs-automount
Date: Thu, 24 Feb 2011 01:58:17 +0000 [thread overview]
Message-ID: <20110224015817.GQ22723@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1297779902.13007.86.camel@perseus>
On Tue, Feb 15, 2011 at 10:25:02PM +0800, Ian Kent wrote:
> I'm seeing a reference being gained, or perhaps not being released
> quickly enough after a close(2) on a file handle open on a mount point
> being umounted seen in the backtrace [1] below. I can't say if there is
> more than one like this because the BUG() stops the umounting. I'm
> thinking of modifying the code to try and continue to see what else I
> can find out.
>
> I get this quite reliably running the test described above.
>
> I've looked long and hard at the vfs-scale path walking code (and the
> vfs-automount code) and I can't yet see how this could be possible.
>
> Here are some observations I've made so far.
>
> In this segement of code in autofs4:
> int autofs4_d_manage(struct dentry *dentry, bool mounting_here, bool rcu_walk)
> {
> struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
>
> DPRINTK("dentry=%p %.*s",
> dentry, dentry->d_name.len, dentry->d_name.name);
>
> /* The daemon never waits. */
> if (autofs4_oz_mode(sbi) || mounting_here) {
> if (!d_mountpoint(dentry))
> return -EISDIR;
> return 0;
> }
>
> /* We need to sleep, so we need pathwalk to be in ref-mode */
> if (rcu_walk)
> return -ECHILD;
[snip]
> If I move the
>
> /* We need to sleep, so we need pathwalk to be in ref-mode */
> if (rcu_walk)
> return -ECHILD;
>
> above the
>
> /* The daemon never waits. */
> if (autofs4_oz_mode(sbi) || mounting_here) {
> if (!d_mountpoint(dentry))
> return -EISDIR;
> return 0;
> }
>
> I almost never see the problem in the first stage of the test, that
> being the nobrowse configuration, but almost always see it in the second
> stage, the so called browse configuration. Which amounts to saying that
> the problem appears to happen more often when mount point directories in
> the automount headachy exist before being mounted on and are not removed
> when they are expired. Unfortunately it isn't as simple as that either
> since the automount map itself is fairly complex. Still, I thought it
> worth mentioning.
Curious... The only caller affected by that transposition is
__follow_mount_rcu() and it would have to
* be called from do_lookup()
* have path->dentry pointing to a mountpoint
* being called by the daemon.
So basically you are forcing the daemon to try and drop out of RCU mode
when it reaches a mountpoint on autofs.
> The most recent curious thing is that if I change the test a bit to use
> bind mounts on local paths instead of NFS mounts I don't get this BUG()
> at all. Don't get me wrong, I'm not saying this is necessarily an NFS
> problem, it may be that the reduced latency of bind mounts changes the
> test behavior and hides the bug. So I'd appreciate some help from NFS
> folks in case it is something that needs to be changed in NFS, since I
> can't see anything that might cause it in the NFS code myself.
It might be "kicks the thing out of RCU mode as soon as we reach into a
directory on NFS"... Try binds down into sysfs; ->d_revalidate() will
act there as well.
> If anyone would like to try and run the test themselves I'll work on
> decoupling it from the RHTS infrastructure within which it is currently
> implemented.
Ho-hum... I can reach RHTS, but I'd rather do that at home boxen, if
possible... Has it been reproduced on UP boxen with SMP kernels, BTW?
next prev parent reply other threads:[~2011-02-24 1:58 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20110118040449.23109.33071.stgit@localhost6.localdomain6>
[not found] ` <1295420817.3076.24.camel@perseus>
2011-02-15 14:25 ` [PATCH 0/3] Fixes for vfs-scale and vfs-automount Ian Kent
2011-02-23 7:22 ` Ian Kent
2011-02-23 16:37 ` Linus Torvalds
2011-02-24 1:58 ` Al Viro [this message]
2011-02-24 3:03 ` Ian Kent
2011-02-24 3:14 ` Al Viro
2011-02-24 3:28 ` Ian Kent
2011-02-24 3:58 ` Al Viro
2011-02-24 5:47 ` Al Viro
2011-02-24 7:23 ` Ian Kent
2011-02-24 6:34 ` Ian Kent
2011-02-24 7:07 ` Al Viro
2011-02-24 10:07 ` Ian Kent
2011-02-24 14:59 ` Al Viro
2011-02-24 15:18 ` Al Viro
2011-02-25 3:07 ` Ian Kent
2011-02-24 19:10 ` Al Viro
2011-02-24 10:21 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110224015817.GQ22723@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=Trond.Myklebust@netapp.com \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=npiggin@suse.de \
--cc=raven@themaw.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).