From: NeilBrown <neilb@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
Kinglong Mee <kinglongmee@gmail.com>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root
Date: Fri, 1 May 2015 12:23:33 +1000 [thread overview]
Message-ID: <20150501122333.1476c999@notabene.brown> (raw)
In-Reply-To: <20150501020324.GP889@ZenIV.linux.org.uk>
[-- Attachment #1: Type: text/plain, Size: 2833 bytes --]
On Fri, 1 May 2015 03:03:24 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote:
> On Fri, May 01, 2015 at 11:53:26AM +1000, NeilBrown wrote:
> > While writing that I began to wonder if lookup_one_len is really the right
> > interface to be used, even though it was introduced (in 2.3.99pre2-4)
> > specifically for nfsd.
> > The problem is that it assumes things about the filesystem. So it makes
> > perfect sense for various filesystems to use it on themselves, but I'm not
> > sure how *right* it is for nfsd (or cachefiles etc) to use it on some
> > *other* filesystem.
> > The particular issue is that it avoids the d_revalidate call.
> > Both vfat and reiserfs have that call ... I wonder if that could ever be a
> > problem.
> >
> > So I'm really leaning towards creating a variant of kern_path_mountpoint and
> > using a variant of that which takes a length.
>
> NAK. As in, "no way in hell". And yes, lookup_one_len() *does* revalidate -
> RTFS(lookup_dcache), please.
Damn - I always seems to get lost when I'm following those call paths.
lookup_one_len -> __lookup_hash -> lookup_dcache -> d_lookup,d_revalidate -> __d_lookup
-> lookup_real -> i_op->lookup
I think I was confusing __lookup_hash with __d_lookup in my thoughts.
>
> What kind of consistency warranties do callers expect, BTW? You do realize
> that between iterate_dir() and callbacks an entry might have been removed
> and/or replaced?
For READDIR_PLUS, lookup_one_len is called on each name and it requires
i_mutex, so the code currently holds i_mutex over the whole sequence.
This is triggering a deadlock.
We could just grab/drop i_mutex over each call to lookup_one_len(), but that
sort of thing is usually frowned upon, and we don't really always *need*
i_mutex if the lookup can be served from the d_cache.
So I'm looking for the best way to perform the lookup without holding i_mutex
for too long.
It sounds like you are suggesting something like lookup_one_len_unlocked(),
which .... uhm...
I was going to say uses lookup_dcache, but that needs i_mutex.
It calls d_lookup(), which doesn't seem to really need i_mutex, and
d_revalidate().
Does the later need i_mutex? I don't think so.
So maybe it is just how d_lookup handles failure that needs i_mutex.
So lookup_one_len_unlocked() could call d_lookup and d_revalidate and if
that all worked nicely, return the result. If it didn't, grab i_mutex and try
again??
Or do we just wear the cost of taking i_mutex for each name in the directory
during READDIR_PLUS?
Thanks,
NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2015-05-01 2:23 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-21 14:50 [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root Kinglong Mee
2015-04-21 21:54 ` J. Bruce Fields
2015-04-22 5:07 ` NeilBrown
2015-04-22 11:11 ` Kinglong Mee
2015-04-22 15:07 ` J. Bruce Fields
2015-04-22 23:44 ` NeilBrown
2015-04-23 12:52 ` Kinglong Mee
2015-04-24 3:00 ` NeilBrown
2015-04-27 12:11 ` Kinglong Mee
2015-04-29 2:57 ` NeilBrown
2015-04-29 8:45 ` Kinglong Mee
2015-04-29 19:19 ` J. Bruce Fields
2015-04-29 21:52 ` NeilBrown
2015-04-30 21:36 ` J. Bruce Fields
2015-05-01 1:53 ` NeilBrown
2015-05-01 2:03 ` Al Viro
2015-05-01 2:23 ` NeilBrown [this message]
2015-05-01 2:29 ` Al Viro
2015-05-01 3:08 ` NeilBrown
2015-05-01 13:29 ` J. Bruce Fields
2015-05-02 23:16 ` NeilBrown
2015-05-03 0:37 ` J. Bruce Fields
2015-05-04 4:11 ` NeilBrown
2015-05-04 21:48 ` J. Bruce Fields
2015-05-05 22:27 ` NeilBrown
2015-05-04 22:01 ` J. Bruce Fields
2015-05-05 13:54 ` Kinglong Mee
2015-05-05 14:18 ` J. Bruce Fields
2015-05-05 15:52 ` J. Bruce Fields
2015-05-05 22:26 ` NeilBrown
2015-05-08 16:15 ` J. Bruce Fields
2015-05-08 20:01 ` [PATCH] nfsd: don't hold i_mutex over userspace upcalls J. Bruce Fields
2015-06-03 15:18 ` J. Bruce Fields
2015-07-05 11:27 ` Kinglong Mee
2015-07-06 18:22 ` J. Bruce Fields
2015-08-18 19:10 ` J. Bruce Fields
2015-11-12 21:22 ` J. Bruce Fields
2015-05-07 15:31 ` [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root J. Bruce Fields
2015-05-07 22:42 ` NeilBrown
2015-05-08 14:10 ` J. Bruce Fields
2015-05-05 3:53 ` Kinglong Mee
2015-05-05 4:19 ` NeilBrown
2015-05-05 8:32 ` Kinglong Mee
2015-05-05 13:52 ` J. Bruce Fields
2015-06-26 23:14 ` Kinglong Mee
2015-06-26 23:35 ` NeilBrown
2015-07-02 9:42 ` Kinglong Mee
2015-05-01 1:55 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150501122333.1476c999@notabene.brown \
--to=neilb@suse.de \
--cc=bfields@fieldses.org \
--cc=kinglongmee@gmail.com \
--cc=linux-nfs@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).