linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	Kinglong Mee <kinglongmee@gmail.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root
Date: Fri, 1 May 2015 12:23:33 +1000	[thread overview]
Message-ID: <20150501122333.1476c999@notabene.brown> (raw)
In-Reply-To: <20150501020324.GP889@ZenIV.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 2833 bytes --]

On Fri, 1 May 2015 03:03:24 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote:

> On Fri, May 01, 2015 at 11:53:26AM +1000, NeilBrown wrote:
> > While writing that I began to wonder if lookup_one_len is really the right
> > interface to be used, even though it was introduced (in 2.3.99pre2-4)
> > specifically for nfsd.
> > The problem is that it assumes things about the filesystem.  So it makes
> > perfect sense for various filesystems to use it on themselves, but I'm not
> > sure how *right* it is for nfsd (or cachefiles etc) to use it on some
> > *other* filesystem.
> > The particular issue is that it avoids the d_revalidate call.
> > Both vfat and reiserfs have that call ... I wonder if that could ever be a
> > problem.
> > 
> > So I'm really leaning towards creating a variant of kern_path_mountpoint and
> > using a variant of that which takes a length.
> 
> NAK.  As in, "no way in hell".  And yes, lookup_one_len() *does* revalidate -
> RTFS(lookup_dcache), please.

Damn - I always seems to get lost when I'm following those call paths.
 lookup_one_len -> __lookup_hash -> lookup_dcache -> d_lookup,d_revalidate -> __d_lookup
                                 -> lookup_real -> i_op->lookup

I think I was confusing __lookup_hash with __d_lookup in my thoughts.

> 
> What kind of consistency warranties do callers expect, BTW?  You do realize
> that between iterate_dir() and callbacks an entry might have been removed
> and/or replaced?

For READDIR_PLUS, lookup_one_len is called on each name and it requires
i_mutex, so the code currently holds i_mutex over the whole sequence.
This is triggering a deadlock.

We could just grab/drop i_mutex over each call to lookup_one_len(), but that
sort of thing is usually frowned upon, and we don't really always *need*
i_mutex if the lookup can be served from the d_cache.

So I'm looking for the best way to perform the lookup without holding i_mutex
for too long.

It sounds like you are suggesting something like lookup_one_len_unlocked(),
which .... uhm...

I was going to say uses lookup_dcache, but that needs i_mutex.
It calls d_lookup(), which doesn't seem to really need i_mutex, and
d_revalidate().
Does the later need i_mutex?  I don't think so.
So maybe it is just how d_lookup handles failure that needs i_mutex.

So lookup_one_len_unlocked() could call d_lookup and d_revalidate and if
that all worked nicely, return the result. If it didn't, grab i_mutex and try
again??

Or do we just wear the cost of taking i_mutex for each name in the directory
during READDIR_PLUS?

Thanks,
NeilBrown


> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

  reply	other threads:[~2015-05-01  2:23 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-21 14:50 [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root Kinglong Mee
2015-04-21 21:54 ` J. Bruce Fields
2015-04-22  5:07   ` NeilBrown
2015-04-22 11:11   ` Kinglong Mee
2015-04-22 15:07     ` J. Bruce Fields
2015-04-22 23:44       ` NeilBrown
2015-04-23 12:52         ` Kinglong Mee
2015-04-24  3:00           ` NeilBrown
2015-04-27 12:11             ` Kinglong Mee
2015-04-29  2:57               ` NeilBrown
2015-04-29  8:45                 ` Kinglong Mee
2015-04-29 19:19                 ` J. Bruce Fields
2015-04-29 21:52                   ` NeilBrown
2015-04-30 21:36                     ` J. Bruce Fields
2015-05-01  1:53                       ` NeilBrown
2015-05-01  2:03                         ` Al Viro
2015-05-01  2:23                           ` NeilBrown [this message]
2015-05-01  2:29                             ` Al Viro
2015-05-01  3:08                               ` NeilBrown
2015-05-01 13:29                                 ` J. Bruce Fields
2015-05-02 23:16                                   ` NeilBrown
2015-05-03  0:37                                     ` J. Bruce Fields
2015-05-04  4:11                                       ` NeilBrown
2015-05-04 21:48                                     ` J. Bruce Fields
2015-05-05 22:27                                       ` NeilBrown
2015-05-04 22:01                         ` J. Bruce Fields
2015-05-05 13:54                           ` Kinglong Mee
2015-05-05 14:18                             ` J. Bruce Fields
2015-05-05 15:52                               ` J. Bruce Fields
2015-05-05 22:26                                 ` NeilBrown
2015-05-08 16:15                                   ` J. Bruce Fields
2015-05-08 20:01                                     ` [PATCH] nfsd: don't hold i_mutex over userspace upcalls J. Bruce Fields
2015-06-03 15:18                                       ` J. Bruce Fields
2015-07-05 11:27                                         ` Kinglong Mee
2015-07-06 18:22                                           ` J. Bruce Fields
2015-08-18 19:10                                         ` J. Bruce Fields
2015-11-12 21:22                                           ` J. Bruce Fields
2015-05-07 15:31                                 ` [PATCH RFC] NFSD: fix cannot umounting mount points under pseudo root J. Bruce Fields
2015-05-07 22:42                                   ` NeilBrown
2015-05-08 14:10                                     ` J. Bruce Fields
2015-05-05  3:53                       ` Kinglong Mee
2015-05-05  4:19                         ` NeilBrown
2015-05-05  8:32                           ` Kinglong Mee
2015-05-05 13:52                             ` J. Bruce Fields
2015-06-26 23:14                             ` Kinglong Mee
2015-06-26 23:35                               ` NeilBrown
2015-07-02  9:42                                 ` Kinglong Mee
2015-05-01  1:55                     ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150501122333.1476c999@notabene.brown \
    --to=neilb@suse.de \
    --cc=bfields@fieldses.org \
    --cc=kinglongmee@gmail.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).