From: NeilBrown <neilb@suse.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Kinglong Mee <kinglongmee@gmail.com>,
"J. Bruce Fields" <bfields@fieldses.org>,
"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
linux-fsdevel@vger.kernel.org,
Trond Myklebust <trond.myklebust@primarydata.com>
Subject: Re: [PATCH 10/10 v7] nfsd: Allows user un-mounting filesystem where nfsd exports base on
Date: Fri, 24 Jul 2015 12:05:15 +1000 [thread overview]
Message-ID: <20150724120515.023322d4@noble> (raw)
In-Reply-To: <20150713044553.GN17109@ZenIV.linux.org.uk>
On Mon, 13 Jul 2015 05:45:53 +0100 Al Viro <viro@ZenIV.linux.org.uk>
wrote:
> On Mon, Jul 13, 2015 at 02:20:59PM +1000, NeilBrown wrote:
>
> > Actually, with that change to pin_kill, this side of things becomes
> > really easy.
> > All expXXX_pin_kill needs to do is call your new cache_delete_entry.
> > If that doesn't cause the entry to be put, then something else has a
> > temporary reference which will be put soon. In any case, pin_kill()
> > will wait long enough, but not indefinitely.
> > No need for kref_get_unless_zero() or any of that.
>
> No. You are seriously misunderstanding what ->kill() is for and what the
> existing instances are doing. Again, there is no promise whatsoever that
> the object containing fs_pin instance will *survive* past ->kill().
> At all.
>
> RTFS, please. What is sorely missing in this recurring patchset is a clear
> description of lifetime rules and ordering (who waits for whom and how long).
> For all the objects involved.
Good point. Let me try.
Entries in the sunrpc 'cache' each contain some 'key' fields and some
'content' fields.
The key fields are set by the .init() method when the entry is
created, which can happen in a call to sunrpc_cache_lookup() or to
sunrpc_cache_update().
The content fields are set by the .update() method when a value is
provided for the cache entry. This happens in sunrpc_cache_update();
A cache entry can be not-valid, negative, or valid.
It starts non-valid when sunrpc_cache_lookup() fails to find the search
key and so creates a new entry (and sets up the key with .init).
It then transitions to either negative or valid.
This can happen through sunrpc_cache_update() or through an error when
instigating an up-call, in which case it goes to negative.
Once it is negative or valid, it stays that way until it is released.
If sunrpc_cache_update is called on an entry that is not not-valid,
then a new entry is created and the old one is marked as expired.
A cache search will find the new one before the old.
The vfsmount object is involved in two separate caches.
It is part of the content of svc_expkey and part of the key of
svc_export.
An svc_expkey entry is only ever held transiently. It is held while an
update is being processed, and it is held briefly while mapping a
filehandle to a mnt+dentry.
Firstly part of the filehandle is used to acccess the svc_expkey cache
to get the vfsmnt. Then that vfsmnt plus the client identification is
looked up in the svc_export cache to find the export options. Then the
svc_expkey cache entry is released.
So it is only held during a lookup of another cache. This can take an
arbitrarily long time as the lookup can go to rpc.mountd in user-space.
The svc_export cache entry can be held for the duration of a single NFS
request. It is stored in the 'struct svc_fh' file handle structure
which is release at the end of handling the request.
The vfsmnt and dentry are only "used" to validate the filehandle and
then while that filehandle is still active.
To avoid having unmount hang while nfsd is performing an upcall to
mountd, we need to legitimize the vfsmnt in the svc_expkey. If that
fails, exp_find_key() can fail and we would never perform the lookup on
svc_export.
If it succeeds, then the legitimacy can be handed over to the svc_export
cache entry, which could then continue to own it, or could hand it on
to the svc_fh.
The latter is *probably* cleanest.
i.e. an svc_fh should always own a reference to exp->ex_path.mnt, and
fh_put must put it.
exp_find_key needs to legitimize ek->ek_path.mnt, so a successful
return from exp_find implies an active refernece to ->ex_path.mnt.
If exp_find fails, it needs to mnt_put(ek->ek_path.mnt).
All callers of exp_find need to mnt_put(exp->ex_path.mnt) when they
decide not to use the exp, and must otherwise store it in an svc_fh.
With this, pin_kill() should only need to wait for exp_find_key() to
discover that it cannot legitimize the mount, or for expkey_path() to
replace the key via sunrpc_cache_update(), or maybe for cache_clean()
to discard an old entry.
Hopefully that makes it all clear.
Thanks,
NeilBrown
next prev parent reply other threads:[~2015-07-24 2:05 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-11 12:46 [PATCH 00/10 v7] NFSD: Pin to vfsmount for nfsd exports cache Kinglong Mee
2015-07-11 12:49 ` [PATCH 05/10 v7] sunrpc: Store cache_detail in seq_file's private, directly Kinglong Mee
2015-07-11 12:49 ` [PATCH 06/10 v7] sunrpc/nfsd: Remove redundant code by exports seq_operations functions Kinglong Mee
2015-07-11 12:51 ` [PATCH 08/10 v7] sunrpc: New helper cache_delete_entry for deleting cache_head directly Kinglong Mee
[not found] ` <55A11010.6050005-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-07-11 12:47 ` [PATCH 01/10 v7] fs_pin: Initialize value for fs_pin explicitly Kinglong Mee
2015-07-11 12:47 ` [PATCH 02/10 v7] fs_pin: Export functions for specific filesystem Kinglong Mee
2015-07-11 12:48 ` [PATCH 03/10 v7] path: New helpers path_get_pin/path_put_unpin for path pin Kinglong Mee
2015-07-11 12:48 ` [PATCH 04/10 v7] fs: New helper legitimize_mntget() for getting a legitimize mnt Kinglong Mee
2015-07-11 12:50 ` [PATCH 07/10 v7] sunrpc: Switch to using list_head instead single list Kinglong Mee
[not found] ` <55A11112.8080502-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-07-11 12:54 ` Christoph Hellwig
2015-07-13 1:30 ` NeilBrown
2015-07-13 8:27 ` Kinglong Mee
2015-07-11 12:51 ` [PATCH 09/10 v7] sunrpc: Support get_ref/put_ref for reference change in cache_head Kinglong Mee
2015-07-11 12:52 ` [PATCH 10/10 v7] nfsd: Allows user un-mounting filesystem where nfsd exports base on Kinglong Mee
[not found] ` <55A111A8.2040701-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-07-13 3:39 ` NeilBrown
2015-07-13 4:02 ` Al Viro
[not found] ` <20150713040258.GM17109-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-07-13 5:19 ` NeilBrown
2015-07-13 6:02 ` Al Viro
2015-07-13 4:20 ` NeilBrown
2015-07-13 4:45 ` Al Viro
[not found] ` <20150713044553.GN17109-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-07-13 5:21 ` NeilBrown
2015-07-13 6:02 ` NeilBrown
2015-07-13 6:08 ` Al Viro
[not found] ` <20150713060802.GP17109-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-07-13 6:32 ` NeilBrown
2015-07-13 6:43 ` Al Viro
2015-07-15 3:49 ` NeilBrown
2015-07-15 4:57 ` Al Viro
2015-07-15 6:51 ` NeilBrown
2015-07-24 2:05 ` NeilBrown [this message]
2015-07-27 2:28 ` Kinglong Mee
[not found] ` <55B59764.1020506-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-07-27 2:51 ` NeilBrown
2015-07-27 3:17 ` Kinglong Mee
2015-07-15 21:07 ` J. Bruce Fields
[not found] ` <20150715210756.GE21669-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-07-15 23:40 ` NeilBrown
2015-07-16 20:51 ` J. Bruce Fields
[not found] ` <20150716205148.GC10673-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-07-21 21:58 ` NeilBrown
2015-07-22 15:08 ` J. Bruce Fields
[not found] ` <20150722150840.GH22718-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2015-07-23 23:46 ` export table lookup: was " NeilBrown
2015-07-24 19:48 ` J. Bruce Fields
2015-07-25 0:40 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150724120515.023322d4@noble \
--to=neilb@suse.com \
--cc=bfields@fieldses.org \
--cc=kinglongmee@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox