linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose
Date: Fri, 30 Sep 2022 20:23:12 +0000	[thread overview]
Message-ID: <E06BAB8A-42DA-484E-96C7-EE7A9254476C@oracle.com> (raw)
In-Reply-To: <649ea8a13435734a54fc6755bd6599c2cacc3a53.camel@kernel.org>



> On Sep 30, 2022, at 3:42 PM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Fri, 2022-09-30 at 19:29 +0000, Chuck Lever III wrote:
>> 
>>> On Sep 30, 2022, at 3:15 PM, Jeff Layton <jlayton@kernel.org> wrote:
>>> 
>>> This function is called two reasons:
>>> 
>>> We're either shutting down and purging the filecache, or we've gotten a
>>> notification about a file delete, so we want to go ahead and unhash it
>>> so that it'll get cleaned up when we close.
>>> 
>>> We're either walking the hashtable or doing a lookup in it and we
>>> don't take a reference in either case. What we want to do in both cases
>>> is to try and unhash the object and put it on the dispose list if that
>>> was successful. If it's no longer hashed, then we don't want to touch
>>> it, with the assumption being that something else is already cleaning
>>> up the sentinel reference.
>>> 
>>> Instead of trying to selectively decrement the refcount in this
>>> function, just unhash it, and if that was successful, move it to the
>>> dispose list. Then, the disposal routine will just clean that up as
>>> usual.
>>> 
>>> Also, just make this a void function, drop the WARN_ON_ONCE, and the
>>> comments about deadlocking since the nature of the purported deadlock
>>> is no longer clear.
>>> 
>>> Signed-off-by: Jeff Layton <jlayton@kernel.org>
>>> ---
>>> fs/nfsd/filecache.c | 32 ++++++--------------------------
>>> 1 file changed, 6 insertions(+), 26 deletions(-)
>>> 
>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>> index 58f4d9267f4a..16bd71a3894e 100644
>>> --- a/fs/nfsd/filecache.c
>>> +++ b/fs/nfsd/filecache.c
>>> @@ -408,19 +408,14 @@ nfsd_file_unhash(struct nfsd_file *nf)
>>> /*
>>> * Return true if the file was unhashed.
>>> */
>> 
>> If you're changing the function to return void, the above
>> comment is now stale.
>> 
>>> -static bool
>>> +static void
>>> nfsd_file_unhash_and_dispose(struct nfsd_file *nf, struct list_head *dispose)
>>> {
>>> 	trace_nfsd_file_unhash_and_dispose(nf);
>>> -	if (!nfsd_file_unhash(nf))
>>> -		return false;
>>> -	/* keep final reference for nfsd_file_lru_dispose */
>> 
>> This comment has been stale since nfsd_file_lru_dispose() was
>> renamed or removed. The only trouble I have is there isn't a
>> comment left that explains why we're not decrementing the hash
>> table reference here. ("don't have to" is enough to say about
>> it, but there should be something).
>> 
>> 
> 
> How about this?
> 
> +       if (nfsd_file_unhash(nf)) {
> +               /*
> +                * Unhashing was successful. Transfer it to the dispose list
> +                * so that we can put the sentinel reference for it later.
> +                */

Right idea, but I would say nothing more than "nfsd_file_dispose_list()
will put the sentinel reference later."


> +               nfsd_file_lru_remove(nf);
> +               list_add(&nf->nf_lru, dispose);
> +       }

I was staring at this earlier today and thinking it needed clean
up. This looks right to me.


> In this case, we're basically transferring the sentinel reference to the
> "dispose" list. Later, we'll call nfsd_file_dispose_list and drop it.
> 
> Now that we don't have such onerous spinlocking in this code, we might
> be able to just put each reference as we go instead of deferring it to a
> list and putting them all at the end. That's probably best done later in
> a separate patch however.
> 
> 
>>> -	if (refcount_dec_not_one(&nf->nf_ref))
>>> -		return true;
>>> -
>>> -	nfsd_file_lru_remove(nf);
>>> -	list_add(&nf->nf_lru, dispose);
>>> -	return true;
>>> +	if (nfsd_file_unhash(nf)) {
>>> +		nfsd_file_lru_remove(nf);
>>> +		list_add(&nf->nf_lru, dispose);
>>> +	}
>>> }
>>> 
>>> static void
>>> @@ -564,8 +559,6 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>>> * @lock: LRU list lock (unused)
>>> * @arg: dispose list
>>> *
>>> - * Note this can deadlock with nfsd_file_cache_purge.
>>> - *
>>> * Return values:
>>> *   %LRU_REMOVED: @item was removed from the LRU
>>> *   %LRU_ROTATE: @item is to be moved to the LRU tail
>>> @@ -750,8 +743,6 @@ nfsd_file_close_inode(struct inode *inode)
>>> *
>>> * Walk the LRU list and close any entries that have not been used since
>>> * the last scan.
>>> - *
>>> - * Note this can deadlock with nfsd_file_cache_purge.
>>> */
>>> static void
>>> nfsd_file_delayed_close(struct work_struct *work)
>>> @@ -893,16 +884,12 @@ nfsd_file_cache_init(void)
>>> 	goto out;
>>> }
>>> 
>>> -/*
>>> - * Note this can deadlock with nfsd_file_lru_cb.
>>> - */
>>> static void
>>> __nfsd_file_cache_purge(struct net *net)
>>> {
>>> 	struct rhashtable_iter iter;
>>> 	struct nfsd_file *nf;
>>> 	LIST_HEAD(dispose);
>>> -	bool del;
>>> 
>>> 	rhashtable_walk_enter(&nfsd_file_rhash_tbl, &iter);
>>> 	do {
>>> @@ -912,14 +899,7 @@ __nfsd_file_cache_purge(struct net *net)
>>> 		while (!IS_ERR_OR_NULL(nf)) {
>>> 			if (net && nf->nf_net != net)
>>> 				continue;
>>> -			del = nfsd_file_unhash_and_dispose(nf, &dispose);
>>> -
>>> -			/*
>>> -			 * Deadlock detected! Something marked this entry as
>>> -			 * unhased, but hasn't removed it from the hash list.
>>> -			 */
>>> -			WARN_ON_ONCE(!del);
>>> -
>>> +			nfsd_file_unhash_and_dispose(nf, &dispose);
>>> 			nf = rhashtable_walk_next(&iter);
>>> 		}
>>> 
>>> -- 
>>> 2.37.3
>>> 
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> 
> -- 
> Jeff Layton <jlayton@kernel.org>

--
Chuck Lever




      reply	other threads:[~2022-09-30 20:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-30 19:15 [PATCH 0/3] nfsd: filecache fixes Jeff Layton
2022-09-30 19:15 ` [PATCH 1/3] nfsd: nfsd_do_file_acquire should hold rcu_read_lock while getting refs Jeff Layton
2022-09-30 19:20   ` Chuck Lever III
2022-09-30 19:33     ` Jeff Layton
2022-09-30 20:06       ` Chuck Lever III
2022-10-01  4:44   ` NeilBrown
2022-10-01  9:47     ` Jeff Layton
2022-09-30 19:15 ` [PATCH 2/3] nfsd: fix potential race in nfsd_file_close Jeff Layton
2022-09-30 20:58   ` Jeff Layton
2022-09-30 20:59     ` Chuck Lever III
2022-10-01  5:03   ` NeilBrown
2022-10-01  9:55     ` Jeff Layton
2022-09-30 19:15 ` [PATCH 3/3] nfsd: fix nfsd_file_unhash_and_dispose Jeff Layton
2022-09-30 19:29   ` Chuck Lever III
2022-09-30 19:42     ` Jeff Layton
2022-09-30 20:23       ` Chuck Lever III [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E06BAB8A-42DA-484E-96C7-EE7A9254476C@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).