public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever III <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition
Date: Mon, 31 Oct 2022 06:08:33 -0400	[thread overview]
Message-ID: <fb57d2cb6769dbc123e15e76ec2c23b1fa9f32be.camel@kernel.org> (raw)
In-Reply-To: <1737B8C1-5B93-4887-A673-F9AFA6ED32C0@oracle.com>

On Mon, 2022-10-31 at 02:51 +0000, Chuck Lever III wrote:
> 
> > On Oct 30, 2022, at 5:45 PM, NeilBrown <neilb@suse.de> wrote:
> > 
> > On Sat, 29 Oct 2022, Chuck Lever III wrote:
> > > 
> > > > On Oct 28, 2022, at 2:57 PM, Jeff Layton <jlayton@kernel.org> wrote:
> > > > 
> > > > The list_lru_add and list_lru_del functions use list_empty checks to see
> > > > whether the object is already on the LRU. That's fine in most cases, but
> > > > we occasionally repurpose nf_lru after unhashing. It's possible for an
> > > > LRU removal to remove it from a different list altogether if we lose a
> > > > race.
> 
> Can that issue be resolved by simply adding a "struct list_head nf_dispose"
> field? That might be more straightforward than adding conditional logic.
> 

Yes, though that would take more memory.

> 
> > > I've never seen that happen. lru field re-use is actually used in other
> > > places in the kernel. Shouldn't we try to find and fix such races?
> > > 
> > > Wasn't the idea to reduce the complexity of nfsd_file_put ?
> > > 
> > 
> > I think the nfsd filecache is different from those "other places"
> > because of nfsd_file_close_inode() and related code.  If I understand
> > correctly, nfsd can remove a file from the filecache while it is still
> > in use.
> 
> Not sure about that; I think nfsd_file_close_inode() is invoked only
> when a file is deleted. I could be remembering incorrectly, but that
> seems like a really difficult race to hit.
> 

I think that's correct. We have a notifier set up that tells us when the
link count goes to zero. At that point we want to unhash the thing and
try to get it out of the cache ASAP.

FWIW, the main impetus for this was NFS reexport. Keeping unlinked files
in the cache could cause the reexporting server to do a bunch of
unnecessary silly-renaming.

> > In this case it needs to be unhashed and removed from the lru -
> > and then added to a dispose list - while it might still be active for
> > some IO request.
> > 
> > Prior to Commit 8d0d254b15cc ("nfsd: fix nfsd_file_unhash_and_dispose")
> > unhash_and_dispose() wouldn't add to the dispose list unless the
> > refcount was one.  I'm not sure how racy this test was, but it would
> > mean that it is unlikely for an nfsd_file to be added to the dispose list
> > if it was still in use.
> > 
> > After that commit it seems more likely that a nfsd_file will be added to
> > a dispose list while it is in use.
> 
> After it's linked to a dispose list via nf_lru, list_lru_add won't put
> it on the LRU -- it becomes a no-op because nf_lru is now !empty. I
> think we would have seen LRU corruption pretty quickly. Re-reading
> Jeff's patch description, that might not be the problem he's trying
> to address here.
> 
> But, it would be easy to do some reality testing. I think you could
> add a WARN_ON or tracepoint in nfsd_file_free() or somewhere in the
> dispose-list path to catch an in-use nfsd_file?
> 
> 
> > As we add/remove things to a dispose list without a lock, we need to be
> > careful that no other thread will add the nfsd_file to an lru at the
> > same time.  refcounts will no longer provide that protection.  Maybe we
> > should restore the refcount protection, or maybe we need a bit as Jeff
> > has added here.
> 
> I'm not opposed to defensive changes, in general. This one seems to be
> adding significant complexity without a clear hazard. I'd like to have
> a better understanding of exactly what misbehavior is being addressed.
> 

The real danger is that we could end up trying to remove an entry from
the LRU, when it's actually sitting on a dispose list. Without some way
to distinguish where it is, we could cause memory corruption.

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2022-10-31 10:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-28 18:57 [PATCH v3 0/4] nfsd: clean up refcounting in the filecache Jeff Layton
2022-10-28 18:57 ` [PATCH v3 1/4] nfsd: remove the pages_flushed statistic from filecache Jeff Layton
2022-10-28 19:41   ` Chuck Lever III
2022-10-28 18:57 ` [PATCH v3 2/4] nfsd: rework refcounting in filecache Jeff Layton
2022-10-28 19:49   ` Chuck Lever III
2022-10-28 20:13     ` Jeff Layton
2022-10-28 20:39       ` Chuck Lever III
2022-10-28 21:03         ` Jeff Layton
2022-10-28 21:23           ` Chuck Lever III
2022-10-31  9:40             ` Jeff Layton
2022-11-01 13:58       ` Chuck Lever III
2022-11-01 14:19         ` Jeff Layton
2022-10-28 18:57 ` [PATCH v3 3/4] nfsd: close race between unhashing and LRU addition Jeff Layton
2022-10-28 19:50   ` Chuck Lever III
2022-10-28 20:04     ` Jeff Layton
2022-10-30 21:45     ` NeilBrown
2022-10-31  2:51       ` Chuck Lever III
2022-10-31 10:08         ` Jeff Layton [this message]
2022-10-31 13:14           ` Chuck Lever III
2022-10-31 13:28             ` Jeff Layton
2022-10-31 10:01       ` Jeff Layton
2022-10-28 18:57 ` [PATCH v3 4/4] nfsd: start non-blocking writeback after adding nfsd_file to the LRU Jeff Layton
2022-10-28 19:50   ` Chuck Lever III
2022-10-28 20:30     ` Jeff Layton
2022-10-28 20:57       ` Chuck Lever III
2022-10-31  9:36         ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb57d2cb6769dbc123e15e76ec2c23b1fa9f32be.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox