linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "bfields@fieldses.org" <bfields@fieldses.org>
To: Trond Myklebust <trondmy@hammerspace.com>
Cc: "chuck.lever@oracle.com" <chuck.lever@oracle.com>,
	"olivier@bm-services.com" <olivier@bm-services.com>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"carnil@debian.org" <carnil@debian.org>
Subject: Re: Kernel panic / list_add corruption when in nfsd4_run_cb_work
Date: Wed, 24 Nov 2021 11:10:38 -0500	[thread overview]
Message-ID: <20211124161038.GC30602@fieldses.org> (raw)
In-Reply-To: <0dbe620703eb27f36c02b4e001e74d67390bce9e.camel@hammerspace.com>

On Wed, Nov 24, 2021 at 03:59:47PM +0000, Trond Myklebust wrote:
> On Wed, 2021-11-24 at 10:29 -0500, Bruce Fields wrote:
> > On Mon, Nov 22, 2021 at 03:17:28PM +0000, Chuck Lever III wrote:
> > > 
> > > 
> > > > On Nov 22, 2021, at 4:15 AM, Olivier Monaco
> > > > <olivier@bm-services.com> wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > I think my problem is related to this thread.
> > > > 
> > > > We are running a VMware vCenter platform running 9 groups of
> > > > virtual machines. Each group includes a VM with NFS for file
> > > > sharing and 3 VMs with NFS clients, so we are running 9
> > > > independent file servers.
> > > 
> > > I've opened https://bugzilla.linux-nfs.org/show_bug.cgi?id=371
> > > 
> > > Just a random thought: would enabling KASAN shed some light?
> > 
> > In fact, we've gotten reports from Redhat QE of a KASAN use-after-
> > free
> > warning in the laundromat thread, which I think might be the same
> > bug.
> > We've been getting occasional reports of problems here for a long
> > time,
> > but they've been very hard to reproduce.
> > 
> > After fooling with their reproducer, I think I finally have it.
> > Embarrasingly, it's nothing that complicated.  You can make it much
> > easier to reproduce by adding an msleep call after the vfs_setlease
> > in
> > nfs4_set_delegation.
> > 
> > If it's possible to run a patched kernel in production, you could try
> > the following, and I'd definitely be interested in any results.
> > 
> > Otherwise, you can probably work around the problem by disabling
> > delegations.  Something like
> > 
> >         sudo echo "fs.leases-enable = 0" >/etc/sysctl.d/nfsd-
> > workaround.conf
> > 
> > should do it.
> > 
> > Not sure if this fix is best or if there's something simpler.
> > 
> > --b.
> > 
> > commit 6de51237589e
> > Author: J. Bruce Fields <bfields@redhat.com>
> > Date:   Tue Nov 23 22:31:04 2021 -0500
> > 
> >     nfsd: fix use-after-free due to delegation race
> >     
> >     A delegation break could arrive as soon as we've called
> > vfs_setlease.  A
> >     delegation break runs a callback which immediately (in
> >     nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. 
> > If we
> >     then exit nfs4_set_delegation without hashing the delegation, it
> > will be
> >     freed as soon as the callback is done with it, without ever being
> >     removed from del_recall_lru.
> >     
> >     Symptoms show up later as use-after-free or list corruption
> > warnings,
> >     usually in the laundromat thread.
> >     
> >     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index bfad94c70b84..8e063f49240b 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -5159,15 +5159,16 @@ nfs4_set_delegation(struct nfs4_client *clp,
> > struct svc_fh *fh,
> >                 locks_free_lock(fl);
> >         if (status)
> >                 goto out_clnt_odstate;
> > +
> >         status = nfsd4_check_conflicting_opens(clp, fp);
> > -       if (status)
> > -               goto out_unlock;
> >  
> >         spin_lock(&state_lock);
> >         spin_lock(&fp->fi_lock);
> > -       if (fp->fi_had_conflict)
> > +       if (status || fp->fi_had_conflict) {
> > +               list_del_init(&dp->dl_recall_lru);
> > +               dp->dl_time++;
> >                 status = -EAGAIN;
> > -       else
> > +       } else
> >                 status = hash_delegation_locked(dp, fp);
> >         spin_unlock(&fp->fi_lock);
> >         spin_unlock(&state_lock);
> 
> Why won't this leak a reference to the stid?

I'm not seeing it.

> Afaics nfsd_break_one_deleg() does take a reference before launching
> the callback daemon, and that reference is released when the
> delegation is later processed by the laundromat.

Right.  Basically, there are two "long-lived" references, one taken as
long as the callback's using it, one taken as long as it's "hashed" (see
hash_delegation_locked/unhash_delegation_locked).

In the -EAGAIN case above, we're holding a temporary reference which
will be dropped on exit from this function; if a callback's in progress,
it will then drop the final reference.

> Hmm... Isn't the real bug here rather that the laundromat is corrupting
> both the nn->client_lru and nn->del_recall_lru lists because it is
> using list_add() instead of list_move() when adding these objects to
> the reaplist?

If that were the problem, I think we'd be hitting it all the time.
Looking....  No, unhash_delegation_locked did a list_del_init().

(Is the WARN_ON there too ugly?)

--b.

  parent reply	other threads:[~2021-11-24 16:10 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-11  7:59 Kernel panic / list_add corruption when in nfsd4_run_cb_work Salvatore Bonaccorso
2020-10-12 14:26 ` J. Bruce Fields
2020-10-12 15:41   ` Salvatore Bonaccorso
2020-10-12 16:33     ` J. Bruce Fields
2020-10-18  9:39       ` Salvatore Bonaccorso
2021-10-06 18:46         ` Salvatore Bonaccorso
2021-11-22  9:15           ` Olivier Monaco
2021-11-22 15:17             ` Chuck Lever III
2021-11-24 15:29               ` Bruce Fields
2021-11-24 15:59                 ` Trond Myklebust
2021-11-24 16:10                   ` Trond Myklebust
2021-11-24 16:10                   ` bfields [this message]
2021-11-24 17:14                     ` Trond Myklebust
2021-11-24 22:06                       ` bfields
2021-11-24 22:17                         ` Trond Myklebust
2021-12-01 22:33                           ` bfields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211124161038.GC30602@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=carnil@debian.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=olivier@bm-services.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).