linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: Anna Schumaker <Anna.Schumaker@netapp.com>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/2] NFSv4: Fix a state manager thread deadlock regression
Date: Sun, 17 Sep 2023 22:27:47 -0400	[thread overview]
Message-ID: <cd36fda66e2ef007bcf23dcec25f5bb8cd68dc27.camel@kernel.org> (raw)
In-Reply-To: <169500033147.8274.17464264746177119631@noble.neil.brown.name>

On Mon, 2023-09-18 at 11:25 +1000, NeilBrown wrote:
> On Mon, 18 Sep 2023, trondmy@kernel.org wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > Commit 4dc73c679114 reintroduces the deadlock that was fixed by
> > commit
> > aeabb3c96186 ("NFSv4: Fix a NFSv4 state manager deadlock") because
> > it
> > prevents the setup of new threads to handle reboot recovery, while
> > the
> > older recovery thread is stuck returning delegations.
> 
> I hadn't realised that the state manager thread did these two
> distinct
> tasks and might need to be doing both at once - requiring two
> threads.
> Thanks for highlighting that.
> 
> It seems to me that even with the new code, we can still get a
> deadlock
> when swap is enabled, as we only ever run one thread in that case.
> Is that correct, or did I miss something?

That is correct. I did try to point this out when you were submitting
the swap patches, but my understanding was that you were assuming that
delegations would not be enabled when swap is enabled.

> 
> Maybe we need two threads - a state manager and a delegation recall
> handler.  And when swap is enabled, both need to be running
> permanently
> ??

Possibly.

> 
> > 
> > Fixes: 4dc73c679114 ("NFSv4: keep state manager thread active if
> > swap is enabled")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> > ---
> >  fs/nfs/nfs4proc.c  |  4 +++-
> >  fs/nfs/nfs4state.c | 38 ++++++++++++++++++++++++++------------
> >  2 files changed, 29 insertions(+), 13 deletions(-)
> > 
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 5deeaea8026e..a19e809cad16 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -10652,7 +10652,9 @@ static void nfs4_disable_swap(struct inode
> > *inode)
> >          */
> >         struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
> >  
> > -       nfs4_schedule_state_manager(clp);
> > +       set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
> > +       clear_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp->cl_state);
> > +       wake_up_var(&clp->cl_state);
> >  }
> >  
> >  static const struct inode_operations nfs4_dir_inode_operations = {
> > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > index 0bc160fbabec..5751a6886da4 100644
> > --- a/fs/nfs/nfs4state.c
> > +++ b/fs/nfs/nfs4state.c
> > @@ -1209,16 +1209,26 @@ void nfs4_schedule_state_manager(struct
> > nfs_client *clp)
> >  {
> >         struct task_struct *task;
> >         char buf[INET6_ADDRSTRLEN + sizeof("-manager") + 1];
> > +       struct rpc_clnt *clnt = clp->cl_rpcclient;
> > +       bool swapon = false;
> >  
> > -       if (clp->cl_rpcclient->cl_shutdown)
> > +       if (clnt->cl_shutdown)
> >                 return;
> >  
> >         set_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state);
> > -       if (test_and_set_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp-
> > >cl_state) != 0) {
> > -               wake_up_var(&clp->cl_state);
> > -               return;
> > +
> > +       if (atomic_read(&clnt->cl_swapper)) {
> > +               swapon =
> > !test_and_set_bit(NFS4CLNT_MANAGER_AVAILABLE,
> > +                                          &clp->cl_state);
> > +               if (!swapon) {
> > +                       wake_up_var(&clp->cl_state);
> > +                       return;
> > +               }
> >         }
> > -       set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state);
> > +
> > +       if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp-
> > >cl_state) != 0)
> > +               return;
> > +
> >         __module_get(THIS_MODULE);
> >         refcount_inc(&clp->cl_count);
> >  
> > @@ -1235,8 +1245,9 @@ void nfs4_schedule_state_manager(struct
> > nfs_client *clp)
> >                         __func__, PTR_ERR(task));
> >                 if (!nfs_client_init_is_complete(clp))
> >                         nfs_mark_client_ready(clp, PTR_ERR(task));
> > +               if (swapon)
> > +                       clear_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp-
> > >cl_state);
> >                 nfs4_clear_state_manager_bit(clp);
> > -               clear_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp-
> > >cl_state);
> >                 nfs_put_client(clp);
> >                 module_put(THIS_MODULE);
> >         }
> > @@ -2748,22 +2759,25 @@ static int nfs4_run_state_manager(void
> > *ptr)
> >  
> >         allow_signal(SIGKILL);
> >  again:
> > -       set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state);
> >         nfs4_state_manager(clp);
> > -       if (atomic_read(&cl->cl_swapper)) {
> > +
> > +       if (test_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp->cl_state) &&
> > +           !test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp-
> > >cl_state)) {
> >                 wait_var_event_interruptible(&clp->cl_state,
> >                                             
> > test_bit(NFS4CLNT_RUN_MANAGER,
> >                                                       &clp-
> > >cl_state));
> > -               if (atomic_read(&cl->cl_swapper) &&
> > -                   test_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state))
> > +               if (!atomic_read(&cl->cl_swapper))
> > +                       clear_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp-
> > >cl_state);
> > +               if (refcount_read(&clp->cl_count) > 1 &&
> > !signalled())
> >                         goto again;
> >                 /* Either no longer a swapper, or were signalled */
> > +               clear_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp-
> > >cl_state);
> > +               nfs4_clear_state_manager_bit(clp);
> >         }
> > -       clear_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp->cl_state);
> >  
> >         if (refcount_read(&clp->cl_count) > 1 && !signalled() &&
> >             test_bit(NFS4CLNT_RUN_MANAGER, &clp->cl_state) &&
> > -           !test_and_set_bit(NFS4CLNT_MANAGER_AVAILABLE, &clp-
> > >cl_state))
> > +           !test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp-
> > >cl_state))
> >                 goto again;
> >  
> >         nfs_put_client(clp);
> > -- 
> > 2.41.0
> > 
> > 
> 


  reply	other threads:[~2023-09-18  2:29 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-17 23:05 [PATCH 1/2] NFSv4: Fix a nfs4_state_manager() race trondmy
2023-09-17 23:05 ` [PATCH 2/2] NFSv4: Fix a state manager thread deadlock regression trondmy
2023-09-18  1:25   ` NeilBrown
2023-09-18  2:27     ` Trond Myklebust [this message]
2023-09-20 19:38   ` Anna Schumaker
2023-09-21  0:15     ` Trond Myklebust
2023-09-22 17:22       ` Olga Kornievskaia
2023-09-22 19:05         ` Trond Myklebust
2023-09-22 21:00           ` Olga Kornievskaia
2023-09-22 21:06             ` Trond Myklebust
2023-09-24 17:08               ` Trond Myklebust
2023-09-26 14:55                 ` Anna Schumaker
2023-09-26 14:31               ` Olga Kornievskaia
2023-09-25 22:28           ` NeilBrown
2023-09-25 22:44             ` Trond Myklebust
2023-09-25 23:04               ` NeilBrown
2023-09-25 23:20                 ` Trond Myklebust
2023-09-18  1:17 ` [PATCH 1/2] NFSv4: Fix a nfs4_state_manager() race NeilBrown
2023-09-18  2:20   ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd36fda66e2ef007bcf23dcec25f5bb8cd68dc27.camel@kernel.org \
    --to=trondmy@kernel.org \
    --cc=Anna.Schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).