All of lore.kernel.org
 help / color / mirror / Atom feed
From: Scott Mayhew <smayhew@redhat.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: jlayton@kernel.org, linux-nfs@vger.kernel.org
Subject: Re: [PATCH] nfsd: CB_RECALL can race with FREE_STATEID
Date: Tue, 30 Apr 2019 14:58:45 -0400	[thread overview]
Message-ID: <20190430185845.GG15226@coeurl.usersys.redhat.com> (raw)
In-Reply-To: <20190418213730.GA1891@fieldses.org>

On Thu, 18 Apr 2019, J. Bruce Fields wrote:

> On Thu, Apr 18, 2019 at 04:50:24PM -0400, Scott Mayhew wrote:
> > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > 
> > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > > While trying to track down some issues involving large numbers of
> > > > delegations being recalled/revoked, I caught the server setting
> > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding to
> > > > CB_RECALLs.  It turns out that the client had already done a
> > > > TEST_STATEID and FREE_STATEID for a delegation being recalled by the
> > > > time it received the CB_RECALL.
> > > 
> > > That's interesting, thanks!
> > > 
> > > This exception seems awfully narrow, though.
> > > 
> > > If we get back any NFS-level error at all, then I think the callback
> > > channel is working (am I wrong?)
> > 
> > Correct, if the client replies with either NFS4ERR_DELAY or
> > NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
> > After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
> > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> > 
> > > and telling the client to set up a new
> > > one is probably not going to help.  The best we can do is probably just
> > > give up
> > 
> > That's what the patch is essentially doing.  Or are you saying don't
> > even bother with the checks but still return 1 so we don't set the
> > SEQ4_STATUS_CB_PATH_DOWN flag?
> 
> Right, I don't see any point returning -1 (which ends up setting
> CB_PATH_DOWN) in any case where we get an nfs-level error.  If the
> client got so far as returning an error, then the callback path is
> working.
> 
> I'm not sure exactly what errors *should* result in CB_PATH_DOWN,
> though.  ETIMEDOUT, ENOTCONN, EIO?

I'm not sure either.  Looking at
call_status/call_timeout/rpc_check_timeout, it looks to me like ENOTCONN
will be translated to ETIMEDOUT because nfsd4_run_cb_work sets the 
RPC_TASK_SOFTCONN flag in the call to rpc_call_async.

It looks like call_status can return EHOSTDOWN, ENETDOWN, EHOSTUNREACH,
ENETUNREACH, and EPERM... should those be handled as well?

-Scott

> And maybe we should be checking for
> those in nfsd4_cb_done, and do away with the convention that -1 means
> CB_PATH_DOWN.  I don't think there's a reason individual callback ops
> would need different rules for when to mark the callback channel down.
> 
> --b.
> 
> > 
> > > and let the client deal with the ensuing
> > > RECALLABLE_STATE_REVOKED flag.
> > 
> > The client's already dealing with the RECALLABLE_STATE_REVOKED flag,
> > that's why it sent a TEST_STATEID and FREE_STATEID before it got this
> > particular CB_RECALL.  The idea behind the patch is to not give the
> > state manager on the client additional work by setting CB_PATH_DOWN when
> > the callback channel is clearly working...
> > 
> > -Scott
> > > 
> > > --b.
> > > 
> > > > 
> > > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 8 ++++++++
> > > >  1 file changed, 8 insertions(+)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index 6a45fb00c5fc..e88e429133a8 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -3958,6 +3958,14 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
> > > >  			rpc_delay(task, 2 * HZ);
> > > >  			return 0;
> > > >  		}
> > > > +		/*
> > > > +		 * Race: client may have done a FREE_STATEID before
> > > > +		 * receiving the CB_RECALL.
> > > > +		 */
> > > > +		if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID &&
> > > > +				refcount_read(&dp->dl_stid.sc_count) == 1 &&
> > > > +				list_empty(&dp->dl_recall_lru))
> > > > +			return 1;
> > > >  		/*FALLTHRU*/
> > > >  	default:
> > > >  		return -1;
> > > > -- 
> > > > 2.17.2

  reply	other threads:[~2019-04-30 18:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-18 13:24 [PATCH] nfsd: CB_RECALL can race with FREE_STATEID Scott Mayhew
2019-04-18 15:13 ` J. Bruce Fields
2019-04-18 20:50   ` Scott Mayhew
2019-04-18 21:37     ` J. Bruce Fields
2019-04-30 18:58       ` Scott Mayhew [this message]
2019-04-30 19:03         ` Trond Myklebust
2019-05-02 11:35           ` Scott Mayhew
2019-05-02 11:49             ` Trond Myklebust
2019-04-18 22:03     ` Trond Myklebust
2019-04-18 23:42       ` bfields
2019-04-30 18:46       ` Scott Mayhew

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190430185845.GG15226@coeurl.usersys.redhat.com \
    --to=smayhew@redhat.com \
    --cc=bfields@fieldses.org \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.