public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nfsd: always handle RPC_SIGNALLED earlier in nfsd4_cb_sequence_done()
Date: Wed, 22 Jan 2025 10:44:43 -0500	[thread overview]
Message-ID: <6196ed07e6db1bf664d2f10280cbe0e80857583e.camel@kernel.org> (raw)
In-Reply-To: <e85a0515-7972-4428-9270-a982073adcb4@oracle.com>

On Wed, 2025-01-22 at 10:20 -0500, Chuck Lever wrote:
> On 1/22/25 10:10 AM, Jeff Layton wrote:
> > The v4.0 client always restarts the callback when the connection is shut
> > down (which is indicated by RPC_SIGNALLED()). The RPC is then requeued
> > and the result eventually should complete (or be aborted).
> > 
> > The v4.1 code instead processes the result and only later decides to
> > restart the call. Even more problematic is the fact that it releases the
> > slot beforehand. The restarted call may get a new slot, which would
> > could break DRC handling.
> 
> "break DRC handling" -- I'd like to understand this.
> 
> NFSD always sets cachethis to false in CB_SEQUENCE, so there is no DRC
> for these operations. The only thing the client saves is the slot
> sequence number IIUC.
> 
> Is retrying an uncached operation via a different slot a problem?
> 

Ahh, I missed that we always set cachethis to false. So, there is
probably now a problem with the DRC after all. Still, I don't see a
good argument for processing the CB_SEQUENCE result, when we intend to
retransmit the call anyway.

> 
> > Change nfsd4_cb_sequence_done() such that RPC_SIGNALLED() is handled the
> > same way irrespective of the minorversion. Check it early, before
> > processing the CB_SEQUENCE result, and immediately restart the call.
> 
> It would be lovely to have some test cases that exercise the server's
> backchannel retry logic.
> 

It sure would. I think that sort of thing would require a synthetic
client (like pynfs).

Since we're on the subject too, pynfs is broken on Fedora 41. That
moved to python 3.13, which dropped support for the the xdrlib module.
There is a forked xdrlib3 module which may be a suitable replacement,
but I need to see if I can get it to work.

> 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >   fs/nfsd/nfs4callback.c | 32 ++++++++++++++++----------------
> >   1 file changed, 16 insertions(+), 16 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> > index 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..05afdf94c6478c7d698b059fa33dd9e7af49b91e 100644
> > --- a/fs/nfsd/nfs4callback.c
> > +++ b/fs/nfsd/nfs4callback.c
> > @@ -1334,21 +1334,24 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   	struct nfsd4_session *session = clp->cl_cb_session;
> >   	bool ret = true;
> >   
> > -	if (!clp->cl_minorversion) {
> > -		/*
> > -		 * If the backchannel connection was shut down while this
> > -		 * task was queued, we need to resubmit it after setting up
> > -		 * a new backchannel connection.
> > -		 *
> > -		 * Note that if we lost our callback connection permanently
> > -		 * the submission code will error out, so we don't need to
> > -		 * handle that case here.
> > -		 */
> > -		if (RPC_SIGNALLED(task))
> > -			goto need_restart;
> > +	/*
> > +	 * If the backchannel connection was shut down while this
> > +	 * task was queued, resubmit it after setting up a new backchannel
> > +	 * connection.
> > +	 *
> > +	 * If the backchannel connection is permanently lost, the submission
> > +	 * code will error out, so there is no need to handle that case here.
> > +	 *
> > +	 * For the v4.1+ case, do this without releasing the slot or doing
> > +	 * any further processing. It's possible that the restarted call will
> > +	 * be a retransmission of an earlier call to the server and that will
> > +	 * help ensure that the DRC works correctly.
> > +	 */
> > +	if (RPC_SIGNALLED(task))
> > +		goto need_restart;
> >   
> > +	if (!clp->cl_minorversion)
> >   		return true;
> > -	}
> >   
> >   	if (cb->cb_held_slot < 0)
> >   		goto need_restart;
> > @@ -1403,9 +1406,6 @@ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback
> >   	}
> >   	trace_nfsd_cb_free_slot(task, cb);
> >   	nfsd41_cb_release_slot(cb);
> > -
> > -	if (RPC_SIGNALLED(task))
> > -		goto need_restart;
> >   out:
> >   	return ret;
> >   retry_nowait:
> > 
> > ---
> > base-commit: a8acd0de47f22619d62d548b86bcfc9a4de2b2c6
> > change-id: 20250122-nfsd-6-14-77c1ad5d9f01
> > 
> > Best regards,
> 
> 

-- 
Jeff Layton <jlayton@kernel.org>

  parent reply	other threads:[~2025-01-22 15:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22 15:10 [PATCH] nfsd: always handle RPC_SIGNALLED earlier in nfsd4_cb_sequence_done() Jeff Layton
2025-01-22 15:20 ` Chuck Lever
2025-01-22 15:35   ` Tom Talpey
2025-01-22 15:44   ` Jeff Layton [this message]
2025-01-22 16:06     ` Chuck Lever
2025-01-22 16:50       ` Jeff Layton
2025-01-23 14:37         ` Chuck Lever
2025-01-23 14:57           ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6196ed07e6db1bf664d2f10280cbe0e80857583e.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox