public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <chuck.lever@oracle.com>, Neil Brown <neilb@suse.de>,
	Olga Kornievskaia <okorniev@redhat.com>,
	Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nfsd: always handle RPC_SIGNALLED earlier in nfsd4_cb_sequence_done()
Date: Thu, 23 Jan 2025 09:57:55 -0500	[thread overview]
Message-ID: <cc09b2ddeae3fc83602d854e692e45ca1a799abc.camel@kernel.org> (raw)
In-Reply-To: <5e3f1c01-8e3f-4a18-86b4-c9494d71f30a@oracle.com>

On Thu, 2025-01-23 at 09:37 -0500, Chuck Lever wrote:
> On 1/22/25 11:50 AM, Jeff Layton wrote:
> > On Wed, 2025-01-22 at 11:06 -0500, Chuck Lever wrote:
> > > On 1/22/25 10:44 AM, Jeff Layton wrote:
> > > > On Wed, 2025-01-22 at 10:20 -0500, Chuck Lever wrote:
> > > > > On 1/22/25 10:10 AM, Jeff Layton wrote:
> > > > > > The v4.0 client always restarts the callback when the connection is shut
> > > > > > down (which is indicated by RPC_SIGNALLED()). The RPC is then requeued
> > > > > > and the result eventually should complete (or be aborted).
> > > > > > 
> > > > > > The v4.1 code instead processes the result and only later decides to
> > > > > > restart the call. Even more problematic is the fact that it releases the
> > > > > > slot beforehand. The restarted call may get a new slot, which would
> > > > > > could break DRC handling.
> > > > > 
> > > > > "break DRC handling" -- I'd like to understand this.
> > > > > 
> > > > > NFSD always sets cachethis to false in CB_SEQUENCE, so there is no DRC
> > > > > for these operations. The only thing the client saves is the slot
> > > > > sequence number IIUC.
> > > > > 
> > > > > Is retrying an uncached operation via a different slot a problem?
> > > > > 
> > > > 
> > > > Ahh, I missed that we always set cachethis to false. So, there is
> > > > probably now a problem with the DRC after all. Still, I don't see a
> > > > good argument for processing the CB_SEQUENCE result, when we intend to
> > > > retransmit the call anyway.
> > > 
> > > I expect that the rationale is that the slot sequence number needs to be
> > > advanced appropriately before the slot can be used again.
> > 
> > Once RPC_SIGNALLED returns true, the callback code can either trust the
> > result of the rpc_task or not. If it's going to trust that result, then
> > there is no need to restart the call.
> > 
> > If it's not going to trust it, then the RPC call might as well have not
> > happened, and there is no need to increment the slot sequence number or
> > do anything else.
> > 
> > Is my understanding wrong here?
> 
> It might be.
> 
> The callback client is careful to initialize cb_seq_status to 1 before
> it sends the RPC call. Thus if cb_seq_status is any value other than 1
> in nfsd4_cb_sequence_done(), that means an RPC reply was received
> successfully and the XDR decoder was successful.
> 
> RPC_SIGNALLED doesn't have anything to do with whether a reply arrived
> or can be trusted.
>


It may indicate that the reply from the client is no longer applicable.

CB_SEQUENCE is for synchronizing the communications on the callback
channel. If RPC_SIGNALLED() is true, then that means that the callback
client has been torn down and will be recreated, so we probably don't
want to be doing anything that might cause that to happen again on the
new one.

Today, this code both processes the CB_SEQUENCE reply _and_ restarts
the call. That seems wrong. It sounds like we really only want to
restart the call when the client didn't respond, or the response was
malformed (-ESERVERFAULT case).

> nfsd4_cb_sequence_done() needs to process the reply unconditionally,
> otherwise the server and client will disagree on the slot sequence
> number for that slot.
> 

In that case, there is no need to restart the call just because it was
RPC_SIGNALLED.
-- 
Jeff Layton <jlayton@kernel.org>

      reply	other threads:[~2025-01-23 14:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22 15:10 [PATCH] nfsd: always handle RPC_SIGNALLED earlier in nfsd4_cb_sequence_done() Jeff Layton
2025-01-22 15:20 ` Chuck Lever
2025-01-22 15:35   ` Tom Talpey
2025-01-22 15:44   ` Jeff Layton
2025-01-22 16:06     ` Chuck Lever
2025-01-22 16:50       ` Jeff Layton
2025-01-23 14:37         ` Chuck Lever
2025-01-23 14:57           ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc09b2ddeae3fc83602d854e692e45ca1a799abc.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=okorniev@redhat.com \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox