From: Chuck Lever <chuck.lever@oracle.com>
To: Dai Ngo <dai.ngo@oracle.com>, Christoph Hellwig <hch@lst.de>
Cc: jlayton@kernel.org, neilb@ownmail.net, okorniev@redhat.com,
tom@talpey.com, linux-nfs@vger.kernel.org
Subject: Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
Date: Mon, 3 Nov 2025 15:15:40 -0500 [thread overview]
Message-ID: <40c969cf-9898-48f5-88bb-d6bed7b54a9c@oracle.com> (raw)
In-Reply-To: <f4ddebf0-7039-47c9-8e20-9622c8b33ddd@oracle.com>
On 11/3/25 2:14 PM, Dai Ngo wrote:
>> and I disagree that fencing is harsh, because
>> NFS4ERR_RETRY_UNCACHED_REP is supposed to be quite rare, and of course
>> there are other ways this error can happen.
>
> Yes, this error should be rare. But is fencing the client is a correct
> solution for it? IMHO, NFS4ERR_RETRY_UNCACHED_REP means the client has
> received and replied to the server, it just somehow the server did not
> see the reply due to many reasons.
Fencing seems appropriate when there is a clear indication that the
client and server state are out of sync. The question is why, and how do
we prevent that situation from occurring? And, when we get into this
state, what is the correct recovery?
I don't see NFSD doing this short-circuit when processing a CB_RECALL
response, for instance.
> I think in this case we should just
> mark the back channel down and let the client recover it, instead of
> fencing the client.
Clearly the backchannel needs to recover properly from
NFS4ERR_RETRY_UNCACHED_REP, and if it goes into a loop, something is not
right. I don't think this is the correct fix for looping, either.
I don't understand why, after the server indicates a backchannel fault,
the client and server don't replace the session. The server is trying
to re-use what is obviously an incorrect slot sequence ID; it shouldn't
expect any different result by retrying.
So, yes, there are one or more real bugs here. But ignoring a sign that
state synchrony has been lost is not the right fix.
--
Chuck Lever
next prev parent reply other threads:[~2025-11-03 20:16 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-01 18:51 [PATCH 0/3] NFSD: Fix problem with nfsd4_scsi_fence_client Dai Ngo
2025-11-01 18:51 ` [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type Dai Ngo
2025-11-03 11:42 ` Christoph Hellwig
2025-11-01 18:51 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo
2025-11-03 11:45 ` Christoph Hellwig
2025-11-03 14:16 ` Chuck Lever
2025-11-03 18:50 ` Dai Ngo
2025-11-03 18:57 ` Chuck Lever
2025-11-03 19:14 ` Dai Ngo
2025-11-03 20:03 ` Dai Ngo
2025-11-03 20:15 ` Chuck Lever [this message]
2025-11-03 20:36 ` Dai Ngo
2025-11-03 19:22 ` Jeff Layton
2025-11-03 19:36 ` Dai Ngo
2025-11-03 19:40 ` Jeff Layton
2025-11-01 18:51 ` [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation Dai Ngo
2025-11-02 15:40 ` Chuck Lever
2025-11-03 20:44 ` Dai Ngo
2025-11-03 21:00 ` Chuck Lever
2025-11-04 0:32 ` Dai Ngo
2025-11-04 14:05 ` Chuck Lever
-- strict thread matches above, loose matches on Subject: below --
2025-11-01 18:25 Dai Ngo
2025-11-01 18:25 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=40c969cf-9898-48f5-88bb-d6bed7b54a9c@oracle.com \
--to=chuck.lever@oracle.com \
--cc=dai.ngo@oracle.com \
--cc=hch@lst.de \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@ownmail.net \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).