[PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-01 18:25 Dai Ngo
@ 2025-11-01 18:25 ` Dai Ngo
  0 siblings, 0 replies; 22+ messages in thread
From: Dai Ngo @ 2025-11-01 18:25 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
to the layout recall, no fencing is needed.

Fixes: f99d4fbdae67 ("nfsd: add SCSI layout support")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/nfs4layouts.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 683bd1130afe..cfce8bfd396c 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -715,6 +715,12 @@ nfsd4_cb_layout_done(struct nfsd4_callback *cb, struct rpc_task *task)
 			nfsd_file_put(fl);
 		}
 		return 1;
+	case -NFS4ERR_RETRY_UNCACHED_REP:
+		/*
+		 * client has seen and replied to this request,
+		 * no need to fence the client
+		 */
+		return 1;
 	case -NFS4ERR_NOMATCHING_LAYOUT:
 		trace_nfsd_layout_recall_done(&ls->ls_stid.sc_stateid);
 		task->tk_status = 0;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 0/3] NFSD: Fix problem with nfsd4_scsi_fence_client
@ 2025-11-01 18:51 Dai Ngo
  2025-11-01 18:51 ` [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type Dai Ngo
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Dai Ngo @ 2025-11-01 18:51 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

[resend with subject line]

This patch series includes the following:

. Fix problem with nfsd4_scsi_fence_client using the wrong reservation
type.
. No need to fence the client when server receives
NFS4ERR_RETRY_UNCACHED_REP.
  This error means the client has seen and replied to the layout recall.
. Add trace point to track SCSI preempt operation.

 fs/nfsd/blocklayout.c |  7 +++++--
 fs/nfsd/nfs4layouts.c |  6 ++++++
 fs/nfsd/trace.h       | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type
  2025-11-01 18:51 [PATCH 0/3] NFSD: Fix problem with nfsd4_scsi_fence_client Dai Ngo
@ 2025-11-01 18:51 ` Dai Ngo
  2025-11-03 11:42   ` Christoph Hellwig
  2025-11-01 18:51 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo
  2025-11-01 18:51 ` [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation Dai Ngo
  2 siblings, 1 reply; 22+ messages in thread
From: Dai Ngo @ 2025-11-01 18:51 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

The reservation type argument for the pr_preempt call should match the
one used in nfsd4_block_get_device_info_scsi.

Fixes: f99d4fbdae67 ("nfsd: add SCSI layout support")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/blocklayout.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index fde5539cf6a6..13ff7f474190 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -342,7 +342,7 @@ nfsd4_scsi_fence_client(struct nfs4_layout_stateid *ls, struct nfsd_file *file)
 	struct block_device *bdev = file->nf_file->f_path.mnt->mnt_sb->s_bdev;
 
 	bdev->bd_disk->fops->pr_ops->pr_preempt(bdev, NFSD_MDS_PR_KEY,
-			nfsd4_scsi_pr_key(clp), 0, true);
+			nfsd4_scsi_pr_key(clp), PR_EXCLUSIVE_ACCESS_REG_ONLY, true);
 }
 
 const struct nfsd4_layout_ops scsi_layout_ops = {
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-01 18:51 [PATCH 0/3] NFSD: Fix problem with nfsd4_scsi_fence_client Dai Ngo
  2025-11-01 18:51 ` [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type Dai Ngo
@ 2025-11-01 18:51 ` Dai Ngo
  2025-11-03 11:45   ` Christoph Hellwig
  2025-11-01 18:51 ` [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation Dai Ngo
  2 siblings, 1 reply; 22+ messages in thread
From: Dai Ngo @ 2025-11-01 18:51 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
to the layout recall, no fencing is needed.

Fixes: f99d4fbdae67 ("nfsd: add SCSI layout support")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/nfs4layouts.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 683bd1130afe..cfce8bfd396c 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -715,6 +715,12 @@ nfsd4_cb_layout_done(struct nfsd4_callback *cb, struct rpc_task *task)
 			nfsd_file_put(fl);
 		}
 		return 1;
+	case -NFS4ERR_RETRY_UNCACHED_REP:
+		/*
+		 * client has seen and replied to this request,
+		 * no need to fence the client
+		 */
+		return 1;
 	case -NFS4ERR_NOMATCHING_LAYOUT:
 		trace_nfsd_layout_recall_done(&ls->ls_stid.sc_stateid);
 		task->tk_status = 0;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation.
  2025-11-01 18:51 [PATCH 0/3] NFSD: Fix problem with nfsd4_scsi_fence_client Dai Ngo
  2025-11-01 18:51 ` [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type Dai Ngo
  2025-11-01 18:51 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo
@ 2025-11-01 18:51 ` Dai Ngo
  2025-11-02 15:40   ` Chuck Lever
  2 siblings, 1 reply; 22+ messages in thread
From: Dai Ngo @ 2025-11-01 18:51 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

Add trace point to print client IP address, device name and
status of SCSI pr_preempt command.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
---
 fs/nfsd/blocklayout.c |  5 ++++-
 fs/nfsd/trace.h       | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/blocklayout.c b/fs/nfsd/blocklayout.c
index 13ff7f474190..5fb0b6d59fdc 100644
--- a/fs/nfsd/blocklayout.c
+++ b/fs/nfsd/blocklayout.c
@@ -13,6 +13,7 @@
 #include "pnfs.h"
 #include "filecache.h"
 #include "vfs.h"
+#include "trace.h"
 
 #define NFSDDBG_FACILITY	NFSDDBG_PNFS
 
@@ -340,9 +341,11 @@ nfsd4_scsi_fence_client(struct nfs4_layout_stateid *ls, struct nfsd_file *file)
 {
 	struct nfs4_client *clp = ls->ls_stid.sc_client;
 	struct block_device *bdev = file->nf_file->f_path.mnt->mnt_sb->s_bdev;
+	int status;
 
-	bdev->bd_disk->fops->pr_ops->pr_preempt(bdev, NFSD_MDS_PR_KEY,
+	status = bdev->bd_disk->fops->pr_ops->pr_preempt(bdev, NFSD_MDS_PR_KEY,
 			nfsd4_scsi_pr_key(clp), PR_EXCLUSIVE_ACCESS_REG_ONLY, true);
+	trace_nfsd_pnfs_fence(clp, bdev->bd_disk->disk_name, status);
 }
 
 const struct nfsd4_layout_ops scsi_layout_ops = {
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 6e2c8e2aab10..6a8c840bcf2a 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -2613,6 +2613,42 @@ DEFINE_EVENT(nfsd_vfs_getattr_class, __name,		\
 DEFINE_NFSD_VFS_GETATTR_EVENT(nfsd_vfs_getattr);
 DEFINE_NFSD_VFS_GETATTR_EVENT(nfsd_vfs_statfs);
 
+DECLARE_EVENT_CLASS(nfsd_pnfs_class,
+	TP_PROTO(
+		const struct nfs4_client *clp,
+		const char *dev,
+		int error
+	),
+	TP_ARGS(clp, dev, error),
+	TP_STRUCT__entry(
+		__string(dev, dev)
+		__array(unsigned char, addr, sizeof(struct sockaddr_in6))
+		__field(u32, error)
+	),
+	TP_fast_assign(
+		memcpy(__entry->addr, &clp->cl_addr,
+			sizeof(struct sockaddr_in6));
+		__assign_str(dev);
+		__entry->error = error;
+	),
+	TP_printk("client=%pISpc dev=%s error=%d",
+		__entry->addr,
+		__get_str(dev),
+		__entry->error
+	)
+);
+
+#define DEFINE_NFSD_PNFS_ERR_EVENT(name)		\
+DEFINE_EVENT(nfsd_pnfs_class, nfsd_pnfs_##name,	\
+	TP_PROTO(					\
+		const struct nfs4_client *clp,		\
+		const char *dev,				\
+		int error				\
+	),						\
+	TP_ARGS(clp, dev, error))
+
+DEFINE_NFSD_PNFS_ERR_EVENT(fence);
+
 #endif /* _NFSD_TRACE_H */
 
 #undef TRACE_INCLUDE_PATH
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation.
  2025-11-01 18:51 ` [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation Dai Ngo
@ 2025-11-02 15:40   ` Chuck Lever
  2025-11-03 20:44     ` Dai Ngo
  2025-11-04  0:32     ` Dai Ngo
  0 siblings, 2 replies; 22+ messages in thread
From: Chuck Lever @ 2025-11-02 15:40 UTC (permalink / raw)
  To: Dai Ngo, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

On 11/1/25 2:51 PM, Dai Ngo wrote:
> +	TP_STRUCT__entry(
> +		__string(dev, dev)
> +		__array(unsigned char, addr, sizeof(struct sockaddr_in6))
> +		__field(u32, error)
> +	),
> +	TP_fast_assign(
> +		memcpy(__entry->addr, &clp->cl_addr,
> +			sizeof(struct sockaddr_in6));
> +		__assign_str(dev);
> +		__entry->error = error;
> +	),
> +	TP_printk("client=%pISpc dev=%s error=%d",
> +		__entry->addr,
> +		__get_str(dev),
> +		__entry->error
> +	)

Have a look at the nfsd_cs_slot_class event class (fs/nfsd/trace.h) to
see how to use the trace subsystem's native sockaddr handling.

Do you want to record anything else here? The cl_boot/cl_id, perhaps? I
guess there's no XID available... but is there a relevant state ID?

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type
  2025-11-01 18:51 ` [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type Dai Ngo
@ 2025-11-03 11:42   ` Christoph Hellwig
  0 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2025-11-03 11:42 UTC (permalink / raw)
  To: Dai Ngo; +Cc: chuck.lever, jlayton, neilb, okorniev, tom, hch, linux-nfs

On Sat, Nov 01, 2025 at 11:51:33AM -0700, Dai Ngo wrote:
> -			nfsd4_scsi_pr_key(clp), 0, true);
> +			nfsd4_scsi_pr_key(clp), PR_EXCLUSIVE_ACCESS_REG_ONLY, true);

Please avoid the overly long line.  Otherwise this looks good to me.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-01 18:51 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo
@ 2025-11-03 11:45   ` Christoph Hellwig
  2025-11-03 14:16     ` Chuck Lever
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2025-11-03 11:45 UTC (permalink / raw)
  To: Dai Ngo; +Cc: chuck.lever, jlayton, neilb, okorniev, tom, hch, linux-nfs

On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
> to the layout recall, no fencing is needed.

RFC 5661 specifies that error as:

  The requester has attempted a retry of a Compound that it previously
  requested not be placed in the reply cache.

which to me doesn't imply a positive action here.  But I'm not an
expert at reply cache semantics, so I'll leave others to correct me.
But please add a more detailed comment and commit log as this is
completely unintuitive.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 11:45   ` Christoph Hellwig
@ 2025-11-03 14:16     ` Chuck Lever
  2025-11-03 18:50       ` Dai Ngo
  0 siblings, 1 reply; 22+ messages in thread
From: Chuck Lever @ 2025-11-03 14:16 UTC (permalink / raw)
  To: Christoph Hellwig, Dai Ngo; +Cc: jlayton, neilb, okorniev, tom, linux-nfs

On 11/3/25 6:45 AM, Christoph Hellwig wrote:
> On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
>> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
>> to the layout recall, no fencing is needed.
> 
> RFC 5661 specifies that error as:
> 
>   The requester has attempted a retry of a Compound that it previously
>   requested not be placed in the reply cache.
> 
> which to me doesn't imply a positive action here.

Agreed, this status code seems like a loss of synchronization of session
state between the client and server, or an implementation bug. Ie, it
seems to mean that at the very least, session re-negotiation is needed,
at first blush. Should the server mark a callback channel FAULT, for
instance?

> But I'm not an
> expert at reply cache semantics, so I'll leave others to correct me.
> But please add a more detailed comment and commit log as this is
> completely unintuitive.

The session state and the state of the layout are at two different
and separate layers. Connect the dots to show that not fencing is
the correct action and will result in recovery of full backchannel
operation while maintaining the integrity of the file's content.

So IMHO reviewers need this patch description to provide:

- How this came up during your testing (and maybe a small reproducer)

- An explanation of why leaving the client unfenced is appropriate

- A discussion of what will happen when the server subsequently sends
  another operation on this session slot

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 14:16     ` Chuck Lever
@ 2025-11-03 18:50       ` Dai Ngo
  2025-11-03 18:57         ` Chuck Lever
  2025-11-03 19:22         ` Jeff Layton
  0 siblings, 2 replies; 22+ messages in thread
From: Dai Ngo @ 2025-11-03 18:50 UTC (permalink / raw)
  To: Chuck Lever, Christoph Hellwig; +Cc: jlayton, neilb, okorniev, tom, linux-nfs


On 11/3/25 6:16 AM, Chuck Lever wrote:
> On 11/3/25 6:45 AM, Christoph Hellwig wrote:
>> On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
>>> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
>>> to the layout recall, no fencing is needed.
>> RFC 5661 specifies that error as:
>>
>>    The requester has attempted a retry of a Compound that it previously
>>    requested not be placed in the reply cache.
>>
>> which to me doesn't imply a positive action here.
> Agreed, this status code seems like a loss of synchronization of session
> state between the client and server, or an implementation bug. Ie, it
> seems to mean that at the very least, session re-negotiation is needed,
> at first blush. Should the server mark a callback channel FAULT, for
> instance?
>
>
>> But I'm not an
>> expert at reply cache semantics, so I'll leave others to correct me.
>> But please add a more detailed comment and commit log as this is
>> completely unintuitive.
> The session state and the state of the layout are at two different
> and separate layers. Connect the dots to show that not fencing is
> the correct action and will result in recovery of full backchannel
> operation while maintaining the integrity of the file's content.
>
> So IMHO reviewers need this patch description to provide:
>
> - How this came up during your testing (and maybe a small reproducer)
>
> - An explanation of why leaving the client unfenced is appropriate
>
> - A discussion of what will happen when the server subsequently sends
>    another operation on this session slot

Here is the sequence of events that leads to NFS4ERR_RETRY_UNCACHED_REP:

1. Server sends CB_LAYOUTRECALL with stateID seqid 2
2. Client replies NFS4ERR_NOMATCHING_LAYOUT
3. Server does not receive the reply due to hard hang - no server thread
    available to service the reply (I will post a fix for this problem)
4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
    is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
    and the request is re-queued.
5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
    again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.

This process repeats forever from step 4.

In this scenario, the server does not have a chance to service the reply
therefor nfsd4_cb_layout_done was not called so no fencing happens. However,
if somehow a server thread becomes available and nfsd4_cb_layout_done is
called with NFS4ERR_RETRY_UNCACHED_REP error then the client is fenced.
This stops the client from accessing the SCSI target for all layouts which
I think it's a bit harsh and unnecessary.

This problem can be easily reproduced by running the git test.

-Dai



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 18:50       ` Dai Ngo
@ 2025-11-03 18:57         ` Chuck Lever
  2025-11-03 19:14           ` Dai Ngo
  2025-11-03 19:22         ` Jeff Layton
  1 sibling, 1 reply; 22+ messages in thread
From: Chuck Lever @ 2025-11-03 18:57 UTC (permalink / raw)
  To: Dai Ngo, Christoph Hellwig; +Cc: jlayton, neilb, okorniev, tom, linux-nfs

On 11/3/25 1:50 PM, Dai Ngo wrote:
> 
> On 11/3/25 6:16 AM, Chuck Lever wrote:
>> On 11/3/25 6:45 AM, Christoph Hellwig wrote:
>>> On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
>>>> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
>>>> to the layout recall, no fencing is needed.
>>> RFC 5661 specifies that error as:
>>>
>>>    The requester has attempted a retry of a Compound that it previously
>>>    requested not be placed in the reply cache.
>>>
>>> which to me doesn't imply a positive action here.
>> Agreed, this status code seems like a loss of synchronization of session
>> state between the client and server, or an implementation bug. Ie, it
>> seems to mean that at the very least, session re-negotiation is needed,
>> at first blush. Should the server mark a callback channel FAULT, for
>> instance?
>>
>>
>>> But I'm not an
>>> expert at reply cache semantics, so I'll leave others to correct me.
>>> But please add a more detailed comment and commit log as this is
>>> completely unintuitive.
>> The session state and the state of the layout are at two different
>> and separate layers. Connect the dots to show that not fencing is
>> the correct action and will result in recovery of full backchannel
>> operation while maintaining the integrity of the file's content.
>>
>> So IMHO reviewers need this patch description to provide:
>>
>> - How this came up during your testing (and maybe a small reproducer)
>>
>> - An explanation of why leaving the client unfenced is appropriate
>>
>> - A discussion of what will happen when the server subsequently sends
>>    another operation on this session slot
> 
> Here is the sequence of events that leads to NFS4ERR_RETRY_UNCACHED_REP:
> 
> 1. Server sends CB_LAYOUTRECALL with stateID seqid 2
> 2. Client replies NFS4ERR_NOMATCHING_LAYOUT
> 3. Server does not receive the reply due to hard hang - no server thread
>    available to service the reply (I will post a fix for this problem)
> 4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
>    is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
>    and the request is re-queued.
> 5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
>    again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.
> 
> This process repeats forever from step 4.
> 
> In this scenario, the server does not have a chance to service the reply
> therefor nfsd4_cb_layout_done was not called so no fencing happens.
> However,
> if somehow a server thread becomes available and nfsd4_cb_layout_done is
> called with NFS4ERR_RETRY_UNCACHED_REP error then the client is fenced.
> This stops the client from accessing the SCSI target for all layouts which
> I think it's a bit harsh and unnecessary.
> 
> This problem can be easily reproduced by running the git test.
The problem is step 3, above. NFS4ERR_RETRY_UNCACHED_REP is not a
fix for that, and I disagree that fencing is harsh, because
NFS4ERR_RETRY_UNCACHED_REP is supposed to be quite rare, and of course
there are other ways this error can happen.

I don't understand the assessment that "the server does not have a
chance to service the reply". The server /sends/ replies. For the
backchannel, there should be an nfsd thread waiting for the reply...
unless I've misunderstood something.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 18:57         ` Chuck Lever
@ 2025-11-03 19:14           ` Dai Ngo
  2025-11-03 20:03             ` Dai Ngo
  2025-11-03 20:15             ` Chuck Lever
  0 siblings, 2 replies; 22+ messages in thread
From: Dai Ngo @ 2025-11-03 19:14 UTC (permalink / raw)
  To: Chuck Lever, Christoph Hellwig; +Cc: jlayton, neilb, okorniev, tom, linux-nfs


On 11/3/25 10:57 AM, Chuck Lever wrote:
> On 11/3/25 1:50 PM, Dai Ngo wrote:
>> On 11/3/25 6:16 AM, Chuck Lever wrote:
>>> On 11/3/25 6:45 AM, Christoph Hellwig wrote:
>>>> On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
>>>>> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
>>>>> to the layout recall, no fencing is needed.
>>>> RFC 5661 specifies that error as:
>>>>
>>>>     The requester has attempted a retry of a Compound that it previously
>>>>     requested not be placed in the reply cache.
>>>>
>>>> which to me doesn't imply a positive action here.
>>> Agreed, this status code seems like a loss of synchronization of session
>>> state between the client and server, or an implementation bug. Ie, it
>>> seems to mean that at the very least, session re-negotiation is needed,
>>> at first blush. Should the server mark a callback channel FAULT, for
>>> instance?
>>>
>>>
>>>> But I'm not an
>>>> expert at reply cache semantics, so I'll leave others to correct me.
>>>> But please add a more detailed comment and commit log as this is
>>>> completely unintuitive.
>>> The session state and the state of the layout are at two different
>>> and separate layers. Connect the dots to show that not fencing is
>>> the correct action and will result in recovery of full backchannel
>>> operation while maintaining the integrity of the file's content.
>>>
>>> So IMHO reviewers need this patch description to provide:
>>>
>>> - How this came up during your testing (and maybe a small reproducer)
>>>
>>> - An explanation of why leaving the client unfenced is appropriate
>>>
>>> - A discussion of what will happen when the server subsequently sends
>>>     another operation on this session slot
>> Here is the sequence of events that leads to NFS4ERR_RETRY_UNCACHED_REP:
>>
>> 1. Server sends CB_LAYOUTRECALL with stateID seqid 2
>> 2. Client replies NFS4ERR_NOMATCHING_LAYOUT
>> 3. Server does not receive the reply due to hard hang - no server thread
>>     available to service the reply (I will post a fix for this problem)
>> 4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
>>     is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
>>     and the request is re-queued.
>> 5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
>>     again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.
>>
>> This process repeats forever from step 4.
>>
>> In this scenario, the server does not have a chance to service the reply
>> therefor nfsd4_cb_layout_done was not called so no fencing happens.
>> However,
>> if somehow a server thread becomes available and nfsd4_cb_layout_done is
>> called with NFS4ERR_RETRY_UNCACHED_REP error then the client is fenced.
>> This stops the client from accessing the SCSI target for all layouts which
>> I think it's a bit harsh and unnecessary.
>>
>> This problem can be easily reproduced by running the git test.
> The problem is step 3, above. NFS4ERR_RETRY_UNCACHED_REP is not a
> fix for that,

Agreed, as I said, I will post a separate fix for the hang.

>   and I disagree that fencing is harsh, because
> NFS4ERR_RETRY_UNCACHED_REP is supposed to be quite rare, and of course
> there are other ways this error can happen.

Yes, this error should be rare. But is fencing the client is a correct
solution for it? IMHO, NFS4ERR_RETRY_UNCACHED_REP means the client has
received and replied to the server, it just somehow the server did not
see the reply due to many reasons. I think in this case we should just
mark the back channel down and let the client recover it, instead of
fencing the client.

>
> I don't understand the assessment that "the server does not have a
> chance to service the reply". The server /sends/ replies. For the
> backchannel, there should be an nfsd thread waiting for the reply...
> unless I've misunderstood something.

If all the server threads are waiting in __break_Lease then there is
no available server thread to service the replies, or any incoming
requests from the client. That's the hard hang problem that I mentioned
above.

-Dai


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 18:50       ` Dai Ngo
  2025-11-03 18:57         ` Chuck Lever
@ 2025-11-03 19:22         ` Jeff Layton
  2025-11-03 19:36           ` Dai Ngo
  1 sibling, 1 reply; 22+ messages in thread
From: Jeff Layton @ 2025-11-03 19:22 UTC (permalink / raw)
  To: Dai Ngo, Chuck Lever, Christoph Hellwig; +Cc: neilb, okorniev, tom, linux-nfs

On Mon, 2025-11-03 at 10:50 -0800, Dai Ngo wrote:
> On 11/3/25 6:16 AM, Chuck Lever wrote:
> > On 11/3/25 6:45 AM, Christoph Hellwig wrote:
> > > On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
> > > > NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
> > > > to the layout recall, no fencing is needed.
> > > RFC 5661 specifies that error as:
> > > 
> > >    The requester has attempted a retry of a Compound that it previously
> > >    requested not be placed in the reply cache.
> > > 
> > > which to me doesn't imply a positive action here.
> > Agreed, this status code seems like a loss of synchronization of session
> > state between the client and server, or an implementation bug. Ie, it
> > seems to mean that at the very least, session re-negotiation is needed,
> > at first blush. Should the server mark a callback channel FAULT, for
> > instance?
> > 
> > 
> > > But I'm not an
> > > expert at reply cache semantics, so I'll leave others to correct me.
> > > But please add a more detailed comment and commit log as this is
> > > completely unintuitive.
> > The session state and the state of the layout are at two different
> > and separate layers. Connect the dots to show that not fencing is
> > the correct action and will result in recovery of full backchannel
> > operation while maintaining the integrity of the file's content.
> > 
> > So IMHO reviewers need this patch description to provide:
> > 
> > - How this came up during your testing (and maybe a small reproducer)
> > 
> > - An explanation of why leaving the client unfenced is appropriate
> > 
> > - A discussion of what will happen when the server subsequently sends
> >    another operation on this session slot
> 
> Here is the sequence of events that leads to NFS4ERR_RETRY_UNCACHED_REP:
> 
> 1. Server sends CB_LAYOUTRECALL with stateID seqid 2
> 2. Client replies NFS4ERR_NOMATCHING_LAYOUT
> 3. Server does not receive the reply due to hard hang - no server thread
>     available to service the reply (I will post a fix for this problem)
> 4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
>     is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
>     and the request is re-queued.
> 5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
>     again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.
> 
> This process repeats forever from step 4.
> 

I'm a little confused here. I could see that you might not be able to
process a LAYOUTRETURN if all nfsd threads were blocked waiting for the
break_layout(), but I don't get why that would blocks a CB_LAYOUTRECALL
reply.

For the server, CB_LAYOUTRECALL is a client RPC (server acts as client
and vice versa). A CB_LAYOUTRECALL shouldn't depend on having a nfsd
thread available, since it runs in the context of a workqueue thread.

What am I missing?

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 19:22         ` Jeff Layton
@ 2025-11-03 19:36           ` Dai Ngo
  2025-11-03 19:40             ` Jeff Layton
  0 siblings, 1 reply; 22+ messages in thread
From: Dai Ngo @ 2025-11-03 19:36 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever, Christoph Hellwig
  Cc: neilb, okorniev, tom, linux-nfs


On 11/3/25 11:22 AM, Jeff Layton wrote:
> On Mon, 2025-11-03 at 10:50 -0800, Dai Ngo wrote:
>> On 11/3/25 6:16 AM, Chuck Lever wrote:
>>> On 11/3/25 6:45 AM, Christoph Hellwig wrote:
>>>> On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
>>>>> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
>>>>> to the layout recall, no fencing is needed.
>>>> RFC 5661 specifies that error as:
>>>>
>>>>     The requester has attempted a retry of a Compound that it previously
>>>>     requested not be placed in the reply cache.
>>>>
>>>> which to me doesn't imply a positive action here.
>>> Agreed, this status code seems like a loss of synchronization of session
>>> state between the client and server, or an implementation bug. Ie, it
>>> seems to mean that at the very least, session re-negotiation is needed,
>>> at first blush. Should the server mark a callback channel FAULT, for
>>> instance?
>>>
>>>
>>>> But I'm not an
>>>> expert at reply cache semantics, so I'll leave others to correct me.
>>>> But please add a more detailed comment and commit log as this is
>>>> completely unintuitive.
>>> The session state and the state of the layout are at two different
>>> and separate layers. Connect the dots to show that not fencing is
>>> the correct action and will result in recovery of full backchannel
>>> operation while maintaining the integrity of the file's content.
>>>
>>> So IMHO reviewers need this patch description to provide:
>>>
>>> - How this came up during your testing (and maybe a small reproducer)
>>>
>>> - An explanation of why leaving the client unfenced is appropriate
>>>
>>> - A discussion of what will happen when the server subsequently sends
>>>     another operation on this session slot
>> Here is the sequence of events that leads to NFS4ERR_RETRY_UNCACHED_REP:
>>
>> 1. Server sends CB_LAYOUTRECALL with stateID seqid 2
>> 2. Client replies NFS4ERR_NOMATCHING_LAYOUT
>> 3. Server does not receive the reply due to hard hang - no server thread
>>      available to service the reply (I will post a fix for this problem)
>> 4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
>>      is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
>>      and the request is re-queued.
>> 5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
>>      again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.
>>
>> This process repeats forever from step 4.
>>
> I'm a little confused here. I could see that you might not be able to
> process a LAYOUTRETURN if all nfsd threads were blocked waiting for the
> break_layout(), but I don't get why that would blocks a CB_LAYOUTRECALL
> reply.
>
> For the server, CB_LAYOUTRECALL is a client RPC (server acts as client
> and vice versa). A CB_LAYOUTRECALL shouldn't depend on having a nfsd
> thread available, since it runs in the context of a workqueue thread.
>
> What am I missing?

This is call stack when the server receives the callback reply:

         receive_cb_reply+1
         svc_tcp_recvfrom+3531
         svc_handle_xprt+3747
         svc_recv+511
         nfsd+588
         kthread+916
         ret_from_fork+479
         ret_from_fork_asm+26

As shown, it requires a NFSD thread to service it. The NFSD thread eventually
calls xprt_complete_rqst to wake up the RPC task waiting for the reply.

-Dai

>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 19:36           ` Dai Ngo
@ 2025-11-03 19:40             ` Jeff Layton
  0 siblings, 0 replies; 22+ messages in thread
From: Jeff Layton @ 2025-11-03 19:40 UTC (permalink / raw)
  To: Dai Ngo, Chuck Lever, Christoph Hellwig; +Cc: neilb, okorniev, tom, linux-nfs

On Mon, 2025-11-03 at 11:36 -0800, Dai Ngo wrote:
> On 11/3/25 11:22 AM, Jeff Layton wrote:
> > On Mon, 2025-11-03 at 10:50 -0800, Dai Ngo wrote:
> > > On 11/3/25 6:16 AM, Chuck Lever wrote:
> > > > On 11/3/25 6:45 AM, Christoph Hellwig wrote:
> > > > > On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
> > > > > > NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
> > > > > > to the layout recall, no fencing is needed.
> > > > > RFC 5661 specifies that error as:
> > > > > 
> > > > >     The requester has attempted a retry of a Compound that it previously
> > > > >     requested not be placed in the reply cache.
> > > > > 
> > > > > which to me doesn't imply a positive action here.
> > > > Agreed, this status code seems like a loss of synchronization of session
> > > > state between the client and server, or an implementation bug. Ie, it
> > > > seems to mean that at the very least, session re-negotiation is needed,
> > > > at first blush. Should the server mark a callback channel FAULT, for
> > > > instance?
> > > > 
> > > > 
> > > > > But I'm not an
> > > > > expert at reply cache semantics, so I'll leave others to correct me.
> > > > > But please add a more detailed comment and commit log as this is
> > > > > completely unintuitive.
> > > > The session state and the state of the layout are at two different
> > > > and separate layers. Connect the dots to show that not fencing is
> > > > the correct action and will result in recovery of full backchannel
> > > > operation while maintaining the integrity of the file's content.
> > > > 
> > > > So IMHO reviewers need this patch description to provide:
> > > > 
> > > > - How this came up during your testing (and maybe a small reproducer)
> > > > 
> > > > - An explanation of why leaving the client unfenced is appropriate
> > > > 
> > > > - A discussion of what will happen when the server subsequently sends
> > > >     another operation on this session slot
> > > Here is the sequence of events that leads to NFS4ERR_RETRY_UNCACHED_REP:
> > > 
> > > 1. Server sends CB_LAYOUTRECALL with stateID seqid 2
> > > 2. Client replies NFS4ERR_NOMATCHING_LAYOUT
> > > 3. Server does not receive the reply due to hard hang - no server thread
> > >      available to service the reply (I will post a fix for this problem)
> > > 4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
> > >      is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
> > >      and the request is re-queued.
> > > 5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
> > >      again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.
> > > 
> > > This process repeats forever from step 4.
> > > 
> > I'm a little confused here. I could see that you might not be able to
> > process a LAYOUTRETURN if all nfsd threads were blocked waiting for the
> > break_layout(), but I don't get why that would blocks a CB_LAYOUTRECALL
> > reply.
> > 
> > For the server, CB_LAYOUTRECALL is a client RPC (server acts as client
> > and vice versa). A CB_LAYOUTRECALL shouldn't depend on having a nfsd
> > thread available, since it runs in the context of a workqueue thread.
> > 
> > What am I missing?
> 
> This is call stack when the server receives the callback reply:
> 
>          receive_cb_reply+1
>          svc_tcp_recvfrom+3531
>          svc_handle_xprt+3747
>          svc_recv+511
>          nfsd+588
>          kthread+916
>          ret_from_fork+479
>          ret_from_fork_asm+26
> 
> As shown, it requires a NFSD thread to service it. The NFSD thread eventually
> calls xprt_complete_rqst to wake up the RPC task waiting for the reply.
> 

Thanks. That's the bit I was missing. I'll look forward to seeing your
fix!

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 19:14           ` Dai Ngo
@ 2025-11-03 20:03             ` Dai Ngo
  2025-11-03 20:15             ` Chuck Lever
  1 sibling, 0 replies; 22+ messages in thread
From: Dai Ngo @ 2025-11-03 20:03 UTC (permalink / raw)
  To: Chuck Lever, Christoph Hellwig; +Cc: jlayton, neilb, okorniev, tom, linux-nfs


On 11/3/25 11:14 AM, Dai Ngo wrote:
>
> On 11/3/25 10:57 AM, Chuck Lever wrote:
>> On 11/3/25 1:50 PM, Dai Ngo wrote:
>>> On 11/3/25 6:16 AM, Chuck Lever wrote:
>>>> On 11/3/25 6:45 AM, Christoph Hellwig wrote:
>>>>> On Sat, Nov 01, 2025 at 11:51:34AM -0700, Dai Ngo wrote:
>>>>>> NFS4ERR_RETRY_UNCACHED_REP error means client has seen and replied
>>>>>> to the layout recall, no fencing is needed.
>>>>> RFC 5661 specifies that error as:
>>>>>
>>>>>     The requester has attempted a retry of a Compound that it 
>>>>> previously
>>>>>     requested not be placed in the reply cache.
>>>>>
>>>>> which to me doesn't imply a positive action here.
>>>> Agreed, this status code seems like a loss of synchronization of 
>>>> session
>>>> state between the client and server, or an implementation bug. Ie, it
>>>> seems to mean that at the very least, session re-negotiation is 
>>>> needed,
>>>> at first blush. Should the server mark a callback channel FAULT, for
>>>> instance?
>>>>
>>>>
>>>>> But I'm not an
>>>>> expert at reply cache semantics, so I'll leave others to correct me.
>>>>> But please add a more detailed comment and commit log as this is
>>>>> completely unintuitive.
>>>> The session state and the state of the layout are at two different
>>>> and separate layers. Connect the dots to show that not fencing is
>>>> the correct action and will result in recovery of full backchannel
>>>> operation while maintaining the integrity of the file's content.
>>>>
>>>> So IMHO reviewers need this patch description to provide:
>>>>
>>>> - How this came up during your testing (and maybe a small reproducer)
>>>>
>>>> - An explanation of why leaving the client unfenced is appropriate
>>>>
>>>> - A discussion of what will happen when the server subsequently sends
>>>>     another operation on this session slot
>>> Here is the sequence of events that leads to 
>>> NFS4ERR_RETRY_UNCACHED_REP:
>>>
>>> 1. Server sends CB_LAYOUTRECALL with stateID seqid 2
>>> 2. Client replies NFS4ERR_NOMATCHING_LAYOUT
>>> 3. Server does not receive the reply due to hard hang - no server 
>>> thread
>>>     available to service the reply (I will post a fix for this problem)
>>> 4. Server RPC times out waiting for the reply, nfsd4_cb_sequence_done
>>>     is called with cb_seq_status == 1, nfsd4_mark_cb_fault is called
>>>     and the request is re-queued.
>>> 5. Client receives the same CB_LAYOUTRECALL with stateID seqid 2
>>>     again and this time client replies with NFS4ERR_RETRY_UNCACHED_REP.
>>>
>>> This process repeats forever from step 4.
>>>
>>> In this scenario, the server does not have a chance to service the 
>>> reply
>>> therefor nfsd4_cb_layout_done was not called so no fencing happens.
>>> However,
>>> if somehow a server thread becomes available and 
>>> nfsd4_cb_layout_done is
>>> called with NFS4ERR_RETRY_UNCACHED_REP error then the client is fenced.
>>> This stops the client from accessing the SCSI target for all layouts 
>>> which
>>> I think it's a bit harsh and unnecessary.
>>>
>>> This problem can be easily reproduced by running the git test.
>> The problem is step 3, above. NFS4ERR_RETRY_UNCACHED_REP is not a
>> fix for that,
>
> Agreed, as I said, I will post a separate fix for the hang.
>
>>   and I disagree that fencing is harsh, because
>> NFS4ERR_RETRY_UNCACHED_REP is supposed to be quite rare, and of course
>> there are other ways this error can happen.
>
> Yes, this error should be rare. But is fencing the client is a correct
> solution for it? IMHO, NFS4ERR_RETRY_UNCACHED_REP means the client has
> received and replied to the server, it just somehow the server did not
> see the reply due to many reasons. I think in this case we should just
> mark the back channel down and let the client recover it, instead of
> fencing the client.
>
>>
>> I don't understand the assessment that "the server does not have a
>> chance to service the reply". The server /sends/ replies. For the
>> backchannel, there should be an nfsd thread waiting for the reply...
>> unless I've misunderstood something.
>
> If all the server threads are waiting in __break_Lease then there is
> no available server thread to service the replies, or any incoming
> requests from the client. That's the hard hang problem that I mentioned
> above.

I can either (1) drop this patch and keep the existing behavior or
(2) mark the back channel down and let the recovery takes place.

What do you think?

-Dai


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 19:14           ` Dai Ngo
  2025-11-03 20:03             ` Dai Ngo
@ 2025-11-03 20:15             ` Chuck Lever
  2025-11-03 20:36               ` Dai Ngo
  1 sibling, 1 reply; 22+ messages in thread
From: Chuck Lever @ 2025-11-03 20:15 UTC (permalink / raw)
  To: Dai Ngo, Christoph Hellwig; +Cc: jlayton, neilb, okorniev, tom, linux-nfs

On 11/3/25 2:14 PM, Dai Ngo wrote:
>>   and I disagree that fencing is harsh, because
>> NFS4ERR_RETRY_UNCACHED_REP is supposed to be quite rare, and of course
>> there are other ways this error can happen.
> 
> Yes, this error should be rare. But is fencing the client is a correct
> solution for it? IMHO, NFS4ERR_RETRY_UNCACHED_REP means the client has
> received and replied to the server, it just somehow the server did not
> see the reply due to many reasons.

Fencing seems appropriate when there is a clear indication that the
client and server state are out of sync. The question is why, and how do
we prevent that situation from occurring? And, when we get into this
state, what is the correct recovery?

I don't see NFSD doing this short-circuit when processing a CB_RECALL
response, for instance.

> I think in this case we should just
> mark the back channel down and let the client recover it, instead of
> fencing the client.

Clearly the backchannel needs to recover properly from
NFS4ERR_RETRY_UNCACHED_REP, and if it goes into a loop, something is not
right. I don't think this is the correct fix for looping, either.

I don't understand why, after the server indicates a backchannel fault,
the client and server don't replace the session. The server is trying
to re-use what is obviously an incorrect slot sequence ID; it shouldn't
expect any different result by retrying.

So, yes, there are one or more real bugs here. But ignoring a sign that
state synchrony has been lost is not the right fix.

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error
  2025-11-03 20:15             ` Chuck Lever
@ 2025-11-03 20:36               ` Dai Ngo
  0 siblings, 0 replies; 22+ messages in thread
From: Dai Ngo @ 2025-11-03 20:36 UTC (permalink / raw)
  To: Chuck Lever, Christoph Hellwig; +Cc: jlayton, neilb, okorniev, tom, linux-nfs


On 11/3/25 12:15 PM, Chuck Lever wrote:
> On 11/3/25 2:14 PM, Dai Ngo wrote:
>>>    and I disagree that fencing is harsh, because
>>> NFS4ERR_RETRY_UNCACHED_REP is supposed to be quite rare, and of course
>>> there are other ways this error can happen.
>> Yes, this error should be rare. But is fencing the client is a correct
>> solution for it? IMHO, NFS4ERR_RETRY_UNCACHED_REP means the client has
>> received and replied to the server, it just somehow the server did not
>> see the reply due to many reasons.
> Fencing seems appropriate when there is a clear indication that the
> client and server state are out of sync. The question is why, and how do
> we prevent that situation from occurring? And, when we get into this
> state, what is the correct recovery?
>
> I don't see NFSD doing this short-circuit when processing a CB_RECALL
> response, for instance.
>
>
>> I think in this case we should just
>> mark the back channel down and let the client recover it, instead of
>> fencing the client.
> Clearly the backchannel needs to recover properly from
> NFS4ERR_RETRY_UNCACHED_REP, and if it goes into a loop, something is not
> right. I don't think this is the correct fix for looping, either.
>
> I don't understand why, after the server indicates a backchannel fault,
> the client and server don't replace the session. The server is trying
> to re-use what is obviously an incorrect slot sequence ID; it shouldn't
> expect any different result by retrying.
>
> So, yes, there are one or more real bugs here. But ignoring a sign that
> state synchrony has been lost is not the right fix.

ok, I'll drop this patch. This error occurs due to the hard hang problem.
In that case, there is no back channel recovery, no new session, etc takes
place, the callback code just keep retrying forever.

Let fix the hang problem first and we can take care of this error later
if it still happens.

-Dai


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation.
  2025-11-02 15:40   ` Chuck Lever
@ 2025-11-03 20:44     ` Dai Ngo
  2025-11-03 21:00       ` Chuck Lever
  2025-11-04  0:32     ` Dai Ngo
  1 sibling, 1 reply; 22+ messages in thread
From: Dai Ngo @ 2025-11-03 20:44 UTC (permalink / raw)
  To: Chuck Lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs


On 11/2/25 7:40 AM, Chuck Lever wrote:
> On 11/1/25 2:51 PM, Dai Ngo wrote:
>> +	TP_STRUCT__entry(
>> +		__string(dev, dev)
>> +		__array(unsigned char, addr, sizeof(struct sockaddr_in6))
>> +		__field(u32, error)
>> +	),
>> +	TP_fast_assign(
>> +		memcpy(__entry->addr, &clp->cl_addr,
>> +			sizeof(struct sockaddr_in6));
>> +		__assign_str(dev);
>> +		__entry->error = error;
>> +	),
>> +	TP_printk("client=%pISpc dev=%s error=%d",
>> +		__entry->addr,
>> +		__get_str(dev),
>> +		__entry->error
>> +	)
> Have a look at the nfsd_cs_slot_class event class (fs/nfsd/trace.h) to
> see how to use the trace subsystem's native sockaddr handling.

will fix.

>
> Do you want to record anything else here? The cl_boot/cl_id, perhaps? I
> guess there's no XID available... but is there a relevant state ID?

There is a layout stateid. For the fencing operation, I think it helps
to display the client IP and the effected block device and the status of
operation. IMHO, I don't see any additional value the cl_boot/cl_id of
the layout stateid can add?

-Dai


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation.
  2025-11-03 20:44     ` Dai Ngo
@ 2025-11-03 21:00       ` Chuck Lever
  0 siblings, 0 replies; 22+ messages in thread
From: Chuck Lever @ 2025-11-03 21:00 UTC (permalink / raw)
  To: Dai Ngo, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

On 11/3/25 3:44 PM, Dai Ngo wrote:
> 
> On 11/2/25 7:40 AM, Chuck Lever wrote:

>> Do you want to record anything else here? The cl_boot/cl_id, perhaps? I
>> guess there's no XID available... but is there a relevant state ID?
> 
> There is a layout stateid. For the fencing operation, I think it helps
> to display the client IP and the effected block device and the status of
> operation. IMHO, I don't see any additional value the cl_boot/cl_id of
> the layout stateid can add?
The theory goes that these two items define the server's boot epoch.
Maybe what we really want is the net namespace number, since that makes
the IP address captured by the trace event more meaningful.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation.
  2025-11-02 15:40   ` Chuck Lever
  2025-11-03 20:44     ` Dai Ngo
@ 2025-11-04  0:32     ` Dai Ngo
  2025-11-04 14:05       ` Chuck Lever
  1 sibling, 1 reply; 22+ messages in thread
From: Dai Ngo @ 2025-11-04  0:32 UTC (permalink / raw)
  To: Chuck Lever, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs


On 11/2/25 7:40 AM, Chuck Lever wrote:
> On 11/1/25 2:51 PM, Dai Ngo wrote:
>> +	TP_STRUCT__entry(
>> +		__string(dev, dev)
>> +		__array(unsigned char, addr, sizeof(struct sockaddr_in6))
>> +		__field(u32, error)
>> +	),
>> +	TP_fast_assign(
>> +		memcpy(__entry->addr, &clp->cl_addr,
>> +			sizeof(struct sockaddr_in6));
>> +		__assign_str(dev);
>> +		__entry->error = error;
>> +	),
>> +	TP_printk("client=%pISpc dev=%s error=%d",
>> +		__entry->addr,
>> +		__get_str(dev),
>> +		__entry->error
>> +	)
> Have a look at the nfsd_cs_slot_class event class (fs/nfsd/trace.h) to
> see how to use the trace subsystem's native sockaddr handling.

The native sockaddr handling used by nfsd_cs_slot_class seems
to be broken:

[opc]# trace-cmd record -e nfsd_slot_seqid_conf -e nfsd_slot_seqid_unconf
[opc]# trace-cmd report
CPU 0 is empty
...
CPU 15 is empty
cpus=16
             nfsd-2784  [011]  1205.355839: nfsd_slot_seqid_unconf: [FAILED TO PARSE] seqid=1 slot_seqid=0 cl_boot=1762214337 cl_id=3172331794 addr=
[opc]#

After changing 'nfsd_pnfs_class' to use native sockaddr handling:

[opc]# trace-cmd record -e nfsd_pnfs_fence
kworker/u65:0-7458  [001] 933.253811: nfsd_pnfs_fence: [FAILED TO PARSE] addr=ARRAY[02, 00, 02, f9, 64, 64, fa, 2b, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] netns_ino=4026531833 dev=sdb error=24
[opc]#

-Dai

>
> Do you want to record anything else here? The cl_boot/cl_id, perhaps? I
> guess there's no XID available... but is there a relevant state ID?
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation.
  2025-11-04  0:32     ` Dai Ngo
@ 2025-11-04 14:05       ` Chuck Lever
  0 siblings, 0 replies; 22+ messages in thread
From: Chuck Lever @ 2025-11-04 14:05 UTC (permalink / raw)
  To: Dai Ngo, jlayton, neilb, okorniev, tom, hch; +Cc: linux-nfs

On 11/3/25 7:32 PM, Dai Ngo wrote:
> 
> On 11/2/25 7:40 AM, Chuck Lever wrote:
>> On 11/1/25 2:51 PM, Dai Ngo wrote:
>>> +    TP_STRUCT__entry(
>>> +        __string(dev, dev)
>>> +        __array(unsigned char, addr, sizeof(struct sockaddr_in6))
>>> +        __field(u32, error)
>>> +    ),
>>> +    TP_fast_assign(
>>> +        memcpy(__entry->addr, &clp->cl_addr,
>>> +            sizeof(struct sockaddr_in6));
>>> +        __assign_str(dev);
>>> +        __entry->error = error;
>>> +    ),
>>> +    TP_printk("client=%pISpc dev=%s error=%d",
>>> +        __entry->addr,
>>> +        __get_str(dev),
>>> +        __entry->error
>>> +    )
>> Have a look at the nfsd_cs_slot_class event class (fs/nfsd/trace.h) to
>> see how to use the trace subsystem's native sockaddr handling.
> 
> The native sockaddr handling used by nfsd_cs_slot_class seems
> to be broken:
> 
> [opc]# trace-cmd record -e nfsd_slot_seqid_conf -e nfsd_slot_seqid_unconf
> [opc]# trace-cmd report
> CPU 0 is empty
> ...
> CPU 15 is empty
> cpus=16
>             nfsd-2784  [011]  1205.355839: nfsd_slot_seqid_unconf:
> [FAILED TO PARSE] seqid=1 slot_seqid=0 cl_boot=1762214337
> cl_id=3172331794 addr=
> [opc]#
> 
> After changing 'nfsd_pnfs_class' to use native sockaddr handling:
> 
> [opc]# trace-cmd record -e nfsd_pnfs_fence
> kworker/u65:0-7458  [001] 933.253811: nfsd_pnfs_fence: [FAILED TO PARSE]
> addr=ARRAY[02, 00, 02, f9, 64, 64, fa, 2b, 00, 00, 00, 00, 00, 00, 00,
> 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00, 00] netns_ino=4026531833
> dev=sdb error=24
> [opc]#
FAILED TO PARSE is from the user space library. It's far out of date.

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-11-04 14:06 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-01 18:51 [PATCH 0/3] NFSD: Fix problem with nfsd4_scsi_fence_client Dai Ngo
2025-11-01 18:51 ` [PATCH 1/3] NFSD: Fix problem with nfsd4_scsi_fence_client using the wrong reservation type Dai Ngo
2025-11-03 11:42   ` Christoph Hellwig
2025-11-01 18:51 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo
2025-11-03 11:45   ` Christoph Hellwig
2025-11-03 14:16     ` Chuck Lever
2025-11-03 18:50       ` Dai Ngo
2025-11-03 18:57         ` Chuck Lever
2025-11-03 19:14           ` Dai Ngo
2025-11-03 20:03             ` Dai Ngo
2025-11-03 20:15             ` Chuck Lever
2025-11-03 20:36               ` Dai Ngo
2025-11-03 19:22         ` Jeff Layton
2025-11-03 19:36           ` Dai Ngo
2025-11-03 19:40             ` Jeff Layton
2025-11-01 18:51 ` [PATCH 3/3] NFSD: Add trace point for SCSI fencing operation Dai Ngo
2025-11-02 15:40   ` Chuck Lever
2025-11-03 20:44     ` Dai Ngo
2025-11-03 21:00       ` Chuck Lever
2025-11-04  0:32     ` Dai Ngo
2025-11-04 14:05       ` Chuck Lever
  -- strict thread matches above, loose matches on Subject: below --
2025-11-01 18:25 Dai Ngo
2025-11-01 18:25 ` [PATCH 2/3] NFSD: Do not fence the client on NFS4ERR_RETRY_UNCACHED_REP error Dai Ngo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).