linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Fix unwanted memory overwrites
@ 2025-10-07 16:04 Chuck Lever
  2025-10-07 16:04 ` [PATCH v2 1/4] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Chuck Lever @ 2025-10-07 16:04 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

<rtm@csail.mit.edu> reported some memory overwrites that can be
triggered by NFS client input. I was able to observe overwrites
by enabling KASAN and running his reproducer [1].

NFSD caches COMPOUNDs containing only a single SEQUENCE operation
whether the client requests it to or not, in order to work around a
deficiency in the NFSv4.1 protocol. However, the predicate that
identifies solo SEQUENCE operations was incorrect.

Changes since v1:
* Reordered patches
* Disable caching of solo SEQUENCE operations
* Additional clean up

Chuck Lever (4):
  NFSD: Skip close replay processing if XDR encoding fails
  NFSD: Fix the "is this a solo SEQUENCE" predicate
  NFSD: Do not cache solo SEQUENCE operations
  NFSD: Move nfsd4_cache_this()

 fs/nfsd/nfs4state.c | 23 +++++++++++++++++++++++
 fs/nfsd/nfs4xdr.c   |  3 +--
 fs/nfsd/xdr4.h      | 21 ---------------------
 3 files changed, 24 insertions(+), 23 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/4] NFSD: Skip close replay processing if XDR encoding fails
  2025-10-07 16:04 [PATCH v2 0/4] Fix unwanted memory overwrites Chuck Lever
@ 2025-10-07 16:04 ` Chuck Lever
  2025-10-07 16:04 ` [PATCH v2 2/4] NFSD: Fix the "is this a solo SEQUENCE" predicate Chuck Lever
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Chuck Lever @ 2025-10-07 16:04 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever, rtm

From: Chuck Lever <chuck.lever@oracle.com>

The close replay logic added by commit 9411b1d4c7df ("nfsd4: cleanup
handling of nfsv4.0 closed stateid's") cannot be done if encoding
failed due to a short send buffer; there's no guarantee that the
operation encoder has actually encoded the data that is being copied
to the replay cache.

I think there are deeper problems here. Is stateid sequencing
screwed up if XDR encoding fails? Does NFSv4.1+ even need to care
about this?

Reported-by: <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t
Fixes: 9411b1d4c7df ("nfsd4: cleanup handling of nfsv4.0 closed stateid's")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4xdr.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 230bf53e39f7..85b773a65670 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -5937,8 +5937,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
 		 */
 		warn_on_nonidempotent_op(op);
 		xdr_truncate_encode(xdr, op_status_offset + XDR_UNIT);
-	}
-	if (so) {
+	} else if (so) {
 		int len = xdr->buf->len - (op_status_offset + XDR_UNIT);
 
 		so->so_replay.rp_status = op->status;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/4] NFSD: Fix the "is this a solo SEQUENCE" predicate
  2025-10-07 16:04 [PATCH v2 0/4] Fix unwanted memory overwrites Chuck Lever
  2025-10-07 16:04 ` [PATCH v2 1/4] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
@ 2025-10-07 16:04 ` Chuck Lever
  2025-10-07 17:18   ` Jeff Layton
  2025-10-07 16:04 ` [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations Chuck Lever
  2025-10-07 16:04 ` [PATCH v2 4/4] NFSD: Move nfsd4_cache_this() Chuck Lever
  3 siblings, 1 reply; 18+ messages in thread
From: Chuck Lever @ 2025-10-07 16:04 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever, rtm

From: Chuck Lever <chuck.lever@oracle.com>

RFC 8881 Section 2.10.6.4 says:

> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> cache a reply except if an error is returned by the SEQUENCE or
> CB_SEQUENCE operation (see Section 2.10.6.1.2).

This text also appears in RFC 5661. Further, Section 2.10.6.1.2
says:

> Any time SEQUENCE or CB_SEQUENCE returns an error, the sequence ID
> of the slot MUST NOT change. The replier MUST NOT modify the reply
> cache entry for the slot whenever an error is returned from
> SEQUENCE or CB_SEQUENCE.

NFSD caches the result of solo SEQUENCE operations, but rtm's
reproducer sends two operations in the failing COMPOUND. NFSD should
not attempt to cache the reply.

The logic in nfsd4_is_solo_sequence() is incorrect: it checks the
current operation index, not the total count of operations in the
COMPOUND. If the SEQUENCE operation, which is always operation 1,
fails in a multi-operation compound, resp->opcnt is always 1. Thus
when a SEQUENCE operation fails, nfsd4_is_solo_sequence() always
returns true.

Reported-by: <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t
Fixes: 468de9e54a90 ("nfsd41: expand solo sequence check")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/xdr4.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index ee0570cbdd9e..d1837a10b0c2 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -926,7 +926,8 @@ struct nfsd4_compoundres {
 static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
 {
 	struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
-	return resp->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
+
+	return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
 }
 
 /*
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-07 16:04 [PATCH v2 0/4] Fix unwanted memory overwrites Chuck Lever
  2025-10-07 16:04 ` [PATCH v2 1/4] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
  2025-10-07 16:04 ` [PATCH v2 2/4] NFSD: Fix the "is this a solo SEQUENCE" predicate Chuck Lever
@ 2025-10-07 16:04 ` Chuck Lever
  2025-10-07 17:19   ` Jeff Layton
                     ` (2 more replies)
  2025-10-07 16:04 ` [PATCH v2 4/4] NFSD: Move nfsd4_cache_this() Chuck Lever
  3 siblings, 3 replies; 18+ messages in thread
From: Chuck Lever @ 2025-10-07 16:04 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

 RFC 8881 Section 2.10.6.1.3 says:

> On a per-request basis, the requester can choose to direct the
> replier to cache the reply to all operations after the first
> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.

RFC 8881 Section 2.10.6.4 further says:

> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> cache a reply except if an error is returned by the SEQUENCE or
> CB_SEQUENCE operation (see Section 2.10.6.1.2).

This suggests to me that the spec authors do not expect an NFSv4.1
server implementation to ever cache the result of a SEQUENCE
operation (except perhaps as part of a successful multi-operation
COMPOUND).

NFSD attempts to cache the result of solo SEQUENCE operations,
however. This is because the protocol does not permit servers to
respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
always caches solo SEQUENCE operations, then it never has occasion
to return that status code.

However, clients use solo SEQUENCE to query the current status of a
session slot. A cached reply will return stale information to the
client, and could result in an infinite loop.

Change the check in nfsd4_

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/xdr4.h | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index d1837a10b0c2..9619e78f0ed2 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -931,18 +931,19 @@ static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
 }
 
 /*
- * The session reply cache only needs to cache replies that the client
- * actually asked us to.  But it's almost free for us to cache compounds
- * consisting of only a SEQUENCE op, so we may as well cache those too.
- * Also, the protocol doesn't give us a convenient response in the case
- * of a replay of a solo SEQUENCE op that wasn't cached
- * (RETRY_UNCACHED_REP can only be returned in the second op of a
- * compound).
+ * Solo SEQUENCE operations are not supposed respect the setting in the
+ * sa_cachethis field, since that field controls whether the operations
+ * /after/ the SEQUENCE are preserved in the slot reply cache. Because
+ * clients might use a solo SEQUENCE to query the current state of the
+ * session or slot, a cached reply would return stale data to the client.
+ *
+ * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
+ * how the sa_cachethis field is set.
  */
 static inline bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
 {
-	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS)
-		|| nfsd4_is_solo_sequence(resp);
+	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
+		!nfsd4_is_solo_sequence(resp);
 }
 
 static inline bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/4] NFSD: Move nfsd4_cache_this()
  2025-10-07 16:04 [PATCH v2 0/4] Fix unwanted memory overwrites Chuck Lever
                   ` (2 preceding siblings ...)
  2025-10-07 16:04 ` [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations Chuck Lever
@ 2025-10-07 16:04 ` Chuck Lever
  2025-10-07 17:20   ` Jeff Layton
  3 siblings, 1 reply; 18+ messages in thread
From: Chuck Lever @ 2025-10-07 16:04 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

nfsd4_cache_this() has one call site, and is not related to XDR at
all. It doesn't belong in fs/nfsd/xdr4.h.

As a clean-up, move this function (and its helper) to nfs4state.c,
next to its caller.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4state.c | 23 +++++++++++++++++++++++
 fs/nfsd/xdr4.h      | 23 -----------------------
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index c9053ef4d79f..2c4fa4a23463 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3476,6 +3476,29 @@ gen_callback(struct nfs4_client *clp, struct nfsd4_setclientid *se, struct svc_r
 	return;
 }
 
+static bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
+{
+	struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
+
+	return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
+}
+
+/*
+ * Solo SEQUENCE operations are not supposed respect the setting in the
+ * sa_cachethis field, since that field controls whether the operations
+ * /after/ the SEQUENCE are preserved in the slot reply cache. Because
+ * clients might use a solo SEQUENCE to query the current state of the
+ * session or slot, a cached reply would return stale data to the client.
+ *
+ * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
+ * how the sa_cachethis field is set.
+ */
+static bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
+{
+	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
+		!nfsd4_is_solo_sequence(resp);
+}
+
 /*
  * Cache a reply. nfsd4_check_resp_size() has bounded the cache size.
  */
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 9619e78f0ed2..6f0129ea754d 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -923,29 +923,6 @@ struct nfsd4_compoundres {
 	struct nfsd4_compound_state	cstate;
 };
 
-static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
-{
-	struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
-
-	return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
-}
-
-/*
- * Solo SEQUENCE operations are not supposed respect the setting in the
- * sa_cachethis field, since that field controls whether the operations
- * /after/ the SEQUENCE are preserved in the slot reply cache. Because
- * clients might use a solo SEQUENCE to query the current state of the
- * session or slot, a cached reply would return stale data to the client.
- *
- * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
- * how the sa_cachethis field is set.
- */
-static inline bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
-{
-	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
-		!nfsd4_is_solo_sequence(resp);
-}
-
 static inline bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
 {
 	struct nfsd4_compoundres *resp = rqstp->rq_resp;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/4] NFSD: Fix the "is this a solo SEQUENCE" predicate
  2025-10-07 16:04 ` [PATCH v2 2/4] NFSD: Fix the "is this a solo SEQUENCE" predicate Chuck Lever
@ 2025-10-07 17:18   ` Jeff Layton
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff Layton @ 2025-10-07 17:18 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever, rtm

On Tue, 2025-10-07 at 12:04 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> RFC 8881 Section 2.10.6.4 says:
> 
> > If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> > cache a reply except if an error is returned by the SEQUENCE or
> > CB_SEQUENCE operation (see Section 2.10.6.1.2).
> 
> This text also appears in RFC 5661. Further, Section 2.10.6.1.2
> says:
> 
> > Any time SEQUENCE or CB_SEQUENCE returns an error, the sequence ID
> > of the slot MUST NOT change. The replier MUST NOT modify the reply
> > cache entry for the slot whenever an error is returned from
> > SEQUENCE or CB_SEQUENCE.
> 
> NFSD caches the result of solo SEQUENCE operations, but rtm's
> reproducer sends two operations in the failing COMPOUND. NFSD should
> not attempt to cache the reply.
> 
> The logic in nfsd4_is_solo_sequence() is incorrect: it checks the
> current operation index, not the total count of operations in the
> COMPOUND. If the SEQUENCE operation, which is always operation 1,
> fails in a multi-operation compound, resp->opcnt is always 1. Thus
> when a SEQUENCE operation fails, nfsd4_is_solo_sequence() always
> returns true.
> 
> Reported-by: <rtm@csail.mit.edu>
> Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t
> Fixes: 468de9e54a90 ("nfsd41: expand solo sequence check")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/xdr4.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index ee0570cbdd9e..d1837a10b0c2 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -926,7 +926,8 @@ struct nfsd4_compoundres {
>  static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
>  {
>  	struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
> -	return resp->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
> +
> +	return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
>  }
>  
>  /*

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-07 16:04 ` [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations Chuck Lever
@ 2025-10-07 17:19   ` Jeff Layton
  2025-10-07 20:05   ` Chuck Lever
  2025-10-07 22:21   ` Calum Mackay
  2 siblings, 0 replies; 18+ messages in thread
From: Jeff Layton @ 2025-10-07 17:19 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Tue, 2025-10-07 at 12:04 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
>  RFC 8881 Section 2.10.6.1.3 says:
> 
> > On a per-request basis, the requester can choose to direct the
> > replier to cache the reply to all operations after the first
> > operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
> > csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> 
> RFC 8881 Section 2.10.6.4 further says:
> 
> > If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> > cache a reply except if an error is returned by the SEQUENCE or
> > CB_SEQUENCE operation (see Section 2.10.6.1.2).
> 
> This suggests to me that the spec authors do not expect an NFSv4.1
> server implementation to ever cache the result of a SEQUENCE
> operation (except perhaps as part of a successful multi-operation
> COMPOUND).
> 
> NFSD attempts to cache the result of solo SEQUENCE operations,
> however. This is because the protocol does not permit servers to
> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> always caches solo SEQUENCE operations, then it never has occasion
> to return that status code.
> 
> However, clients use solo SEQUENCE to query the current status of a
> session slot. A cached reply will return stale information to the
> client, and could result in an infinite loop.
> 
> Change the check in nfsd4_
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/xdr4.h | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index d1837a10b0c2..9619e78f0ed2 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -931,18 +931,19 @@ static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
>  }
>  
>  /*
> - * The session reply cache only needs to cache replies that the client
> - * actually asked us to.  But it's almost free for us to cache compounds
> - * consisting of only a SEQUENCE op, so we may as well cache those too.
> - * Also, the protocol doesn't give us a convenient response in the case
> - * of a replay of a solo SEQUENCE op that wasn't cached
> - * (RETRY_UNCACHED_REP can only be returned in the second op of a
> - * compound).
> + * Solo SEQUENCE operations are not supposed respect the setting in the
> + * sa_cachethis field, since that field controls whether the operations
> + * /after/ the SEQUENCE are preserved in the slot reply cache. Because
> + * clients might use a solo SEQUENCE to query the current state of the
> + * session or slot, a cached reply would return stale data to the client.
> + *
> + * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
> + * how the sa_cachethis field is set.
>   */
>  static inline bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
>  {
> -	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS)
> -		|| nfsd4_is_solo_sequence(resp);
> +	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
> +		!nfsd4_is_solo_sequence(resp);
>  }
>  
>  static inline bool nfsd4_last_compound_op(struct svc_rqst *rqstp)

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/4] NFSD: Move nfsd4_cache_this()
  2025-10-07 16:04 ` [PATCH v2 4/4] NFSD: Move nfsd4_cache_this() Chuck Lever
@ 2025-10-07 17:20   ` Jeff Layton
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff Layton @ 2025-10-07 17:20 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Tue, 2025-10-07 at 12:04 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> nfsd4_cache_this() has one call site, and is not related to XDR at
> all. It doesn't belong in fs/nfsd/xdr4.h.
> 
> As a clean-up, move this function (and its helper) to nfs4state.c,
> next to its caller.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs4state.c | 23 +++++++++++++++++++++++
>  fs/nfsd/xdr4.h      | 23 -----------------------
>  2 files changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index c9053ef4d79f..2c4fa4a23463 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3476,6 +3476,29 @@ gen_callback(struct nfs4_client *clp, struct nfsd4_setclientid *se, struct svc_r
>  	return;
>  }
>  
> +static bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
> +{
> +	struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
> +
> +	return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
> +}
> +
> +/*
> + * Solo SEQUENCE operations are not supposed respect the setting in the
> + * sa_cachethis field, since that field controls whether the operations
> + * /after/ the SEQUENCE are preserved in the slot reply cache. Because
> + * clients might use a solo SEQUENCE to query the current state of the
> + * session or slot, a cached reply would return stale data to the client.
> + *
> + * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
> + * how the sa_cachethis field is set.
> + */
> +static bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
> +{
> +	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
> +		!nfsd4_is_solo_sequence(resp);
> +}
> +
>  /*
>   * Cache a reply. nfsd4_check_resp_size() has bounded the cache size.
>   */
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 9619e78f0ed2..6f0129ea754d 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -923,29 +923,6 @@ struct nfsd4_compoundres {
>  	struct nfsd4_compound_state	cstate;
>  };
>  
> -static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
> -{
> -	struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
> -
> -	return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
> -}
> -
> -/*
> - * Solo SEQUENCE operations are not supposed respect the setting in the
> - * sa_cachethis field, since that field controls whether the operations
> - * /after/ the SEQUENCE are preserved in the slot reply cache. Because
> - * clients might use a solo SEQUENCE to query the current state of the
> - * session or slot, a cached reply would return stale data to the client.
> - *
> - * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
> - * how the sa_cachethis field is set.
> - */
> -static inline bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
> -{
> -	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
> -		!nfsd4_is_solo_sequence(resp);
> -}
> -
>  static inline bool nfsd4_last_compound_op(struct svc_rqst *rqstp)
>  {
>  	struct nfsd4_compoundres *resp = rqstp->rq_resp;

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-07 16:04 ` [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations Chuck Lever
  2025-10-07 17:19   ` Jeff Layton
@ 2025-10-07 20:05   ` Chuck Lever
  2025-10-07 22:12     ` NeilBrown
  2025-10-07 22:21   ` Calum Mackay
  2 siblings, 1 reply; 18+ messages in thread
From: Chuck Lever @ 2025-10-07 20:05 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: linux-nfs

On 10/7/25 12:04 PM, Chuck Lever wrote:
>  RFC 8881 Section 2.10.6.1.3 says:
> 
>> On a per-request basis, the requester can choose to direct the
>> replier to cache the reply to all operations after the first
>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> RFC 8881 Section 2.10.6.4 further says:
> 
>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
>> cache a reply except if an error is returned by the SEQUENCE or
>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
> This suggests to me that the spec authors do not expect an NFSv4.1
> server implementation to ever cache the result of a SEQUENCE
> operation (except perhaps as part of a successful multi-operation
> COMPOUND).
> 
> NFSD attempts to cache the result of solo SEQUENCE operations,
> however. This is because the protocol does not permit servers to
> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> always caches solo SEQUENCE operations, then it never has occasion
> to return that status code.
> 
> However, clients use solo SEQUENCE to query the current status of a
> session slot. A cached reply will return stale information to the
> client, and could result in an infinite loop.

The pynfs SEQ9f test is now failing with this change. This test:

- Sends a CREATE_SESSION
- Sends a solo SEQUENCE with sa_cachethis set
- Sends the same operation without changing the slot sequence number

The test expects the server's response to be NFS4_OK. NFSD now returns
NFS4ERR_SEQ_FALSE_RETRY instead.

It's possible the test is wrong, but how should it be fixed?

Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
COMPOUND containing a solo SEQUENCE?

When reporting a retransmitted solo SEQUENCE, what is the correct status
code?


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-07 20:05   ` Chuck Lever
@ 2025-10-07 22:12     ` NeilBrown
  2025-10-08 13:04       ` Chuck Lever
  0 siblings, 1 reply; 18+ messages in thread
From: NeilBrown @ 2025-10-07 22:12 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On Wed, 08 Oct 2025, Chuck Lever wrote:
> On 10/7/25 12:04 PM, Chuck Lever wrote:
> >  RFC 8881 Section 2.10.6.1.3 says:
> > 
> >> On a per-request basis, the requester can choose to direct the
> >> replier to cache the reply to all operations after the first
> >> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
> >> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> > RFC 8881 Section 2.10.6.4 further says:
> > 
> >> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> >> cache a reply except if an error is returned by the SEQUENCE or
> >> CB_SEQUENCE operation (see Section 2.10.6.1.2).
> > This suggests to me that the spec authors do not expect an NFSv4.1
> > server implementation to ever cache the result of a SEQUENCE
> > operation (except perhaps as part of a successful multi-operation
> > COMPOUND).
> > 
> > NFSD attempts to cache the result of solo SEQUENCE operations,
> > however. This is because the protocol does not permit servers to
> > respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> > always caches solo SEQUENCE operations, then it never has occasion
> > to return that status code.
> > 
> > However, clients use solo SEQUENCE to query the current status of a
> > session slot. A cached reply will return stale information to the
> > client, and could result in an infinite loop.
> 
> The pynfs SEQ9f test is now failing with this change. This test:
> 
> - Sends a CREATE_SESSION
> - Sends a solo SEQUENCE with sa_cachethis set
> - Sends the same operation without changing the slot sequence number
> 
> The test expects the server's response to be NFS4_OK. NFSD now returns
> NFS4ERR_SEQ_FALSE_RETRY instead.
> 
> It's possible the test is wrong, but how should it be fixed?
> 
> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
> COMPOUND containing a solo SEQUENCE?
> 
> When reporting a retransmitted solo SEQUENCE, what is the correct status
> code?

Interesting question....
To help with context: you wrote:

   However, clients use solo SEQUENCE to query the current status of a
   session slot.  A cached reply will return stale information to the
   client, and could result in an infinite loop.

Could you please expand on that?  What in the reply might be stale, and
how might that result in an infinite loop?

Could a reply to a replayed singleton SEQUENCE simple always return the
current info, rather than cached info?

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-07 16:04 ` [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations Chuck Lever
  2025-10-07 17:19   ` Jeff Layton
  2025-10-07 20:05   ` Chuck Lever
@ 2025-10-07 22:21   ` Calum Mackay
  2 siblings, 0 replies; 18+ messages in thread
From: Calum Mackay @ 2025-10-07 22:21 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo,
	Tom Talpey
  Cc: Calum Mackay, linux-nfs, Chuck Lever

On 07/10/2025 5:04 pm, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
>   RFC 8881 Section 2.10.6.1.3 says:
> 
>> On a per-request basis, the requester can choose to direct the
>> replier to cache the reply to all operations after the first
>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> 
> RFC 8881 Section 2.10.6.4 further says:
> 
>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
>> cache a reply except if an error is returned by the SEQUENCE or
>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
> 
> This suggests to me that the spec authors do not expect an NFSv4.1
> server implementation to ever cache the result of a SEQUENCE
> operation (except perhaps as part of a successful multi-operation
> COMPOUND).
> 
> NFSD attempts to cache the result of solo SEQUENCE operations,
> however. This is because the protocol does not permit servers to
> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> always caches solo SEQUENCE operations, then it never has occasion
> to return that status code.
> 
> However, clients use solo SEQUENCE to query the current status of a
> session slot. A cached reply will return stale information to the
> client, and could result in an infinite loop.
> 
> Change the check in nfsd4_
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   fs/nfsd/xdr4.h | 19 ++++++++++---------
>   1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index d1837a10b0c2..9619e78f0ed2 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -931,18 +931,19 @@ static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
>   }
>   
>   /*
> - * The session reply cache only needs to cache replies that the client
> - * actually asked us to.  But it's almost free for us to cache compounds
> - * consisting of only a SEQUENCE op, so we may as well cache those too.
> - * Also, the protocol doesn't give us a convenient response in the case
> - * of a replay of a solo SEQUENCE op that wasn't cached
> - * (RETRY_UNCACHED_REP can only be returned in the second op of a
> - * compound).
> + * Solo SEQUENCE operations are not supposed respect the setting in the

nit: perhaps -> not supposed to respect

cheers,
c.


> + * sa_cachethis field, since that field controls whether the operations
> + * /after/ the SEQUENCE are preserved in the slot reply cache. Because
> + * clients might use a solo SEQUENCE to query the current state of the
> + * session or slot, a cached reply would return stale data to the client.
> + *
> + * Therefore NFSD treats solo SEQUENCE as an uncached operation no matter
> + * how the sa_cachethis field is set.
>    */
>   static inline bool nfsd4_cache_this(struct nfsd4_compoundres *resp)
>   {
> -	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS)
> -		|| nfsd4_is_solo_sequence(resp);
> +	return (resp->cstate.slot->sl_flags & NFSD4_SLOT_CACHETHIS) &&
> +		!nfsd4_is_solo_sequence(resp);
>   }
>   
>   static inline bool nfsd4_last_compound_op(struct svc_rqst *rqstp)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-07 22:12     ` NeilBrown
@ 2025-10-08 13:04       ` Chuck Lever
  2025-10-08 22:03         ` NeilBrown
  0 siblings, 1 reply; 18+ messages in thread
From: Chuck Lever @ 2025-10-08 13:04 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On 10/7/25 6:12 PM, NeilBrown wrote:
> On Wed, 08 Oct 2025, Chuck Lever wrote:
>> On 10/7/25 12:04 PM, Chuck Lever wrote:
>>>  RFC 8881 Section 2.10.6.1.3 says:
>>>
>>>> On a per-request basis, the requester can choose to direct the
>>>> replier to cache the reply to all operations after the first
>>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
>>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
>>> RFC 8881 Section 2.10.6.4 further says:
>>>
>>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
>>>> cache a reply except if an error is returned by the SEQUENCE or
>>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
>>> This suggests to me that the spec authors do not expect an NFSv4.1
>>> server implementation to ever cache the result of a SEQUENCE
>>> operation (except perhaps as part of a successful multi-operation
>>> COMPOUND).
>>>
>>> NFSD attempts to cache the result of solo SEQUENCE operations,
>>> however. This is because the protocol does not permit servers to
>>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
>>> always caches solo SEQUENCE operations, then it never has occasion
>>> to return that status code.
>>>
>>> However, clients use solo SEQUENCE to query the current status of a
>>> session slot. A cached reply will return stale information to the
>>> client, and could result in an infinite loop.
>>
>> The pynfs SEQ9f test is now failing with this change. This test:
>>
>> - Sends a CREATE_SESSION
>> - Sends a solo SEQUENCE with sa_cachethis set
>> - Sends the same operation without changing the slot sequence number
>>
>> The test expects the server's response to be NFS4_OK. NFSD now returns
>> NFS4ERR_SEQ_FALSE_RETRY instead.
>>
>> It's possible the test is wrong, but how should it be fixed?
>>
>> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
>> COMPOUND containing a solo SEQUENCE?
>>
>> When reporting a retransmitted solo SEQUENCE, what is the correct status
>> code?
> 
> Interesting question....
> To help with context: you wrote:
> 
>    However, clients use solo SEQUENCE to query the current status of a
>    session slot.  A cached reply will return stale information to the
>    client, and could result in an infinite loop.
> 
> Could you please expand on that?  What in the reply might be stale, and
> how might that result in an infinite loop?
> 
> Could a reply to a replayed singleton SEQUENCE simple always return the
> current info, rather than cached info?

If a cached reply is returned to the client, the slot sequence number
doesn't change, and neither do the SEQ4_STATUS flags.

The only real recovery in this case is to destroy the session, which
will remove the cached reply.

We've determined that the Linux NFS client never asserts sa_cachethis
when sending a solo SEQUENCE, so the questions above might be academic.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-08 13:04       ` Chuck Lever
@ 2025-10-08 22:03         ` NeilBrown
  2025-10-09 12:56           ` Chuck Lever
  0 siblings, 1 reply; 18+ messages in thread
From: NeilBrown @ 2025-10-08 22:03 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On Thu, 09 Oct 2025, Chuck Lever wrote:
> On 10/7/25 6:12 PM, NeilBrown wrote:
> > On Wed, 08 Oct 2025, Chuck Lever wrote:
> >> On 10/7/25 12:04 PM, Chuck Lever wrote:
> >>>  RFC 8881 Section 2.10.6.1.3 says:
> >>>
> >>>> On a per-request basis, the requester can choose to direct the
> >>>> replier to cache the reply to all operations after the first
> >>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
> >>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> >>> RFC 8881 Section 2.10.6.4 further says:
> >>>
> >>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> >>>> cache a reply except if an error is returned by the SEQUENCE or
> >>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
> >>> This suggests to me that the spec authors do not expect an NFSv4.1
> >>> server implementation to ever cache the result of a SEQUENCE
> >>> operation (except perhaps as part of a successful multi-operation
> >>> COMPOUND).
> >>>
> >>> NFSD attempts to cache the result of solo SEQUENCE operations,
> >>> however. This is because the protocol does not permit servers to
> >>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> >>> always caches solo SEQUENCE operations, then it never has occasion
> >>> to return that status code.
> >>>
> >>> However, clients use solo SEQUENCE to query the current status of a
> >>> session slot. A cached reply will return stale information to the
> >>> client, and could result in an infinite loop.
> >>
> >> The pynfs SEQ9f test is now failing with this change. This test:
> >>
> >> - Sends a CREATE_SESSION
> >> - Sends a solo SEQUENCE with sa_cachethis set
> >> - Sends the same operation without changing the slot sequence number
> >>
> >> The test expects the server's response to be NFS4_OK. NFSD now returns
> >> NFS4ERR_SEQ_FALSE_RETRY instead.
> >>
> >> It's possible the test is wrong, but how should it be fixed?
> >>
> >> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
> >> COMPOUND containing a solo SEQUENCE?
> >>
> >> When reporting a retransmitted solo SEQUENCE, what is the correct status
> >> code?
> > 
> > Interesting question....
> > To help with context: you wrote:
> > 
> >    However, clients use solo SEQUENCE to query the current status of a
> >    session slot.  A cached reply will return stale information to the
> >    client, and could result in an infinite loop.
> > 
> > Could you please expand on that?  What in the reply might be stale, and
> > how might that result in an infinite loop?
> > 
> > Could a reply to a replayed singleton SEQUENCE simple always return the
> > current info, rather than cached info?
> 
> If a cached reply is returned to the client, the slot sequence number
> doesn't change, and neither do the SEQ4_STATUS flags.

Why is that a problem?  And importantly: how can it result in an
infinite loop?

> 
> The only real recovery in this case is to destroy the session, which
> will remove the cached reply.

What "case" that needs to be recovered from?

> 
> We've determined that the Linux NFS client never asserts sa_cachethis
> when sending a solo SEQUENCE, so the questions above might be academic.

It would still be nice to have clear agreement on what the spec allows
and expects.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-08 22:03         ` NeilBrown
@ 2025-10-09 12:56           ` Chuck Lever
  2025-10-09 23:29             ` NeilBrown
  0 siblings, 1 reply; 18+ messages in thread
From: Chuck Lever @ 2025-10-09 12:56 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On 10/8/25 6:03 PM, NeilBrown wrote:
> On Thu, 09 Oct 2025, Chuck Lever wrote:
>> On 10/7/25 6:12 PM, NeilBrown wrote:
>>> On Wed, 08 Oct 2025, Chuck Lever wrote:
>>>> On 10/7/25 12:04 PM, Chuck Lever wrote:
>>>>>  RFC 8881 Section 2.10.6.1.3 says:
>>>>>
>>>>>> On a per-request basis, the requester can choose to direct the
>>>>>> replier to cache the reply to all operations after the first
>>>>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
>>>>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
>>>>> RFC 8881 Section 2.10.6.4 further says:
>>>>>
>>>>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
>>>>>> cache a reply except if an error is returned by the SEQUENCE or
>>>>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
>>>>> This suggests to me that the spec authors do not expect an NFSv4.1
>>>>> server implementation to ever cache the result of a SEQUENCE
>>>>> operation (except perhaps as part of a successful multi-operation
>>>>> COMPOUND).
>>>>>
>>>>> NFSD attempts to cache the result of solo SEQUENCE operations,
>>>>> however. This is because the protocol does not permit servers to
>>>>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
>>>>> always caches solo SEQUENCE operations, then it never has occasion
>>>>> to return that status code.
>>>>>
>>>>> However, clients use solo SEQUENCE to query the current status of a
>>>>> session slot. A cached reply will return stale information to the
>>>>> client, and could result in an infinite loop.
>>>>
>>>> The pynfs SEQ9f test is now failing with this change. This test:
>>>>
>>>> - Sends a CREATE_SESSION
>>>> - Sends a solo SEQUENCE with sa_cachethis set
>>>> - Sends the same operation without changing the slot sequence number
>>>>
>>>> The test expects the server's response to be NFS4_OK. NFSD now returns
>>>> NFS4ERR_SEQ_FALSE_RETRY instead.
>>>>
>>>> It's possible the test is wrong, but how should it be fixed?
>>>>
>>>> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
>>>> COMPOUND containing a solo SEQUENCE?
>>>>
>>>> When reporting a retransmitted solo SEQUENCE, what is the correct status
>>>> code?
>>>
>>> Interesting question....
>>> To help with context: you wrote:
>>>
>>>    However, clients use solo SEQUENCE to query the current status of a
>>>    session slot.  A cached reply will return stale information to the
>>>    client, and could result in an infinite loop.
>>>
>>> Could you please expand on that?  What in the reply might be stale, and
>>> how might that result in an infinite loop?
>>>
>>> Could a reply to a replayed singleton SEQUENCE simple always return the
>>> current info, rather than cached info?
>>
>> If a cached reply is returned to the client, the slot sequence number
>> doesn't change, and neither do the SEQ4_STATUS flags.
> 
> Why is that a problem?  And importantly: how can it result in an
> infinite loop?

My remark was "could result in an infinite loop" -- subjunctive, meaning
we haven't seen that behavior in this specific case, but there is a
risk, if the client has a bug, for example. I can drop that remark, if
it vexes.

The actual issue at hand is that the client can set the server up for
a memory overwrite. If CREATE_SESSION does not request a large enough
ca_maxresponsesize_cached, but the server expects to be able to cache
SEQUENCE operations anyway, a solo SEQUENCE will cause the server to
write into a cache that does not exist.

Either NFSD needs to unconditionally reserve the slot cache space for
solo SEQUENCE, if it intends to unconditionally cache those; or it
shouldn't cache them at all.

The language of the spec suggests that the authors did not expect
servers to cache solo SEQUENCE operations. It states that sa_cachethis
refers to caching COMPOUND operations /after/ the SEQUENCE operation.


>> The only real recovery in this case is to destroy the session, which
>> will remove the cached reply.
> 
> What "case" that needs to be recovered from?

When the session slot has become unusable because server and client have
different ideas of what the slot sequence number is.

The client and server might stop using a slot in that situation.
Destroying the session is the only way to get rid of the slot entirely.

But IMHO this is a rat hole.


>> We've determined that the Linux NFS client never asserts sa_cachethis
>> when sending a solo SEQUENCE, so the questions above might be academic.
> 
> It would still be nice to have clear agreement on what the spec allows
> and expects.
RFC 8881 Section 2.10.6 does not provide any guidance about how a server
MUST respond either when sa_cachethis is asserted on a solo SEQUENCE,
nor does it say explicitly how to respond when a client replays such a
SEQUENCE operation.

I can ask on nfsv4@ietf.org, but I suspect we will not get a consensus
response there either.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-09 12:56           ` Chuck Lever
@ 2025-10-09 23:29             ` NeilBrown
  2025-10-10 13:03               ` Chuck Lever
  0 siblings, 1 reply; 18+ messages in thread
From: NeilBrown @ 2025-10-09 23:29 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On Thu, 09 Oct 2025, Chuck Lever wrote:
> On 10/8/25 6:03 PM, NeilBrown wrote:
> > On Thu, 09 Oct 2025, Chuck Lever wrote:
> >> On 10/7/25 6:12 PM, NeilBrown wrote:
> >>> On Wed, 08 Oct 2025, Chuck Lever wrote:
> >>>> On 10/7/25 12:04 PM, Chuck Lever wrote:
> >>>>>  RFC 8881 Section 2.10.6.1.3 says:
> >>>>>
> >>>>>> On a per-request basis, the requester can choose to direct the
> >>>>>> replier to cache the reply to all operations after the first
> >>>>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
> >>>>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> >>>>> RFC 8881 Section 2.10.6.4 further says:
> >>>>>
> >>>>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> >>>>>> cache a reply except if an error is returned by the SEQUENCE or
> >>>>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
> >>>>> This suggests to me that the spec authors do not expect an NFSv4.1
> >>>>> server implementation to ever cache the result of a SEQUENCE
> >>>>> operation (except perhaps as part of a successful multi-operation
> >>>>> COMPOUND).
> >>>>>
> >>>>> NFSD attempts to cache the result of solo SEQUENCE operations,
> >>>>> however. This is because the protocol does not permit servers to
> >>>>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> >>>>> always caches solo SEQUENCE operations, then it never has occasion
> >>>>> to return that status code.
> >>>>>
> >>>>> However, clients use solo SEQUENCE to query the current status of a
> >>>>> session slot. A cached reply will return stale information to the
> >>>>> client, and could result in an infinite loop.
> >>>>
> >>>> The pynfs SEQ9f test is now failing with this change. This test:
> >>>>
> >>>> - Sends a CREATE_SESSION
> >>>> - Sends a solo SEQUENCE with sa_cachethis set
> >>>> - Sends the same operation without changing the slot sequence number
> >>>>
> >>>> The test expects the server's response to be NFS4_OK. NFSD now returns
> >>>> NFS4ERR_SEQ_FALSE_RETRY instead.
> >>>>
> >>>> It's possible the test is wrong, but how should it be fixed?
> >>>>
> >>>> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
> >>>> COMPOUND containing a solo SEQUENCE?
> >>>>
> >>>> When reporting a retransmitted solo SEQUENCE, what is the correct status
> >>>> code?
> >>>
> >>> Interesting question....
> >>> To help with context: you wrote:
> >>>
> >>>    However, clients use solo SEQUENCE to query the current status of a
> >>>    session slot.  A cached reply will return stale information to the
> >>>    client, and could result in an infinite loop.
> >>>
> >>> Could you please expand on that?  What in the reply might be stale, and
> >>> how might that result in an infinite loop?
> >>>
> >>> Could a reply to a replayed singleton SEQUENCE simple always return the
> >>> current info, rather than cached info?
> >>
> >> If a cached reply is returned to the client, the slot sequence number
> >> doesn't change, and neither do the SEQ4_STATUS flags.
> > 
> > Why is that a problem?  And importantly: how can it result in an
> > infinite loop?
> 
> My remark was "could result in an infinite loop" -- subjunctive, meaning
> we haven't seen that behavior in this specific case, but there is a
> risk, if the client has a bug, for example. I can drop that remark, if
> it vexes.

I think dropping it would be best.

> 
> The actual issue at hand is that the client can set the server up for
> a memory overwrite. If CREATE_SESSION does not request a large enough
> ca_maxresponsesize_cached, but the server expects to be able to cache
> SEQUENCE operations anyway, a solo SEQUENCE will cause the server to
> write into a cache that does not exist.

OK, this looks like an important issue.  Adding that to the commit
message would help.

> 
> Either NFSD needs to unconditionally reserve the slot cache space for
> solo SEQUENCE, if it intends to unconditionally cache those; or it
> shouldn't cache them at all.
> 
> The language of the spec suggests that the authors did not expect
> servers to cache solo SEQUENCE operations. It states that sa_cachethis
> refers to caching COMPOUND operations /after/ the SEQUENCE operation.
> 
> 
> >> The only real recovery in this case is to destroy the session, which
> >> will remove the cached reply.
> > 
> > What "case" that needs to be recovered from?
> 
> When the session slot has become unusable because server and client have
> different ideas of what the slot sequence number is.
> 
> The client and server might stop using a slot in that situation.
> Destroying the session is the only way to get rid of the slot entirely.
> 
> But IMHO this is a rat hole.

OK, I think I understand now.

According to RFC 5661

  The value of the sa_sequenceid argument relative to the cached
   sequence ID on the slot falls into one of three cases.

of which the second case is

   o  If sa_sequenceid and the cached sequence ID are the same, this is
      a retry, and the server replies with what is recorded in the reply
      cache.  The lease is possibly renewed as described below.

So the server *must* cache the reply for the most recent SEQUENCE in
each slot.

The language in that section seems to suggest the "cached sequence ID"
is stored in the slot's reply cache.
Our struct nfsd4_slot stores the cached sequence ID not as part of the
generic reply "sa_data[]" but as a dedicated field.  We could store the
rest of the SEQUENCE reply in dedicated fields and leave sa_data to
store the replies to ops 2 onwards.

The discussion of ca_maxresponsesize_cached always takes about "if
sa_cachethis is set" it seems to apply only to those ops where that
is relevant - it isn't relevant for SEQUENCE.

So I think I would be in favour of always having enough space to cache a
full SEQUENCE reply.  The decision of whether it goes in sa_data[] or in
dedicated fields depends on how clean the code looks.

Thanks,
NeilBrown



> 
> 
> >> We've determined that the Linux NFS client never asserts sa_cachethis
> >> when sending a solo SEQUENCE, so the questions above might be academic.
> > 
> > It would still be nice to have clear agreement on what the spec allows
> > and expects.
> RFC 8881 Section 2.10.6 does not provide any guidance about how a server
> MUST respond either when sa_cachethis is asserted on a solo SEQUENCE,
> nor does it say explicitly how to respond when a client replays such a
> SEQUENCE operation.
> 
> I can ask on nfsv4@ietf.org, but I suspect we will not get a consensus
> response there either.
> 
> 
> -- 
> Chuck Lever
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-09 23:29             ` NeilBrown
@ 2025-10-10 13:03               ` Chuck Lever
  2025-10-11  0:55                 ` NeilBrown
  0 siblings, 1 reply; 18+ messages in thread
From: Chuck Lever @ 2025-10-10 13:03 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On 10/9/25 7:29 PM, NeilBrown wrote:
> On Thu, 09 Oct 2025, Chuck Lever wrote:
>> On 10/8/25 6:03 PM, NeilBrown wrote:
>>> On Thu, 09 Oct 2025, Chuck Lever wrote:
>>>> On 10/7/25 6:12 PM, NeilBrown wrote:
>>>>> On Wed, 08 Oct 2025, Chuck Lever wrote:
>>>>>> On 10/7/25 12:04 PM, Chuck Lever wrote:
>>>>>>>  RFC 8881 Section 2.10.6.1.3 says:
>>>>>>>
>>>>>>>> On a per-request basis, the requester can choose to direct the
>>>>>>>> replier to cache the reply to all operations after the first
>>>>>>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
>>>>>>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
>>>>>>> RFC 8881 Section 2.10.6.4 further says:
>>>>>>>
>>>>>>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
>>>>>>>> cache a reply except if an error is returned by the SEQUENCE or
>>>>>>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
>>>>>>> This suggests to me that the spec authors do not expect an NFSv4.1
>>>>>>> server implementation to ever cache the result of a SEQUENCE
>>>>>>> operation (except perhaps as part of a successful multi-operation
>>>>>>> COMPOUND).
>>>>>>>
>>>>>>> NFSD attempts to cache the result of solo SEQUENCE operations,
>>>>>>> however. This is because the protocol does not permit servers to
>>>>>>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
>>>>>>> always caches solo SEQUENCE operations, then it never has occasion
>>>>>>> to return that status code.
>>>>>>>
>>>>>>> However, clients use solo SEQUENCE to query the current status of a
>>>>>>> session slot. A cached reply will return stale information to the
>>>>>>> client, and could result in an infinite loop.
>>>>>>
>>>>>> The pynfs SEQ9f test is now failing with this change. This test:
>>>>>>
>>>>>> - Sends a CREATE_SESSION
>>>>>> - Sends a solo SEQUENCE with sa_cachethis set
>>>>>> - Sends the same operation without changing the slot sequence number
>>>>>>
>>>>>> The test expects the server's response to be NFS4_OK. NFSD now returns
>>>>>> NFS4ERR_SEQ_FALSE_RETRY instead.
>>>>>>
>>>>>> It's possible the test is wrong, but how should it be fixed?
>>>>>>
>>>>>> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
>>>>>> COMPOUND containing a solo SEQUENCE?
>>>>>>
>>>>>> When reporting a retransmitted solo SEQUENCE, what is the correct status
>>>>>> code?
>>>>>
>>>>> Interesting question....
>>>>> To help with context: you wrote:
>>>>>
>>>>>    However, clients use solo SEQUENCE to query the current status of a
>>>>>    session slot.  A cached reply will return stale information to the
>>>>>    client, and could result in an infinite loop.
>>>>>
>>>>> Could you please expand on that?  What in the reply might be stale, and
>>>>> how might that result in an infinite loop?
>>>>>
>>>>> Could a reply to a replayed singleton SEQUENCE simple always return the
>>>>> current info, rather than cached info?
>>>>
>>>> If a cached reply is returned to the client, the slot sequence number
>>>> doesn't change, and neither do the SEQ4_STATUS flags.
>>>
>>> Why is that a problem?  And importantly: how can it result in an
>>> infinite loop?
>>
>> My remark was "could result in an infinite loop" -- subjunctive, meaning
>> we haven't seen that behavior in this specific case, but there is a
>> risk, if the client has a bug, for example. I can drop that remark, if
>> it vexes.
> 
> I think dropping it would be best.
> 
>>
>> The actual issue at hand is that the client can set the server up for
>> a memory overwrite. If CREATE_SESSION does not request a large enough
>> ca_maxresponsesize_cached, but the server expects to be able to cache
>> SEQUENCE operations anyway, a solo SEQUENCE will cause the server to
>> write into a cache that does not exist.
> 
> OK, this looks like an important issue.  Adding that to the commit
> message would help.
> 
>>
>> Either NFSD needs to unconditionally reserve the slot cache space for
>> solo SEQUENCE, if it intends to unconditionally cache those; or it
>> shouldn't cache them at all.
>>
>> The language of the spec suggests that the authors did not expect
>> servers to cache solo SEQUENCE operations. It states that sa_cachethis
>> refers to caching COMPOUND operations /after/ the SEQUENCE operation.
>>
>>
>>>> The only real recovery in this case is to destroy the session, which
>>>> will remove the cached reply.
>>>
>>> What "case" that needs to be recovered from?
>>
>> When the session slot has become unusable because server and client have
>> different ideas of what the slot sequence number is.
>>
>> The client and server might stop using a slot in that situation.
>> Destroying the session is the only way to get rid of the slot entirely.
>>
>> But IMHO this is a rat hole.
> 
> OK, I think I understand now.
> 
> According to RFC 5661
> 
>   The value of the sa_sequenceid argument relative to the cached
>    sequence ID on the slot falls into one of three cases.
> 
> of which the second case is
> 
>    o  If sa_sequenceid and the cached sequence ID are the same, this is
>       a retry, and the server replies with what is recorded in the reply
>       cache.  The lease is possibly renewed as described below.
> 
> So the server *must* cache the reply for the most recent SEQUENCE in
> each slot.
> 
> The language in that section seems to suggest the "cached sequence ID"
> is stored in the slot's reply cache.
> Our struct nfsd4_slot stores the cached sequence ID not as part of the
> generic reply "sa_data[]" but as a dedicated field.  We could store the
> rest of the SEQUENCE reply in dedicated fields and leave sa_data to
> store the replies to ops 2 onwards.
> 
> The discussion of ca_maxresponsesize_cached always takes about "if
> sa_cachethis is set" it seems to apply only to those ops where that
> is relevant - it isn't relevant for SEQUENCE.
> 
> So I think I would be in favour of always having enough space to cache a
> full SEQUENCE reply.  The decision of whether it goes in sa_data[] or in
> dedicated fields depends on how clean the code looks.

IIUC the protocol requirements are:

+ solo SEQUENCE that fails MUST NOT be cached. That is currently an NFSD
bug, because it unconditionally caches solo SEQUENCE. I have a patch to
be added to this series to address that.

+ solo SEQUENCE when sa_cachethis is false does not need to be cached,
but the protocol does not allow REP_UNCACHED_RETRY. This is the only
reason to cache a solo SEQUENCE reply. Note that the cached response to
a solo SEQUENCE can be constructed from existing fields in the slot. But
I can't see such reconstruction being clean, and it is certainly a
heroic case (ie, not commonly used, if ever).

nfsd4_alloc_slot() sets the slot replay cache size to /zero/ when the
client's ca_maxresponsesize_cached is smaller than ... I think just the
size of a solo SEQUENCE. I thought that was the bug, originally, but
then Tom suggested there's truly no reason to cache a solo SEQUENCE.

So, we could go with:

0. Fix nfsd_is_solo_sequence (that's 2/4 in this series)
1. Fix NFSD to not cache failed solo SEQUENCE
2. Increase the minimum size of the slot's replay cache to fit a
solo SEQUENCE


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-10 13:03               ` Chuck Lever
@ 2025-10-11  0:55                 ` NeilBrown
  2025-10-11 15:30                   ` Chuck Lever
  0 siblings, 1 reply; 18+ messages in thread
From: NeilBrown @ 2025-10-11  0:55 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On Sat, 11 Oct 2025, Chuck Lever wrote:
> On 10/9/25 7:29 PM, NeilBrown wrote:
> > On Thu, 09 Oct 2025, Chuck Lever wrote:
> >> On 10/8/25 6:03 PM, NeilBrown wrote:
> >>> On Thu, 09 Oct 2025, Chuck Lever wrote:
> >>>> On 10/7/25 6:12 PM, NeilBrown wrote:
> >>>>> On Wed, 08 Oct 2025, Chuck Lever wrote:
> >>>>>> On 10/7/25 12:04 PM, Chuck Lever wrote:
> >>>>>>>  RFC 8881 Section 2.10.6.1.3 says:
> >>>>>>>
> >>>>>>>> On a per-request basis, the requester can choose to direct the
> >>>>>>>> replier to cache the reply to all operations after the first
> >>>>>>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
> >>>>>>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
> >>>>>>> RFC 8881 Section 2.10.6.4 further says:
> >>>>>>>
> >>>>>>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> >>>>>>>> cache a reply except if an error is returned by the SEQUENCE or
> >>>>>>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
> >>>>>>> This suggests to me that the spec authors do not expect an NFSv4.1
> >>>>>>> server implementation to ever cache the result of a SEQUENCE
> >>>>>>> operation (except perhaps as part of a successful multi-operation
> >>>>>>> COMPOUND).
> >>>>>>>
> >>>>>>> NFSD attempts to cache the result of solo SEQUENCE operations,
> >>>>>>> however. This is because the protocol does not permit servers to
> >>>>>>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
> >>>>>>> always caches solo SEQUENCE operations, then it never has occasion
> >>>>>>> to return that status code.
> >>>>>>>
> >>>>>>> However, clients use solo SEQUENCE to query the current status of a
> >>>>>>> session slot. A cached reply will return stale information to the
> >>>>>>> client, and could result in an infinite loop.
> >>>>>>
> >>>>>> The pynfs SEQ9f test is now failing with this change. This test:
> >>>>>>
> >>>>>> - Sends a CREATE_SESSION
> >>>>>> - Sends a solo SEQUENCE with sa_cachethis set
> >>>>>> - Sends the same operation without changing the slot sequence number
> >>>>>>
> >>>>>> The test expects the server's response to be NFS4_OK. NFSD now returns
> >>>>>> NFS4ERR_SEQ_FALSE_RETRY instead.
> >>>>>>
> >>>>>> It's possible the test is wrong, but how should it be fixed?
> >>>>>>
> >>>>>> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
> >>>>>> COMPOUND containing a solo SEQUENCE?
> >>>>>>
> >>>>>> When reporting a retransmitted solo SEQUENCE, what is the correct status
> >>>>>> code?
> >>>>>
> >>>>> Interesting question....
> >>>>> To help with context: you wrote:
> >>>>>
> >>>>>    However, clients use solo SEQUENCE to query the current status of a
> >>>>>    session slot.  A cached reply will return stale information to the
> >>>>>    client, and could result in an infinite loop.
> >>>>>
> >>>>> Could you please expand on that?  What in the reply might be stale, and
> >>>>> how might that result in an infinite loop?
> >>>>>
> >>>>> Could a reply to a replayed singleton SEQUENCE simple always return the
> >>>>> current info, rather than cached info?
> >>>>
> >>>> If a cached reply is returned to the client, the slot sequence number
> >>>> doesn't change, and neither do the SEQ4_STATUS flags.
> >>>
> >>> Why is that a problem?  And importantly: how can it result in an
> >>> infinite loop?
> >>
> >> My remark was "could result in an infinite loop" -- subjunctive, meaning
> >> we haven't seen that behavior in this specific case, but there is a
> >> risk, if the client has a bug, for example. I can drop that remark, if
> >> it vexes.
> > 
> > I think dropping it would be best.
> > 
> >>
> >> The actual issue at hand is that the client can set the server up for
> >> a memory overwrite. If CREATE_SESSION does not request a large enough
> >> ca_maxresponsesize_cached, but the server expects to be able to cache
> >> SEQUENCE operations anyway, a solo SEQUENCE will cause the server to
> >> write into a cache that does not exist.
> > 
> > OK, this looks like an important issue.  Adding that to the commit
> > message would help.
> > 
> >>
> >> Either NFSD needs to unconditionally reserve the slot cache space for
> >> solo SEQUENCE, if it intends to unconditionally cache those; or it
> >> shouldn't cache them at all.
> >>
> >> The language of the spec suggests that the authors did not expect
> >> servers to cache solo SEQUENCE operations. It states that sa_cachethis
> >> refers to caching COMPOUND operations /after/ the SEQUENCE operation.
> >>
> >>
> >>>> The only real recovery in this case is to destroy the session, which
> >>>> will remove the cached reply.
> >>>
> >>> What "case" that needs to be recovered from?
> >>
> >> When the session slot has become unusable because server and client have
> >> different ideas of what the slot sequence number is.
> >>
> >> The client and server might stop using a slot in that situation.
> >> Destroying the session is the only way to get rid of the slot entirely.
> >>
> >> But IMHO this is a rat hole.
> > 
> > OK, I think I understand now.
> > 
> > According to RFC 5661
> > 
> >   The value of the sa_sequenceid argument relative to the cached
> >    sequence ID on the slot falls into one of three cases.
> > 
> > of which the second case is
> > 
> >    o  If sa_sequenceid and the cached sequence ID are the same, this is
> >       a retry, and the server replies with what is recorded in the reply
> >       cache.  The lease is possibly renewed as described below.
> > 
> > So the server *must* cache the reply for the most recent SEQUENCE in
> > each slot.
> > 
> > The language in that section seems to suggest the "cached sequence ID"
> > is stored in the slot's reply cache.
> > Our struct nfsd4_slot stores the cached sequence ID not as part of the
> > generic reply "sa_data[]" but as a dedicated field.  We could store the
> > rest of the SEQUENCE reply in dedicated fields and leave sa_data to
> > store the replies to ops 2 onwards.
> > 
> > The discussion of ca_maxresponsesize_cached always takes about "if
> > sa_cachethis is set" it seems to apply only to those ops where that
> > is relevant - it isn't relevant for SEQUENCE.
> > 
> > So I think I would be in favour of always having enough space to cache a
> > full SEQUENCE reply.  The decision of whether it goes in sa_data[] or in
> > dedicated fields depends on how clean the code looks.
> 
> IIUC the protocol requirements are:
> 
> + solo SEQUENCE that fails MUST NOT be cached. That is currently an NFSD
> bug, because it unconditionally caches solo SEQUENCE. I have a patch to
> be added to this series to address that.

Agreed.

> 
> + solo SEQUENCE when sa_cachethis is false does not need to be cached,
> but the protocol does not allow REP_UNCACHED_RETRY. This is the only
> reason to cache a solo SEQUENCE reply. Note that the cached response to
> a solo SEQUENCE can be constructed from existing fields in the slot. But
> I can't see such reconstruction being clean, and it is certainly a
> heroic case (ie, not commonly used, if ever).

I disagree.  The text I quoted strongly implies that any SEQUENCE must be
cached so that it can be replayed on a resend.  This is not presented as
optional.
   "the server replies with what is recorded in the reply cache".

The code currently always constructs a new reply using the fields of
'seq' - the request.
This is probably wrong.
When nfsd4_sequence() doesn't detect a resend, it updates 
   ->maxslots
   ->target_maxslots
   ->status_flags

When a replay is detected, those fields aren't updated so the client
just sees what it send.
The nfsd4_slot doesn't have explicit room to store these.  Maybe it should.

NeilBrown

> 
> nfsd4_alloc_slot() sets the slot replay cache size to /zero/ when the
> client's ca_maxresponsesize_cached is smaller than ... I think just the
> size of a solo SEQUENCE. I thought that was the bug, originally, but
> then Tom suggested there's truly no reason to cache a solo SEQUENCE.
> 
> So, we could go with:
> 
> 0. Fix nfsd_is_solo_sequence (that's 2/4 in this series)
> 1. Fix NFSD to not cache failed solo SEQUENCE
> 2. Increase the minimum size of the slot's replay cache to fit a
> solo SEQUENCE
> 
> 
> -- 
> Chuck Lever
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations
  2025-10-11  0:55                 ` NeilBrown
@ 2025-10-11 15:30                   ` Chuck Lever
  0 siblings, 0 replies; 18+ messages in thread
From: Chuck Lever @ 2025-10-11 15:30 UTC (permalink / raw)
  To: NeilBrown
  Cc: Chuck Lever, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs

On 10/10/25 8:55 PM, NeilBrown wrote:
> On Sat, 11 Oct 2025, Chuck Lever wrote:
>> On 10/9/25 7:29 PM, NeilBrown wrote:
>>> On Thu, 09 Oct 2025, Chuck Lever wrote:
>>>> On 10/8/25 6:03 PM, NeilBrown wrote:
>>>>> On Thu, 09 Oct 2025, Chuck Lever wrote:
>>>>>> On 10/7/25 6:12 PM, NeilBrown wrote:
>>>>>>> On Wed, 08 Oct 2025, Chuck Lever wrote:
>>>>>>>> On 10/7/25 12:04 PM, Chuck Lever wrote:
>>>>>>>>>  RFC 8881 Section 2.10.6.1.3 says:
>>>>>>>>>
>>>>>>>>>> On a per-request basis, the requester can choose to direct the
>>>>>>>>>> replier to cache the reply to all operations after the first
>>>>>>>>>> operation (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or
>>>>>>>>>> csa_cachethis fields of the arguments to SEQUENCE or CB_SEQUENCE.
>>>>>>>>> RFC 8881 Section 2.10.6.4 further says:
>>>>>>>>>
>>>>>>>>>> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
>>>>>>>>>> cache a reply except if an error is returned by the SEQUENCE or
>>>>>>>>>> CB_SEQUENCE operation (see Section 2.10.6.1.2).
>>>>>>>>> This suggests to me that the spec authors do not expect an NFSv4.1
>>>>>>>>> server implementation to ever cache the result of a SEQUENCE
>>>>>>>>> operation (except perhaps as part of a successful multi-operation
>>>>>>>>> COMPOUND).
>>>>>>>>>
>>>>>>>>> NFSD attempts to cache the result of solo SEQUENCE operations,
>>>>>>>>> however. This is because the protocol does not permit servers to
>>>>>>>>> respond to a SEQUENCE with NFS4ERR_RETRY_UNCACHED_REP. If the server
>>>>>>>>> always caches solo SEQUENCE operations, then it never has occasion
>>>>>>>>> to return that status code.
>>>>>>>>>
>>>>>>>>> However, clients use solo SEQUENCE to query the current status of a
>>>>>>>>> session slot. A cached reply will return stale information to the
>>>>>>>>> client, and could result in an infinite loop.
>>>>>>>>
>>>>>>>> The pynfs SEQ9f test is now failing with this change. This test:
>>>>>>>>
>>>>>>>> - Sends a CREATE_SESSION
>>>>>>>> - Sends a solo SEQUENCE with sa_cachethis set
>>>>>>>> - Sends the same operation without changing the slot sequence number
>>>>>>>>
>>>>>>>> The test expects the server's response to be NFS4_OK. NFSD now returns
>>>>>>>> NFS4ERR_SEQ_FALSE_RETRY instead.
>>>>>>>>
>>>>>>>> It's possible the test is wrong, but how should it be fixed?
>>>>>>>>
>>>>>>>> Is it compliant for an NFSv4.1 server to ignore sa_cachethis for a
>>>>>>>> COMPOUND containing a solo SEQUENCE?
>>>>>>>>
>>>>>>>> When reporting a retransmitted solo SEQUENCE, what is the correct status
>>>>>>>> code?
>>>>>>>
>>>>>>> Interesting question....
>>>>>>> To help with context: you wrote:
>>>>>>>
>>>>>>>    However, clients use solo SEQUENCE to query the current status of a
>>>>>>>    session slot.  A cached reply will return stale information to the
>>>>>>>    client, and could result in an infinite loop.
>>>>>>>
>>>>>>> Could you please expand on that?  What in the reply might be stale, and
>>>>>>> how might that result in an infinite loop?
>>>>>>>
>>>>>>> Could a reply to a replayed singleton SEQUENCE simple always return the
>>>>>>> current info, rather than cached info?
>>>>>>
>>>>>> If a cached reply is returned to the client, the slot sequence number
>>>>>> doesn't change, and neither do the SEQ4_STATUS flags.
>>>>>
>>>>> Why is that a problem?  And importantly: how can it result in an
>>>>> infinite loop?
>>>>
>>>> My remark was "could result in an infinite loop" -- subjunctive, meaning
>>>> we haven't seen that behavior in this specific case, but there is a
>>>> risk, if the client has a bug, for example. I can drop that remark, if
>>>> it vexes.
>>>
>>> I think dropping it would be best.
>>>
>>>>
>>>> The actual issue at hand is that the client can set the server up for
>>>> a memory overwrite. If CREATE_SESSION does not request a large enough
>>>> ca_maxresponsesize_cached, but the server expects to be able to cache
>>>> SEQUENCE operations anyway, a solo SEQUENCE will cause the server to
>>>> write into a cache that does not exist.
>>>
>>> OK, this looks like an important issue.  Adding that to the commit
>>> message would help.
>>>
>>>>
>>>> Either NFSD needs to unconditionally reserve the slot cache space for
>>>> solo SEQUENCE, if it intends to unconditionally cache those; or it
>>>> shouldn't cache them at all.
>>>>
>>>> The language of the spec suggests that the authors did not expect
>>>> servers to cache solo SEQUENCE operations. It states that sa_cachethis
>>>> refers to caching COMPOUND operations /after/ the SEQUENCE operation.
>>>>
>>>>
>>>>>> The only real recovery in this case is to destroy the session, which
>>>>>> will remove the cached reply.
>>>>>
>>>>> What "case" that needs to be recovered from?
>>>>
>>>> When the session slot has become unusable because server and client have
>>>> different ideas of what the slot sequence number is.
>>>>
>>>> The client and server might stop using a slot in that situation.
>>>> Destroying the session is the only way to get rid of the slot entirely.
>>>>
>>>> But IMHO this is a rat hole.
>>>
>>> OK, I think I understand now.
>>>
>>> According to RFC 5661
>>>
>>>   The value of the sa_sequenceid argument relative to the cached
>>>    sequence ID on the slot falls into one of three cases.
>>>
>>> of which the second case is
>>>
>>>    o  If sa_sequenceid and the cached sequence ID are the same, this is
>>>       a retry, and the server replies with what is recorded in the reply
>>>       cache.  The lease is possibly renewed as described below.
>>>
>>> So the server *must* cache the reply for the most recent SEQUENCE in
>>> each slot.
>>>
>>> The language in that section seems to suggest the "cached sequence ID"
>>> is stored in the slot's reply cache.
>>> Our struct nfsd4_slot stores the cached sequence ID not as part of the
>>> generic reply "sa_data[]" but as a dedicated field.  We could store the
>>> rest of the SEQUENCE reply in dedicated fields and leave sa_data to
>>> store the replies to ops 2 onwards.
>>>
>>> The discussion of ca_maxresponsesize_cached always takes about "if
>>> sa_cachethis is set" it seems to apply only to those ops where that
>>> is relevant - it isn't relevant for SEQUENCE.
>>>
>>> So I think I would be in favour of always having enough space to cache a
>>> full SEQUENCE reply.  The decision of whether it goes in sa_data[] or in
>>> dedicated fields depends on how clean the code looks.
>>
>> IIUC the protocol requirements are:
>>
>> + solo SEQUENCE that fails MUST NOT be cached. That is currently an NFSD
>> bug, because it unconditionally caches solo SEQUENCE. I have a patch to
>> be added to this series to address that.
> 
> Agreed.
> 
>>
>> + solo SEQUENCE when sa_cachethis is false does not need to be cached,
>> but the protocol does not allow REP_UNCACHED_RETRY. This is the only
>> reason to cache a solo SEQUENCE reply. Note that the cached response to
>> a solo SEQUENCE can be constructed from existing fields in the slot. But
>> I can't see such reconstruction being clean, and it is certainly a
>> heroic case (ie, not commonly used, if ever).
> 
> I disagree.  The text I quoted strongly implies that any SEQUENCE must be
> cached so that it can be replayed on a resend. 

First, RFC 8881 is now authoritative, not RFC 5661, so let's stick with
the former.

The text you quoted refers to the sa_sequenceid field. AFAIUI, that is
a field that could be or should be saved whether sa_cachethis is set or
not.

My question regards only what NFSD saves in the slot's sl_data, the part
of the actual RPC reply message that is saved by
nfsd4_store_cache_entry() when sa_cachethis is set.


> This is not presented as
> optional.
>    "the server replies with what is recorded in the reply cache".

Be careful here. There is no usage of BCP14 terms here such as MUST or
REQUIRED. Therefore the text here is not a normative mandate of any
kind; rather this text is only descriptive. We don't have to speculate
about whether this is or isn't optional; this language is clearly not
normative.


> The code currently always constructs a new reply using the fields of
> 'seq' - the request.
> This is probably wrong.
> When nfsd4_sequence() doesn't detect a resend, it updates 
>    ->maxslots
>    ->target_maxslots
>    ->status_flags
> 
> When a replay is detected, those fields aren't updated so the client
> just sees what it send.
> The nfsd4_slot doesn't have explicit room to store these.  Maybe it should.

A few paragraphs earlier in Section 18.46.3, we have:

> The sa_sequenceid field is the sequence number of the request for the
> reply cache entry (slot). The sr_slotid result MUST equal sa_slotid.
> The sr_sequenceid result MUST equal sa_sequenceid.

Here we have BCP14 terms that are normative requirements that the server
copies the client's values to the reply. I haven't yet found text that
suggests this needs to be different when returning a cached reply.

I'm open to suggestions, though.


> NeilBrown
> 
>>
>> nfsd4_alloc_slot() sets the slot replay cache size to /zero/ when the
>> client's ca_maxresponsesize_cached is smaller than ... I think just the
>> size of a solo SEQUENCE. I thought that was the bug, originally, but
>> then Tom suggested there's truly no reason to cache a solo SEQUENCE.
>>
>> So, we could go with:
>>
>> 0. Fix nfsd_is_solo_sequence (that's 2/4 in this series)
>> 1. Fix NFSD to not cache failed solo SEQUENCE
>> 2. Increase the minimum size of the slot's replay cache to fit a
>> solo SEQUENCE
>>
>>
>> -- 
>> Chuck Lever
>>
> 


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-10-11 15:30 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-07 16:04 [PATCH v2 0/4] Fix unwanted memory overwrites Chuck Lever
2025-10-07 16:04 ` [PATCH v2 1/4] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
2025-10-07 16:04 ` [PATCH v2 2/4] NFSD: Fix the "is this a solo SEQUENCE" predicate Chuck Lever
2025-10-07 17:18   ` Jeff Layton
2025-10-07 16:04 ` [PATCH v2 3/4] NFSD: Do not cache solo SEQUENCE operations Chuck Lever
2025-10-07 17:19   ` Jeff Layton
2025-10-07 20:05   ` Chuck Lever
2025-10-07 22:12     ` NeilBrown
2025-10-08 13:04       ` Chuck Lever
2025-10-08 22:03         ` NeilBrown
2025-10-09 12:56           ` Chuck Lever
2025-10-09 23:29             ` NeilBrown
2025-10-10 13:03               ` Chuck Lever
2025-10-11  0:55                 ` NeilBrown
2025-10-11 15:30                   ` Chuck Lever
2025-10-07 22:21   ` Calum Mackay
2025-10-07 16:04 ` [PATCH v2 4/4] NFSD: Move nfsd4_cache_this() Chuck Lever
2025-10-07 17:20   ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).