* [PATCH v1 1/2] NFSD: Do not cache failed SEQUENCE operations
2025-10-06 18:45 [PATCH v1 0/2] Fix unwanted memory overwrites Chuck Lever
@ 2025-10-06 18:45 ` Chuck Lever
2025-10-06 18:45 ` [PATCH v1 2/2] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
2025-10-07 12:45 ` [PATCH v1 0/2] Fix unwanted memory overwrites Tom Talpey
2 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2025-10-06 18:45 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever, rtm
From: Chuck Lever <chuck.lever@oracle.com>
RFC 8881 Section 2.10.6.4 says:
> If sa_cachethis or csa_cachethis is TRUE, then the replier MUST
> cache a reply except if an error is returned by the SEQUENCE or
> CB_SEQUENCE operation (see Section 2.10.6.1.2).
This text also appears in RFC 5661. Further, Section 2.10.6.1.2
says:
> Any time SEQUENCE or CB_SEQUENCE returns an error, the sequence ID
> of the slot MUST NOT change. The replier MUST NOT modify the reply
> cache entry for the slot whenever an error is returned from
> SEQUENCE or CB_SEQUENCE.
NFSD caches the result of solo SEQUENCE operations, but rtm's
reproducer sends two operations in the failing COMPOUND. NFSD should
not attempt to cache the reply.
The logic in nfsd4_is_solo_sequence() was incorrect: it checked the
current operation index, not the total count of operations in the
COMPOUND. If the SEQUENCE operation, which is always operation 1,
fails in a multi-operation compound, resp->opcnt is always 1. Thus
when a SEQUENCE operation fails, nfsd4_is_solo_sequence() always
returns true.
Reported-by: <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t
Fixes: 468de9e54a90 ("nfsd41: expand solo sequence check")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/xdr4.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index ee0570cbdd9e..d1837a10b0c2 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -926,7 +926,8 @@ struct nfsd4_compoundres {
static inline bool nfsd4_is_solo_sequence(struct nfsd4_compoundres *resp)
{
struct nfsd4_compoundargs *args = resp->rqstp->rq_argp;
- return resp->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
+
+ return args->opcnt == 1 && args->ops[0].opnum == OP_SEQUENCE;
}
/*
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH v1 2/2] NFSD: Skip close replay processing if XDR encoding fails
2025-10-06 18:45 [PATCH v1 0/2] Fix unwanted memory overwrites Chuck Lever
2025-10-06 18:45 ` [PATCH v1 1/2] NFSD: Do not cache failed SEQUENCE operations Chuck Lever
@ 2025-10-06 18:45 ` Chuck Lever
2025-10-07 14:18 ` Jeff Layton
2025-10-07 12:45 ` [PATCH v1 0/2] Fix unwanted memory overwrites Tom Talpey
2 siblings, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2025-10-06 18:45 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever, rtm
From: Chuck Lever <chuck.lever@oracle.com>
The close replay logic added by commit 9411b1d4c7df ("nfsd4: cleanup
handling of nfsv4.0 closed stateid's") cannot be done if encoding
failed due to a short send buffer; there's no guarantee that the
operation encoder has actually encoded the data that is being copied
to the replay cache.
I think there are deeper problems here. Is stateid sequencing
screwed up if XDR encoding fails? Does NFSv4.1+ even need to care
about this?
Reported-by: <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t
Fixes: 9411b1d4c7df ("nfsd4: cleanup handling of nfsv4.0 closed stateid's")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4xdr.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 230bf53e39f7..85b773a65670 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -5937,8 +5937,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
*/
warn_on_nonidempotent_op(op);
xdr_truncate_encode(xdr, op_status_offset + XDR_UNIT);
- }
- if (so) {
+ } else if (so) {
int len = xdr->buf->len - (op_status_offset + XDR_UNIT);
so->so_replay.rp_status = op->status;
--
2.51.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH v1 2/2] NFSD: Skip close replay processing if XDR encoding fails
2025-10-06 18:45 ` [PATCH v1 2/2] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
@ 2025-10-07 14:18 ` Jeff Layton
2025-10-07 14:24 ` Chuck Lever
0 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2025-10-07 14:18 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever, rtm
On Mon, 2025-10-06 at 14:45 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> The close replay logic added by commit 9411b1d4c7df ("nfsd4: cleanup
> handling of nfsv4.0 closed stateid's") cannot be done if encoding
> failed due to a short send buffer; there's no guarantee that the
> operation encoder has actually encoded the data that is being copied
> to the replay cache.
>
> I think there are deeper problems here. Is stateid sequencing
> screwed up if XDR encoding fails?
>
I would imagine so? We will have certainly morphed the stateids in the
server, but aren't sending the updated info in the reply, right? That
seems like we'll end up in state recovery.
> Does NFSv4.1+ even need to care
> about this?
>
The stateid confusion? Is there anything we can do about that for
either protocol?
> Reported-by: <rtm@csail.mit.edu>
> Closes: https://lore.kernel.org/linux-nfs/c3628d57-94ae-48cf-8c9e-49087a28cec9@oracle.com/T/#t
> Fixes: 9411b1d4c7df ("nfsd4: cleanup handling of nfsv4.0 closed stateid's")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/nfs4xdr.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 230bf53e39f7..85b773a65670 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -5937,8 +5937,7 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
> */
> warn_on_nonidempotent_op(op);
> xdr_truncate_encode(xdr, op_status_offset + XDR_UNIT);
> - }
> - if (so) {
> + } else if (so) {
> int len = xdr->buf->len - (op_status_offset + XDR_UNIT);
>
> so->so_replay.rp_status = op->status;
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH v1 2/2] NFSD: Skip close replay processing if XDR encoding fails
2025-10-07 14:18 ` Jeff Layton
@ 2025-10-07 14:24 ` Chuck Lever
0 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2025-10-07 14:24 UTC (permalink / raw)
To: Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever, rtm
On 10/7/25 10:18 AM, Jeff Layton wrote:
> On Mon, 2025-10-06 at 14:45 -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> The close replay logic added by commit 9411b1d4c7df ("nfsd4: cleanup
>> handling of nfsv4.0 closed stateid's") cannot be done if encoding
>> failed due to a short send buffer; there's no guarantee that the
>> operation encoder has actually encoded the data that is being copied
>> to the replay cache.
>>
>> I think there are deeper problems here. Is stateid sequencing
>> screwed up if XDR encoding fails?
>>
>
> I would imagine so? We will have certainly morphed the stateids in the
> server, but aren't sending the updated info in the reply, right? That
> seems like we'll end up in state recovery.
Fair enough. But it feels to me like a layering violation to manage
stateid accounting in the server's XDR layer.
>> Does NFSv4.1+ even need to care
>> about this?
>>
>
> The stateid confusion? Is there anything we can do about that for
> either protocol?
I'm wondering if the stateid accounting is also needed by NFSv4.1.
I saw that OPEN_CONFIRM was included in the set of operations that
participate in this bit of magic, and wondered if this is something
that only NFSv4.0 needs to do. But the magic is in the generic NFSv4
processing path, so I guess I should assume that yes, NFSv4.1 needs
this as well.
--
Chuck Lever
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 0/2] Fix unwanted memory overwrites
2025-10-06 18:45 [PATCH v1 0/2] Fix unwanted memory overwrites Chuck Lever
2025-10-06 18:45 ` [PATCH v1 1/2] NFSD: Do not cache failed SEQUENCE operations Chuck Lever
2025-10-06 18:45 ` [PATCH v1 2/2] NFSD: Skip close replay processing if XDR encoding fails Chuck Lever
@ 2025-10-07 12:45 ` Tom Talpey
2025-10-07 13:16 ` Chuck Lever
2 siblings, 1 reply; 7+ messages in thread
From: Tom Talpey @ 2025-10-07 12:45 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo
Cc: linux-nfs, Chuck Lever
On 10/6/2025 2:45 PM, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> <rtm@csail.mit.edu> reported some memory overwrites that can be
> triggered by NFS client input. I was able to observe overwrites
> by enabling KASAN and running his reproducer [1].
>
> NFSD caches COMPOUNDs containing only a single SEQUENCE operation
> whether the client requests it to or not, in order to work around a
> deficiency in the NFSv4.1 protocol. However, the predicate that
> identifies solo SEQUENCE operations was incorrect.
I'm not sure why a SEQUENCE should be ever be cached, apart from
recognizing it as one of the operations in a prior request. The
idea from a protocol perspective is that the sequence is just a
ind of clock that ticks once per time it's executed, and it only
ticks when the sender sends something new.
IOW from a responder (server) perspective, caching it seems wrong.
Can you elaborate on your interpretation of RFC8881?
Tom.
> (Based on my reading of RFC 8881, I'm not sure NFSD should cache
> solo SEQUENCE operations that fail, but that is perhaps for a
> different day).
>
> Chuck Lever (2):
> NFSD: Do not cache failed SEQUENCE operations
> NFSD: Skip close replay processing if XDR encoding fails
>
> fs/nfsd/nfs4xdr.c | 3 +--
> fs/nfsd/xdr4.h | 3 ++-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v1 0/2] Fix unwanted memory overwrites
2025-10-07 12:45 ` [PATCH v1 0/2] Fix unwanted memory overwrites Tom Talpey
@ 2025-10-07 13:16 ` Chuck Lever
0 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2025-10-07 13:16 UTC (permalink / raw)
To: Tom Talpey, NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo
Cc: linux-nfs, Chuck Lever
On 10/7/25 8:45 AM, Tom Talpey wrote:
> On 10/6/2025 2:45 PM, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> <rtm@csail.mit.edu> reported some memory overwrites that can be
>> triggered by NFS client input. I was able to observe overwrites
>> by enabling KASAN and running his reproducer [1].
>>
>> NFSD caches COMPOUNDs containing only a single SEQUENCE operation
>> whether the client requests it to or not, in order to work around a
>> deficiency in the NFSv4.1 protocol. However, the predicate that
>> identifies solo SEQUENCE operations was incorrect.
>
> I'm not sure why a SEQUENCE should be ever be cached, apart from
> recognizing it as one of the operations in a prior request. The
> idea from a protocol perspective is that the sequence is just a
> ind of clock that ticks once per time it's executed, and it only
> ticks when the sender sends something new.
>
> IOW from a responder (server) perspective, caching it seems wrong.
> Can you elaborate on your interpretation of RFC8881?
I'm basing my remarks about the protocol on this old comment that is
near the code repaired by this series.
/*
* The session reply cache only needs to cache replies that the client
* actually asked us to. But it's almost free for us to cache compounds
* consisting of only a SEQUENCE op, so we may as well cache those too.
* Also, the protocol doesn't give us a convenient response in the case
* of a replay of a solo SEQUENCE op that wasn't cached
* (RETRY_UNCACHED_REP can only be returned in the second op of a
* compound).
*/
>> (Based on my reading of RFC 8881, I'm not sure NFSD should cache
>> solo SEQUENCE operations that fail, but that is perhaps for a
>> different day).
The 2009 commit that broke this removed the logic that prevented caching
failed solo SEQUENCE operations. It might be more sensible to simply
revert that commit entirely.
Or, further, if it seems utterly broken to cache even successful solo
SEQUENCE operations, then I would consider removing the solo SEQUENCE
caching logic.
RFC 8881 Section 2.10.6.1.3 opens with:
> On a per-request basis, the requester can choose to direct the replier
> to cache the reply to all operations after the first operation
> (SEQUENCE or CB_SEQUENCE) via the sa_cachethis or csa_cachethis fields
> of the arguments to SEQUENCE or CB_SEQUENCE.
That suggests the spec authors expected servers not to cache a
COMPOUND's SEQUENCE result at all.
--
Chuck Lever
^ permalink raw reply [flat|nested] 7+ messages in thread