From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever <cel@kernel.org>
Cc: NeilBrown <neilb@ownmail.net>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>,
linux-nfs@vger.kernel.org, Chuck Lever <chuck.lever@oracle.com>
Subject: Re: [PATCH] nfsd: use dynamic allocation for oversized NFSv4.0 replay cache
Date: Tue, 24 Feb 2026 14:39:50 -0500 [thread overview]
Message-ID: <887c1ca78b34974160dee3ce7f25d6d077da93ab.camel@kernel.org> (raw)
In-Reply-To: <20260224193354.57849-1-cel@kernel.org>
On Tue, 2026-02-24 at 14:33 -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Commit 1e8e9913672a ("nfsd: fix heap overflow in NFSv4.0 LOCK
> replay cache") capped the replay cache copy at NFSD4_REPLAY_ISIZE
> to prevent a heap overflow, but set rp_buflen to zero when the
> encoded response exceeded the inline buffer. A retransmitted LOCK
> reaching the replay path then produced only a status code with no
> operation body, resulting in a malformed XDR response.
>
> When the encoded response exceeds the 112-byte inline rp_ibuf, a
> buffer is kmalloc'd to hold it. If the allocation fails, rp_buflen
> remains zero, preserving the behavior from the capped-copy fix.
> The buffer is freed when the stateowner is released or when a
> subsequent operation's response fits in the inline buffer.
>
> Fixes: 1e8e9913672a ("nfsd: fix heap overflow in NFSv4.0 LOCK replay cache")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/nfs4state.c | 16 ++++++++++++++++
> fs/nfsd/nfs4xdr.c | 23 ++++++++++++++++-------
> fs/nfsd/state.h | 12 +++++++-----
> 3 files changed, 39 insertions(+), 12 deletions(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index ba49f49bb93b..b4d0e82b2690 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1496,8 +1496,24 @@ release_all_access(struct nfs4_ol_stateid *stp)
> }
> }
>
> +/**
> + * nfs4_replay_free_cache - release dynamically allocated replay buffer
> + * @rp: replay cache to reset
> + *
> + * If @rp->rp_buf points to a kmalloc'd buffer, free it and reset
> + * rp_buf to the inline rp_ibuf. Always zeroes rp_buflen.
> + */
> +void nfs4_replay_free_cache(struct nfs4_replay *rp)
> +{
> + if (rp->rp_buf != rp->rp_ibuf)
> + kfree(rp->rp_buf);
> + rp->rp_buf = rp->rp_ibuf;
> + rp->rp_buflen = 0;
> +}
> +
> static inline void nfs4_free_stateowner(struct nfs4_stateowner *sop)
> {
> + nfs4_replay_free_cache(&sop->so_replay);
> kfree(sop->so_owner.data);
> sop->so_ops->so_free(sop);
> }
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 690f7a3122ec..2a0946c630e1 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -6282,14 +6282,23 @@ nfsd4_encode_operation(struct nfsd4_compoundres *resp, struct nfsd4_op *op)
> int len = xdr->buf->len - (op_status_offset + XDR_UNIT);
>
> so->so_replay.rp_status = op->status;
> - if (len <= NFSD4_REPLAY_ISIZE) {
> - so->so_replay.rp_buflen = len;
> - read_bytes_from_xdr_buf(xdr->buf,
> - op_status_offset + XDR_UNIT,
> - so->so_replay.rp_buf, len);
> - } else {
> - so->so_replay.rp_buflen = 0;
> + if (len > NFSD4_REPLAY_ISIZE) {
> + char *buf = kmalloc(len, GFP_KERNEL);
> +
> + nfs4_replay_free_cache(&so->so_replay);
> + if (buf) {
> + so->so_replay.rp_buf = buf;
> + } else {
> + /* rp_buflen already zeroed; skip caching */
> + goto status;
> + }
> + } else if (so->so_replay.rp_buf != so->so_replay.rp_ibuf) {
> + nfs4_replay_free_cache(&so->so_replay);
> }
> + so->so_replay.rp_buflen = len;
> + read_bytes_from_xdr_buf(xdr->buf,
> + op_status_offset + XDR_UNIT,
> + so->so_replay.rp_buf, len);
> }
> status:
> op->status = nfsd4_map_status(op->status,
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 3159c7b67f50..9b05462da4cc 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -554,10 +554,10 @@ struct nfs4_client_reclaim {
> * ~32(deleg. ace) = 112 bytes
> *
> * Some responses can exceed this. A LOCK denial includes the conflicting
> - * lock owner, which can be up to 1024 bytes (NFS4_OPAQUE_LIMIT). Responses
> - * larger than REPLAY_ISIZE are not cached in rp_ibuf; only rp_status is
> - * saved. Enlarging this constant increases the size of every
> - * nfs4_stateowner.
> + * lock owner, which can be up to 1024 bytes (NFS4_OPAQUE_LIMIT). When a
> + * response exceeds REPLAY_ISIZE, a buffer is dynamically allocated. If
> + * that allocation fails, only rp_status is saved. Enlarging this constant
> + * increases the size of every nfs4_stateowner.
> */
>
> #define NFSD4_REPLAY_ISIZE 112
> @@ -569,12 +569,14 @@ struct nfs4_client_reclaim {
> struct nfs4_replay {
> __be32 rp_status;
> unsigned int rp_buflen;
> - char *rp_buf;
> + char *rp_buf; /* rp_ibuf or kmalloc'd */
> struct knfsd_fh rp_openfh;
> int rp_locked;
> char rp_ibuf[NFSD4_REPLAY_ISIZE];
> };
>
> +extern void nfs4_replay_free_cache(struct nfs4_replay *rp);
> +
> struct nfs4_stateowner;
>
> struct nfs4_stateowner_operations {
Certainly a reasonable approach if we care about full correctness when
dealing with a large lockowner on NFSv4.0. Do we?
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2026-02-24 19:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-24 19:33 [PATCH] nfsd: use dynamic allocation for oversized NFSv4.0 replay cache Chuck Lever
2026-02-24 19:39 ` Jeff Layton [this message]
2026-02-24 19:42 ` Chuck Lever
2026-02-24 19:51 ` Jeff Layton
2026-02-24 19:53 ` Chuck Lever
2026-02-24 19:59 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=887c1ca78b34974160dee3ce7f25d6d077da93ab.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=dai.ngo@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@ownmail.net \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox