* [RFC PATCH 0/3] Remove the max-ops-per-compound-limit
@ 2025-06-10 16:05 Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 1/3] NFSD: Rename a function parameter Chuck Lever
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Chuck Lever @ 2025-06-10 16:05 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
After some discussion last year, Jeff reasoned that there was no
architectural reason NFSD has to restrict the number of
operations per NFSv4 COMPOUND. This is an experimental series to
explore that idea.
Chuck Lever (3):
NFSD: Rename a function parameter
NFSD: Make nfsd_genl_rqstp::rq_ops array best-effort
NFSD: Remove the cap on number of operations per NFSv4 COMPOUND
fs/nfsd/nfs4proc.c | 14 ++------------
fs/nfsd/nfs4state.c | 1 -
fs/nfsd/nfs4xdr.c | 4 +---
fs/nfsd/nfsctl.c | 31 ++++++++++++++++---------------
fs/nfsd/nfsd.h | 5 +----
fs/nfsd/xdr4.h | 1 -
6 files changed, 20 insertions(+), 36 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH 1/3] NFSD: Rename a function parameter
2025-06-10 16:05 [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Chuck Lever
@ 2025-06-10 16:05 ` Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 2/3] NFSD: Make nfsd_genl_rqstp::rq_ops array best-effort Chuck Lever
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2025-06-10 16:05 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Clean up: A function parameter called "rqstp" typically refers to an
object of type "struct svc_rqst", so it's confusing when such an
parameter refers to a different struct type with field names that are
very similar to svc_rqst.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfsctl.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 3f3e9f6c4250..b9b2189ce880 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1436,7 +1436,7 @@ unsigned int nfsd_net_id;
static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
struct netlink_callback *cb,
- struct nfsd_genl_rqstp *rqstp)
+ struct nfsd_genl_rqstp *genl_rqstp)
{
void *hdr;
u32 i;
@@ -1446,22 +1446,22 @@ static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
if (!hdr)
return -ENOBUFS;
- if (nla_put_be32(skb, NFSD_A_RPC_STATUS_XID, rqstp->rq_xid) ||
- nla_put_u32(skb, NFSD_A_RPC_STATUS_FLAGS, rqstp->rq_flags) ||
- nla_put_u32(skb, NFSD_A_RPC_STATUS_PROG, rqstp->rq_prog) ||
- nla_put_u32(skb, NFSD_A_RPC_STATUS_PROC, rqstp->rq_proc) ||
- nla_put_u8(skb, NFSD_A_RPC_STATUS_VERSION, rqstp->rq_vers) ||
+ if (nla_put_be32(skb, NFSD_A_RPC_STATUS_XID, genl_rqstp->rq_xid) ||
+ nla_put_u32(skb, NFSD_A_RPC_STATUS_FLAGS, genl_rqstp->rq_flags) ||
+ nla_put_u32(skb, NFSD_A_RPC_STATUS_PROG, genl_rqstp->rq_prog) ||
+ nla_put_u32(skb, NFSD_A_RPC_STATUS_PROC, genl_rqstp->rq_proc) ||
+ nla_put_u8(skb, NFSD_A_RPC_STATUS_VERSION, genl_rqstp->rq_vers) ||
nla_put_s64(skb, NFSD_A_RPC_STATUS_SERVICE_TIME,
- ktime_to_us(rqstp->rq_stime),
+ ktime_to_us(genl_rqstp->rq_stime),
NFSD_A_RPC_STATUS_PAD))
return -ENOBUFS;
- switch (rqstp->rq_saddr.sa_family) {
+ switch (genl_rqstp->rq_saddr.sa_family) {
case AF_INET: {
const struct sockaddr_in *s_in, *d_in;
- s_in = (const struct sockaddr_in *)&rqstp->rq_saddr;
- d_in = (const struct sockaddr_in *)&rqstp->rq_daddr;
+ s_in = (const struct sockaddr_in *)&genl_rqstp->rq_saddr;
+ d_in = (const struct sockaddr_in *)&genl_rqstp->rq_daddr;
if (nla_put_in_addr(skb, NFSD_A_RPC_STATUS_SADDR4,
s_in->sin_addr.s_addr) ||
nla_put_in_addr(skb, NFSD_A_RPC_STATUS_DADDR4,
@@ -1476,8 +1476,8 @@ static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
case AF_INET6: {
const struct sockaddr_in6 *s_in, *d_in;
- s_in = (const struct sockaddr_in6 *)&rqstp->rq_saddr;
- d_in = (const struct sockaddr_in6 *)&rqstp->rq_daddr;
+ s_in = (const struct sockaddr_in6 *)&genl_rqstp->rq_saddr;
+ d_in = (const struct sockaddr_in6 *)&genl_rqstp->rq_daddr;
if (nla_put_in6_addr(skb, NFSD_A_RPC_STATUS_SADDR6,
&s_in->sin6_addr) ||
nla_put_in6_addr(skb, NFSD_A_RPC_STATUS_DADDR6,
@@ -1491,9 +1491,9 @@ static int nfsd_genl_rpc_status_compose_msg(struct sk_buff *skb,
}
}
- for (i = 0; i < rqstp->rq_opcnt; i++)
+ for (i = 0; i < genl_rqstp->rq_opcnt; i++)
if (nla_put_u32(skb, NFSD_A_RPC_STATUS_COMPOUND_OPS,
- rqstp->rq_opnum[i]))
+ genl_rqstp->rq_opnum[i]))
return -ENOBUFS;
genlmsg_end(skb, hdr);
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [RFC PATCH 2/3] NFSD: Make nfsd_genl_rqstp::rq_ops array best-effort
2025-06-10 16:05 [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 1/3] NFSD: Rename a function parameter Chuck Lever
@ 2025-06-10 16:05 ` Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND Chuck Lever
2025-06-10 17:28 ` [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Jeff Layton
3 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2025-06-10 16:05 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
To enable NFSD to handle NFSv4 COMPOUNDs of unrestricted size,
resize the array in struct nfsd_genl_rqstp so it saves only up to
16 operations per COMPOUND.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfsctl.c | 3 ++-
fs/nfsd/nfsd.h | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index b9b2189ce880..1e0ebcc3216c 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1569,7 +1569,8 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
int j;
args = rqstp->rq_argp;
- genl_rqstp.rq_opcnt = args->opcnt;
+ genl_rqstp.rq_opcnt = min_t(u32, args->opcnt,
+ ARRAY_SIZE(genl_rqstp.rq_opnum));
for (j = 0; j < genl_rqstp.rq_opcnt; j++)
genl_rqstp.rq_opnum[j] =
args->ops[j].opnum;
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 1bfd0b4e9af7..570065285e67 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -72,7 +72,7 @@ struct nfsd_genl_rqstp {
/* NFSv4 compound */
u32 rq_opcnt;
- u32 rq_opnum[NFSD_MAX_OPS_PER_COMPOUND];
+ u32 rq_opnum[16];
};
extern struct svc_program nfsd_programs[];
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND
2025-06-10 16:05 [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 1/3] NFSD: Rename a function parameter Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 2/3] NFSD: Make nfsd_genl_rqstp::rq_ops array best-effort Chuck Lever
@ 2025-06-10 16:05 ` Chuck Lever
2025-06-10 17:01 ` Jeff Layton
2025-06-10 17:28 ` [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Jeff Layton
3 siblings, 1 reply; 7+ messages in thread
From: Chuck Lever @ 2025-06-10 16:05 UTC (permalink / raw)
To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
This limit has always been a sanity check; in nearly all cases a
large COMPOUND is a sign of a malfunctioning client. The only real
limit on COMPOUND size and complexity is the size of NFSD's send
and receive buffers.
However, there are a few cases where a large COMPOUND is sane. For
example, when a client implementation wants to walk down a long file
pathname in a single round trip.
A small risk is that now a client can construct a COMPOUND request
that can keep a single nfsd thread busy for quite some time.
Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4proc.c | 14 ++------------
fs/nfsd/nfs4state.c | 1 -
fs/nfsd/nfs4xdr.c | 4 +---
fs/nfsd/nfsd.h | 3 ---
fs/nfsd/xdr4.h | 1 -
5 files changed, 3 insertions(+), 20 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f13abbb13b38..f4edf222e00e 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2842,20 +2842,10 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
rqstp->rq_lease_breaker = (void **)&cstate->clp;
- trace_nfsd_compound(rqstp, args->tag, args->taglen, args->client_opcnt);
+ trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
while (!status && resp->opcnt < args->opcnt) {
op = &args->ops[resp->opcnt++];
- if (unlikely(resp->opcnt == NFSD_MAX_OPS_PER_COMPOUND)) {
- /* If there are still more operations to process,
- * stop here and report NFS4ERR_RESOURCE. */
- if (cstate->minorversion == 0 &&
- args->client_opcnt > resp->opcnt) {
- op->status = nfserr_resource;
- goto encode_op;
- }
- }
-
/*
* The XDR decode routines may have pre-set op->status;
* for example, if there is a miscellaneous XDR error
@@ -2932,7 +2922,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
status = op->status;
}
- trace_nfsd_compound_status(args->client_opcnt, resp->opcnt,
+ trace_nfsd_compound_status(args->opcnt, resp->opcnt,
status, nfsd4_op_name(op->opnum));
nfsd4_cstate_clear_replay(cstate);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index d5694987f86f..4b6ae8e54cd2 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -3872,7 +3872,6 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
ca->headerpadsz = 0;
ca->maxreq_sz = min_t(u32, ca->maxreq_sz, maxrpc);
ca->maxresp_sz = min_t(u32, ca->maxresp_sz, maxrpc);
- ca->maxops = min_t(u32, ca->maxops, NFSD_MAX_OPS_PER_COMPOUND);
ca->maxresp_cached = min_t(u32, ca->maxresp_cached,
NFSD_SLOT_CACHE_SIZE + NFSD_MIN_HDR_SEQ_SZ);
ca->maxreqs = min_t(u32, ca->maxreqs, NFSD_MAX_SLOTS_PER_SESSION);
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 3afcdbed6e14..ea91bad4eee2 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -2500,10 +2500,8 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
if (xdr_stream_decode_u32(argp->xdr, &argp->minorversion) < 0)
return false;
- if (xdr_stream_decode_u32(argp->xdr, &argp->client_opcnt) < 0)
+ if (xdr_stream_decode_u32(argp->xdr, &argp->opcnt) < 0)
return false;
- argp->opcnt = min_t(u32, argp->client_opcnt,
- NFSD_MAX_OPS_PER_COMPOUND);
if (argp->opcnt > ARRAY_SIZE(argp->iops)) {
argp->ops = vcalloc(argp->opcnt, sizeof(*argp->ops));
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 570065285e67..54a96042f5ac 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -57,9 +57,6 @@ struct readdir_cd {
__be32 err; /* 0, nfserr, or nfserr_eof */
};
-/* Maximum number of operations per session compound */
-#define NFSD_MAX_OPS_PER_COMPOUND 50
-
struct nfsd_genl_rqstp {
struct sockaddr rq_daddr;
struct sockaddr rq_saddr;
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index aa2a356da784..a23bc56051ca 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -870,7 +870,6 @@ struct nfsd4_compoundargs {
char * tag;
u32 taglen;
u32 minorversion;
- u32 client_opcnt;
u32 opcnt;
bool splice_ok;
struct nfsd4_op *ops;
--
2.49.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND
2025-06-10 16:05 ` [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND Chuck Lever
@ 2025-06-10 17:01 ` Jeff Layton
2025-06-10 17:07 ` Chuck Lever
0 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2025-06-10 17:01 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
On Tue, 2025-06-10 at 12:05 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> This limit has always been a sanity check; in nearly all cases a
> large COMPOUND is a sign of a malfunctioning client. The only real
> limit on COMPOUND size and complexity is the size of NFSD's send
> and receive buffers.
>
> However, there are a few cases where a large COMPOUND is sane. For
> example, when a client implementation wants to walk down a long file
> pathname in a single round trip.
>
> A small risk is that now a client can construct a COMPOUND request
> that can keep a single nfsd thread busy for quite some time.
>
You're right about the risk there. I wonder what we could do to
mitigate that?
Maybe get a timestamp at the start of the compound and then check vs.
that after every operation? If the compound is taking longer than a
some timeout, give up and return an error on the next operation?
Also, while I did suggest it, we should consider not removing this
limit altogether, and rather just increase it to something like a max
practical limit:
For instance, we have limits in the channel_attrs for ca_maxrequestsize
and ca_maxresponsesize. What's the smallest operation? If we had a
compound comprised of just those operations, how many would fit?
That would at least act as a sanity check against compounds that are
clearly nonsensical.
> Suggested-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/nfs4proc.c | 14 ++------------
> fs/nfsd/nfs4state.c | 1 -
> fs/nfsd/nfs4xdr.c | 4 +---
> fs/nfsd/nfsd.h | 3 ---
> fs/nfsd/xdr4.h | 1 -
> 5 files changed, 3 insertions(+), 20 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index f13abbb13b38..f4edf222e00e 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -2842,20 +2842,10 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
>
> rqstp->rq_lease_breaker = (void **)&cstate->clp;
>
> - trace_nfsd_compound(rqstp, args->tag, args->taglen, args->client_opcnt);
> + trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
> while (!status && resp->opcnt < args->opcnt) {
> op = &args->ops[resp->opcnt++];
>
> - if (unlikely(resp->opcnt == NFSD_MAX_OPS_PER_COMPOUND)) {
> - /* If there are still more operations to process,
> - * stop here and report NFS4ERR_RESOURCE. */
> - if (cstate->minorversion == 0 &&
> - args->client_opcnt > resp->opcnt) {
> - op->status = nfserr_resource;
> - goto encode_op;
> - }
> - }
> -
> /*
> * The XDR decode routines may have pre-set op->status;
> * for example, if there is a miscellaneous XDR error
> @@ -2932,7 +2922,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
> status = op->status;
> }
>
> - trace_nfsd_compound_status(args->client_opcnt, resp->opcnt,
> + trace_nfsd_compound_status(args->opcnt, resp->opcnt,
> status, nfsd4_op_name(op->opnum));
>
> nfsd4_cstate_clear_replay(cstate);
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index d5694987f86f..4b6ae8e54cd2 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3872,7 +3872,6 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
> ca->headerpadsz = 0;
> ca->maxreq_sz = min_t(u32, ca->maxreq_sz, maxrpc);
> ca->maxresp_sz = min_t(u32, ca->maxresp_sz, maxrpc);
> - ca->maxops = min_t(u32, ca->maxops, NFSD_MAX_OPS_PER_COMPOUND);
> ca->maxresp_cached = min_t(u32, ca->maxresp_cached,
> NFSD_SLOT_CACHE_SIZE + NFSD_MIN_HDR_SEQ_SZ);
> ca->maxreqs = min_t(u32, ca->maxreqs, NFSD_MAX_SLOTS_PER_SESSION);
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index 3afcdbed6e14..ea91bad4eee2 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -2500,10 +2500,8 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
>
> if (xdr_stream_decode_u32(argp->xdr, &argp->minorversion) < 0)
> return false;
> - if (xdr_stream_decode_u32(argp->xdr, &argp->client_opcnt) < 0)
> + if (xdr_stream_decode_u32(argp->xdr, &argp->opcnt) < 0)
> return false;
> - argp->opcnt = min_t(u32, argp->client_opcnt,
> - NFSD_MAX_OPS_PER_COMPOUND);
>
> if (argp->opcnt > ARRAY_SIZE(argp->iops)) {
> argp->ops = vcalloc(argp->opcnt, sizeof(*argp->ops));
> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> index 570065285e67..54a96042f5ac 100644
> --- a/fs/nfsd/nfsd.h
> +++ b/fs/nfsd/nfsd.h
> @@ -57,9 +57,6 @@ struct readdir_cd {
> __be32 err; /* 0, nfserr, or nfserr_eof */
> };
>
> -/* Maximum number of operations per session compound */
> -#define NFSD_MAX_OPS_PER_COMPOUND 50
> -
> struct nfsd_genl_rqstp {
> struct sockaddr rq_daddr;
> struct sockaddr rq_saddr;
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index aa2a356da784..a23bc56051ca 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -870,7 +870,6 @@ struct nfsd4_compoundargs {
> char * tag;
> u32 taglen;
> u32 minorversion;
> - u32 client_opcnt;
> u32 opcnt;
> bool splice_ok;
> struct nfsd4_op *ops;
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND
2025-06-10 17:01 ` Jeff Layton
@ 2025-06-10 17:07 ` Chuck Lever
0 siblings, 0 replies; 7+ messages in thread
From: Chuck Lever @ 2025-06-10 17:07 UTC (permalink / raw)
To: Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
On 6/10/25 1:01 PM, Jeff Layton wrote:
> On Tue, 2025-06-10 at 12:05 -0400, Chuck Lever wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> This limit has always been a sanity check; in nearly all cases a
>> large COMPOUND is a sign of a malfunctioning client. The only real
>> limit on COMPOUND size and complexity is the size of NFSD's send
>> and receive buffers.
>>
>> However, there are a few cases where a large COMPOUND is sane. For
>> example, when a client implementation wants to walk down a long file
>> pathname in a single round trip.
>>
>> A small risk is that now a client can construct a COMPOUND request
>> that can keep a single nfsd thread busy for quite some time.
>>
>
> You're right about the risk there. I wonder what we could do to
> mitigate that?
>
> Maybe get a timestamp at the start of the compound and then check vs.
> that after every operation? If the compound is taking longer than a
> some timeout, give up and return an error on the next operation?
I'm open to thinking about additional guard rails.
The problem with a timeout is that any single operation can take a long
time -- if the underlying media is malfunctioning or if the remote NFS
server for a re-export is unreachable, for example.
> Also, while I did suggest it, we should consider not removing this
> limit altogether, and rather just increase it to something like a max
> practical limit:
>
> For instance, we have limits in the channel_attrs for ca_maxrequestsize
> and ca_maxresponsesize. What's the smallest operation? If we had a
> compound comprised of just those operations, how many would fit?
>
> That would at least act as a sanity check against compounds that are
> clearly nonsensical.
Relying on the size of the COMPOUND itself should be sufficient. If the
whole COMPOUND can't fit in ca_maxrequestsize, that's effectively the
same thing as limiting the number of ops based on the maxrequestsize
value.
>> Suggested-by: Jeff Layton <jlayton@kernel.org>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> fs/nfsd/nfs4proc.c | 14 ++------------
>> fs/nfsd/nfs4state.c | 1 -
>> fs/nfsd/nfs4xdr.c | 4 +---
>> fs/nfsd/nfsd.h | 3 ---
>> fs/nfsd/xdr4.h | 1 -
>> 5 files changed, 3 insertions(+), 20 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index f13abbb13b38..f4edf222e00e 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -2842,20 +2842,10 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
>>
>> rqstp->rq_lease_breaker = (void **)&cstate->clp;
>>
>> - trace_nfsd_compound(rqstp, args->tag, args->taglen, args->client_opcnt);
>> + trace_nfsd_compound(rqstp, args->tag, args->taglen, args->opcnt);
>> while (!status && resp->opcnt < args->opcnt) {
>> op = &args->ops[resp->opcnt++];
>>
>> - if (unlikely(resp->opcnt == NFSD_MAX_OPS_PER_COMPOUND)) {
>> - /* If there are still more operations to process,
>> - * stop here and report NFS4ERR_RESOURCE. */
>> - if (cstate->minorversion == 0 &&
>> - args->client_opcnt > resp->opcnt) {
>> - op->status = nfserr_resource;
>> - goto encode_op;
>> - }
>> - }
>> -
>> /*
>> * The XDR decode routines may have pre-set op->status;
>> * for example, if there is a miscellaneous XDR error
>> @@ -2932,7 +2922,7 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
>> status = op->status;
>> }
>>
>> - trace_nfsd_compound_status(args->client_opcnt, resp->opcnt,
>> + trace_nfsd_compound_status(args->opcnt, resp->opcnt,
>> status, nfsd4_op_name(op->opnum));
>>
>> nfsd4_cstate_clear_replay(cstate);
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index d5694987f86f..4b6ae8e54cd2 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -3872,7 +3872,6 @@ static __be32 check_forechannel_attrs(struct nfsd4_channel_attrs *ca, struct nfs
>> ca->headerpadsz = 0;
>> ca->maxreq_sz = min_t(u32, ca->maxreq_sz, maxrpc);
>> ca->maxresp_sz = min_t(u32, ca->maxresp_sz, maxrpc);
>> - ca->maxops = min_t(u32, ca->maxops, NFSD_MAX_OPS_PER_COMPOUND);
>> ca->maxresp_cached = min_t(u32, ca->maxresp_cached,
>> NFSD_SLOT_CACHE_SIZE + NFSD_MIN_HDR_SEQ_SZ);
>> ca->maxreqs = min_t(u32, ca->maxreqs, NFSD_MAX_SLOTS_PER_SESSION);
>> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
>> index 3afcdbed6e14..ea91bad4eee2 100644
>> --- a/fs/nfsd/nfs4xdr.c
>> +++ b/fs/nfsd/nfs4xdr.c
>> @@ -2500,10 +2500,8 @@ nfsd4_decode_compound(struct nfsd4_compoundargs *argp)
>>
>> if (xdr_stream_decode_u32(argp->xdr, &argp->minorversion) < 0)
>> return false;
>> - if (xdr_stream_decode_u32(argp->xdr, &argp->client_opcnt) < 0)
>> + if (xdr_stream_decode_u32(argp->xdr, &argp->opcnt) < 0)
>> return false;
>> - argp->opcnt = min_t(u32, argp->client_opcnt,
>> - NFSD_MAX_OPS_PER_COMPOUND);
>>
>> if (argp->opcnt > ARRAY_SIZE(argp->iops)) {
>> argp->ops = vcalloc(argp->opcnt, sizeof(*argp->ops));
>> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
>> index 570065285e67..54a96042f5ac 100644
>> --- a/fs/nfsd/nfsd.h
>> +++ b/fs/nfsd/nfsd.h
>> @@ -57,9 +57,6 @@ struct readdir_cd {
>> __be32 err; /* 0, nfserr, or nfserr_eof */
>> };
>>
>> -/* Maximum number of operations per session compound */
>> -#define NFSD_MAX_OPS_PER_COMPOUND 50
>> -
>> struct nfsd_genl_rqstp {
>> struct sockaddr rq_daddr;
>> struct sockaddr rq_saddr;
>> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
>> index aa2a356da784..a23bc56051ca 100644
>> --- a/fs/nfsd/xdr4.h
>> +++ b/fs/nfsd/xdr4.h
>> @@ -870,7 +870,6 @@ struct nfsd4_compoundargs {
>> char * tag;
>> u32 taglen;
>> u32 minorversion;
>> - u32 client_opcnt;
>> u32 opcnt;
>> bool splice_ok;
>> struct nfsd4_op *ops;
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH 0/3] Remove the max-ops-per-compound-limit
2025-06-10 16:05 [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Chuck Lever
` (2 preceding siblings ...)
2025-06-10 16:05 ` [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND Chuck Lever
@ 2025-06-10 17:28 ` Jeff Layton
3 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2025-06-10 17:28 UTC (permalink / raw)
To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
On Tue, 2025-06-10 at 12:05 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> After some discussion last year, Jeff reasoned that there was no
> architectural reason NFSD has to restrict the number of
> operations per NFSv4 COMPOUND. This is an experimental series to
> explore that idea.
>
> Chuck Lever (3):
> NFSD: Rename a function parameter
> NFSD: Make nfsd_genl_rqstp::rq_ops array best-effort
> NFSD: Remove the cap on number of operations per NFSv4 COMPOUND
>
> fs/nfsd/nfs4proc.c | 14 ++------------
> fs/nfsd/nfs4state.c | 1 -
> fs/nfsd/nfs4xdr.c | 4 +---
> fs/nfsd/nfsctl.c | 31 ++++++++++++++++---------------
> fs/nfsd/nfsd.h | 5 +----
> fs/nfsd/xdr4.h | 1 -
> 6 files changed, 20 insertions(+), 36 deletions(-)
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-06-10 17:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-10 16:05 [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 1/3] NFSD: Rename a function parameter Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 2/3] NFSD: Make nfsd_genl_rqstp::rq_ops array best-effort Chuck Lever
2025-06-10 16:05 ` [RFC PATCH 3/3] NFSD: Remove the cap on number of operations per NFSv4 COMPOUND Chuck Lever
2025-06-10 17:01 ` Jeff Layton
2025-06-10 17:07 ` Chuck Lever
2025-06-10 17:28 ` [RFC PATCH 0/3] Remove the max-ops-per-compound-limit Jeff Layton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox