Linux NFS development
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Move rq_vec[] and rq_bvec[] out of svc_rqst
@ 2025-04-16 15:28 cel
  2025-04-16 15:28 ` [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory cel
  2025-04-16 15:28 ` [RFC PATCH 2/2] sunrpc: Replace the rq_vec " cel
  0 siblings, 2 replies; 6+ messages in thread
From: cel @ 2025-04-16 15:28 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

In order to make RPCSVC_MAXPAYLOAD larger (or variable in size), we
need to do something clever with the payload arrays embedded in
struct svc_rqst. Here's one way of dealing with two of them.

My preference is to keep these arrays allocated all the time because
allocating them on demand increases the risk of a memory allocation
failure during a large I/O. This is a quick-and-dirty approach that
might be replaced once NFSD is converted to use large folios.

The downside of this design choice is that it pins a few pages per
NFSD thread (and that's the current situation already). But note
that because RPCSVC_MAXPAGES is 259, each array is just over a page
in size, making the allocation waste quite a bit of memory beyond
the end of the array due to power-of-2 allocator round up. This gets
worse as the MAXPAGES value is doubled or quadrupled.

I plan to look at rq_pages[] next.

Chuck Lever (2):
  sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  sunrpc: Replace the rq_vec array with dynamically-allocated memory

 fs/nfsd/nfs4proc.c         |  2 +-
 fs/nfsd/vfs.c              |  2 +-
 include/linux/sunrpc/svc.h |  4 ++--
 net/sunrpc/svc.c           | 14 +++++++++++++-
 net/sunrpc/svcsock.c       |  7 +++----
 5 files changed, 20 insertions(+), 9 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  2025-04-16 15:28 [RFC PATCH 0/2] Move rq_vec[] and rq_bvec[] out of svc_rqst cel
@ 2025-04-16 15:28 ` cel
  2025-04-16 18:42   ` Jeff Layton
  2025-04-16 15:28 ` [RFC PATCH 2/2] sunrpc: Replace the rq_vec " cel
  1 sibling, 1 reply; 6+ messages in thread
From: cel @ 2025-04-16 15:28 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

As a step towards making NFSD's maximum rsize and wsize variable,
replace the fixed-size rq_bvec[] array in struct svc_rqst with a
chunk of dynamically-allocated memory.

On a system with 8-byte pointers and 4KB pages, pahole reports that
the rq_bvec[] array is 4144 bytes. Replacing it with a single
pointer reduces the size of struct svc_rqst to about 7500 bytes.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/svc.h | 2 +-
 net/sunrpc/svc.c           | 6 ++++++
 net/sunrpc/svcsock.c       | 7 +++----
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 74658cca0f38..225c385085c3 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -195,7 +195,7 @@ struct svc_rqst {
 
 	struct folio_batch	rq_fbatch;
 	struct kvec		rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */
-	struct bio_vec		rq_bvec[RPCSVC_MAXPAGES];
+	struct bio_vec		*rq_bvec;
 
 	__be32			rq_xid;		/* transmission id */
 	u32			rq_prog;	/* program number */
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index e7f9c295d13c..db29819716b8 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -673,6 +673,7 @@ static void
 svc_rqst_free(struct svc_rqst *rqstp)
 {
 	folio_batch_release(&rqstp->rq_fbatch);
+	kfree(rqstp->rq_bvec);
 	svc_release_buffer(rqstp);
 	if (rqstp->rq_scratch_page)
 		put_page(rqstp->rq_scratch_page);
@@ -711,6 +712,11 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
 	if (!svc_init_buffer(rqstp, serv->sv_max_mesg, node))
 		goto out_enomem;
 
+	rqstp->rq_bvec = kcalloc_node(RPCSVC_MAXPAGES, sizeof(struct bio_vec),
+				      GFP_KERNEL, node);
+	if (!rqstp->rq_bvec)
+		goto out_enomem;
+
 	rqstp->rq_err = -EAGAIN; /* No error yet */
 
 	serv->sv_nrthreads += 1;
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 72e5a01df3d3..671640933f18 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -713,8 +713,7 @@ static int svc_udp_sendto(struct svc_rqst *rqstp)
 	if (svc_xprt_is_dead(xprt))
 		goto out_notconn;
 
-	count = xdr_buf_to_bvec(rqstp->rq_bvec,
-				ARRAY_SIZE(rqstp->rq_bvec), xdr);
+	count = xdr_buf_to_bvec(rqstp->rq_bvec, RPCSVC_MAXPAGES, xdr);
 
 	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
 		      count, rqstp->rq_res.len);
@@ -1219,8 +1218,8 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
 	memcpy(buf, &marker, sizeof(marker));
 	bvec_set_virt(rqstp->rq_bvec, buf, sizeof(marker));
 
-	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1,
-				ARRAY_SIZE(rqstp->rq_bvec) - 1, &rqstp->rq_res);
+	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1, RPCSVC_MAXPAGES,
+				&rqstp->rq_res);
 
 	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
 		      1 + count, sizeof(marker) + rqstp->rq_res.len);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH 2/2] sunrpc: Replace the rq_vec array with dynamically-allocated memory
  2025-04-16 15:28 [RFC PATCH 0/2] Move rq_vec[] and rq_bvec[] out of svc_rqst cel
  2025-04-16 15:28 ` [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory cel
@ 2025-04-16 15:28 ` cel
  1 sibling, 0 replies; 6+ messages in thread
From: cel @ 2025-04-16 15:28 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

As a step towards making NFSD's maximum rsize and wsize variable,
replace the fixed-size rq_vec[] array in struct svc_rqst with a
chunk of dynamically-allocated memory.

On a system with 8-byte pointers and 4KB pages, pahole reports that
the rq_vec[] array is 4144 bytes. Replacing it with a single
pointer reduces the size of struct svc_rqst to about 3300 bytes.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4proc.c         | 2 +-
 fs/nfsd/vfs.c              | 2 +-
 include/linux/sunrpc/svc.h | 2 +-
 net/sunrpc/svc.c           | 8 +++++++-
 4 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index b397246dae7b..79ee58202396 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1228,7 +1228,7 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	write->wr_how_written = write->wr_stable_how;
 
 	nvecs = svc_fill_write_vector(rqstp, &write->wr_payload);
-	WARN_ON_ONCE(nvecs > ARRAY_SIZE(rqstp->rq_vec));
+	/* WARN_ON_ONCE(nvecs > ARRAY_SIZE(rqstp->rq_vec)); */
 
 	status = nfsd_vfs_write(rqstp, &cstate->current_fh, nf,
 				write->wr_offset, rqstp->rq_vec, nvecs, &cnt,
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 9abdc4b75813..ae0901d6db1a 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1094,7 +1094,7 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		++v;
 		base = 0;
 	}
-	WARN_ON_ONCE(v > ARRAY_SIZE(rqstp->rq_vec));
+	WARN_ON_ONCE(v > RPCSVC_MAXPAGES);
 
 	trace_nfsd_read_vector(rqstp, fhp, offset, *count);
 	iov_iter_kvec(&iter, ITER_DEST, rqstp->rq_vec, v, *count);
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 225c385085c3..13b6d0753bc0 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -194,7 +194,7 @@ struct svc_rqst {
 	struct page *		*rq_page_end;  /* one past the last page */
 
 	struct folio_batch	rq_fbatch;
-	struct kvec		rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */
+	struct kvec		*rq_vec;
 	struct bio_vec		*rq_bvec;
 
 	__be32			rq_xid;		/* transmission id */
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index db29819716b8..8d28aeb74e1b 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -674,6 +674,7 @@ svc_rqst_free(struct svc_rqst *rqstp)
 {
 	folio_batch_release(&rqstp->rq_fbatch);
 	kfree(rqstp->rq_bvec);
+	kfree(rqstp->rq_vec);
 	svc_release_buffer(rqstp);
 	if (rqstp->rq_scratch_page)
 		put_page(rqstp->rq_scratch_page);
@@ -712,6 +713,11 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
 	if (!svc_init_buffer(rqstp, serv->sv_max_mesg, node))
 		goto out_enomem;
 
+	rqstp->rq_vec = kcalloc_node(RPCSVC_MAXPAGES, sizeof(struct kvec),
+				      GFP_KERNEL, node);
+	if (!rqstp->rq_vec)
+		goto out_enomem;
+
 	rqstp->rq_bvec = kcalloc_node(RPCSVC_MAXPAGES, sizeof(struct bio_vec),
 				      GFP_KERNEL, node);
 	if (!rqstp->rq_bvec)
@@ -1754,7 +1760,7 @@ unsigned int svc_fill_write_vector(struct svc_rqst *rqstp,
 		++pages;
 	}
 
-	WARN_ON_ONCE(i > ARRAY_SIZE(rqstp->rq_vec));
+	WARN_ON_ONCE(i > RPCSVC_MAXPAGES);
 	return i;
 }
 EXPORT_SYMBOL_GPL(svc_fill_write_vector);
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  2025-04-16 15:28 ` [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory cel
@ 2025-04-16 18:42   ` Jeff Layton
  2025-04-16 18:45     ` Chuck Lever
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Layton @ 2025-04-16 18:42 UTC (permalink / raw)
  To: cel, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Wed, 2025-04-16 at 11:28 -0400, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> As a step towards making NFSD's maximum rsize and wsize variable,
> replace the fixed-size rq_bvec[] array in struct svc_rqst with a
> chunk of dynamically-allocated memory.
> 
> On a system with 8-byte pointers and 4KB pages, pahole reports that
> the rq_bvec[] array is 4144 bytes. Replacing it with a single
> pointer reduces the size of struct svc_rqst to about 7500 bytes.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  include/linux/sunrpc/svc.h | 2 +-
>  net/sunrpc/svc.c           | 6 ++++++
>  net/sunrpc/svcsock.c       | 7 +++----
>  3 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 74658cca0f38..225c385085c3 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -195,7 +195,7 @@ struct svc_rqst {
>  
>  	struct folio_batch	rq_fbatch;
>  	struct kvec		rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */
> -	struct bio_vec		rq_bvec[RPCSVC_MAXPAGES];
> +	struct bio_vec		*rq_bvec;

It's a reasonable start.

What would also be good to do here is to replace the invocations of
RPCSVC_MAXPAGES that involve this array with a helper function that
returns the length of it.

For now it could just return RPCSVC_MAXPAGES, but eventually you could
add (e.g.) a rqstp->rq_bvec_len field and use that to indicate how many
entries there are in rq_bvec.

>  
>  	__be32			rq_xid;		/* transmission id */
>  	u32			rq_prog;	/* program number */
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index e7f9c295d13c..db29819716b8 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -673,6 +673,7 @@ static void
>  svc_rqst_free(struct svc_rqst *rqstp)
>  {
>  	folio_batch_release(&rqstp->rq_fbatch);
> +	kfree(rqstp->rq_bvec);
>  	svc_release_buffer(rqstp);
>  	if (rqstp->rq_scratch_page)
>  		put_page(rqstp->rq_scratch_page);
> @@ -711,6 +712,11 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
>  	if (!svc_init_buffer(rqstp, serv->sv_max_mesg, node))
>  		goto out_enomem;
>  
> +	rqstp->rq_bvec = kcalloc_node(RPCSVC_MAXPAGES, sizeof(struct bio_vec),
> +				      GFP_KERNEL, node);
> +	if (!rqstp->rq_bvec)
> +		goto out_enomem;
> +
>  	rqstp->rq_err = -EAGAIN; /* No error yet */
>  
>  	serv->sv_nrthreads += 1;
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 72e5a01df3d3..671640933f18 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -713,8 +713,7 @@ static int svc_udp_sendto(struct svc_rqst *rqstp)
>  	if (svc_xprt_is_dead(xprt))
>  		goto out_notconn;
>  
> -	count = xdr_buf_to_bvec(rqstp->rq_bvec,
> -				ARRAY_SIZE(rqstp->rq_bvec), xdr);
> +	count = xdr_buf_to_bvec(rqstp->rq_bvec, RPCSVC_MAXPAGES, xdr);
>  
>  	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
>  		      count, rqstp->rq_res.len);
> @@ -1219,8 +1218,8 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
>  	memcpy(buf, &marker, sizeof(marker));
>  	bvec_set_virt(rqstp->rq_bvec, buf, sizeof(marker));
>  
> -	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1,
> -				ARRAY_SIZE(rqstp->rq_bvec) - 1, &rqstp->rq_res);
> +	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1, RPCSVC_MAXPAGES,
> +				&rqstp->rq_res);
>  
>  	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
>  		      1 + count, sizeof(marker) + rqstp->rq_res.len);

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  2025-04-16 18:42   ` Jeff Layton
@ 2025-04-16 18:45     ` Chuck Lever
  2025-04-16 18:55       ` Jeff Layton
  0 siblings, 1 reply; 6+ messages in thread
From: Chuck Lever @ 2025-04-16 18:45 UTC (permalink / raw)
  To: Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On 4/16/25 2:42 PM, Jeff Layton wrote:
> On Wed, 2025-04-16 at 11:28 -0400, cel@kernel.org wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> As a step towards making NFSD's maximum rsize and wsize variable,
>> replace the fixed-size rq_bvec[] array in struct svc_rqst with a
>> chunk of dynamically-allocated memory.
>>
>> On a system with 8-byte pointers and 4KB pages, pahole reports that
>> the rq_bvec[] array is 4144 bytes. Replacing it with a single
>> pointer reduces the size of struct svc_rqst to about 7500 bytes.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>  include/linux/sunrpc/svc.h | 2 +-
>>  net/sunrpc/svc.c           | 6 ++++++
>>  net/sunrpc/svcsock.c       | 7 +++----
>>  3 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
>> index 74658cca0f38..225c385085c3 100644
>> --- a/include/linux/sunrpc/svc.h
>> +++ b/include/linux/sunrpc/svc.h
>> @@ -195,7 +195,7 @@ struct svc_rqst {
>>  
>>  	struct folio_batch	rq_fbatch;
>>  	struct kvec		rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */
>> -	struct bio_vec		rq_bvec[RPCSVC_MAXPAGES];
>> +	struct bio_vec		*rq_bvec;
> 
> It's a reasonable start.
> 
> What would also be good to do here is to replace the invocations of
> RPCSVC_MAXPAGES that involve this array with a helper function that
> returns the length of it.
> 
> For now it could just return RPCSVC_MAXPAGES, but eventually you could
> add (e.g.) a rqstp->rq_bvec_len field and use that to indicate how many
> entries there are in rq_bvec.

rq_vec, rq_pages, and rq_bvec all have the same entry count (plus or
minus one) so only one new field is necessary. There are a few other
places that allocate arrays of size RPCSVC_MAXPAGES that will need
similar treatment.

Stay tuned for v2.


>>  	__be32			rq_xid;		/* transmission id */
>>  	u32			rq_prog;	/* program number */
>> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
>> index e7f9c295d13c..db29819716b8 100644
>> --- a/net/sunrpc/svc.c
>> +++ b/net/sunrpc/svc.c
>> @@ -673,6 +673,7 @@ static void
>>  svc_rqst_free(struct svc_rqst *rqstp)
>>  {
>>  	folio_batch_release(&rqstp->rq_fbatch);
>> +	kfree(rqstp->rq_bvec);
>>  	svc_release_buffer(rqstp);
>>  	if (rqstp->rq_scratch_page)
>>  		put_page(rqstp->rq_scratch_page);
>> @@ -711,6 +712,11 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
>>  	if (!svc_init_buffer(rqstp, serv->sv_max_mesg, node))
>>  		goto out_enomem;
>>  
>> +	rqstp->rq_bvec = kcalloc_node(RPCSVC_MAXPAGES, sizeof(struct bio_vec),
>> +				      GFP_KERNEL, node);
>> +	if (!rqstp->rq_bvec)
>> +		goto out_enomem;
>> +
>>  	rqstp->rq_err = -EAGAIN; /* No error yet */
>>  
>>  	serv->sv_nrthreads += 1;
>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>> index 72e5a01df3d3..671640933f18 100644
>> --- a/net/sunrpc/svcsock.c
>> +++ b/net/sunrpc/svcsock.c
>> @@ -713,8 +713,7 @@ static int svc_udp_sendto(struct svc_rqst *rqstp)
>>  	if (svc_xprt_is_dead(xprt))
>>  		goto out_notconn;
>>  
>> -	count = xdr_buf_to_bvec(rqstp->rq_bvec,
>> -				ARRAY_SIZE(rqstp->rq_bvec), xdr);
>> +	count = xdr_buf_to_bvec(rqstp->rq_bvec, RPCSVC_MAXPAGES, xdr);
>>  
>>  	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
>>  		      count, rqstp->rq_res.len);
>> @@ -1219,8 +1218,8 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
>>  	memcpy(buf, &marker, sizeof(marker));
>>  	bvec_set_virt(rqstp->rq_bvec, buf, sizeof(marker));
>>  
>> -	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1,
>> -				ARRAY_SIZE(rqstp->rq_bvec) - 1, &rqstp->rq_res);
>> +	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1, RPCSVC_MAXPAGES,
>> +				&rqstp->rq_res);
>>  
>>  	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
>>  		      1 + count, sizeof(marker) + rqstp->rq_res.len);
> 

-- 
Chuck Lever


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory
  2025-04-16 18:45     ` Chuck Lever
@ 2025-04-16 18:55       ` Jeff Layton
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff Layton @ 2025-04-16 18:55 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey; +Cc: linux-nfs

On Wed, 2025-04-16 at 14:45 -0400, Chuck Lever wrote:
> On 4/16/25 2:42 PM, Jeff Layton wrote:
> > On Wed, 2025-04-16 at 11:28 -0400, cel@kernel.org wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > As a step towards making NFSD's maximum rsize and wsize variable,
> > > replace the fixed-size rq_bvec[] array in struct svc_rqst with a
> > > chunk of dynamically-allocated memory.
> > > 
> > > On a system with 8-byte pointers and 4KB pages, pahole reports that
> > > the rq_bvec[] array is 4144 bytes. Replacing it with a single
> > > pointer reduces the size of struct svc_rqst to about 7500 bytes.
> > > 
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > ---
> > >  include/linux/sunrpc/svc.h | 2 +-
> > >  net/sunrpc/svc.c           | 6 ++++++
> > >  net/sunrpc/svcsock.c       | 7 +++----
> > >  3 files changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> > > index 74658cca0f38..225c385085c3 100644
> > > --- a/include/linux/sunrpc/svc.h
> > > +++ b/include/linux/sunrpc/svc.h
> > > @@ -195,7 +195,7 @@ struct svc_rqst {
> > >  
> > >  	struct folio_batch	rq_fbatch;
> > >  	struct kvec		rq_vec[RPCSVC_MAXPAGES]; /* generally useful.. */
> > > -	struct bio_vec		rq_bvec[RPCSVC_MAXPAGES];
> > > +	struct bio_vec		*rq_bvec;
> > 
> > It's a reasonable start.
> > 
> > What would also be good to do here is to replace the invocations of
> > RPCSVC_MAXPAGES that involve this array with a helper function that
> > returns the length of it.
> > 
> > For now it could just return RPCSVC_MAXPAGES, but eventually you could
> > add (e.g.) a rqstp->rq_bvec_len field and use that to indicate how many
> > entries there are in rq_bvec.
> 
> rq_vec, rq_pages, and rq_bvec all have the same entry count (plus or
> minus one) so only one new field is necessary. There are a few other
> places that allocate arrays of size RPCSVC_MAXPAGES that will need
> similar treatment.
>
> Stay tuned for v2.
> 

Ok. I think I didn't articulate this well. Let me try again:

If you're looking to break the assumption that the length of these
arrays is RPCSVC_MAXPAGES, then the thing to do is to eliminate the
places where we make that assumption.

In particular, the two places where you're adding new RPCSVC_MAXPAGES
invocations would be better replaced with a helper function that we can
change the return value of later.

> 
> > >  	__be32			rq_xid;		/* transmission id */
> > >  	u32			rq_prog;	/* program number */
> > > diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> > > index e7f9c295d13c..db29819716b8 100644
> > > --- a/net/sunrpc/svc.c
> > > +++ b/net/sunrpc/svc.c
> > > @@ -673,6 +673,7 @@ static void
> > >  svc_rqst_free(struct svc_rqst *rqstp)
> > >  {
> > >  	folio_batch_release(&rqstp->rq_fbatch);
> > > +	kfree(rqstp->rq_bvec);
> > >  	svc_release_buffer(rqstp);
> > >  	if (rqstp->rq_scratch_page)
> > >  		put_page(rqstp->rq_scratch_page);
> > > @@ -711,6 +712,11 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
> > >  	if (!svc_init_buffer(rqstp, serv->sv_max_mesg, node))
> > >  		goto out_enomem;
> > >  
> > > +	rqstp->rq_bvec = kcalloc_node(RPCSVC_MAXPAGES, sizeof(struct bio_vec),
> > > +				      GFP_KERNEL, node);
> > > +	if (!rqstp->rq_bvec)
> > > +		goto out_enomem;
> > > +
> > >  	rqstp->rq_err = -EAGAIN; /* No error yet */
> > >  
> > >  	serv->sv_nrthreads += 1;
> > > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> > > index 72e5a01df3d3..671640933f18 100644
> > > --- a/net/sunrpc/svcsock.c
> > > +++ b/net/sunrpc/svcsock.c
> > > @@ -713,8 +713,7 @@ static int svc_udp_sendto(struct svc_rqst *rqstp)
> > >  	if (svc_xprt_is_dead(xprt))
> > >  		goto out_notconn;
> > >  
> > > -	count = xdr_buf_to_bvec(rqstp->rq_bvec,
> > > -				ARRAY_SIZE(rqstp->rq_bvec), xdr);
> > > +	count = xdr_buf_to_bvec(rqstp->rq_bvec, RPCSVC_MAXPAGES, xdr);
> > >  
> > >  	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
> > >  		      count, rqstp->rq_res.len);
> > > @@ -1219,8 +1218,8 @@ static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
> > >  	memcpy(buf, &marker, sizeof(marker));
> > >  	bvec_set_virt(rqstp->rq_bvec, buf, sizeof(marker));
> > >  
> > > -	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1,
> > > -				ARRAY_SIZE(rqstp->rq_bvec) - 1, &rqstp->rq_res);
> > > +	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1, RPCSVC_MAXPAGES,
> > > +				&rqstp->rq_res);
> > > 
>
> > >  
> > >  	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
> > >  		      1 + count, sizeof(marker) + rqstp->rq_res.len);
> > 
> 

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-04-16 18:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 15:28 [RFC PATCH 0/2] Move rq_vec[] and rq_bvec[] out of svc_rqst cel
2025-04-16 15:28 ` [RFC PATCH 1/2] sunrpc: Replace the rq_bvec array with dynamically-allocated memory cel
2025-04-16 18:42   ` Jeff Layton
2025-04-16 18:45     ` Chuck Lever
2025-04-16 18:55       ` Jeff Layton
2025-04-16 15:28 ` [RFC PATCH 2/2] sunrpc: Replace the rq_vec " cel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox