From: Chuck Lever <cel@kernel.org>
To: NeilBrown <neilb@ownmail.net>, Jeff Layton <jlayton@kernel.org>,
Olga Kornievskaia <okorniev@redhat.com>,
Dai Ngo <dai.ngo@oracle.com>, Tom Talpey <tom@talpey.com>
Cc: <linux-nfs@vger.kernel.org>, Chuck Lever <chuck.lever@oracle.com>
Subject: [PATCH v2 0/6] Optimize NFSD buffer page management
Date: Thu, 26 Feb 2026 09:47:33 -0500 [thread overview]
Message-ID: <20260226144739.193129-1-cel@kernel.org> (raw)
From: Chuck Lever <chuck.lever@oracle.com>
This series solves two problems. First:
NFSv3 operations have complementary Request and Response sizes.
When a Request message is large, the corresponding Response
message is small, and vice versa. The sum of the two message
sizes is never more than the maximum transport payload size. So
NFSD could get away with maintaining a single array of pages,
split between the RPC send and Receive buffer.
NFSv4 is not as cut and dried. An NFSv4 client may construct an
NFSv4 COMPOUND that is arbitrarily complex, mixing operations
that can have large Request size with operations that have a
large Response size. The resulting server-side buffer size
requirement can be larger than the maximum transport payload size.
Therefore we must increase the allocated RPC Call landing zone and
the RPC Reply construction zone to ensure that arbitrary NFSv4
COMPOUNDs can be handled.
Second:
Due to the above, and because NFSD can now handle payload sizes
considerably larger than 1MB, the number of array entries that
alloc_bulk_pages() walks through to reset the rqst page arrays
after each RPC completes has increased dramatically.
But we observe that the mean size of NFS requests remains smaller
than a few pages. If only a few pages are consumed while processing
each RPC, then traversing all of the pages in the page arrays for
refills is wasted effort. The CPU cost of walking these arrays is
noticeable in "perf" captures.
It would be more efficient to keep track of which entries need to
be refilled, since that is likely to be a small number in the most
common case, and use alloc_bulk_pages() to fill only those entries.
---
Changes since RFC:
- Clarify a number of comments based on review (NeilBrown)
- Possible NFSv3 waste is still open for discussion
Chuck Lever (6):
SUNRPC: Tighten bounds checking in svc_rqst_replace_page
SUNRPC: Allocate a separate Reply page array
SUNRPC: Handle NULL entries in svc_rqst_release_pages
svcrdma: preserve rq_next_page in svc_rdma_save_io_pages
SUNRPC: Track consumed rq_pages entries
SUNRPC: Optimize rq_respages allocation in svc_alloc_arg
include/linux/sunrpc/svc.h | 61 +++++++++++++++----------
net/sunrpc/svc.c | 59 +++++++++++++++++-------
net/sunrpc/svc_xprt.c | 47 +++++++++++++++----
net/sunrpc/svcsock.c | 7 +--
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 15 ++----
net/sunrpc/xprtrdma/svc_rdma_rw.c | 1 +
net/sunrpc/xprtrdma/svc_rdma_sendto.c | 6 +--
7 files changed, 125 insertions(+), 71 deletions(-)
--
2.53.0
next reply other threads:[~2026-02-26 14:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-26 14:47 Chuck Lever [this message]
2026-02-26 14:47 ` [PATCH v2 1/6] SUNRPC: Tighten bounds checking in svc_rqst_replace_page Chuck Lever
2026-03-10 18:01 ` Jeff Layton
2026-02-26 14:47 ` [PATCH v2 2/6] SUNRPC: Allocate a separate Reply page array Chuck Lever
2026-03-10 18:10 ` Jeff Layton
2026-02-26 14:47 ` [PATCH v2 3/6] SUNRPC: Handle NULL entries in svc_rqst_release_pages Chuck Lever
2026-03-10 18:11 ` Jeff Layton
2026-02-26 14:47 ` [PATCH v2 4/6] svcrdma: preserve rq_next_page in svc_rdma_save_io_pages Chuck Lever
2026-03-10 18:13 ` Jeff Layton
2026-02-26 14:47 ` [PATCH v2 5/6] SUNRPC: Track consumed rq_pages entries Chuck Lever
2026-03-10 18:16 ` Jeff Layton
2026-02-26 14:47 ` [PATCH v2 6/6] SUNRPC: Optimize rq_respages allocation in svc_alloc_arg Chuck Lever
2026-03-10 18:18 ` Jeff Layton
2026-03-10 18:19 ` [PATCH v2 0/6] Optimize NFSD buffer page management Jeff Layton
2026-03-10 18:24 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260226144739.193129-1-cel@kernel.org \
--to=cel@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=dai.ngo@oracle.com \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@ownmail.net \
--cc=okorniev@redhat.com \
--cc=tom@talpey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox