From: Chuck Lever <chuck.lever@oracle.com>
To: Cedric Blancher <cedric.blancher@gmail.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: Increase RPCSVC_MAXPAYLOAD to 8M?
Date: Wed, 29 Jan 2025 10:02:05 -0500 [thread overview]
Message-ID: <9138cbb9-b373-477e-bcc4-5a7cc4e16ed5@oracle.com> (raw)
In-Reply-To: <CALXu0Ue+w_P6P_yyVR1y85bKXxkorGrctJ4jiTBctQd8ei1_kw@mail.gmail.com>
On 1/29/25 2:32 AM, Cedric Blancher wrote:
> On Wed, 22 Jan 2025 at 11:07, Cedric Blancher <cedric.blancher@gmail.com> wrote:
>>
>> Good morning!
>>
>> IMO it might be good to increase RPCSVC_MAXPAYLOAD to at least 8M,
>> giving the NFSv4.1 session mechanism some headroom for negotiation.
>> For over a decade the default value is 1M (1*1024*1024u), which IMO
>> causes problems with anything faster than 2500baseT.
>
> The 1MB limit was defined when 10base5/10baseT was the norm, and
> 100baseT (100mbit) was "fast".
>
> Nowadays 1000baseT is the norm, 2500baseT is in premium *laptops*, and
> 10000baseT is fast.
> Just the 1MB limit is now in the way of EVERYTHING, including "large
> send offload" and other acceleration features.
>
> So my suggestion is to increase the buffer to 4MB by default (2*2MB
> hugepages on x86), and allow a tuneable to select up to 16MB.
TL;DR: This has been on the long-term to-do list for NFSD for quite some
time.
We certainly want to support larger COMPOUNDs, but increasing
RPCSVC_MAXPAYLOAD is only the first step.
The biggest obstacle is the rq_pages[] array in struct svc_rqst. Today
it has 259 entries. Quadrupling that would make the array itself
multiple pages in size, and there's one of these for each nfsd thread.
We are working on replacing the use of page arrays with folios, which
would make this infrastructure significantly smaller and faster, but it
depends on folio support in all the kernel APIs that NFSD makes use of.
That situation continues to evolve.
An equivalent issue exists in the Linux NFS client.
Increasing this capability on the server without having a client that
can make use of it doesn't seem wise.
You can try increasing the value of RPCSVC_MAXPAYLOAD yourself and try
some measurements to help make the case (and analyze the operational
costs). I think we need some confidence that increasing the maximum
payload size will not unduly impact small I/O.
Re: a tunable: I'm not sure why someone would want to tune this number
down from the maximum. You can control how much total memory the server
consumes by reducing the number of nfsd threads.
--
Chuck Lever
next prev parent reply other threads:[~2025-01-29 15:02 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-22 10:07 Increase RPCSVC_MAXPAYLOAD to 8M? Cedric Blancher
2025-01-29 7:32 ` Cedric Blancher
2025-01-29 15:02 ` Chuck Lever [this message]
2025-02-06 8:45 ` Cedric Blancher
2025-02-06 14:25 ` Chuck Lever
2025-03-04 6:43 ` Cedric Blancher
2025-03-04 14:40 ` Chuck Lever
2025-04-07 11:34 ` Increase RPCSVC_MAXPAYLOAD to 8M, part DEUX Cedric Blancher
2025-04-07 13:58 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9138cbb9-b373-477e-bcc4-5a7cc4e16ed5@oracle.com \
--to=chuck.lever@oracle.com \
--cc=cedric.blancher@gmail.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox