Re: A comparison of the new nfsd iomodes (and an experimental one)

public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed

From: Mike Snitzer <snitzer@kernel.org>
To: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: A comparison of the new nfsd iomodes (and an experimental one)
Date: Thu, 26 Mar 2026 16:48:46 -0400	[thread overview]
Message-ID: <acWbrlvt_dUB9X3R@kernel.org> (raw)
In-Reply-To: <33582a86daf135336f6bc0d5260d8de0501abadd.camel@kernel.org>

On Thu, Mar 26, 2026 at 12:35:15PM -0400, Jeff Layton wrote:
> On Thu, 2026-03-26 at 11:30 -0400, Chuck Lever wrote:
> > On 3/26/26 11:23 AM, Jeff Layton wrote:
> > > I've been doing some benchmarking of the new nfsd iomodes, using
> > > different fio-based workloads.
> > > 
> > > The results have been interesting, but one thing that stands out is
> > > that RWF_DONTCACHE is absolutely terrible for streaming write
> > > workloads. That prompted me to experiment with a new iomode that added
> > > some optimizations (DONTCACHE_LAZY).
> > > 
> > > The results along with Claude's analysis are here:
> > > 
> > >     https://markdownpastebin.com/?id=387375d00b5443b3a2e37d58a062331f
> > > 
> > > He gets a bit out over his skis on the upstream plan, but tl;dr is that
> > > DONTCACHE_LAZY (which is DONTCACHE with some optimizations) outperforms
> > > the other write iomodes.
> > 
> > The analysis of the write modes seems plausible. I'm interested to hear
> > what Mike and Jens have to say about that.

Thanks for doing your testing and the summary, but I cannot help but
feel like your test isn't coming close to realizing the O_DIRECT
benefits over buffered that were covered in the past, e.g.:
https://www.youtube.com/watch?v=tpPFDu9Nuuw

Can Claude be made to watch a youtube video, summarize what it learned
and then adapt its test-plan accordingly? ;)

Your bandwidth for 1MB sequential IO of 793 MB/s for O_DIRECT and
4,952 MB/s for buffered and dontcache is considerably less than the 72
GB/s offered in Jon's testbed.  Your testing isn't exposing the
bottlenecks (contention) of the MM subsystem for buffered IO... not
yet put my finger on _why_ that is.

In Jon Flynn's testing he was using a working set of 312.5% of
available server memory, and the single client test system was using
fio with multiple threads and sync IO to write to 16 different mounts
(one per NVMe of the NFS server) with nconnect=16 and RDMA.

Raw performance of a single NVMe in Jon's testbed was over 14 GB/s --
he has the ability to drive 16 NVMe in his single NFS server.  So an
order of magnitude more capable backend storage in Jon's NFS server.

Big concern is your testing isn't exposing MM bottlenecks of buffered
IO... given that, its not really providing useful results to compare
with O_DIRECT.

Putting that aside, yes DONTCACHE as-is really isn't helpful.. your
lazy variant seems much more useful.

> > One thing I'd like to hear more about is why Claude felt that disabling
> > splice read was beneficial. My own benchmarking in that area has shown
> > that splice read is always a win over not using splice.
> > 
> 
> Good catch. That turns out to be a mistake in Claude's writeup.
> 
> The test scripts left splice reads enabled for buffered reads, and the
> results in the analysis reflect that. I (and it) have no idea why it
> would recommend disabling them, when the testing all left them enabled
> for buffered reads.

Claude had to have picked up on the mutual exclusion with splice_read
for both NFSD_IO_DONTCACHE and NFSD_IO_DIRECT io modes.  So
splice_read is implicity disabled when testing NFSD_IO_DONTCACHE
(which is buffered IO).

Mike

     prev parent reply	other threads:[~2026-03-26 20:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26 15:23 A comparison of the new nfsd iomodes (and an experimental one) Jeff Layton
2026-03-26 15:30 ` Chuck Lever
2026-03-26 16:35   ` Jeff Layton
2026-03-26 20:48     ` Mike Snitzer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=acWbrlvt_dUB9X3R@kernel.org \
    --to=snitzer@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox