Re: A comparison of the new nfsd iomodes (and an experimental one)

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Jeff Layton <jlayton@kernel.org>
To: Mike Snitzer <snitzer@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	linux-nfs@vger.kernel.org,  linux-fsdevel@vger.kernel.org,
	linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: A comparison of the new nfsd iomodes (and an experimental one)
Date: Fri, 27 Mar 2026 07:32:08 -0400	[thread overview]
Message-ID: <70e9c23a97d94a3dad5aa7f03f5a22c0950b00bf.camel@kernel.org> (raw)
In-Reply-To: <acWbrlvt_dUB9X3R@kernel.org>

On Thu, 2026-03-26 at 16:48 -0400, Mike Snitzer wrote:
> On Thu, Mar 26, 2026 at 12:35:15PM -0400, Jeff Layton wrote:
> > On Thu, 2026-03-26 at 11:30 -0400, Chuck Lever wrote:
> > > On 3/26/26 11:23 AM, Jeff Layton wrote:
> > > > I've been doing some benchmarking of the new nfsd iomodes, using
> > > > different fio-based workloads.
> > > > 
> > > > The results have been interesting, but one thing that stands out is
> > > > that RWF_DONTCACHE is absolutely terrible for streaming write
> > > > workloads. That prompted me to experiment with a new iomode that added
> > > > some optimizations (DONTCACHE_LAZY).
> > > > 
> > > > The results along with Claude's analysis are here:
> > > > 
> > > >     https://markdownpastebin.com/?id=387375d00b5443b3a2e37d58a062331f
> > > > 
> > > > He gets a bit out over his skis on the upstream plan, but tl;dr is that
> > > > DONTCACHE_LAZY (which is DONTCACHE with some optimizations) outperforms
> > > > the other write iomodes.
> > > 
> > > The analysis of the write modes seems plausible. I'm interested to hear
> > > what Mike and Jens have to say about that.
> 
> Thanks for doing your testing and the summary, but I cannot help but
> feel like your test isn't coming close to realizing the O_DIRECT
> benefits over buffered that were covered in the past, e.g.:
> https://www.youtube.com/watch?v=tpPFDu9Nuuw
> 
> Can Claude be made to watch a youtube video, summarize what it learned
> and then adapt its test-plan accordingly? ;)
> 

I'm not sure it can. It's a good q though. I'll ask it!

> Your bandwidth for 1MB sequential IO of 793 MB/s for O_DIRECT and
> 4,952 MB/s for buffered and dontcache is considerably less than the 72
> GB/s offered in Jon's testbed.  Your testing isn't exposing the
> bottlenecks (contention) of the MM subsystem for buffered IO... not
> yet put my finger on _why_ that is.

That may very well be, but not everyone has a box as large as the one
you and Jon were working with.

> In Jon Flynn's testing he was using a working set of 312.5% of
> available server memory, and the single client test system was using
> fio with multiple threads and sync IO to write to 16 different mounts
> (one per NVMe of the NFS server) with nconnect=16 and RDMA.
> 

This test was attempting to simulate a high nconnect count from
multiple clients by using multiple fio processes. I also used the
libnfs backend to fio so I wouldn't need to worry about tuning the
kernel client.

> Raw performance of a single NVMe in Jon's testbed was over 14 GB/s --
> he has the ability to drive 16 NVMe in his single NFS server.  So an
> order of magnitude more capable backend storage in Jon's NFS server.
> 

Very true. This box only had a single SSD. I can try to find something
with more storage though for another test.

> Big concern is your testing isn't exposing MM bottlenecks of buffered
> IO... given that, its not really providing useful results to compare
> with O_DIRECT.
> 
> Putting that aside, yes DONTCACHE as-is really isn't helpful.. your
> lazy variant seems much more useful.
> 

Right. I think that's the big takeaway from this. Ignoring claude's
rambling about changing default iomodes in the server, RWF_DONTCACHE
just sucks for heavy writes. There is room for improvement there.

The big question I have is whether fixing RWF_DONTCACHE's writeback
behavior would give better results than what you were seeing with
O_DIRECT. Do you guys still have access to that test rig? I can send
you the patch if you want to test this and see how it does.

> > > One thing I'd like to hear more about is why Claude felt that disabling
> > > splice read was beneficial. My own benchmarking in that area has shown
> > > that splice read is always a win over not using splice.
> > > 
> > 
> > Good catch. That turns out to be a mistake in Claude's writeup.
> > 
> > The test scripts left splice reads enabled for buffered reads, and the
> > results in the analysis reflect that. I (and it) have no idea why it
> > would recommend disabling them, when the testing all left them enabled
> > for buffered reads.
> 
> Claude had to have picked up on the mutual exclusion with splice_read
> for both NFSD_IO_DONTCACHE and NFSD_IO_DIRECT io modes.  So
> splice_read is implicity disabled when testing NFSD_IO_DONTCACHE
> (which is buffered IO).


Yeah, Claude just got confused on the writeup. The benchmarking itself
was sound, AFAICT. Buffered reads used splice and it was disabled for
the other modes.
-- 
Jeff Layton <jlayton@kernel.org>

next prev parent reply	other threads:[~2026-03-27 11:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26 15:23 A comparison of the new nfsd iomodes (and an experimental one) Jeff Layton
2026-03-26 15:30 ` Chuck Lever
2026-03-26 16:35   ` Jeff Layton
2026-03-26 20:48     ` Mike Snitzer
2026-03-27 11:32       ` Jeff Layton [this message]
2026-03-27 13:19         ` Chuck Lever
2026-03-27 16:57           ` Mike Snitzer
2026-03-28 12:37             ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=70e9c23a97d94a3dad5aa7f03f5a22c0950b00bf.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=chuck.lever@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=snitzer@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox