All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Chuck Lever <chuck.lever@oracle.com>, Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org
Subject: [PATCH 1/3] nfsd: avoid using DONTCACHE for misaligned DIO's buffered IO fallback
Date: Tue,  4 Nov 2025 11:42:27 -0500	[thread overview]
Message-ID: <20251104164229.43259-2-snitzer@kernel.org> (raw)
In-Reply-To: <20251104164229.43259-1-snitzer@kernel.org>

Also, use buffered IO (without DONTCACHE) if READ is less than 32K.
But do use DONTCACHE if an entire WRITE is misaligned, this preserves
intent of NFSD_IO_DIRECT.

The misaligned ends of a misaligned DIO WRITE will use buffered IO
(without DONTCACHE) but the middle DIO-aligned segment with use direct
IO.  This provides ideal performance for streaming misaligned DIO
(e.g. IO500's IOR_HARD) because buffered IO is used to benefit RMW.

On one capable testbed, this commit improved IOR_HARD WRITE
performance from 0.3433GB/s to 1.26GB/s.

Signed-off-by: Mike Snitzer <snitzer@hammerspace.com>
---
 fs/nfsd/vfs.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 701dd261c252..9403ec8bb2da 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -104,6 +104,7 @@ nfserrno (int errno)
 		{ nfserr_perm, -ENOKEY },
 		{ nfserr_no_grace, -ENOGRACE},
 		{ nfserr_io, -EBADMSG },
+		{ nfserr_eagain, -ENOTBLK },
 	};
 	int	i;
 
@@ -1099,13 +1100,18 @@ nfsd_direct_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	size_t len;
 
 	init_sync_kiocb(&kiocb, nf->nf_file);
-	kiocb.ki_flags |= IOCB_DIRECT;
 
 	/* Read a properly-aligned region of bytes into rq_bvec */
 	dio_start = round_down(offset, nf->nf_dio_read_offset_align);
 	dio_end = round_up((u64)offset + *count, nf->nf_dio_read_offset_align);
 
+	/* Don't use expanded DIO READ for IO less than 32K */
+	if ((*count < (32 << 10)) &&
+	    (((offset - dio_start) > 0) || ((dio_end - (offset + *count)) > 0)))
+		return nfserrno(-ENOTBLK); /* fallback to buffered */
+
 	kiocb.ki_pos = dio_start;
+	kiocb.ki_flags |= IOCB_DIRECT;
 
 	v = 0;
 	total = dio_end - dio_start;
@@ -1184,10 +1190,13 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		break;
 	case NFSD_IO_DIRECT:
 		/* When dio_read_offset_align is zero, dio is not supported */
-		if (nf->nf_dio_read_offset_align && !rqstp->rq_res.page_len)
-			return nfsd_direct_read(rqstp, fhp, nf, offset,
+		if (nf->nf_dio_read_offset_align && !rqstp->rq_res.page_len) {
+			__be32 nfserr = nfsd_direct_read(rqstp, fhp, nf, offset,
 						count, eof);
-		fallthrough;
+			if (nfserr != nfserr_eagain)
+				return nfserr;
+		}
+		break; /* fallback to buffered */
 	case NFSD_IO_DONTCACHE:
 		if (file->f_op->fop_flags & FOP_DONTCACHE)
 			kiocb.ki_flags = IOCB_DONTCACHE;
@@ -1347,6 +1356,15 @@ nfsd_write_dio_iters_init(struct bio_vec *bvec, unsigned int nvecs,
 		++args->nsegs;
 	}
 
+	/*
+	 * Don't use IOCB_DONTCACHE if misaligned DIO WRITE (args->nsegs > 1),
+	 * because it compromises unaligned segments' RMW IO being able to
+	 * benefit from buffered IO (especially important for streaming
+	 * misaligned DIO WRITE performance).
+	 */
+	if (args->nsegs > 1 && (args->flags_buffered & IOCB_DONTCACHE))
+		args->flags_buffered &= ~IOCB_DONTCACHE;
+
 	return;
 
 no_dio:
@@ -1400,7 +1418,7 @@ nfsd_direct_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 
 	/*
 	 * IOCB_DONTCACHE preserves the intent of NFSD_IO_DIRECT when
-	 * writing unaligned segments or handling fallback I/O.
+	 * falling back to buffered IO if entire WRITE is unaligned.
 	 */
 	args.flags_buffered = kiocb->ki_flags;
 	if (args.nf->nf_file->f_op->fop_flags & FOP_DONTCACHE)
-- 
2.44.0


  reply	other threads:[~2025-11-04 16:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-04 16:42 [PATCH 0/3] NFSD: additional NFSD Direct changes Mike Snitzer
2025-11-04 16:42 ` Mike Snitzer [this message]
2025-11-04 17:23   ` [PATCH 1/3] nfsd: avoid using DONTCACHE for misaligned DIO's buffered IO fallback Chuck Lever
2025-11-04 17:35     ` Mike Snitzer
2025-11-04 19:33       ` Chuck Lever
2025-11-04 18:11   ` [PATCH v2 " Mike Snitzer
2025-11-05  6:19   ` [PATCH v3 1/3] NFSD: avoid DONTCACHE for misaligned ends of misaligned DIO WRITE Mike Snitzer
2025-11-05 14:58     ` Chuck Lever
2025-11-05 17:33       ` Mike Snitzer
2025-11-04 16:42 ` [PATCH 2/3] NFSD: add new NFSD_IO_DIRECT variants that may override stable_how Mike Snitzer
2025-11-04 16:42 ` [PATCH 3/3] NFSD: update Documentation/filesystems/nfs/nfsd-io-modes.rst Mike Snitzer
2025-11-04 17:25   ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251104164229.43259-2-snitzer@kernel.org \
    --to=snitzer@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.