All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
@ 2025-10-20 16:25 Chuck Lever
  2025-10-20 16:25 ` [PATCH v5 1/4] NFSD: Enable return of an updated stable_how to NFS clients Chuck Lever
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 16:25 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Following on https://lore.kernel.org/linux-nfs/aPAci7O_XK1ljaum@kernel.org/
this series includes all patches needed to make NFSD Direct WRITE
work, plus an experimental follow-on patch.

Mike, just send along any responses to review comments as patch
snippets and I will apply them as needed.

Changes since v4:
* Split out refactoring nfsd_buffered_write() into a separate patch
* Expand patch description of 1/4
* Don't set IOCB_SYNC flag

Changes since v3:
* Address checkpatch.pl nits in 2/3
* Add an untested patch to mark ingress RDMA Read chunks

Chuck Lever (3):
  NFSD: Enable return of an updated stable_how to NFS clients
  NFSD: Refactor nfsd_vfs_write()
  svcrdma: Mark Read chunks

Mike Snitzer (1):
  NFSD: Implement NFSD_IO_DIRECT for NFS WRITE

 fs/nfsd/debugfs.c                       |   1 +
 fs/nfsd/nfs3proc.c                      |   2 +-
 fs/nfsd/nfs4proc.c                      |   2 +-
 fs/nfsd/nfsproc.c                       |   3 +-
 fs/nfsd/trace.h                         |   1 +
 fs/nfsd/vfs.c                           | 231 +++++++++++++++++++++++-
 fs/nfsd/vfs.h                           |   6 +-
 fs/nfsd/xdr3.h                          |   2 +-
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |   5 +
 9 files changed, 237 insertions(+), 16 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5 1/4] NFSD: Enable return of an updated stable_how to NFS clients
  2025-10-20 16:25 [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
@ 2025-10-20 16:25 ` Chuck Lever
  2025-10-20 16:25 ` [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write() Chuck Lever
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 16:25 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

NFSv3 and newer protocols enable clients to perform a two-phase
WRITE. A client requests an UNSTABLE WRITE, which sends dirty data
to the NFS server, but does not persist it. The server replies that
it performed the UNSTABLE WRITE, and the client is then obligated to
follow up with a COMMIT request before it can remove the dirty data
from its own page cache. The COMMIT reply is the client's guarantee
that the written data has been persisted on the server.

The purpose of this protocol design is to enable clients to send
a large amount of data via multiple WRITE requests to a server, and
then wait for persistence just once. The server is able to start
persisting the data as soon as it gets it, to shorten the length of
time the client has to wait for the final COMMIT to complete.

It's also possible for the server to respond to an UNSTABLE WRITE
request in a way that indicates that the data was persisted anyway.
In that case, the client can skip the COMMIT and remove the dirty
data from its memory immediately. NetApp filers, for example, do
this because they have a battery-backed cache and can guarantee that
written data is persisted quickly and immediately.

NFSD has never implemented this kind of promotion. UNSTABLE WRITE
requests are unconditionally treated as UNSTABLE. However, in a
subsequent patch, nfsd_vfs_write() will be able to promote an
UNSTABLE WRITE to be a FILE_SYNC WRITE. This will be because NFSD
will handle some WRITE requests locally with O_DIRECT, which
persists written data immediately. The FILE_SYNC WRITE response
indicates to the client that no follow-up COMMIT is necessary.

This patch prepares for that change by enabling the modified
stable_how value to be passed along to NFSD's WRITE reply encoder.
No behavior change is expected.

Reviewed-by: NeilBrown <neil@brown.name>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs3proc.c |  2 +-
 fs/nfsd/nfs4proc.c |  2 +-
 fs/nfsd/nfsproc.c  |  3 ++-
 fs/nfsd/vfs.c      | 11 ++++++-----
 fs/nfsd/vfs.h      |  6 ++++--
 fs/nfsd/xdr3.h     |  2 +-
 6 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index b6d03e1ef5f7..ad14b34583bb 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -236,7 +236,7 @@ nfsd3_proc_write(struct svc_rqst *rqstp)
 	resp->committed = argp->stable;
 	resp->status = nfsd_write(rqstp, &resp->fh, argp->offset,
 				  &argp->payload, &cnt,
-				  resp->committed, resp->verf);
+				  &resp->committed, resp->verf);
 	resp->count = cnt;
 	resp->status = nfsd3_map_status(resp->status);
 	return rpc_success;
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 7f7e6bb23a90..2222bb283baf 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1285,7 +1285,7 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	write->wr_how_written = write->wr_stable_how;
 	status = nfsd_vfs_write(rqstp, &cstate->current_fh, nf,
 				write->wr_offset, &write->wr_payload,
-				&cnt, write->wr_how_written,
+				&cnt, &write->wr_how_written,
 				(__be32 *)write->wr_verifier.data);
 	nfsd_file_put(nf);
 
diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c
index 8f71f5748c75..706401ed6f8d 100644
--- a/fs/nfsd/nfsproc.c
+++ b/fs/nfsd/nfsproc.c
@@ -251,6 +251,7 @@ nfsd_proc_write(struct svc_rqst *rqstp)
 	struct nfsd_writeargs *argp = rqstp->rq_argp;
 	struct nfsd_attrstat *resp = rqstp->rq_resp;
 	unsigned long cnt = argp->len;
+	u32 committed = NFS_DATA_SYNC;
 
 	dprintk("nfsd: WRITE    %s %u bytes at %d\n",
 		SVCFH_fmt(&argp->fh),
@@ -258,7 +259,7 @@ nfsd_proc_write(struct svc_rqst *rqstp)
 
 	fh_copy(&resp->fh, &argp->fh);
 	resp->status = nfsd_write(rqstp, &resp->fh, argp->offset,
-				  &argp->payload, &cnt, NFS_DATA_SYNC, NULL);
+				  &argp->payload, &cnt, &committed, NULL);
 	if (resp->status == nfs_ok)
 		resp->status = fh_getattr(&resp->fh, &resp->stat);
 	else if (resp->status == nfserr_jukebox)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index f537a7b4ee01..8b2dc7a88aab 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1262,7 +1262,7 @@ static int wait_for_concurrent_writes(struct file *file)
  * @offset: Byte offset of start
  * @payload: xdr_buf containing the write payload
  * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
- * @stable: An NFS stable_how value
+ * @stable_how: IN: Client's requested stable_how, OUT: Actual stable_how
  * @verf: NFS WRITE verifier
  *
  * Upon return, caller must invoke fh_put on @fhp.
@@ -1274,11 +1274,12 @@ __be32
 nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	       struct nfsd_file *nf, loff_t offset,
 	       const struct xdr_buf *payload, unsigned long *cnt,
-	       int stable, __be32 *verf)
+	       u32 *stable_how, __be32 *verf)
 {
 	struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
 	struct file		*file = nf->nf_file;
 	struct super_block	*sb = file_inode(file)->i_sb;
+	u32			stable = *stable_how;
 	struct kiocb		kiocb;
 	struct svc_export	*exp;
 	struct iov_iter		iter;
@@ -1434,7 +1435,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
  * @offset: Byte offset of start
  * @payload: xdr_buf containing the write payload
  * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
- * @stable: An NFS stable_how value
+ * @stable_how: IN: Client's requested stable_how, OUT: Actual stable_how
  * @verf: NFS WRITE verifier
  *
  * Upon return, caller must invoke fh_put on @fhp.
@@ -1444,7 +1445,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
  */
 __be32
 nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
-	   const struct xdr_buf *payload, unsigned long *cnt, int stable,
+	   const struct xdr_buf *payload, unsigned long *cnt, u32 *stable_how,
 	   __be32 *verf)
 {
 	struct nfsd_file *nf;
@@ -1457,7 +1458,7 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset,
 		goto out;
 
 	err = nfsd_vfs_write(rqstp, fhp, nf, offset, payload, cnt,
-			     stable, verf);
+			     stable_how, verf);
 	nfsd_file_put(nf);
 out:
 	trace_nfsd_write_done(rqstp, fhp, offset, *cnt);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index fa46f8b5f132..c713ed0b04e0 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -130,11 +130,13 @@ __be32		nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				u32 *eof);
 __be32		nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				loff_t offset, const struct xdr_buf *payload,
-				unsigned long *cnt, int stable, __be32 *verf);
+				unsigned long *cnt, u32 *stable_how,
+				__be32 *verf);
 __be32		nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 				struct nfsd_file *nf, loff_t offset,
 				const struct xdr_buf *payload,
-				unsigned long *cnt, int stable, __be32 *verf);
+				unsigned long *cnt, u32 *stable_how,
+				__be32 *verf);
 __be32		nfsd_readlink(struct svc_rqst *, struct svc_fh *,
 				char *, int *);
 __be32		nfsd_symlink(struct svc_rqst *, struct svc_fh *,
diff --git a/fs/nfsd/xdr3.h b/fs/nfsd/xdr3.h
index 522067b7fd75..c0e443ef3a6b 100644
--- a/fs/nfsd/xdr3.h
+++ b/fs/nfsd/xdr3.h
@@ -152,7 +152,7 @@ struct nfsd3_writeres {
 	__be32			status;
 	struct svc_fh		fh;
 	unsigned long		count;
-	int			committed;
+	u32			committed;
 	__be32			verf[2];
 };
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write()
  2025-10-20 16:25 [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
  2025-10-20 16:25 ` [PATCH v5 1/4] NFSD: Enable return of an updated stable_how to NFS clients Chuck Lever
@ 2025-10-20 16:25 ` Chuck Lever
  2025-10-20 16:32   ` Mike Snitzer
  2025-10-21 10:51   ` Jeff Layton
  2025-10-20 16:25 ` [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 16:25 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Extract the common code that is to be used in the buffered and
dontcache I/O modes. This common code will also be used as the
fallback when direct I/O is requested but cannot be used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/vfs.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 8b2dc7a88aab..ae9c41f7374e 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1254,6 +1254,22 @@ static int wait_for_concurrent_writes(struct file *file)
 	return err;
 }
 
+static int
+nfsd_iocb_write(struct svc_rqst *rqstp, struct file *file, unsigned int nvecs,
+		unsigned long *cnt, struct kiocb *kiocb)
+{
+	struct iov_iter iter;
+	int host_err;
+
+	iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
+	host_err = vfs_iocb_iter_write(file, kiocb, &iter);
+	if (host_err < 0)
+		return host_err;
+	*cnt = host_err;
+
+	return 0;
+}
+
 /**
  * nfsd_vfs_write - write data to an already-open file
  * @rqstp: RPC execution context
@@ -1282,7 +1298,6 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	u32			stable = *stable_how;
 	struct kiocb		kiocb;
 	struct svc_export	*exp;
-	struct iov_iter		iter;
 	errseq_t		since;
 	__be32			nfserr;
 	int			host_err;
@@ -1319,25 +1334,24 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		kiocb.ki_flags |= IOCB_DSYNC;
 
 	nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
-	iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
+
 	since = READ_ONCE(file->f_wb_err);
 	if (verf)
 		nfsd_copy_write_verifier(verf, nn);
 
 	switch (nfsd_io_cache_write) {
-	case NFSD_IO_BUFFERED:
-		break;
 	case NFSD_IO_DONTCACHE:
 		if (file->f_op->fop_flags & FOP_DONTCACHE)
 			kiocb.ki_flags |= IOCB_DONTCACHE;
+		fallthrough;
+	case NFSD_IO_BUFFERED:
+		host_err = nfsd_iocb_write(rqstp, file, nvecs, cnt, &kiocb);
 		break;
 	}
-	host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
 	if (host_err < 0) {
 		commit_reset_write_verifier(nn, rqstp, host_err);
 		goto out_nfserr;
 	}
-	*cnt = host_err;
 	nfsd_stats_io_write_add(nn, exp, *cnt);
 	fsnotify_modify(file);
 	host_err = filemap_check_wb_err(file->f_mapping, since);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-20 16:25 [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
  2025-10-20 16:25 ` [PATCH v5 1/4] NFSD: Enable return of an updated stable_how to NFS clients Chuck Lever
  2025-10-20 16:25 ` [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write() Chuck Lever
@ 2025-10-20 16:25 ` Chuck Lever
  2025-10-25  0:10   ` kernel test robot
  2025-10-20 16:25 ` [PATCH v5 4/4] svcrdma: Mark Read chunks Chuck Lever
  2025-10-20 16:33 ` [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Mike Snitzer
  4 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 16:25 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Mike Snitzer

From: Mike Snitzer <snitzer@kernel.org>

If NFSD_IO_DIRECT is used, split any misaligned WRITE into a start,
middle and end as needed. The large middle extent is DIO-aligned and
the start and/or end are misaligned. Synchronous buffered IO (with
preference towards using DONTCACHE) is used for the misaligned extents
and O_DIRECT is used for the middle DIO-aligned extent.

nfsd_issue_write_dio() promotes @stable_how to NFS_FILE_SYNC, which
allows the client to drop its dirty data and avoid needing an extra
COMMIT operation.

If vfs_iocb_iter_write() returns -ENOTBLK, due to its inability to
invalidate the page cache on behalf of the DIO WRITE, then
nfsd_issue_write_dio() will fall back to using buffered IO.

These changes served as the original starting point for the NFS
client's misaligned O_DIRECT support that landed with
commit c817248fc831 ("nfs/localio: add proper O_DIRECT support for
READ and WRITE"). But NFSD's support is simpler because it currently
doesn't use AIO completion.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/debugfs.c |   1 +
 fs/nfsd/trace.h   |   1 +
 fs/nfsd/vfs.c     | 196 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 198 insertions(+)

diff --git a/fs/nfsd/debugfs.c b/fs/nfsd/debugfs.c
index 00eb1ecef6ac..7f44689e0a53 100644
--- a/fs/nfsd/debugfs.c
+++ b/fs/nfsd/debugfs.c
@@ -108,6 +108,7 @@ static int nfsd_io_cache_write_set(void *data, u64 val)
 	switch (val) {
 	case NFSD_IO_BUFFERED:
 	case NFSD_IO_DONTCACHE:
+	case NFSD_IO_DIRECT:
 		nfsd_io_cache_write = val;
 		break;
 	default:
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index bfd41236aff2..ad74439d0105 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -469,6 +469,7 @@ DEFINE_NFSD_IO_EVENT(read_io_done);
 DEFINE_NFSD_IO_EVENT(read_done);
 DEFINE_NFSD_IO_EVENT(write_start);
 DEFINE_NFSD_IO_EVENT(write_opened);
+DEFINE_NFSD_IO_EVENT(write_direct);
 DEFINE_NFSD_IO_EVENT(write_io_done);
 DEFINE_NFSD_IO_EVENT(write_done);
 DEFINE_NFSD_IO_EVENT(commit_start);
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index ae9c41f7374e..6840fff37649 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1254,6 +1254,109 @@ static int wait_for_concurrent_writes(struct file *file)
 	return err;
 }
 
+struct nfsd_write_dio {
+	ssize_t	start_len;	/* Length for misaligned first extent */
+	ssize_t	middle_len;	/* Length for DIO-aligned middle extent */
+	ssize_t	end_len;	/* Length for misaligned last extent */
+};
+
+static bool
+nfsd_is_write_dio_possible(loff_t offset, unsigned long len,
+			   struct nfsd_file *nf,
+			   struct nfsd_write_dio *write_dio)
+{
+	const u32 dio_blocksize = nf->nf_dio_offset_align;
+	loff_t start_end, orig_end, middle_end;
+
+	if (unlikely(!nf->nf_dio_mem_align || !dio_blocksize))
+		return false;
+	if (unlikely(dio_blocksize > PAGE_SIZE))
+		return false;
+	if (unlikely(len < dio_blocksize))
+		return false;
+
+	start_end = round_up(offset, dio_blocksize);
+	orig_end = offset + len;
+	middle_end = round_down(orig_end, dio_blocksize);
+
+	write_dio->start_len = start_end - offset;
+	write_dio->middle_len = middle_end - start_end;
+	write_dio->end_len = orig_end - middle_end;
+
+	return true;
+}
+
+static bool
+nfsd_iov_iter_aligned_bvec(const struct iov_iter *i, unsigned int addr_mask,
+			   unsigned int len_mask)
+{
+	const struct bio_vec *bvec = i->bvec;
+	size_t skip = i->iov_offset;
+	size_t size = i->count;
+
+	if (size & len_mask)
+		return false;
+	do {
+		size_t len = bvec->bv_len;
+
+		if (len > size)
+			len = size;
+		if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
+			return false;
+		bvec++;
+		size -= len;
+		skip = 0;
+	} while (size);
+
+	return true;
+}
+
+/*
+ * Setup as many as 3 iov_iter based on extents described by @write_dio.
+ * Returns the number of iov_iter that were setup.
+ */
+static int
+nfsd_setup_write_dio_iters(struct iov_iter **iterp, bool *iter_is_dio_aligned,
+			   struct bio_vec *rq_bvec, unsigned int nvecs,
+			   unsigned long cnt, struct nfsd_write_dio *write_dio,
+			   struct nfsd_file *nf)
+{
+	int n_iters = 0;
+	struct iov_iter *iters = *iterp;
+
+	/* Setup misaligned start? */
+	if (write_dio->start_len) {
+		iov_iter_bvec(&iters[n_iters], ITER_SOURCE, rq_bvec, nvecs, cnt);
+		iters[n_iters].count = write_dio->start_len;
+		iter_is_dio_aligned[n_iters] = false;
+		++n_iters;
+	}
+
+	/* Setup DIO-aligned middle */
+	iov_iter_bvec(&iters[n_iters], ITER_SOURCE, rq_bvec, nvecs, cnt);
+	if (write_dio->start_len)
+		iov_iter_advance(&iters[n_iters], write_dio->start_len);
+	iters[n_iters].count -= write_dio->end_len;
+	iter_is_dio_aligned[n_iters] =
+		nfsd_iov_iter_aligned_bvec(&iters[n_iters],
+					   nf->nf_dio_mem_align - 1,
+					   nf->nf_dio_offset_align - 1);
+	if (unlikely(!iter_is_dio_aligned[n_iters]))
+		return 0; /* no DIO-aligned IO possible */
+	++n_iters;
+
+	/* Setup misaligned end? */
+	if (write_dio->end_len) {
+		iov_iter_bvec(&iters[n_iters], ITER_SOURCE, rq_bvec, nvecs, cnt);
+		iov_iter_advance(&iters[n_iters],
+				 write_dio->start_len + write_dio->middle_len);
+		iter_is_dio_aligned[n_iters] = false;
+		++n_iters;
+	}
+
+	return n_iters;
+}
+
 static int
 nfsd_iocb_write(struct svc_rqst *rqstp, struct file *file, unsigned int nvecs,
 		unsigned long *cnt, struct kiocb *kiocb)
@@ -1270,6 +1373,94 @@ nfsd_iocb_write(struct svc_rqst *rqstp, struct file *file, unsigned int nvecs,
 	return 0;
 }
 
+static int
+nfsd_issue_write_dio(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
+		     u32 *stable_how, unsigned int nvecs, unsigned long *cnt,
+		     struct kiocb *kiocb, struct nfsd_write_dio *write_dio)
+{
+	struct file *file = nf->nf_file;
+	bool iter_is_dio_aligned[3];
+	struct iov_iter iter_stack[3];
+	struct iov_iter *iter = iter_stack;
+	unsigned int n_iters = 0;
+	unsigned long in_count = *cnt;
+	loff_t in_offset = kiocb->ki_pos;
+	ssize_t host_err;
+
+	n_iters = nfsd_setup_write_dio_iters(&iter, iter_is_dio_aligned,
+					     rqstp->rq_bvec, nvecs, *cnt,
+					     write_dio, nf);
+	if (unlikely(!n_iters))
+		return nfsd_iocb_write(rqstp, file, nvecs, cnt, kiocb);
+
+	trace_nfsd_write_direct(rqstp, fhp, in_offset, in_count);
+
+	/*
+	 * Any buffered IO issued here will be misaligned, use
+	 * sync IO to ensure it has completed before returning.
+	 * Also update @stable_how to avoid need for COMMIT.
+	 */
+	kiocb->ki_flags |= IOCB_DSYNC;
+	*stable_how = NFS_FILE_SYNC;
+
+	*cnt = 0;
+	for (int i = 0; i < n_iters; i++) {
+		if (iter_is_dio_aligned[i])
+			kiocb->ki_flags |= IOCB_DIRECT;
+		else
+			kiocb->ki_flags &= ~IOCB_DIRECT;
+
+		host_err = vfs_iocb_iter_write(file, kiocb, &iter[i]);
+		if (host_err < 0) {
+			/*
+			 * VFS will return -ENOTBLK if DIO WRITE fails to
+			 * invalidate the page cache. Retry using buffered IO.
+			 */
+			if (unlikely(host_err == -ENOTBLK)) {
+				kiocb->ki_flags &= ~IOCB_DIRECT;
+				*cnt = in_count;
+				kiocb->ki_pos = in_offset;
+				return nfsd_iocb_write(rqstp, file, nvecs, cnt,
+						       kiocb);
+			} else if (unlikely(host_err == -EINVAL)) {
+				struct inode *inode = d_inode(fhp->fh_dentry);
+
+				pr_info_ratelimited("nfsd: Direct I/O alignment failure on %s/%ld\n",
+						    inode->i_sb->s_id, inode->i_ino);
+				host_err = -ESERVERFAULT;
+			}
+			return host_err;
+		}
+		*cnt += host_err;
+		if (host_err < iter[i].count) /* partial write? */
+			break;
+	}
+
+	return 0;
+}
+
+static noinline_for_stack int
+nfsd_direct_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
+		  struct nfsd_file *nf, u32 *stable_how, unsigned int nvecs,
+		  unsigned long *cnt, struct kiocb *kiocb)
+{
+	struct nfsd_write_dio write_dio;
+
+	/*
+	 * Check if IOCB_DONTCACHE can be used when issuing buffered IO;
+	 * if so, set it to preserve intent of NFSD_IO_DIRECT (it will
+	 * be ignored for any DIO issued here).
+	 */
+	if (nf->nf_file->f_op->fop_flags & FOP_DONTCACHE)
+		kiocb->ki_flags |= IOCB_DONTCACHE;
+
+	if (nfsd_is_write_dio_possible(kiocb->ki_pos, *cnt, nf, &write_dio))
+		return nfsd_issue_write_dio(rqstp, fhp, nf, stable_how, nvecs,
+					    cnt, kiocb, &write_dio);
+
+	return nfsd_iocb_write(rqstp, nf->nf_file, nvecs, cnt, kiocb);
+}
+
 /**
  * nfsd_vfs_write - write data to an already-open file
  * @rqstp: RPC execution context
@@ -1340,6 +1531,11 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		nfsd_copy_write_verifier(verf, nn);
 
 	switch (nfsd_io_cache_write) {
+	case NFSD_IO_DIRECT:
+		host_err = nfsd_direct_write(rqstp, fhp, nf, stable_how,
+					     nvecs, cnt, &kiocb);
+		stable = *stable_how;
+		break;
 	case NFSD_IO_DONTCACHE:
 		if (file->f_op->fop_flags & FOP_DONTCACHE)
 			kiocb.ki_flags |= IOCB_DONTCACHE;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 4/4] svcrdma: Mark Read chunks
  2025-10-20 16:25 [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
                   ` (2 preceding siblings ...)
  2025-10-20 16:25 ` [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
@ 2025-10-20 16:25 ` Chuck Lever
  2025-10-21 11:07   ` Jeff Layton
  2025-10-20 16:33 ` [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Mike Snitzer
  4 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 16:25 UTC (permalink / raw)
  To: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

The upper layer may want to know when the receive buffer's .pages
array is guaranteed to contain only an opaque payload. This permits
the upper layer to optimize its buffer handling.

NB: Since svc_rdma_recvfrom.c is under net/, we use the comment
style that is preferred in the networking layer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index e7e4a39ca6c6..b1a0c72f73de 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -815,6 +815,11 @@ static void svc_rdma_read_complete_one(struct svc_rqst *rqstp,
 	buf->page_len = length;
 	buf->len += length;
 	buf->buflen += length;
+
+	/* Transport guarantees that only the chunk payload
+	 * appears in buf->pages.
+	 */
+	buf->flags |= XDRBUF_READ;
 }
 
 /* Finish constructing the RPC Call message in rqstp::rq_arg.
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write()
  2025-10-20 16:25 ` [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write() Chuck Lever
@ 2025-10-20 16:32   ` Mike Snitzer
  2025-10-21 10:51   ` Jeff Layton
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Snitzer @ 2025-10-20 16:32 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever

On Mon, Oct 20, 2025 at 12:25:44PM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Extract the common code that is to be used in the buffered and
> dontcache I/O modes. This common code will also be used as the
> fallback when direct I/O is requested but cannot be used.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Looks good, but please keep it attributed to me.  Feel free to add
Co-developed-by or something to reflect your assist?

Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-20 16:25 [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
                   ` (3 preceding siblings ...)
  2025-10-20 16:25 ` [PATCH v5 4/4] svcrdma: Mark Read chunks Chuck Lever
@ 2025-10-20 16:33 ` Mike Snitzer
  2025-10-20 16:44   ` Chuck Lever
  4 siblings, 1 reply; 17+ messages in thread
From: Mike Snitzer @ 2025-10-20 16:33 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever

On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Following on https://lore.kernel.org/linux-nfs/aPAci7O_XK1ljaum@kernel.org/
> this series includes all patches needed to make NFSD Direct WRITE
> work, plus an experimental follow-on patch.
> 
> Mike, just send along any responses to review comments as patch
> snippets and I will apply them as needed.
> 
> Changes since v4:
> * Split out refactoring nfsd_buffered_write() into a separate patch
> * Expand patch description of 1/4
> * Don't set IOCB_SYNC flag

Sure thing.  Just a bit concerned about removing IOCB_SYNC in that
we're altering stable_how to be NFS_FILE_SYNC.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-20 16:33 ` [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Mike Snitzer
@ 2025-10-20 16:44   ` Chuck Lever
  2025-10-20 18:20     ` Chuck Lever
  2025-10-21 11:12     ` Jeff Layton
  0 siblings, 2 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 16:44 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever

On 10/20/25 12:33 PM, Mike Snitzer wrote:
> On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
> Just a bit concerned about removing IOCB_SYNC in that
> we're altering stable_how to be NFS_FILE_SYNC.
Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
the first use of IOCB_ flags in NFSD's write path, and it uses
IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
Jeff.

Should we be concerned that IOCB_DSYNC does not persist time stamp
changes that might be lost during an unplanned server boot?

As a reminder to the thread, Section 3.3.7 of RFC 1813 says:

         If stable is FILE_SYNC, the server must commit the data
         written plus all file system metadata to stable storage
         before returning results.

The text is a bit blurry about whether "file system metadata" means
all of the outstanding metadata changes for every file, or just the
metadata changes for the target file handle.

NFSD has historically treated DATA_SYNC and FILE_SYNC identically,
as the Linux NFS client does not use DATA_SYNC (IIRC).


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-20 16:44   ` Chuck Lever
@ 2025-10-20 18:20     ` Chuck Lever
  2025-10-21 11:12     ` Jeff Layton
  1 sibling, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-20 18:20 UTC (permalink / raw)
  To: Chuck Lever
  Cc: NeilBrown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Mike Snitzer

On 10/20/25 12:44 PM, Chuck Lever wrote:
> On 10/20/25 12:33 PM, Mike Snitzer wrote:
>> On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
>> Just a bit concerned about removing IOCB_SYNC in that
>> we're altering stable_how to be NFS_FILE_SYNC.
> Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
> the first use of IOCB_ flags in NFSD's write path, and it uses
> IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
> Jeff.
> 
> Should we be concerned that IOCB_DSYNC does not persist time stamp
> changes that might be lost during an unplanned server boot?

One reason to mention 3f3503adb332 is that if we decide IOCB_SYNC is
necessary to correctly implement NFS_FILE_SYNC, then a separate,
backport-able, patch will be needed to replace the flag that was used in
that patch.

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write()
  2025-10-20 16:25 ` [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write() Chuck Lever
  2025-10-20 16:32   ` Mike Snitzer
@ 2025-10-21 10:51   ` Jeff Layton
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-10-21 10:51 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Mon, 2025-10-20 at 12:25 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Extract the common code that is to be used in the buffered and
> dontcache I/O modes. This common code will also be used as the
> fallback when direct I/O is requested but cannot be used.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/vfs.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 8b2dc7a88aab..ae9c41f7374e 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -1254,6 +1254,22 @@ static int wait_for_concurrent_writes(struct file *file)
>  	return err;
>  }
>  
> +static int
> +nfsd_iocb_write(struct svc_rqst *rqstp, struct file *file, unsigned int nvecs,
> +		unsigned long *cnt, struct kiocb *kiocb)
> +{
> +	struct iov_iter iter;
> +	int host_err;
> +
> +	iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
> +	host_err = vfs_iocb_iter_write(file, kiocb, &iter);
> +	if (host_err < 0)
> +		return host_err;
> +	*cnt = host_err;
> +
> +	return 0;
> +}
> +
>  /**
>   * nfsd_vfs_write - write data to an already-open file
>   * @rqstp: RPC execution context
> @@ -1282,7 +1298,6 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  	u32			stable = *stable_how;
>  	struct kiocb		kiocb;
>  	struct svc_export	*exp;
> -	struct iov_iter		iter;
>  	errseq_t		since;
>  	__be32			nfserr;
>  	int			host_err;
> @@ -1319,25 +1334,24 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
>  		kiocb.ki_flags |= IOCB_DSYNC;
>  
>  	nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
> -	iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
> +
>  	since = READ_ONCE(file->f_wb_err);
>  	if (verf)
>  		nfsd_copy_write_verifier(verf, nn);
>  
>  	switch (nfsd_io_cache_write) {
> -	case NFSD_IO_BUFFERED:
> -		break;
>  	case NFSD_IO_DONTCACHE:
>  		if (file->f_op->fop_flags & FOP_DONTCACHE)
>  			kiocb.ki_flags |= IOCB_DONTCACHE;
> +		fallthrough;
> +	case NFSD_IO_BUFFERED:
> +		host_err = nfsd_iocb_write(rqstp, file, nvecs, cnt, &kiocb);
>  		break;
>  	}
> -	host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
>  	if (host_err < 0) {
>  		commit_reset_write_verifier(nn, rqstp, host_err);
>  		goto out_nfserr;
>  	}
> -	*cnt = host_err;
>  	nfsd_stats_io_write_add(nn, exp, *cnt);
>  	fsnotify_modify(file);
>  	host_err = filemap_check_wb_err(file->f_mapping, since);

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 4/4] svcrdma: Mark Read chunks
  2025-10-20 16:25 ` [PATCH v5 4/4] svcrdma: Mark Read chunks Chuck Lever
@ 2025-10-21 11:07   ` Jeff Layton
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2025-10-21 11:07 UTC (permalink / raw)
  To: Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: linux-nfs, Chuck Lever

On Mon, 2025-10-20 at 12:25 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> The upper layer may want to know when the receive buffer's .pages
> array is guaranteed to contain only an opaque payload. This permits
> the upper layer to optimize its buffer handling.
> 
> NB: Since svc_rdma_recvfrom.c is under net/, we use the comment
> style that is preferred in the networking layer.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index e7e4a39ca6c6..b1a0c72f73de 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -815,6 +815,11 @@ static void svc_rdma_read_complete_one(struct svc_rqst *rqstp,
>  	buf->page_len = length;
>  	buf->len += length;
>  	buf->buflen += length;
> +
> +	/* Transport guarantees that only the chunk payload
> +	 * appears in buf->pages.
> +	 */
> +	buf->flags |= XDRBUF_READ;
>  }
>  
>  /* Finish constructing the RPC Call message in rqstp::rq_arg.

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-20 16:44   ` Chuck Lever
  2025-10-20 18:20     ` Chuck Lever
@ 2025-10-21 11:12     ` Jeff Layton
  2025-10-21 13:35       ` Chuck Lever
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2025-10-21 11:12 UTC (permalink / raw)
  To: Chuck Lever, Mike Snitzer
  Cc: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Mon, 2025-10-20 at 12:44 -0400, Chuck Lever wrote:
> On 10/20/25 12:33 PM, Mike Snitzer wrote:
> > On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
> > Just a bit concerned about removing IOCB_SYNC in that
> > we're altering stable_how to be NFS_FILE_SYNC.
> Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
> the first use of IOCB_ flags in NFSD's write path, and it uses
> IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
> Jeff.
> 
> Should we be concerned that IOCB_DSYNC does not persist time stamp
> changes that might be lost during an unplanned server boot?
> 
> As a reminder to the thread, Section 3.3.7 of RFC 1813 says:
> 
>          If stable is FILE_SYNC, the server must commit the data
>          written plus all file system metadata to stable storage
>          before returning results.
> 
> The text is a bit blurry about whether "file system metadata" means
> all of the outstanding metadata changes for every file, or just the
> metadata changes for the target file handle.
> 
> NFSD has historically treated DATA_SYNC and FILE_SYNC identically,
> as the Linux NFS client does not use DATA_SYNC (IIRC).
> 

Surely it just meant for the one file. FILE_SYNC is only applicable to
WRITE/COMMIT operations and those only deal with a single file at a
time. If the client gets back FILE_SYNC on a write, it should _not_
assume that all outstanding dirty data to all files has been sync'ed.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-21 11:12     ` Jeff Layton
@ 2025-10-21 13:35       ` Chuck Lever
  2025-10-21 13:45         ` Jeff Layton
  2025-10-22 17:00         ` Mike Snitzer
  0 siblings, 2 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-21 13:35 UTC (permalink / raw)
  To: Jeff Layton, Mike Snitzer
  Cc: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On 10/21/25 7:12 AM, Jeff Layton wrote:
> On Mon, 2025-10-20 at 12:44 -0400, Chuck Lever wrote:
>> On 10/20/25 12:33 PM, Mike Snitzer wrote:
>>> On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
>>> Just a bit concerned about removing IOCB_SYNC in that
>>> we're altering stable_how to be NFS_FILE_SYNC.
>> Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
>> the first use of IOCB_ flags in NFSD's write path, and it uses
>> IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
>> Jeff.
>>
>> Should we be concerned that IOCB_DSYNC does not persist time stamp
>> changes that might be lost during an unplanned server boot?
>>
>> As a reminder to the thread, Section 3.3.7 of RFC 1813 says:
>>
>>          If stable is FILE_SYNC, the server must commit the data
>>          written plus all file system metadata to stable storage
>>          before returning results.
>>
>> The text is a bit blurry about whether "file system metadata" means
>> all of the outstanding metadata changes for every file, or just the
>> metadata changes for the target file handle.
>>
>> NFSD has historically treated DATA_SYNC and FILE_SYNC identically,
>> as the Linux NFS client does not use DATA_SYNC (IIRC).
>>
> 
> Surely it just meant for the one file.

Well yes that is the traditional understanding. I'm merely pointing out
that the actual text is not quite as specific as what we've come to
understand.


> FILE_SYNC is only applicable to
> WRITE/COMMIT operations and those only deal with a single file at a
> time.

True but you may recall that NFSD's COMMIT used to ignore the range
arguments and flush the whole file. Some file systems used to flush
all dirty data in this case, IIRC.

There's always been a bit of a mismatch between the spec and what NFSD
has implemented.


> If the client gets back FILE_SYNC on a write, it should _not_
> assume that all outstanding dirty data to all files has been sync'ed.

Agreed.

But back to Mike's point.

- The spec says NFS_DATA_SYNC means persist file data.

- The spec says NFS_FILE_SYNC means persist file data and file
  attributes.

- After consulting with the section describing COMMIT, I think that
  COMMIT is supposed to persist both file data and attributes.

And my reading of the code in fs/nfsd/vfs.c is that NFSD does the
equivalent of NFS_DATA_SYNC in all of these cases, and has done for
as long as I cared to chase the commit log.

Moveover, commit 3f3503adb332 did not introduce this behavior.

Previous to that commit, nfsd_vfs_write() passed RWF_SYNC to
vfs_iter_write(). This API uses kiocb_set_rw_flags() to convert the RWF
flag into an IOCB flag. kiocb_set_rw_flags does this:

        kiocb_flags |= (__force int) (flags & RWF_SUPPORTED);
        if (flags & RWF_SYNC)
                kiocb_flags |= IOCB_DSYNC;

And that's where I copied IOCB_DSYNC from. The use of RWF_SYNC was
introduced in 2016 by commit 24368aad47dc ("nfsd: use RWF_SYNC").

So we've tacitly agreed to let NFSD fall short of the specs in this
regard for some time. However I don't believe this is documented
anywhere.

Based on this reasoning, IOCB_DSYNC is historically correct for the
DIRECT WRITE path and its fallbacks. I'm guessing that an O_DIRECT WRITE
is going to persist the written data but won't persist file attribute
changes either.

I'm open to making NFSD adhere more strictly to the spec language, but
I bet there will be a performance impact. Maybe that impact will be
unnoticeable on modern storage devices.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-21 13:35       ` Chuck Lever
@ 2025-10-21 13:45         ` Jeff Layton
  2025-10-21 13:53           ` Chuck Lever
  2025-10-22 17:00         ` Mike Snitzer
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2025-10-21 13:45 UTC (permalink / raw)
  To: Chuck Lever, Mike Snitzer
  Cc: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On Tue, 2025-10-21 at 09:35 -0400, Chuck Lever wrote:
> On 10/21/25 7:12 AM, Jeff Layton wrote:
> > On Mon, 2025-10-20 at 12:44 -0400, Chuck Lever wrote:
> > > On 10/20/25 12:33 PM, Mike Snitzer wrote:
> > > > On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
> > > > Just a bit concerned about removing IOCB_SYNC in that
> > > > we're altering stable_how to be NFS_FILE_SYNC.
> > > Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
> > > the first use of IOCB_ flags in NFSD's write path, and it uses
> > > IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
> > > Jeff.
> > > 
> > > Should we be concerned that IOCB_DSYNC does not persist time stamp
> > > changes that might be lost during an unplanned server boot?
> > > 
> > > As a reminder to the thread, Section 3.3.7 of RFC 1813 says:
> > > 
> > >          If stable is FILE_SYNC, the server must commit the data
> > >          written plus all file system metadata to stable storage
> > >          before returning results.
> > > 
> > > The text is a bit blurry about whether "file system metadata" means
> > > all of the outstanding metadata changes for every file, or just the
> > > metadata changes for the target file handle.
> > > 
> > > NFSD has historically treated DATA_SYNC and FILE_SYNC identically,
> > > as the Linux NFS client does not use DATA_SYNC (IIRC).
> > > 
> > 
> > Surely it just meant for the one file.
> 
> Well yes that is the traditional understanding. I'm merely pointing out
> that the actual text is not quite as specific as what we've come to
> understand.
> 
> 
> > FILE_SYNC is only applicable to
> > WRITE/COMMIT operations and those only deal with a single file at a
> > time.
> 
> True but you may recall that NFSD's COMMIT used to ignore the range
> arguments and flush the whole file. Some file systems used to flush
> all dirty data in this case, IIRC.
> 
> There's always been a bit of a mismatch between the spec and what NFSD
> has implemented.
> 
> 
> > If the client gets back FILE_SYNC on a write, it should _not_
> > assume that all outstanding dirty data to all files has been sync'ed.
> 
> Agreed.
> 
> But back to Mike's point.
> 
> - The spec says NFS_DATA_SYNC means persist file data.
> 
> - The spec says NFS_FILE_SYNC means persist file data and file
>   attributes.
> 
> - After consulting with the section describing COMMIT, I think that
>   COMMIT is supposed to persist both file data and attributes.
> 
> And my reading of the code in fs/nfsd/vfs.c is that NFSD does the
> equivalent of NFS_DATA_SYNC in all of these cases, and has done for
> as long as I cared to chase the commit log.
> 
> Moveover, commit 3f3503adb332 did not introduce this behavior.
> 
> Previous to that commit, nfsd_vfs_write() passed RWF_SYNC to
> vfs_iter_write(). This API uses kiocb_set_rw_flags() to convert the RWF
> flag into an IOCB flag. kiocb_set_rw_flags does this:
> 
>         kiocb_flags |= (__force int) (flags & RWF_SUPPORTED);
>         if (flags & RWF_SYNC)
>                 kiocb_flags |= IOCB_DSYNC;
> 
> And that's where I copied IOCB_DSYNC from. The use of RWF_SYNC was
> introduced in 2016 by commit 24368aad47dc ("nfsd: use RWF_SYNC").
> 
> So we've tacitly agreed to let NFSD fall short of the specs in this
> regard for some time. However I don't believe this is documented
> anywhere.
> 
> Based on this reasoning, IOCB_DSYNC is historically correct for the
> DIRECT WRITE path and its fallbacks. I'm guessing that an O_DIRECT WRITE
> is going to persist the written data but won't persist file attribute
> changes either.
> 

I think that's the case, generally.

> I'm open to making NFSD adhere more strictly to the spec language, but
> I bet there will be a performance impact. Maybe that impact will be
> unnoticeable on modern storage devices.
> 

Ok, I had missed the context that we had been doing this all along
anyway. In that case, IOCB_DSYNC seems like it's probably acceptable.
We likely have bigger cache coherency problems outstanding than
potential timestamp rollbacks anyway.

Alternately, we could just return NFS_DATA_SYNC, but then we'd have to
deal with follow-on commits.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-21 13:45         ` Jeff Layton
@ 2025-10-21 13:53           ` Chuck Lever
  0 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2025-10-21 13:53 UTC (permalink / raw)
  To: Jeff Layton, Mike Snitzer
  Cc: NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
	Chuck Lever

On 10/21/25 9:45 AM, Jeff Layton wrote:
> On Tue, 2025-10-21 at 09:35 -0400, Chuck Lever wrote:
>> On 10/21/25 7:12 AM, Jeff Layton wrote:
>>> On Mon, 2025-10-20 at 12:44 -0400, Chuck Lever wrote:
>>>> On 10/20/25 12:33 PM, Mike Snitzer wrote:
>>>>> On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
>>>>> Just a bit concerned about removing IOCB_SYNC in that
>>>>> we're altering stable_how to be NFS_FILE_SYNC.
>>>> Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
>>>> the first use of IOCB_ flags in NFSD's write path, and it uses
>>>> IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
>>>> Jeff.
>>>>
>>>> Should we be concerned that IOCB_DSYNC does not persist time stamp
>>>> changes that might be lost during an unplanned server boot?
>>>>
>>>> As a reminder to the thread, Section 3.3.7 of RFC 1813 says:
>>>>
>>>>          If stable is FILE_SYNC, the server must commit the data
>>>>          written plus all file system metadata to stable storage
>>>>          before returning results.
>>>>
>>>> The text is a bit blurry about whether "file system metadata" means
>>>> all of the outstanding metadata changes for every file, or just the
>>>> metadata changes for the target file handle.
>>>>
>>>> NFSD has historically treated DATA_SYNC and FILE_SYNC identically,
>>>> as the Linux NFS client does not use DATA_SYNC (IIRC).
>>>>
>>>
>>> Surely it just meant for the one file.
>>
>> Well yes that is the traditional understanding. I'm merely pointing out
>> that the actual text is not quite as specific as what we've come to
>> understand.
>>
>>
>>> FILE_SYNC is only applicable to
>>> WRITE/COMMIT operations and those only deal with a single file at a
>>> time.
>>
>> True but you may recall that NFSD's COMMIT used to ignore the range
>> arguments and flush the whole file. Some file systems used to flush
>> all dirty data in this case, IIRC.
>>
>> There's always been a bit of a mismatch between the spec and what NFSD
>> has implemented.
>>
>>
>>> If the client gets back FILE_SYNC on a write, it should _not_
>>> assume that all outstanding dirty data to all files has been sync'ed.
>>
>> Agreed.
>>
>> But back to Mike's point.
>>
>> - The spec says NFS_DATA_SYNC means persist file data.
>>
>> - The spec says NFS_FILE_SYNC means persist file data and file
>>   attributes.
>>
>> - After consulting with the section describing COMMIT, I think that
>>   COMMIT is supposed to persist both file data and attributes.
>>
>> And my reading of the code in fs/nfsd/vfs.c is that NFSD does the
>> equivalent of NFS_DATA_SYNC in all of these cases, and has done for
>> as long as I cared to chase the commit log.
>>
>> Moveover, commit 3f3503adb332 did not introduce this behavior.
>>
>> Previous to that commit, nfsd_vfs_write() passed RWF_SYNC to
>> vfs_iter_write(). This API uses kiocb_set_rw_flags() to convert the RWF
>> flag into an IOCB flag. kiocb_set_rw_flags does this:
>>
>>         kiocb_flags |= (__force int) (flags & RWF_SUPPORTED);
>>         if (flags & RWF_SYNC)
>>                 kiocb_flags |= IOCB_DSYNC;
>>
>> And that's where I copied IOCB_DSYNC from. The use of RWF_SYNC was
>> introduced in 2016 by commit 24368aad47dc ("nfsd: use RWF_SYNC").
>>
>> So we've tacitly agreed to let NFSD fall short of the specs in this
>> regard for some time. However I don't believe this is documented
>> anywhere.
>>
>> Based on this reasoning, IOCB_DSYNC is historically correct for the
>> DIRECT WRITE path and its fallbacks. I'm guessing that an O_DIRECT WRITE
>> is going to persist the written data but won't persist file attribute
>> changes either.
>>
> 
> I think that's the case, generally.
> 
>> I'm open to making NFSD adhere more strictly to the spec language, but
>> I bet there will be a performance impact. Maybe that impact will be
>> unnoticeable on modern storage devices.
>>
> 
> Ok, I had missed the context that we had been doing this all along
> anyway. In that case, IOCB_DSYNC seems like it's probably acceptable.
> We likely have bigger cache coherency problems outstanding than
> potential timestamp rollbacks anyway.
> 
> Alternately, we could just return NFS_DATA_SYNC, but then we'd have to
> deal with follow-on commits.

But what's the point of asking the client to send us a COMMIT if we're
still not going to sync the file attributes? ;-)


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-21 13:35       ` Chuck Lever
  2025-10-21 13:45         ` Jeff Layton
@ 2025-10-22 17:00         ` Mike Snitzer
  1 sibling, 0 replies; 17+ messages in thread
From: Mike Snitzer @ 2025-10-22 17:00 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, NeilBrown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-nfs, Chuck Lever, jonathan.flynn

On Tue, Oct 21, 2025 at 09:35:32AM -0400, Chuck Lever wrote:
> On 10/21/25 7:12 AM, Jeff Layton wrote:
> > On Mon, 2025-10-20 at 12:44 -0400, Chuck Lever wrote:
> >> On 10/20/25 12:33 PM, Mike Snitzer wrote:
> >>> On Mon, Oct 20, 2025 at 12:25:42PM -0400, Chuck Lever wrote:
> >>> Just a bit concerned about removing IOCB_SYNC in that
> >>> we're altering stable_how to be NFS_FILE_SYNC.
> >> Commit 3f3503adb332 ("NFSD: Use vfs_iocb_iter_write()") introduces
> >> the first use of IOCB_ flags in NFSD's write path, and it uses
> >> IOCB_DSYNC. The patch has Reviewed-by's from Christoph, Neil, and
> >> Jeff.
> >>
> >> Should we be concerned that IOCB_DSYNC does not persist time stamp
> >> changes that might be lost during an unplanned server boot?
> >>
> >> As a reminder to the thread, Section 3.3.7 of RFC 1813 says:
> >>
> >>          If stable is FILE_SYNC, the server must commit the data
> >>          written plus all file system metadata to stable storage
> >>          before returning results.
> >>
> >> The text is a bit blurry about whether "file system metadata" means
> >> all of the outstanding metadata changes for every file, or just the
> >> metadata changes for the target file handle.
> >>
> >> NFSD has historically treated DATA_SYNC and FILE_SYNC identically,
> >> as the Linux NFS client does not use DATA_SYNC (IIRC).
> >>
> > 
> > Surely it just meant for the one file.
> 
> Well yes that is the traditional understanding. I'm merely pointing out
> that the actual text is not quite as specific as what we've come to
> understand.
> 
> 
> > FILE_SYNC is only applicable to
> > WRITE/COMMIT operations and those only deal with a single file at a
> > time.
> 
> True but you may recall that NFSD's COMMIT used to ignore the range
> arguments and flush the whole file. Some file systems used to flush
> all dirty data in this case, IIRC.
> 
> There's always been a bit of a mismatch between the spec and what NFSD
> has implemented.
> 
> 
> > If the client gets back FILE_SYNC on a write, it should _not_
> > assume that all outstanding dirty data to all files has been sync'ed.
> 
> Agreed.
> 
> But back to Mike's point.
> 
> - The spec says NFS_DATA_SYNC means persist file data.
> 
> - The spec says NFS_FILE_SYNC means persist file data and file
>   attributes.
> 
> - After consulting with the section describing COMMIT, I think that
>   COMMIT is supposed to persist both file data and attributes.
> 
> And my reading of the code in fs/nfsd/vfs.c is that NFSD does the
> equivalent of NFS_DATA_SYNC in all of these cases, and has done for
> as long as I cared to chase the commit log.
> 
> Moveover, commit 3f3503adb332 did not introduce this behavior.
> 
> Previous to that commit, nfsd_vfs_write() passed RWF_SYNC to
> vfs_iter_write(). This API uses kiocb_set_rw_flags() to convert the RWF
> flag into an IOCB flag. kiocb_set_rw_flags does this:
> 
>         kiocb_flags |= (__force int) (flags & RWF_SUPPORTED);
>         if (flags & RWF_SYNC)
>                 kiocb_flags |= IOCB_DSYNC;
> 
> And that's where I copied IOCB_DSYNC from. The use of RWF_SYNC was
> introduced in 2016 by commit 24368aad47dc ("nfsd: use RWF_SYNC").
> 
> So we've tacitly agreed to let NFSD fall short of the specs in this
> regard for some time. However I don't believe this is documented
> anywhere.
> 
> Based on this reasoning, IOCB_DSYNC is historically correct for the
> DIRECT WRITE path and its fallbacks. I'm guessing that an O_DIRECT WRITE
> is going to persist the written data but won't persist file attribute
> changes either.
> 
> I'm open to making NFSD adhere more strictly to the spec language, but
> I bet there will be a performance impact. Maybe that impact will be
> unnoticeable on modern storage devices.

I was able to get NFSD Direct's use of IOCB_DSYNC|IOCB_SYNC tested
with Jonathan Flynn last week, there was no performance difference on
his modern test cluster (8 "enterprise" NVMe devices in the server).

Was very comforting to learn IOCB_SYNC didn't cause any performance
loss.

Mike

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
  2025-10-20 16:25 ` [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
@ 2025-10-25  0:10   ` kernel test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-10-25  0:10 UTC (permalink / raw)
  To: Chuck Lever; +Cc: llvm, oe-kbuild-all

Hi Chuck,

kernel test robot noticed the following build errors:

[auto build test ERROR on v6.18-rc2]
[also build test ERROR on linus/master next-20251024]
[cannot apply to trondmy-nfs/linux-next brauner-vfs/vfs.all]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chuck-Lever/NFSD-Enable-return-of-an-updated-stable_how-to-NFS-clients/20251021-002911
base:   v6.18-rc2
patch link:    https://lore.kernel.org/r/20251020162546.5066-4-cel%40kernel.org
patch subject: [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20251025/202510250840.m9CSsDgA-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251025/202510250840.m9CSsDgA-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510250840.m9CSsDgA-lkp@intel.com/

All errors (new ones prefixed by >>):

>> fs/nfsd/vfs.c:1448:7: error: use of undeclared identifier 'NFSD_IO_DIRECT'
    1448 |         case NFSD_IO_DIRECT:
         |              ^
   1 error generated.
--
>> fs/nfsd/debugfs.c:109:7: error: use of undeclared identifier 'NFSD_IO_DIRECT'
     109 |         case NFSD_IO_DIRECT:
         |              ^
   1 error generated.


vim +/NFSD_IO_DIRECT +1448 fs/nfsd/vfs.c

  1377	
  1378	/**
  1379	 * nfsd_vfs_write - write data to an already-open file
  1380	 * @rqstp: RPC execution context
  1381	 * @fhp: File handle of file to write into
  1382	 * @nf: An open file matching @fhp
  1383	 * @offset: Byte offset of start
  1384	 * @payload: xdr_buf containing the write payload
  1385	 * @cnt: IN: number of bytes to write, OUT: number of bytes actually written
  1386	 * @stable_how: IN: Client's requested stable_how, OUT: Actual stable_how
  1387	 * @verf: NFS WRITE verifier
  1388	 *
  1389	 * Upon return, caller must invoke fh_put on @fhp.
  1390	 *
  1391	 * Return values:
  1392	 *   An nfsstat value in network byte order.
  1393	 */
  1394	__be32
  1395	nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
  1396		       struct nfsd_file *nf, loff_t offset,
  1397		       const struct xdr_buf *payload, unsigned long *cnt,
  1398		       u32 *stable_how, __be32 *verf)
  1399	{
  1400		struct nfsd_net		*nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
  1401		struct file		*file = nf->nf_file;
  1402		struct super_block	*sb = file_inode(file)->i_sb;
  1403		u32			stable = *stable_how;
  1404		struct kiocb		kiocb;
  1405		struct svc_export	*exp;
  1406		errseq_t		since;
  1407		__be32			nfserr;
  1408		int			host_err;
  1409		unsigned long		exp_op_flags = 0;
  1410		unsigned int		pflags = current->flags;
  1411		bool			restore_flags = false;
  1412		unsigned int		nvecs;
  1413	
  1414		trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
  1415	
  1416		if (sb->s_export_op)
  1417			exp_op_flags = sb->s_export_op->flags;
  1418	
  1419		if (test_bit(RQ_LOCAL, &rqstp->rq_flags) &&
  1420		    !(exp_op_flags & EXPORT_OP_REMOTE_FS)) {
  1421			/*
  1422			 * We want throttling in balance_dirty_pages()
  1423			 * and shrink_inactive_list() to only consider
  1424			 * the backingdev we are writing to, so that nfs to
  1425			 * localhost doesn't cause nfsd to lock up due to all
  1426			 * the client's dirty pages or its congested queue.
  1427			 */
  1428			current->flags |= PF_LOCAL_THROTTLE;
  1429			restore_flags = true;
  1430		}
  1431	
  1432		exp = fhp->fh_export;
  1433	
  1434		if (!EX_ISSYNC(exp))
  1435			stable = NFS_UNSTABLE;
  1436		init_sync_kiocb(&kiocb, file);
  1437		kiocb.ki_pos = offset;
  1438		if (stable && !fhp->fh_use_wgather)
  1439			kiocb.ki_flags |= IOCB_DSYNC;
  1440	
  1441		nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
  1442	
  1443		since = READ_ONCE(file->f_wb_err);
  1444		if (verf)
  1445			nfsd_copy_write_verifier(verf, nn);
  1446	
  1447		switch (nfsd_io_cache_write) {
> 1448		case NFSD_IO_DIRECT:
  1449			host_err = nfsd_direct_write(rqstp, fhp, nf, stable_how,
  1450						     nvecs, cnt, &kiocb);
  1451			stable = *stable_how;
  1452			break;
  1453		case NFSD_IO_DONTCACHE:
  1454			if (file->f_op->fop_flags & FOP_DONTCACHE)
  1455				kiocb.ki_flags |= IOCB_DONTCACHE;
  1456			fallthrough;
  1457		case NFSD_IO_BUFFERED:
  1458			host_err = nfsd_iocb_write(rqstp, file, nvecs, cnt, &kiocb);
  1459			break;
  1460		}
  1461		if (host_err < 0) {
  1462			commit_reset_write_verifier(nn, rqstp, host_err);
  1463			goto out_nfserr;
  1464		}
  1465		nfsd_stats_io_write_add(nn, exp, *cnt);
  1466		fsnotify_modify(file);
  1467		host_err = filemap_check_wb_err(file->f_mapping, since);
  1468		if (host_err < 0)
  1469			goto out_nfserr;
  1470	
  1471		if (stable && fhp->fh_use_wgather) {
  1472			host_err = wait_for_concurrent_writes(file);
  1473			if (host_err < 0)
  1474				commit_reset_write_verifier(nn, rqstp, host_err);
  1475		}
  1476	
  1477	out_nfserr:
  1478		if (host_err >= 0) {
  1479			trace_nfsd_write_io_done(rqstp, fhp, offset, *cnt);
  1480			nfserr = nfs_ok;
  1481		} else {
  1482			trace_nfsd_write_err(rqstp, fhp, offset, host_err);
  1483			nfserr = nfserrno(host_err);
  1484		}
  1485		if (restore_flags)
  1486			current_restore_flags(pflags, PF_LOCAL_THROTTLE);
  1487		return nfserr;
  1488	}
  1489	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-10-25  0:11 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 16:25 [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
2025-10-20 16:25 ` [PATCH v5 1/4] NFSD: Enable return of an updated stable_how to NFS clients Chuck Lever
2025-10-20 16:25 ` [PATCH v5 2/4] NFSD: Refactor nfsd_vfs_write() Chuck Lever
2025-10-20 16:32   ` Mike Snitzer
2025-10-21 10:51   ` Jeff Layton
2025-10-20 16:25 ` [PATCH v5 3/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Chuck Lever
2025-10-25  0:10   ` kernel test robot
2025-10-20 16:25 ` [PATCH v5 4/4] svcrdma: Mark Read chunks Chuck Lever
2025-10-21 11:07   ` Jeff Layton
2025-10-20 16:33 ` [PATCH v5 0/4] NFSD: Implement NFSD_IO_DIRECT for NFS WRITE Mike Snitzer
2025-10-20 16:44   ` Chuck Lever
2025-10-20 18:20     ` Chuck Lever
2025-10-21 11:12     ` Jeff Layton
2025-10-21 13:35       ` Chuck Lever
2025-10-21 13:45         ` Jeff Layton
2025-10-21 13:53           ` Chuck Lever
2025-10-22 17:00         ` Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.