* [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes
@ 2025-09-03 20:51 Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 1/9] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Mike Snitzer
` (8 more replies)
0 siblings, 9 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
Hi,
Some workloads benefit from NFSD avoiding the page cache, particularly
those with a working set that is significantly larger than available
system memory. This patchset introduces _optional_ support to
configure the use of O_DIRECT or DONTCACHE for NFSD's READ and WRITE
support. The NFSD default to use page cache is left unchanged.
This code has proven to work well during my testing. Any suggestions
for further refinement are welcome.
Thanks,
Mike
Changes since v8:
- Remove a few WARN_ON_ONCE from the misaligned DIO READ and WRITE paths
- pr_warn_ratelimited if EINVAL on misaligned READ and WRITE paths
- handle DIO WRITE -ENOTBLK return by falling back to using buffered IO.
- fix misaligned DIO READ to not use a start_extra_page
- use /end/ of rq_pages for front_pad page
- fix checkpatch warning about 'unsigned' in nfsd_iov_iter_aligned_bvec
- fix NFSD debugfs interfaces to no longer use UNSPECIFIED state,
explicitly default to NFSD_IO_BUFFERED
- add Documentation/filesystems/nfs/nfsd-io-modes.rst
Mike Snitzer (9):
NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support
NFSD: pass nfsd_file to nfsd_iter_read()
NFSD: add io_cache_read controls to debugfs interface
NFSD: add io_cache_write controls to debugfs interface
NFSD: issue READs using O_DIRECT even if IO is misaligned
NFSD: issue WRITEs using O_DIRECT even if IO is misaligned
NFSD: add nfsd_analyze_read_dio and nfsd_analyze_write_dio trace events
NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst
NFSD: use /end/ of rq_pages for misaligned DIO READ's start_extra page
.../filesystems/nfs/nfsd-io-modes.rst | 144 ++++++
fs/nfsd/debugfs.c | 98 +++++
fs/nfsd/filecache.c | 32 ++
fs/nfsd/filecache.h | 4 +
fs/nfsd/nfs4xdr.c | 8 +-
fs/nfsd/nfsd.h | 10 +
fs/nfsd/nfsfh.c | 4 +
fs/nfsd/trace.h | 61 +++
fs/nfsd/vfs.c | 409 +++++++++++++++++-
fs/nfsd/vfs.h | 2 +-
include/linux/sunrpc/svc.h | 5 +-
11 files changed, 759 insertions(+), 18 deletions(-)
create mode 100644 Documentation/filesystems/nfs/nfsd-io-modes.rst
--
2.44.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v9 1/9] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 2/9] NFSD: pass nfsd_file to nfsd_iter_read() Mike Snitzer
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
Use STATX_DIOALIGN and STATX_DIO_READ_ALIGN to get and store DIO
alignment attributes from underlying filesystem in associated
nfsd_file. This is done when the nfsd_file is first opened for
a regular file.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/filecache.c | 32 ++++++++++++++++++++++++++++++++
fs/nfsd/filecache.h | 4 ++++
fs/nfsd/nfsfh.c | 4 ++++
3 files changed, 40 insertions(+)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 8581c131338b8..5447dba6c5da0 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -231,6 +231,9 @@ nfsd_file_alloc(struct net *net, struct inode *inode, unsigned char need,
refcount_set(&nf->nf_ref, 1);
nf->nf_may = need;
nf->nf_mark = NULL;
+ nf->nf_dio_mem_align = 0;
+ nf->nf_dio_offset_align = 0;
+ nf->nf_dio_read_offset_align = 0;
return nf;
}
@@ -1048,6 +1051,33 @@ nfsd_file_is_cached(struct inode *inode)
return ret;
}
+static __be32
+nfsd_file_getattr(const struct svc_fh *fhp, struct nfsd_file *nf)
+{
+ struct inode *inode = file_inode(nf->nf_file);
+ struct kstat stat;
+ __be32 status;
+
+ /* Currently only need to get DIO alignment info for regular files */
+ if (!S_ISREG(inode->i_mode))
+ return nfs_ok;
+
+ status = fh_getattr(fhp, &stat);
+ if (status != nfs_ok)
+ return status;
+
+ if (stat.result_mask & STATX_DIOALIGN) {
+ nf->nf_dio_mem_align = stat.dio_mem_align;
+ nf->nf_dio_offset_align = stat.dio_offset_align;
+ }
+ if (stat.result_mask & STATX_DIO_READ_ALIGN)
+ nf->nf_dio_read_offset_align = stat.dio_read_offset_align;
+ else
+ nf->nf_dio_read_offset_align = nf->nf_dio_offset_align;
+
+ return status;
+}
+
static __be32
nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
struct svc_cred *cred,
@@ -1166,6 +1196,8 @@ nfsd_file_do_acquire(struct svc_rqst *rqstp, struct net *net,
}
status = nfserrno(ret);
trace_nfsd_file_open(nf, status);
+ if (status == nfs_ok)
+ status = nfsd_file_getattr(fhp, nf);
}
} else
status = nfserr_jukebox;
diff --git a/fs/nfsd/filecache.h b/fs/nfsd/filecache.h
index 24ddf60e8434a..e3d6ca2b60308 100644
--- a/fs/nfsd/filecache.h
+++ b/fs/nfsd/filecache.h
@@ -54,6 +54,10 @@ struct nfsd_file {
struct list_head nf_gc;
struct rcu_head nf_rcu;
ktime_t nf_birthtime;
+
+ u32 nf_dio_mem_align;
+ u32 nf_dio_offset_align;
+ u32 nf_dio_read_offset_align;
};
int nfsd_file_cache_init(void);
diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
index f4a3cc9e31e05..bdba2ba828a6a 100644
--- a/fs/nfsd/nfsfh.c
+++ b/fs/nfsd/nfsfh.c
@@ -677,8 +677,12 @@ __be32 fh_getattr(const struct svc_fh *fhp, struct kstat *stat)
.mnt = fhp->fh_export->ex_path.mnt,
.dentry = fhp->fh_dentry,
};
+ struct inode *inode = d_inode(p.dentry);
u32 request_mask = STATX_BASIC_STATS;
+ if (S_ISREG(inode->i_mode))
+ request_mask |= (STATX_DIOALIGN | STATX_DIO_READ_ALIGN);
+
if (fhp->fh_maxsize == NFS4_FHSIZE)
request_mask |= (STATX_BTIME | STATX_CHANGE_COOKIE);
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 2/9] NFSD: pass nfsd_file to nfsd_iter_read()
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 1/9] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 3/9] NFSD: add io_cache_read controls to debugfs interface Mike Snitzer
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
Prepares for nfsd_iter_read() to use DIO alignment stored in nfsd_file.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
fs/nfsd/nfs4xdr.c | 8 ++++----
fs/nfsd/vfs.c | 7 ++++---
fs/nfsd/vfs.h | 2 +-
3 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 7d19925f46e45..d519f4156cfad 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -4464,7 +4464,7 @@ static __be32 nfsd4_encode_splice_read(
static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
struct nfsd4_read *read,
- struct file *file, unsigned long maxcount)
+ unsigned long maxcount)
{
struct xdr_stream *xdr = resp->xdr;
unsigned int base = xdr->buf->page_len & ~PAGE_MASK;
@@ -4475,7 +4475,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
if (xdr_reserve_space_vec(xdr, maxcount) < 0)
return nfserr_resource;
- nfserr = nfsd_iter_read(resp->rqstp, read->rd_fhp, file,
+ nfserr = nfsd_iter_read(resp->rqstp, read->rd_fhp, read->rd_nf,
read->rd_offset, &maxcount, base,
&read->rd_eof);
read->rd_length = maxcount;
@@ -4522,7 +4522,7 @@ nfsd4_encode_read(struct nfsd4_compoundres *resp, __be32 nfserr,
if (file->f_op->splice_read && splice_ok)
nfserr = nfsd4_encode_splice_read(resp, read, file, maxcount);
else
- nfserr = nfsd4_encode_readv(resp, read, file, maxcount);
+ nfserr = nfsd4_encode_readv(resp, read, maxcount);
if (nfserr) {
xdr_truncate_encode(xdr, eof_offset);
return nfserr;
@@ -5418,7 +5418,7 @@ nfsd4_encode_read_plus_data(struct nfsd4_compoundres *resp,
if (file->f_op->splice_read && splice_ok)
nfserr = nfsd4_encode_splice_read(resp, read, file, maxcount);
else
- nfserr = nfsd4_encode_readv(resp, read, file, maxcount);
+ nfserr = nfsd4_encode_readv(resp, read, maxcount);
if (nfserr)
return nfserr;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 0c0f25b2c8e38..79439ad93880a 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1075,7 +1075,7 @@ __be32 nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
* nfsd_iter_read - Perform a VFS read using an iterator
* @rqstp: RPC transaction context
* @fhp: file handle of file to be read
- * @file: opened struct file of file to be read
+ * @nf: opened struct nfsd_file of file to be read
* @offset: starting byte offset
* @count: IN: requested number of bytes; OUT: number of bytes read
* @base: offset in first page of read buffer
@@ -1088,9 +1088,10 @@ __be32 nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
* returned.
*/
__be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
- struct file *file, loff_t offset, unsigned long *count,
+ struct nfsd_file *nf, loff_t offset, unsigned long *count,
unsigned int base, u32 *eof)
{
+ struct file *file = nf->nf_file;
unsigned long v, total;
struct iov_iter iter;
struct kiocb kiocb;
@@ -1312,7 +1313,7 @@ __be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
if (file->f_op->splice_read && nfsd_read_splice_ok(rqstp))
err = nfsd_splice_read(rqstp, fhp, file, offset, count, eof);
else
- err = nfsd_iter_read(rqstp, fhp, file, offset, count, 0, eof);
+ err = nfsd_iter_read(rqstp, fhp, nf, offset, count, 0, eof);
nfsd_file_put(nf);
trace_nfsd_read_done(rqstp, fhp, offset, *count);
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 0c0292611c6de..fa46f8b5f1320 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -121,7 +121,7 @@ __be32 nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned long *count,
u32 *eof);
__be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
- struct file *file, loff_t offset,
+ struct nfsd_file *nf, loff_t offset,
unsigned long *count, unsigned int base,
u32 *eof);
bool nfsd_read_splice_ok(struct svc_rqst *rqstp);
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 3/9] NFSD: add io_cache_read controls to debugfs interface
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 1/9] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 2/9] NFSD: pass nfsd_file to nfsd_iter_read() Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 4/9] NFSD: add io_cache_write " Mike Snitzer
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
Add 'io_cache_read' to NFSD's debugfs interface so that: Any data
read by NFSD will either be:
- cached using page cache (NFSD_IO_BUFFERED=0)
- cached but removed from the page cache upon completion
(NFSD_IO_DONTCACHE=1).
- not cached (NFSD_IO_DIRECT=2)
io_cache_read may be set by writing to:
/sys/kernel/debug/nfsd/io_cache_read
The default value for io_cache_read reflects NFSD's current default IO
mode (NFSD_IO_BUFFERED=0).
If NFSD_IO_DONTCACHE is specified using 1, FOP_DONTCACHE must be
advertised as supported by the underlying filesystem (e.g. XFS),
otherwise all IO flagged with RWF_DONTCACHE will fail with
-EOPNOTSUPP.
If NFSD_IO_DIRECT is specified using 2, the IO must be aligned
relative to the underlying block device's logical_block_size. Also the
memory buffer used to store the read must be aligned relative to the
underlying block device's dma_alignment.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfsd/debugfs.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/nfsd.h | 9 ++++++++
fs/nfsd/vfs.c | 17 ++++++++++++++
3 files changed, 82 insertions(+)
diff --git a/fs/nfsd/debugfs.c b/fs/nfsd/debugfs.c
index 84b0c8b559dc9..dd1dc28a53784 100644
--- a/fs/nfsd/debugfs.c
+++ b/fs/nfsd/debugfs.c
@@ -27,11 +27,64 @@ static int nfsd_dsr_get(void *data, u64 *val)
static int nfsd_dsr_set(void *data, u64 val)
{
nfsd_disable_splice_read = (val > 0) ? true : false;
+ if (!nfsd_disable_splice_read) {
+ /*
+ * Cannot use NFSD_IO_DONTCACHE or NFSD_IO_DIRECT
+ * if splice_read is enabled.
+ */
+ nfsd_io_cache_read = NFSD_IO_BUFFERED;
+ }
return 0;
}
DEFINE_DEBUGFS_ATTRIBUTE(nfsd_dsr_fops, nfsd_dsr_get, nfsd_dsr_set, "%llu\n");
+/*
+ * /sys/kernel/debug/nfsd/io_cache_read
+ *
+ * Contents:
+ * %0: NFS READ will use buffered IO
+ * %1: NFS READ will use dontcache (buffered IO w/ dropbehind)
+ * %2: NFS READ will use direct IO
+ *
+ * This setting takes immediate effect for all NFS versions,
+ * all exports, and in all NFSD net namespaces.
+ */
+
+static int nfsd_io_cache_read_get(void *data, u64 *val)
+{
+ *val = nfsd_io_cache_read;
+ return 0;
+}
+
+static int nfsd_io_cache_read_set(void *data, u64 val)
+{
+ int ret = 0;
+
+ switch (val) {
+ case NFSD_IO_BUFFERED:
+ nfsd_io_cache_read = NFSD_IO_BUFFERED;
+ break;
+ case NFSD_IO_DONTCACHE:
+ case NFSD_IO_DIRECT:
+ /*
+ * Must disable splice_read when enabling
+ * NFSD_IO_DONTCACHE or NFSD_IO_DIRECT.
+ */
+ nfsd_disable_splice_read = true;
+ nfsd_io_cache_read = val;
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(nfsd_io_cache_read_fops, nfsd_io_cache_read_get,
+ nfsd_io_cache_read_set, "%llu\n");
+
void nfsd_debugfs_exit(void)
{
debugfs_remove_recursive(nfsd_top_dir);
@@ -44,4 +97,7 @@ void nfsd_debugfs_init(void)
debugfs_create_file("disable-splice-read", S_IWUSR | S_IRUGO,
nfsd_top_dir, NULL, &nfsd_dsr_fops);
+
+ debugfs_create_file("io_cache_read", S_IWUSR | S_IRUGO,
+ nfsd_top_dir, NULL, &nfsd_io_cache_read_fops);
}
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 1cd0bed57bc2f..41cb7c7feff3e 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -153,6 +153,15 @@ static inline void nfsd_debugfs_exit(void) {}
extern bool nfsd_disable_splice_read __read_mostly;
+enum {
+ /* Any new NFSD_IO enum value must be added at the end */
+ NFSD_IO_BUFFERED,
+ NFSD_IO_DONTCACHE,
+ NFSD_IO_DIRECT,
+};
+
+extern u64 nfsd_io_cache_read __read_mostly;
+
extern int nfsd_max_blksize;
static inline int nfsd_v4client(struct svc_rqst *rq)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 79439ad93880a..21441745df69a 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -49,6 +49,7 @@
#define NFSDDBG_FACILITY NFSDDBG_FILEOP
bool nfsd_disable_splice_read __read_mostly;
+u64 nfsd_io_cache_read __read_mostly = NFSD_IO_BUFFERED;
/**
* nfserrno - Map Linux errnos to NFS errnos
@@ -1099,6 +1100,22 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
size_t len;
init_sync_kiocb(&kiocb, file);
+
+ switch (nfsd_io_cache_read) {
+ case NFSD_IO_DIRECT:
+ /* Verify ondisk and memory DIO alignment */
+ if (nf->nf_dio_mem_align && nf->nf_dio_read_offset_align &&
+ (((offset | *count) & (nf->nf_dio_read_offset_align - 1)) == 0) &&
+ (base & (nf->nf_dio_mem_align - 1)) == 0)
+ kiocb.ki_flags = IOCB_DIRECT;
+ break;
+ case NFSD_IO_DONTCACHE:
+ kiocb.ki_flags = IOCB_DONTCACHE;
+ break;
+ case NFSD_IO_BUFFERED:
+ break;
+ }
+
kiocb.ki_pos = offset;
v = 0;
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 4/9] NFSD: add io_cache_write controls to debugfs interface
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
` (2 preceding siblings ...)
2025-09-03 20:51 ` [PATCH v9 3/9] NFSD: add io_cache_read controls to debugfs interface Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 5/9] NFSD: issue READs using O_DIRECT even if IO is misaligned Mike Snitzer
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
Add 'io_cache_write' to NFSD's debugfs interface so that: Any data
written by NFSD will either be:
- cached using page cache (NFSD_IO_BUFFERED=0)
- cached but removed from the page cache upon completion
(NFSD_IO_DONTCACHE=1).
- not cached (NFSD_IO_DIRECT=2)
io_cache_write may be set by writing to:
/sys/kernel/debug/nfsd/io_cache_write
The default value for io_cache_write reflects NFSD's current default
IO mode (NFSD_IO_BUFFERED=0).
If NFSD_IO_DONTCACHE is specified using 1, FOP_DONTCACHE must be
advertised as supported by the underlying filesystem (e.g. XFS),
otherwise all IO flagged with RWF_DONTCACHE will fail with
-EOPNOTSUPP.
If NFSD_IO_DIRECT is specified using 2, the IO must be aligned
relative to the underlying block device's logical_block_size. Also the
memory buffer used to store the WRITE payload must be aligned relative
to the underlying block device's dma_alignment.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfsd/debugfs.c | 42 ++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/nfsd.h | 1 +
fs/nfsd/vfs.c | 15 +++++++++++++++
3 files changed, 58 insertions(+)
diff --git a/fs/nfsd/debugfs.c b/fs/nfsd/debugfs.c
index dd1dc28a53784..173032a04cdec 100644
--- a/fs/nfsd/debugfs.c
+++ b/fs/nfsd/debugfs.c
@@ -85,6 +85,45 @@ static int nfsd_io_cache_read_set(void *data, u64 val)
DEFINE_DEBUGFS_ATTRIBUTE(nfsd_io_cache_read_fops, nfsd_io_cache_read_get,
nfsd_io_cache_read_set, "%llu\n");
+/*
+ * /sys/kernel/debug/nfsd/io_cache_write
+ *
+ * Contents:
+ * %0: NFS WRITE will use buffered IO
+ * %1: NFS WRITE will use dontcache (buffered IO w/ dropbehind)
+ * %2: NFS WRITE will use direct IO
+ *
+ * This setting takes immediate effect for all NFS versions,
+ * all exports, and in all NFSD net namespaces.
+ */
+
+static int nfsd_io_cache_write_get(void *data, u64 *val)
+{
+ *val = nfsd_io_cache_write;
+ return 0;
+}
+
+static int nfsd_io_cache_write_set(void *data, u64 val)
+{
+ int ret = 0;
+
+ switch (val) {
+ case NFSD_IO_BUFFERED:
+ case NFSD_IO_DONTCACHE:
+ case NFSD_IO_DIRECT:
+ nfsd_io_cache_write = val;
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+ return ret;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(nfsd_io_cache_write_fops, nfsd_io_cache_write_get,
+ nfsd_io_cache_write_set, "%llu\n");
+
void nfsd_debugfs_exit(void)
{
debugfs_remove_recursive(nfsd_top_dir);
@@ -100,4 +139,7 @@ void nfsd_debugfs_init(void)
debugfs_create_file("io_cache_read", S_IWUSR | S_IRUGO,
nfsd_top_dir, NULL, &nfsd_io_cache_read_fops);
+
+ debugfs_create_file("io_cache_write", S_IWUSR | S_IRUGO,
+ nfsd_top_dir, NULL, &nfsd_io_cache_write_fops);
}
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index 41cb7c7feff3e..c491eb258ecd3 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -161,6 +161,7 @@ enum {
};
extern u64 nfsd_io_cache_read __read_mostly;
+extern u64 nfsd_io_cache_write __read_mostly;
extern int nfsd_max_blksize;
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 21441745df69a..358d10a0665f6 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -50,6 +50,7 @@
bool nfsd_disable_splice_read __read_mostly;
u64 nfsd_io_cache_read __read_mostly = NFSD_IO_BUFFERED;
+u64 nfsd_io_cache_write __read_mostly = NFSD_IO_BUFFERED;
/**
* nfserrno - Map Linux errnos to NFS errnos
@@ -1241,6 +1242,20 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
since = READ_ONCE(file->f_wb_err);
if (verf)
nfsd_copy_write_verifier(verf, nn);
+
+ switch (nfsd_io_cache_write) {
+ case NFSD_IO_DIRECT:
+ /* direct I/O must be aligned to device logical sector size */
+ if (nf->nf_dio_mem_align && nf->nf_dio_offset_align &&
+ (((offset | *cnt) & (nf->nf_dio_offset_align-1)) == 0))
+ kiocb.ki_flags |= IOCB_DIRECT;
+ break;
+ case NFSD_IO_DONTCACHE:
+ kiocb.ki_flags |= IOCB_DONTCACHE;
+ break;
+ case NFSD_IO_BUFFERED:
+ break;
+ }
host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
if (host_err < 0) {
commit_reset_write_verifier(nn, rqstp, host_err);
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 5/9] NFSD: issue READs using O_DIRECT even if IO is misaligned
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
` (3 preceding siblings ...)
2025-09-03 20:51 ` [PATCH v9 4/9] NFSD: add io_cache_write " Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 6/9] NFSD: issue WRITEs " Mike Snitzer
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
If NFSD_IO_DIRECT is used, expand any misaligned READ to the next
DIO-aligned block (on either end of the READ). The expanded READ is
verified to have proper offset/len (logical_block_size) and
dma_alignment checking.
Any misaligned READ that is less than 32K won't be expanded to be
DIO-aligned (this heuristic just avoids excess work, like allocating
start_extra_page, for smaller IO that can generally already perform
well using buffered IO).
Suggested-by: Jeff Layton <jlayton@kernel.org>
Suggested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfsd/vfs.c | 184 +++++++++++++++++++++++++++++++++++--
include/linux/sunrpc/svc.h | 5 +-
2 files changed, 178 insertions(+), 11 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 358d10a0665f6..96ae86419dc80 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -19,6 +19,7 @@
#include <linux/splice.h>
#include <linux/falloc.h>
#include <linux/fcntl.h>
+#include <linux/math.h>
#include <linux/namei.h>
#include <linux/delay.h>
#include <linux/fsnotify.h>
@@ -1073,6 +1074,137 @@ __be32 nfsd_splice_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
return nfsd_finish_read(rqstp, fhp, file, offset, count, eof, host_err);
}
+struct nfsd_read_dio {
+ loff_t start;
+ loff_t end;
+ unsigned long start_extra;
+ unsigned long end_extra;
+};
+
+static void init_nfsd_read_dio(struct nfsd_read_dio *read_dio)
+{
+ memset(read_dio, 0, sizeof(*read_dio));
+}
+
+#define NFSD_READ_DIO_MIN_KB (32 << 10)
+
+static bool nfsd_analyze_read_dio(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ struct nfsd_file *nf, loff_t offset,
+ unsigned long len, unsigned int base,
+ struct nfsd_read_dio *read_dio)
+{
+ const u32 dio_blocksize = nf->nf_dio_read_offset_align;
+ loff_t middle_end, orig_end = offset + len;
+
+ if (unlikely(!nf->nf_dio_mem_align || !dio_blocksize))
+ return false;
+ if (unlikely(dio_blocksize > PAGE_SIZE))
+ return false;
+ if (unlikely(len < dio_blocksize))
+ return false;
+
+ /* Return early if IO is irreparably misaligned (base not aligned).
+ * Ondisk alignment is implied by the following code that expands
+ * misaligned IO to have a DIO-aligned offset and len.
+ */
+ if ((base & (nf->nf_dio_mem_align-1)) != 0)
+ return false;
+
+ init_nfsd_read_dio(read_dio);
+
+ read_dio->start = round_down(offset, dio_blocksize);
+ read_dio->end = round_up(orig_end, dio_blocksize);
+ read_dio->start_extra = offset - read_dio->start;
+ read_dio->end_extra = read_dio->end - orig_end;
+
+ /*
+ * Any misaligned READ less than NFSD_READ_DIO_MIN_KB won't be expanded
+ * to be DIO-aligned (this heuristic avoids excess work, for smaller IO
+ * that can generally already perform well using buffered IO).
+ */
+ if ((read_dio->start_extra || read_dio->end_extra) &&
+ (len < NFSD_READ_DIO_MIN_KB)) {
+ init_nfsd_read_dio(read_dio);
+ return false;
+ }
+
+ return true;
+}
+
+static ssize_t nfsd_complete_misaligned_read_dio(struct svc_rqst *rqstp,
+ struct nfsd_read_dio *read_dio,
+ ssize_t bytes_read,
+ unsigned long bytes_expected,
+ loff_t *offset,
+ unsigned long *rq_bvec_numpages)
+{
+ ssize_t host_err = bytes_read;
+ loff_t v;
+
+ if (!read_dio->start_extra && !read_dio->end_extra)
+ return host_err;
+
+ /* If nfsd_analyze_read_dio() found start_extra (front-pad) page needed it
+ * must be removed from rqstp->rq_bvec[] to avoid returning unwanted data.
+ */
+ if (read_dio->start_extra) {
+ *rq_bvec_numpages -= 1;
+ v = *rq_bvec_numpages;
+ memmove(rqstp->rq_bvec, rqstp->rq_bvec + 1,
+ v * sizeof(struct bio_vec));
+ }
+ /* Eliminate any end_extra bytes from the last page */
+ v = *rq_bvec_numpages;
+ rqstp->rq_bvec[v].bv_len -= read_dio->end_extra;
+
+ if (host_err < 0) {
+ /* Underlying FS will return -EINVAL if DIO is misaligned. */
+ if (unlikely(host_err == -EINVAL))
+ pr_warn_ratelimited("%s: unexpected host_err=%zd\n",
+ __func__, host_err);
+ return host_err;
+ }
+
+ /* nfsd_analyze_read_dio() may have expanded the start and end,
+ * if so adjust returned read size to reflect original extent.
+ */
+ *offset += read_dio->start_extra;
+ if (likely(host_err >= read_dio->start_extra)) {
+ host_err -= read_dio->start_extra;
+ if (host_err > bytes_expected)
+ host_err = bytes_expected;
+ } else {
+ /* Short read that didn't read any of requested data */
+ host_err = 0;
+ }
+
+ return host_err;
+}
+
+static bool nfsd_iov_iter_aligned_bvec(const struct iov_iter *i,
+ unsigned int addr_mask, unsigned int len_mask)
+{
+ const struct bio_vec *bvec = i->bvec;
+ size_t skip = i->iov_offset;
+ size_t size = i->count;
+
+ if (size & len_mask)
+ return false;
+ do {
+ size_t len = bvec->bv_len;
+
+ if (len > size)
+ len = size;
+ if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
+ return false;
+ bvec++;
+ size -= len;
+ skip = 0;
+ } while (size);
+
+ return true;
+}
+
/**
* nfsd_iter_read - Perform a VFS read using an iterator
* @rqstp: RPC transaction context
@@ -1094,7 +1226,8 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
unsigned int base, u32 *eof)
{
struct file *file = nf->nf_file;
- unsigned long v, total;
+ unsigned long v, total, in_count = *count;
+ struct nfsd_read_dio read_dio;
struct iov_iter iter;
struct kiocb kiocb;
ssize_t host_err;
@@ -1102,13 +1235,34 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
init_sync_kiocb(&kiocb, file);
+ v = 0;
+ total = in_count;
+
switch (nfsd_io_cache_read) {
case NFSD_IO_DIRECT:
- /* Verify ondisk and memory DIO alignment */
- if (nf->nf_dio_mem_align && nf->nf_dio_read_offset_align &&
- (((offset | *count) & (nf->nf_dio_read_offset_align - 1)) == 0) &&
- (base & (nf->nf_dio_mem_align - 1)) == 0)
- kiocb.ki_flags = IOCB_DIRECT;
+ /*
+ * If NFSD_IO_DIRECT enabled, expand any misaligned READ to
+ * the next DIO-aligned block (on either end of the READ).
+ */
+ if (nfsd_analyze_read_dio(rqstp, fhp, nf, offset,
+ in_count, base, &read_dio)) {
+ /* trace_nfsd_read_vector() will reflect larger
+ * DIO-aligned READ.
+ */
+ offset = read_dio.start;
+ in_count = read_dio.end - offset;
+ total = in_count;
+
+ kiocb.ki_flags |= IOCB_DIRECT;
+ if (read_dio.start_extra) {
+ len = read_dio.start_extra;
+ bvec_set_page(&rqstp->rq_bvec[v],
+ *(rqstp->rq_next_page++),
+ len, PAGE_SIZE - len);
+ total -= len;
+ ++v;
+ }
+ }
break;
case NFSD_IO_DONTCACHE:
kiocb.ki_flags = IOCB_DONTCACHE;
@@ -1119,8 +1273,6 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
kiocb.ki_pos = offset;
- v = 0;
- total = *count;
while (total) {
len = min_t(size_t, total, PAGE_SIZE - base);
bvec_set_page(&rqstp->rq_bvec[v], *(rqstp->rq_next_page++),
@@ -1131,9 +1283,21 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
}
WARN_ON_ONCE(v > rqstp->rq_maxpages);
- trace_nfsd_read_vector(rqstp, fhp, offset, *count);
- iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, *count);
+ trace_nfsd_read_vector(rqstp, fhp, offset, in_count);
+ iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, in_count);
+
+ if ((kiocb.ki_flags & IOCB_DIRECT) &&
+ !nfsd_iov_iter_aligned_bvec(&iter, nf->nf_dio_mem_align-1,
+ nf->nf_dio_read_offset_align-1))
+ kiocb.ki_flags &= ~IOCB_DIRECT;
+
host_err = vfs_iocb_iter_read(file, &kiocb, &iter);
+
+ if (in_count != *count) {
+ /* misaligned DIO expanded read to be DIO-aligned */
+ host_err = nfsd_complete_misaligned_read_dio(rqstp, &read_dio,
+ host_err, *count, &offset, &v);
+ }
return nfsd_finish_read(rqstp, fhp, file, offset, count, eof, host_err);
}
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index e64ab444e0a7f..190c2667500e2 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -163,10 +163,13 @@ extern u32 svc_max_payload(const struct svc_rqst *rqstp);
* pages, one for the request, and one for the reply.
* nfsd_splice_actor() might need an extra page when a READ payload
* is not page-aligned.
+ * nfsd_iter_read() might need two extra pages when a READ payload
+ * is not DIO-aligned -- but nfsd_iter_read() and nfsd_splice_actor()
+ * are mutually exclusive (so reuse page reserved for nfsd_splice_actor).
*/
static inline unsigned long svc_serv_maxpages(const struct svc_serv *serv)
{
- return DIV_ROUND_UP(serv->sv_max_mesg, PAGE_SIZE) + 2 + 1;
+ return DIV_ROUND_UP(serv->sv_max_mesg, PAGE_SIZE) + 2 + 1 + 1;
}
/*
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 6/9] NFSD: issue WRITEs using O_DIRECT even if IO is misaligned
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
` (4 preceding siblings ...)
2025-09-03 20:51 ` [PATCH v9 5/9] NFSD: issue READs using O_DIRECT even if IO is misaligned Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 7/9] NFSD: add nfsd_analyze_read_dio and nfsd_analyze_write_dio trace events Mike Snitzer
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
If NFSD_IO_DIRECT is used, split any misaligned WRITE into a start,
middle and end as needed. The large middle extent is DIO-aligned and
the start and/or end are misaligned. Buffered IO is used for the
misaligned extents and O_DIRECT is used for the middle DIO-aligned
extent.
If vfs_iocb_iter_write() returns -ENOTBLK, due to its inability to
invalidate the page cache on behalf of the DIO WRITE, then
nfsd_issue_write_dio() will fall back to using buffered IO.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfsd/vfs.c | 192 +++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 183 insertions(+), 9 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 96ae86419dc80..c163afe13ab35 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1338,6 +1338,183 @@ static int wait_for_concurrent_writes(struct file *file)
return err;
}
+struct nfsd_write_dio {
+ loff_t middle_offset; /* Offset for start of DIO-aligned middle */
+ loff_t end_offset; /* Offset for start of DIO-aligned end */
+ ssize_t start_len; /* Length for misaligned first extent */
+ ssize_t middle_len; /* Length for DIO-aligned middle extent */
+ ssize_t end_len; /* Length for misaligned last extent */
+};
+
+static bool
+nfsd_analyze_write_dio(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ struct nfsd_file *nf, loff_t offset,
+ unsigned long len, struct nfsd_write_dio *write_dio)
+{
+ const u32 dio_blocksize = nf->nf_dio_offset_align;
+ loff_t orig_end, middle_end, start_end, start_offset = offset;
+ ssize_t start_len = len;
+
+ if (unlikely(!nf->nf_dio_mem_align || !dio_blocksize))
+ return false;
+ if (unlikely(dio_blocksize > PAGE_SIZE))
+ return false;
+ if (unlikely(len < dio_blocksize))
+ return false;
+
+ memset(write_dio, 0, sizeof(*write_dio));
+
+ if (((offset | len) & (dio_blocksize-1)) == 0) {
+ /* already DIO-aligned, no misaligned head or tail */
+ write_dio->middle_offset = offset;
+ write_dio->middle_len = len;
+ /* clear these for the benefit of trace_nfsd_analyze_write_dio */
+ start_offset = 0;
+ start_len = 0;
+ return true;
+ }
+
+ start_end = round_up(offset, dio_blocksize);
+ start_len = start_end - offset;
+ orig_end = offset + len;
+ middle_end = round_down(orig_end, dio_blocksize);
+
+ write_dio->start_len = start_len;
+ write_dio->middle_offset = start_end;
+ write_dio->middle_len = middle_end - start_end;
+ write_dio->end_offset = middle_end;
+ write_dio->end_len = orig_end - middle_end;
+
+ return true;
+}
+
+/*
+ * Setup as many as 3 iov_iter based on extents described by @write_dio.
+ * @iterp: pointer to pointer to onstack array of 3 iov_iter structs from caller.
+ * @iter_is_dio_aligned: pointer to onstack array of 3 bools from caller.
+ * @rq_bvec: backing bio_vec used to setup all 3 iov_iter permutations.
+ * @nvecs: number of segments in @rq_bvec
+ * @cnt: size of the request in bytes
+ * @write_dio: nfsd_write_dio struct that describes start, middle and end extents.
+ *
+ * Returns the number of iov_iter that were setup.
+ */
+static int
+nfsd_setup_write_dio_iters(struct iov_iter **iterp, bool *iter_is_dio_aligned,
+ struct bio_vec *rq_bvec, unsigned int nvecs,
+ unsigned long cnt, struct nfsd_write_dio *write_dio)
+{
+ int n_iters = 0;
+ struct iov_iter *iters = *iterp;
+
+ /* Setup misaligned start? */
+ if (write_dio->start_len) {
+ iter_is_dio_aligned[n_iters] = false;
+ iov_iter_bvec(&iters[n_iters], ITER_SOURCE, rq_bvec, nvecs, cnt);
+ iters[n_iters].count = write_dio->start_len;
+ ++n_iters;
+ }
+
+ /* Setup DIO-aligned middle */
+ iter_is_dio_aligned[n_iters] = true;
+ iov_iter_bvec(&iters[n_iters], ITER_SOURCE, rq_bvec, nvecs, cnt);
+ if (write_dio->start_len)
+ iov_iter_advance(&iters[n_iters], write_dio->start_len);
+ iters[n_iters].count -= write_dio->end_len;
+ ++n_iters;
+
+ /* Setup misaligned end? */
+ if (write_dio->end_len) {
+ iter_is_dio_aligned[n_iters] = false;
+ iov_iter_bvec(&iters[n_iters], ITER_SOURCE, rq_bvec, nvecs, cnt);
+ iov_iter_advance(&iters[n_iters],
+ write_dio->start_len + write_dio->middle_len);
+ ++n_iters;
+ }
+
+ return n_iters;
+}
+
+static int
+nfsd_issue_write_buffered(struct svc_rqst *rqstp, struct file *file,
+ unsigned int nvecs, unsigned long *cnt,
+ struct kiocb *kiocb)
+{
+ struct iov_iter iter;
+ int host_err;
+
+ iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
+ host_err = vfs_iocb_iter_write(file, kiocb, &iter);
+ if (host_err < 0)
+ return host_err;
+ *cnt = host_err;
+
+ return 0;
+}
+
+static noinline int
+nfsd_issue_write_dio(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ struct nfsd_file *nf, loff_t offset,
+ unsigned int nvecs, unsigned long *cnt,
+ struct kiocb *kiocb)
+{
+ struct nfsd_write_dio write_dio;
+ struct file *file = nf->nf_file;
+
+ /* Any buffered IO issued here will be misaligned, use
+ * IOCB_SYNC to ensure it has completed before returning.
+ */
+ kiocb->ki_flags |= IOCB_SYNC;
+
+ if (!nfsd_analyze_write_dio(rqstp, fhp, nf, offset, *cnt, &write_dio))
+ return nfsd_issue_write_buffered(rqstp, file, nvecs, cnt, kiocb);
+ else {
+ bool iter_is_dio_aligned[3];
+ struct iov_iter iter_stack[3];
+ struct iov_iter *iter = iter_stack;
+ unsigned int n_iters = 0;
+ unsigned long in_count = *cnt;
+ loff_t in_offset = kiocb->ki_pos;
+ ssize_t host_err;
+
+ n_iters = nfsd_setup_write_dio_iters(&iter, iter_is_dio_aligned,
+ rqstp->rq_bvec, nvecs, *cnt, &write_dio);
+ *cnt = 0;
+ for (int i = 0; i < n_iters; i++) {
+ if (iter_is_dio_aligned[i] &&
+ nfsd_iov_iter_aligned_bvec(&iter[i], nf->nf_dio_mem_align-1,
+ nf->nf_dio_offset_align-1))
+ kiocb->ki_flags |= IOCB_DIRECT;
+ else
+ kiocb->ki_flags &= ~IOCB_DIRECT;
+
+ host_err = vfs_iocb_iter_write(file, kiocb, &iter[i]);
+ if (host_err < 0) {
+ /* VFS will return -ENOTBLK if DIO WRITE fails to
+ * invalidate the page cache. Retry using buffered IO.
+ */
+ if (unlikely(host_err == -ENOTBLK)) {
+ kiocb->ki_flags &= ~IOCB_DIRECT;
+ *cnt = in_count;
+ kiocb->ki_pos = in_offset;
+ return nfsd_issue_write_buffered(rqstp, file,
+ nvecs, cnt, kiocb);
+ }
+ /* Underlying FS will return -EINVAL if DIO is misaligned. */
+ if (unlikely(host_err == -EINVAL))
+ pr_warn_ratelimited("%s: unexpected host_err=%zd\n",
+ __func__, host_err);
+ return host_err;
+ }
+ *cnt += host_err;
+ if (host_err < iter[i].count) /* partial write? */
+ return *cnt;
+ }
+ }
+
+ return 0;
+}
+
/**
* nfsd_vfs_write - write data to an already-open file
* @rqstp: RPC execution context
@@ -1365,7 +1542,6 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
struct super_block *sb = file_inode(file)->i_sb;
struct kiocb kiocb;
struct svc_export *exp;
- struct iov_iter iter;
errseq_t since;
__be32 nfserr;
int host_err;
@@ -1402,30 +1578,28 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp,
kiocb.ki_flags |= IOCB_DSYNC;
nvecs = xdr_buf_to_bvec(rqstp->rq_bvec, rqstp->rq_maxpages, payload);
- iov_iter_bvec(&iter, ITER_SOURCE, rqstp->rq_bvec, nvecs, *cnt);
+
since = READ_ONCE(file->f_wb_err);
if (verf)
nfsd_copy_write_verifier(verf, nn);
switch (nfsd_io_cache_write) {
case NFSD_IO_DIRECT:
- /* direct I/O must be aligned to device logical sector size */
- if (nf->nf_dio_mem_align && nf->nf_dio_offset_align &&
- (((offset | *cnt) & (nf->nf_dio_offset_align-1)) == 0))
- kiocb.ki_flags |= IOCB_DIRECT;
+ host_err = nfsd_issue_write_dio(rqstp, fhp, nf, offset,
+ nvecs, cnt, &kiocb);
break;
case NFSD_IO_DONTCACHE:
kiocb.ki_flags |= IOCB_DONTCACHE;
- break;
+ fallthrough; /* must call nfsd_issue_write_buffered */
case NFSD_IO_BUFFERED:
+ host_err = nfsd_issue_write_buffered(rqstp, file,
+ nvecs, cnt, &kiocb);
break;
}
- host_err = vfs_iocb_iter_write(file, &kiocb, &iter);
if (host_err < 0) {
commit_reset_write_verifier(nn, rqstp, host_err);
goto out_nfserr;
}
- *cnt = host_err;
nfsd_stats_io_write_add(nn, exp, *cnt);
fsnotify_modify(file);
host_err = filemap_check_wb_err(file->f_mapping, since);
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 7/9] NFSD: add nfsd_analyze_read_dio and nfsd_analyze_write_dio trace events
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
` (5 preceding siblings ...)
2025-09-03 20:51 ` [PATCH v9 6/9] NFSD: issue WRITEs " Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 8/9] NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 9/9] NFSD: use /end/ of rq_pages for misaligned DIO READ's start_extra page Mike Snitzer
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
Add EVENT_CLASS nfsd_analyze_dio_class and use it to create
nfsd_analyze_read_dio and nfsd_analyze_write_dio trace events.
The nfsd_analyze_read_dio trace event shows how NFSD expands any
misaligned READ to the next DIO-aligned block (on either end of the
original READ, as needed).
This combination of trace events is useful for READs:
echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_read_vector/enable
echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_analyze_read_dio/enable
echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_read_io_done/enable
echo 1 > /sys/kernel/tracing/events/xfs/xfs_file_direct_read/enable
Which for this dd command:
dd if=/mnt/share1/test of=/dev/null bs=47008 count=2 iflag=direct
Results in:
nfsd-23908[010] ..... 10375.141640: nfsd_analyze_read_dio: xid=0x82c5923b fh_hash=0x857ca4fc offset=0 len=47008 start=0+0 middle=0+47008 end=47008+96
nfsd-23908[010] ..... 10375.141642: nfsd_read_vector: xid=0x82c5923b fh_hash=0x857ca4fc offset=0 len=47104
nfsd-23908[010] ..... 10375.141643: xfs_file_direct_read: dev 259:2 ino 0xc00116 disize 0x226e0 pos 0x0 bytecount 0xb800
nfsd-23908[010] ..... 10375.141773: nfsd_read_io_done: xid=0x82c5923b fh_hash=0x857ca4fc offset=0 len=47008
nfsd-23908[010] ..... 10375.142063: nfsd_analyze_read_dio: xid=0x83c5923b fh_hash=0x857ca4fc offset=47008 len=47008 start=46592+416 middle=47008+47008 end=94016+192
nfsd-23908[010] ..... 10375.142064: nfsd_read_vector: xid=0x83c5923b fh_hash=0x857ca4fc offset=46592 len=47616
nfsd-23908[010] ..... 10375.142065: xfs_file_direct_read: dev 259:2 ino 0xc00116 disize 0x226e0 pos 0xb600 bytecount 0xba00
nfsd-23908[010] ..... 10375.142103: nfsd_read_io_done: xid=0x83c5923b fh_hash=0x857ca4fc offset=47008 len=47008
The nfsd_analyze_write_dio trace event shows how NFSD splits a given
misaligned WRITE into a mix of misaligned extent(s) and a DIO-aligned
extent.
This combination of trace events is useful for WRITEs:
echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_write_opened/enable
echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_analyze_write_dio/enable
echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_write_io_done/enable
echo 1 > /sys/kernel/tracing/events/xfs/xfs_file_direct_write/enable
Which for this dd command:
dd if=/dev/zero of=/mnt/share1/test bs=47008 count=2 oflag=direct
Results in:
nfsd-23908[010] ..... 10374.902333: nfsd_write_opened: xid=0x7fc5923b fh_hash=0x857ca4fc offset=0 len=47008
nfsd-23908[010] ..... 10374.902335: nfsd_analyze_write_dio: xid=0x7fc5923b fh_hash=0x857ca4fc offset=0 len=47008 start=0+0 middle=0+46592 end=46592+416
nfsd-23908[010] ..... 10374.902343: xfs_file_direct_write: dev 259:2 ino 0xc00116 disize 0x0 pos 0x0 bytecount 0xb600
nfsd-23908[010] ..... 10374.902697: nfsd_write_io_done: xid=0x7fc5923b fh_hash=0x857ca4fc offset=0 len=47008
nfsd-23908[010] ..... 10374.902925: nfsd_write_opened: xid=0x80c5923b fh_hash=0x857ca4fc offset=47008 len=47008
nfsd-23908[010] ..... 10374.902926: nfsd_analyze_write_dio: xid=0x80c5923b fh_hash=0x857ca4fc offset=47008 len=47008 start=47008+96 middle=47104+46592 end=93696+320
nfsd-23908[010] ..... 10374.903010: xfs_file_direct_write: dev 259:2 ino 0xc00116 disize 0xb800 pos 0xb800 bytecount 0xb600
nfsd-23908[010] ..... 10374.903239: nfsd_write_io_done: xid=0x80c5923b fh_hash=0x857ca4fc offset=47008 len=47008
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfsd/trace.h | 61 +++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/vfs.c | 13 +++++++++--
2 files changed, 72 insertions(+), 2 deletions(-)
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index a664fdf1161e9..cd757d2c52c84 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -473,6 +473,67 @@ DEFINE_NFSD_IO_EVENT(write_done);
DEFINE_NFSD_IO_EVENT(commit_start);
DEFINE_NFSD_IO_EVENT(commit_done);
+DECLARE_EVENT_CLASS(nfsd_analyze_dio_class,
+ TP_PROTO(struct svc_rqst *rqstp,
+ struct svc_fh *fhp,
+ u64 offset,
+ u32 len,
+ loff_t start,
+ ssize_t start_len,
+ loff_t middle,
+ ssize_t middle_len,
+ loff_t end,
+ ssize_t end_len),
+ TP_ARGS(rqstp, fhp, offset, len, start, start_len, middle, middle_len, end, end_len),
+ TP_STRUCT__entry(
+ __field(u32, xid)
+ __field(u32, fh_hash)
+ __field(u64, offset)
+ __field(u32, len)
+ __field(loff_t, start)
+ __field(ssize_t, start_len)
+ __field(loff_t, middle)
+ __field(ssize_t, middle_len)
+ __field(loff_t, end)
+ __field(ssize_t, end_len)
+ ),
+ TP_fast_assign(
+ __entry->xid = be32_to_cpu(rqstp->rq_xid);
+ __entry->fh_hash = knfsd_fh_hash(&fhp->fh_handle);
+ __entry->offset = offset;
+ __entry->len = len;
+ __entry->start = start;
+ __entry->start_len = start_len;
+ __entry->middle = middle;
+ __entry->middle_len = middle_len;
+ __entry->end = end;
+ __entry->end_len = end_len;
+ ),
+ TP_printk("xid=0x%08x fh_hash=0x%08x offset=%llu len=%u start=%llu+%zd middle=%llu+%zd end=%llu+%zd",
+ __entry->xid, __entry->fh_hash,
+ __entry->offset, __entry->len,
+ __entry->start, __entry->start_len,
+ __entry->middle, __entry->middle_len,
+ __entry->end, __entry->end_len)
+)
+
+#define DEFINE_NFSD_ANALYZE_DIO_EVENT(name) \
+DEFINE_EVENT(nfsd_analyze_dio_class, nfsd_analyze_##name##_dio, \
+ TP_PROTO(struct svc_rqst *rqstp, \
+ struct svc_fh *fhp, \
+ u64 offset, \
+ u32 len, \
+ loff_t start, \
+ ssize_t start_len, \
+ loff_t middle, \
+ ssize_t middle_len, \
+ loff_t end, \
+ ssize_t end_len), \
+ TP_ARGS(rqstp, fhp, offset, len, start, start_len, middle, middle_len, end, end_len))
+
+DEFINE_NFSD_ANALYZE_DIO_EVENT(read);
+DEFINE_NFSD_ANALYZE_DIO_EVENT(write);
+
DECLARE_EVENT_CLASS(nfsd_err_class,
TP_PROTO(struct svc_rqst *rqstp,
struct svc_fh *fhp,
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c163afe13ab35..5b3c6072b6f5c 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1128,6 +1128,12 @@ static bool nfsd_analyze_read_dio(struct svc_rqst *rqstp, struct svc_fh *fhp,
return false;
}
+ /* Show original offset and count, and how it was expanded for DIO */
+ middle_end = read_dio->end - read_dio->end_extra;
+ trace_nfsd_analyze_read_dio(rqstp, fhp, offset, len,
+ read_dio->start, read_dio->start_extra,
+ offset, (middle_end - offset),
+ middle_end, read_dio->end_extra);
return true;
}
@@ -1371,7 +1377,7 @@ nfsd_analyze_write_dio(struct svc_rqst *rqstp, struct svc_fh *fhp,
/* clear these for the benefit of trace_nfsd_analyze_write_dio */
start_offset = 0;
start_len = 0;
- return true;
+ goto out;
}
start_end = round_up(offset, dio_blocksize);
@@ -1384,7 +1390,10 @@ nfsd_analyze_write_dio(struct svc_rqst *rqstp, struct svc_fh *fhp,
write_dio->middle_len = middle_end - start_end;
write_dio->end_offset = middle_end;
write_dio->end_len = orig_end - middle_end;
-
+out:
+ trace_nfsd_analyze_write_dio(rqstp, fhp, offset, len, start_offset, start_len,
+ write_dio->middle_offset, write_dio->middle_len,
+ write_dio->end_offset, write_dio->end_len);
return true;
}
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 8/9] NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
` (6 preceding siblings ...)
2025-09-03 20:51 ` [PATCH v9 7/9] NFSD: add nfsd_analyze_read_dio and nfsd_analyze_write_dio trace events Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 9/9] NFSD: use /end/ of rq_pages for misaligned DIO READ's start_extra page Mike Snitzer
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
This document details the NFSD IO modes that are configurable using
NFSD's experimental debugfs interfaces:
/sys/kernel/debug/nfsd/io_cache_read
/sys/kernel/debug/nfsd/io_cache_write
This document will evolve as NFSD's interfaces do (e.g. if/when NFSD's
debugfs interfaces are replaced with per-export controls).
Future updates will provide more specific guidance and howto
information to help others use and evaluate NFSD's IO modes:
BUFFERED, DONTCACHE and DIRECT.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
.../filesystems/nfs/nfsd-io-modes.rst | 144 ++++++++++++++++++
1 file changed, 144 insertions(+)
create mode 100644 Documentation/filesystems/nfs/nfsd-io-modes.rst
diff --git a/Documentation/filesystems/nfs/nfsd-io-modes.rst b/Documentation/filesystems/nfs/nfsd-io-modes.rst
new file mode 100644
index 0000000000000..4863885c70352
--- /dev/null
+++ b/Documentation/filesystems/nfs/nfsd-io-modes.rst
@@ -0,0 +1,144 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+NFSD IO MODES
+=============
+
+Overview
+========
+
+NFSD has historically always used buffered IO when servicing READ and
+WRITE operations. BUFFERED is NFSD's default IO mode, but it is possible
+to override that default to use either DONTCACHE or DIRECT IO modes.
+
+Experimental NFSD debugfs interfaces are available to allow the NFSD IO
+mode used for READ and WRITE to be configured independently. See both:
+- /sys/kernel/debug/nfsd/io_cache_read
+- /sys/kernel/debug/nfsd/io_cache_write
+
+The default value for both io_cache_read and io_cache_write reflects
+NFSD's default IO mode (which is NFSD_IO_BUFFERED=0).
+
+Based on the configured settings, NFSD's IO will either be:
+- cached using page cache (NFSD_IO_BUFFERED=0)
+- cached but removed from the page cache upon completion
+ (NFSD_IO_DONTCACHE=1).
+- not cached (NFSD_IO_DIRECT=2)
+
+To set an NFSD IO mode, write a supported value (0, 1 or 2) to the
+corresponding IO operation's debugfs interface, e.g.:
+ echo 2 > /sys/kernel/debug/nfsd/io_cache_read
+
+To check which IO mode NFSD is using for READ or WRITE, simply read the
+corresponding IO operation's debugfs interface, e.g.:
+ cat /sys/kernel/debug/nfsd/io_cache_read
+
+NFSD DONTCACHE
+==============
+
+DONTCACHE offers a hybrid approach to servicing IO that aims to offer
+the benefits of using DIRECT IO without any of the strict alignment
+requirements that DIRECT IO imposes. To achieve this buffered IO is used
+but the IO is flagged to "drop behind" (meaning associated pages are
+dropped from the page cache) when IO completes.
+
+DONTCACHE aims to avoid what has proven to be a fairly significant
+limition of Linux's memory management subsystem if/when large amounts of
+data is infrequently accessed (e.g. read once _or_ written once but not
+read until much later). Such use-cases are particularly problematic
+because the page cache will eventually become a bottleneck to surfacing
+new IO requests.
+
+For more context, please see these Linux commit headers:
+- Overview: 9ad6344568cc3 ("mm/filemap: change filemap_create_folio()
+ to take a struct kiocb")
+- for READ: 8026e49bff9b1 ("mm/filemap: add read support for
+ RWF_DONTCACHE")
+- for WRITE: 974c5e6139db3 ("xfs: flag as supporting FOP_DONTCACHE")
+
+If NFSD_IO_DONTCACHE is specified by writing 1 to NFSD's debugfs
+interfaces, FOP_DONTCACHE must be advertised as supported by the
+underlying filesystem (e.g. XFS), otherwise all IO flagged with
+RWF_DONTCACHE will fail with -EOPNOTSUPP.
+
+NFSD DIRECT
+===========
+
+DIRECT IO doesn't make use of the page cache, as such it is able to
+avoid the Linux memory management's page reclaim scalability problems
+without resorting to the hybrid use of page cache that DONTCACHE does.
+
+Some workloads benefit from NFSD avoiding the page cache, particularly
+those with a working set that is significantly larger than available
+system memory. The pathological worst-case workload that NFSD DIRECT has
+proven to help most is: NFS client issuing large sequential IO to a file
+that is 2-3 times larger than the NFS server's available system memory.
+
+The performance win associated with using NFSD DIRECT was previously
+discussed on linux-nfs, see:
+https://lore.kernel.org/linux-nfs/aEslwqa9iMeZjjlV@kernel.org/
+But in summary:
+- NFSD DIRECT can signicantly reduce memory requirements
+- NFSD DIRECT can reduce CPU load by avoiding costly page reclaim work
+- NFSD DIRECT can offer more deterministic IO performance
+
+As always, your mileage may vary and so it is important to carefully
+consider if/when it is beneficial to make use of NFSD DIRECT. When
+assessing comparative performance of your workload please be sure to log
+relevant performance metrics during testing (e.g. memory usage, cpu
+usage, IO performance). Using perf to collect perf data that may be used
+to generate a "flamegraph" for work Linux must perform on behalf of your
+test is a really meaningful way to compare the relative health of the
+system and how switching NFSD's IO mode changes what is observed.
+
+If NFSD_IO_DIRECT is specified by writing 2 to NFSD's debugfs
+interfaces, ideally the IO will be aligned relative to the underlying
+block device's logical_block_size. Also the memory buffer used to store
+the READ or WRITE payload must be aligned relative to the underlying
+block device's dma_alignment.
+
+But NFSD DIRECT does handle misaligned IO in terms of O_DIRECT as best
+it can:
+
+Misaligned READ:
+ If NFSD_IO_DIRECT is used, expand any misaligned READ to the next
+ DIO-aligned block (on either end of the READ). The expanded READ is
+ verified to have proper offset/len (logical_block_size) and
+ dma_alignment checking.
+
+ Any misaligned READ that is less than 32K won't be expanded to be
+ DIO-aligned (this heuristic just avoids excess work, like allocating
+ start_extra_page, for smaller IO that can generally already perform
+ well using buffered IO).
+
+Misaligned WRITE:
+ If NFSD_IO_DIRECT is used, split any misaligned WRITE into a start,
+ middle and end as needed. The large middle extent is DIO-aligned and
+ the start and/or end are misaligned. Buffered IO is used for the
+ misaligned extents and O_DIRECT is used for the middle DIO-aligned
+ extent.
+
+ If vfs_iocb_iter_write() returns -ENOTBLK, due to its inability to
+ invalidate the page cache on behalf of the DIO WRITE, then
+ nfsd_issue_write_dio() will fall back to using buffered IO.
+
+Tracing:
+ The nfsd_analyze_read_dio trace event shows how NFSD expands any
+ misaligned READ to the next DIO-aligned block (on either end of the
+ original READ, as needed).
+
+ This combination of trace events is useful for READs:
+ echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_read_vector/enable
+ echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_analyze_read_dio/enable
+ echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_read_io_done/enable
+ echo 1 > /sys/kernel/tracing/events/xfs/xfs_file_direct_read/enable
+
+ The nfsd_analyze_write_dio trace event shows how NFSD splits a given
+ misaligned WRITE into a mix of misaligned extent(s) and a DIO-aligned
+ extent.
+
+ This combination of trace events is useful for WRITEs:
+ echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_write_opened/enable
+ echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_analyze_write_dio/enable
+ echo 1 > /sys/kernel/tracing/events/nfsd/nfsd_write_io_done/enable
+ echo 1 > /sys/kernel/tracing/events/xfs/xfs_file_direct_write/enable
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v9 9/9] NFSD: use /end/ of rq_pages for misaligned DIO READ's start_extra page
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
` (7 preceding siblings ...)
2025-09-03 20:51 ` [PATCH v9 8/9] NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst Mike Snitzer
@ 2025-09-03 20:51 ` Mike Snitzer
8 siblings, 0 replies; 10+ messages in thread
From: Mike Snitzer @ 2025-09-03 20:51 UTC (permalink / raw)
To: Chuck Lever, Jeff Layton; +Cc: linux-nfs
This commit works around what seems like a flexfiles+rpcrdma bug, and
Chuck Lever clarified that this shouldn't be needed:
"Yes, the extra page needs to come from rq_pages. But I don't see
why it should come from the /end/ of rq_pages."
However, when using NFSD DIRECT for READ and NFS 4.2 client with pNFS
flexfiles (and client gets a layout to use a v3 DS) over RDMA it is
easy to see data mismatch when NFSD handles a misaligned DIO READ. If
the same misaligned DIO READ is issued directly to the v3 DS over RDMA
(so flexfiles is _not_ used) then no data mismatch occurs.
Therefore, until this bug can be found, must use a 'start_extra' page
from rq_pages that follows the NFS client requested READ payload (RDMA
memory) if/when expanding the misaligned READ requires reading an
extra partial page at the start of the READ so that its DIO-aligned.
Otherwise if the 'start_extra' page is taken from the beginning of
rq_pages the pNFS flexfiles client will see data mismatch corruption.
As found, and then this fix of using the end of rq_pages verified,
using the 'dt' utility:
dt of=/mnt/share1/dt_a.test passes=1 bs=47008 count=2 \
iotype=sequential pattern=iot onerr=abort oncerr=abort
see: https://github.com/RobinTMiller/dt.git
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
---
fs/nfsd/vfs.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 5b3c6072b6f5c..e9ddeec3c9a32 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1263,7 +1263,7 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
if (read_dio.start_extra) {
len = read_dio.start_extra;
bvec_set_page(&rqstp->rq_bvec[v],
- *(rqstp->rq_next_page++),
+ NULL, /* set below */
len, PAGE_SIZE - len);
total -= len;
++v;
@@ -1288,6 +1288,11 @@ __be32 nfsd_iter_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
base = 0;
}
WARN_ON_ONCE(v > rqstp->rq_maxpages);
+ /* FIXME: having the start_extra page come from the end of
+ * rq_pages[] works around what seems to be a flexfiles+rpcrdma bug.
+ */
+ if ((kiocb.ki_flags & IOCB_DIRECT) && read_dio.start_extra)
+ rqstp->rq_bvec[0].bv_page = *(rqstp->rq_next_page++);
trace_nfsd_read_vector(rqstp, fhp, offset, in_count);
iov_iter_bvec(&iter, ITER_DEST, rqstp->rq_bvec, v, in_count);
--
2.44.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-09-03 20:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-03 20:51 [PATCH v9 0/9] NFSD: add "NFSD DIRECT" and "NFSD DONTCACHE" IO modes Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 1/9] NFSD: filecache: add STATX_DIOALIGN and STATX_DIO_READ_ALIGN support Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 2/9] NFSD: pass nfsd_file to nfsd_iter_read() Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 3/9] NFSD: add io_cache_read controls to debugfs interface Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 4/9] NFSD: add io_cache_write " Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 5/9] NFSD: issue READs using O_DIRECT even if IO is misaligned Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 6/9] NFSD: issue WRITEs " Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 7/9] NFSD: add nfsd_analyze_read_dio and nfsd_analyze_write_dio trace events Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 8/9] NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst Mike Snitzer
2025-09-03 20:51 ` [PATCH v9 9/9] NFSD: use /end/ of rq_pages for misaligned DIO READ's start_extra page Mike Snitzer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox