* [PATCH v3 0/1] Mount option rdirplus extension @ 2025-03-13 14:42 Benjamin Coddington 2025-03-13 14:43 ` [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" Benjamin Coddington 2025-03-13 15:18 ` [PATCH] nfs(5): Add new rdirplus functionality, clarify Benjamin Coddington 0 siblings, 2 replies; 5+ messages in thread From: Benjamin Coddington @ 2025-03-13 14:42 UTC (permalink / raw) To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs v2 - Instead of adding "force_rdirplus", extend rdirplus=force v3 - Fix broken bitwise operator ------------- After these two changes: 1a34c8c9a49e NFS: Support larger readdir buffers and 85aa8ddc3818 NFS: Trigger the "ls -l" readdir heuristic sooner .. we've disadvantaged/regressed some users workloads where the user is forced to use NFSv3 (such that we cannot prime the dcache with d_type on READDIR as we do for NFSv4), and where the workload is a search through a large directory tree. Previously, the smaller buffers and/or the larger heuristic size worked to their advantage to optimize the client toward the READDIRPLUS path. We've been able to relieve some pain temporarily by putting a knob into sysfs to allow them to set the operation to one of: - force READDIRPLUS - disable READDIRPLUS - use the default heurstic That's exactly the solution they're looking for and we have most of that in mount options today with the "rdirplus" and "nordirplus". Missing only is the option to force the client to always use READDIR plus. What follows is a patch to do that by extending the existing rdirplus mount option to take values. I'll follow this up with a man page update to nfs-utils. The global heuristic is always going to be wrong for some workloads and we cannnot depend on something like a mount-wide option to always work well for mixed workloads on the same mount. A realistic long-term solution involves allowing the applications to signal their intended behavior. So, I am going to look at plumbing in a posix_fadvise() flag for READDIRPLUS. There's been some interest expressed already for other filesystems[1]. I plan to send along patches for this and face the discussions. However, that work will take some time to filter up into the utilities and into distros. In the meantime, I hope we can add this simpler mount option to help the folks that need the old behaviors in recent kernels. [1]: https://lore.kernel.org/linux-fsdevel/CAOQ4uxhBWV3DfqaE=reuPjh8w92wwujA6Abj=Gt0YvapR4m_1g@mail.gmail.com/ Benjamin Coddington (1): NFS: Extend rdirplus mount option with "force|none" fs/nfs/dir.c | 2 ++ fs/nfs/fs_context.c | 32 ++++++++++++++++++++++++++++---- fs/nfs/super.c | 1 + include/linux/nfs_fs_sb.h | 1 + 4 files changed, 32 insertions(+), 4 deletions(-) base-commit: 2408a807bfc3f738850ef5ad5e3fd59d66168996 -- 2.47.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" 2025-03-13 14:42 [PATCH v3 0/1] Mount option rdirplus extension Benjamin Coddington @ 2025-03-13 14:43 ` Benjamin Coddington 2025-03-13 16:34 ` Trond Myklebust 2025-03-13 15:18 ` [PATCH] nfs(5): Add new rdirplus functionality, clarify Benjamin Coddington 1 sibling, 1 reply; 5+ messages in thread From: Benjamin Coddington @ 2025-03-13 14:43 UTC (permalink / raw) To: Trond Myklebust, Anna Schumaker; +Cc: linux-nfs There are certain users that wish to force the NFS client to choose READDIRPLUS over READDIR for a particular mount. Update the "rdirplus" mount option to optionally accept values. For "rdirplus=force", the NFS client will always attempt to use READDDIRPLUS. The setting of "rdirplus=none" is aliased to the existing "nordirplus". Signed-off-by: Benjamin Coddington <bcodding@redhat.com> --- fs/nfs/dir.c | 2 ++ fs/nfs/fs_context.c | 32 ++++++++++++++++++++++++++++---- fs/nfs/super.c | 1 + include/linux/nfs_fs_sb.h | 1 + 4 files changed, 32 insertions(+), 4 deletions(-) diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 2b04038b0e40..c9de0e474cf5 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -666,6 +666,8 @@ static bool nfs_use_readdirplus(struct inode *dir, struct dir_context *ctx, { if (!nfs_server_capable(dir, NFS_CAP_READDIRPLUS)) return false; + if (NFS_SERVER(dir)->flags && NFS_MOUNT_FORCE_RDIRPLUS) + return true; if (ctx->pos == 0 || cache_hits + cache_misses > NFS_READDIR_CACHE_USAGE_THRESHOLD) return true; diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c index b069385eea17..1cabba1231d6 100644 --- a/fs/nfs/fs_context.c +++ b/fs/nfs/fs_context.c @@ -72,6 +72,8 @@ enum nfs_param { Opt_posix, Opt_proto, Opt_rdirplus, + Opt_rdirplus_none, + Opt_rdirplus_force, Opt_rdma, Opt_resvport, Opt_retrans, @@ -174,7 +176,8 @@ static const struct fs_parameter_spec nfs_fs_parameters[] = { fsparam_u32 ("port", Opt_port), fsparam_flag_no("posix", Opt_posix), fsparam_string("proto", Opt_proto), - fsparam_flag_no("rdirplus", Opt_rdirplus), + fsparam_flag_no("rdirplus", Opt_rdirplus), // rdirplus|nordirplus + fsparam_string("rdirplus", Opt_rdirplus), // rdirplus=... fsparam_flag ("rdma", Opt_rdma), fsparam_flag_no("resvport", Opt_resvport), fsparam_u32 ("retrans", Opt_retrans), @@ -288,6 +291,12 @@ static const struct constant_table nfs_xprtsec_policies[] = { {} }; +static const struct constant_table nfs_rdirplus_tokens[] = { + { "none", Opt_rdirplus_none }, + { "force", Opt_rdirplus_force }, + {} +}; + /* * Sanity-check a server address provided by the mount command. * @@ -636,10 +645,25 @@ static int nfs_fs_context_parse_param(struct fs_context *fc, ctx->flags &= ~NFS_MOUNT_NOACL; break; case Opt_rdirplus: - if (result.negated) + if (result.negated) { + ctx->flags &= ~NFS_MOUNT_FORCE_RDIRPLUS; ctx->flags |= NFS_MOUNT_NORDIRPLUS; - else - ctx->flags &= ~NFS_MOUNT_NORDIRPLUS; + } else if (!param->string) { + ctx->flags &= ~(NFS_MOUNT_NORDIRPLUS | NFS_MOUNT_FORCE_RDIRPLUS); + } else { + switch (lookup_constant(nfs_rdirplus_tokens, param->string, -1)) { + case Opt_rdirplus_none: + ctx->flags &= ~NFS_MOUNT_FORCE_RDIRPLUS; + ctx->flags |= NFS_MOUNT_NORDIRPLUS; + break; + case Opt_rdirplus_force: + ctx->flags &= ~NFS_MOUNT_NORDIRPLUS; + ctx->flags |= NFS_MOUNT_FORCE_RDIRPLUS; + break; + default: + goto out_invalid_value; + } + } break; case Opt_sharecache: if (result.negated) diff --git a/fs/nfs/super.c b/fs/nfs/super.c index aeb715b4a690..9a747b224a9d 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -456,6 +456,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, { NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" }, { NFS_MOUNT_UNSHARED, ",nosharecache", "" }, { NFS_MOUNT_NORESVPORT, ",noresvport", "" }, + { NFS_MOUNT_FORCE_RDIRPLUS, ",rdirplus=force", "" }, { 0, NULL, NULL } }; const struct proc_nfs_info *nfs_infop; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index f00bfcee7120..3774b2235a1e 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -167,6 +167,7 @@ struct nfs_server { #define NFS_MOUNT_TRUNK_DISCOVERY 0x04000000 #define NFS_MOUNT_SHUTDOWN 0x08000000 #define NFS_MOUNT_NO_ALIGNWRITE 0x10000000 +#define NFS_MOUNT_FORCE_RDIRPLUS 0x20000000 unsigned int fattr_valid; /* Valid attributes */ unsigned int caps; /* server capabilities */ -- 2.47.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" 2025-03-13 14:43 ` [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" Benjamin Coddington @ 2025-03-13 16:34 ` Trond Myklebust 2025-03-13 16:51 ` Benjamin Coddington 0 siblings, 1 reply; 5+ messages in thread From: Trond Myklebust @ 2025-03-13 16:34 UTC (permalink / raw) To: anna@kernel.org, bcodding@redhat.com; +Cc: linux-nfs@vger.kernel.org On Thu, 2025-03-13 at 10:43 -0400, Benjamin Coddington wrote: > There are certain users that wish to force the NFS client to choose > READDIRPLUS over READDIR for a particular mount. Update the > "rdirplus" mount > option to optionally accept values. For "rdirplus=force", the NFS > client > will always attempt to use READDDIRPLUS. The setting of > "rdirplus=none" is > aliased to the existing "nordirplus". > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com> > --- > fs/nfs/dir.c | 2 ++ > fs/nfs/fs_context.c | 32 ++++++++++++++++++++++++++++---- > fs/nfs/super.c | 1 + > include/linux/nfs_fs_sb.h | 1 + > 4 files changed, 32 insertions(+), 4 deletions(-) > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c > index 2b04038b0e40..c9de0e474cf5 100644 > --- a/fs/nfs/dir.c > +++ b/fs/nfs/dir.c > @@ -666,6 +666,8 @@ static bool nfs_use_readdirplus(struct inode > *dir, struct dir_context *ctx, > { > if (!nfs_server_capable(dir, NFS_CAP_READDIRPLUS)) > return false; > + if (NFS_SERVER(dir)->flags && NFS_MOUNT_FORCE_RDIRPLUS) Bitwise and? > + return true; > if (ctx->pos == 0 || > cache_hits + cache_misses > > NFS_READDIR_CACHE_USAGE_THRESHOLD) > return true; > diff --git a/fs/nfs/fs_context.c b/fs/nfs/fs_context.c > index b069385eea17..1cabba1231d6 100644 > --- a/fs/nfs/fs_context.c > +++ b/fs/nfs/fs_context.c > @@ -72,6 +72,8 @@ enum nfs_param { > Opt_posix, > Opt_proto, > Opt_rdirplus, > + Opt_rdirplus_none, > + Opt_rdirplus_force, > Opt_rdma, > Opt_resvport, > Opt_retrans, > @@ -174,7 +176,8 @@ static const struct fs_parameter_spec > nfs_fs_parameters[] = { > fsparam_u32 ("port", Opt_port), > fsparam_flag_no("posix", Opt_posix), > fsparam_string("proto", Opt_proto), > - fsparam_flag_no("rdirplus", Opt_rdirplus), > + fsparam_flag_no("rdirplus", Opt_rdirplus), // > rdirplus|nordirplus > + fsparam_string("rdirplus", Opt_rdirplus), // rdirplus=... > fsparam_flag ("rdma", Opt_rdma), > fsparam_flag_no("resvport", Opt_resvport), > fsparam_u32 ("retrans", Opt_retrans), > @@ -288,6 +291,12 @@ static const struct constant_table > nfs_xprtsec_policies[] = { > {} > }; > > +static const struct constant_table nfs_rdirplus_tokens[] = { > + { "none", Opt_rdirplus_none }, > + { "force", Opt_rdirplus_force }, > + {} > +}; > + > /* > * Sanity-check a server address provided by the mount command. > * > @@ -636,10 +645,25 @@ static int nfs_fs_context_parse_param(struct > fs_context *fc, > ctx->flags &= ~NFS_MOUNT_NOACL; > break; > case Opt_rdirplus: > - if (result.negated) > + if (result.negated) { > + ctx->flags &= ~NFS_MOUNT_FORCE_RDIRPLUS; > ctx->flags |= NFS_MOUNT_NORDIRPLUS; > - else > - ctx->flags &= ~NFS_MOUNT_NORDIRPLUS; > + } else if (!param->string) { > + ctx->flags &= ~(NFS_MOUNT_NORDIRPLUS | > NFS_MOUNT_FORCE_RDIRPLUS); > + } else { > + switch (lookup_constant(nfs_rdirplus_tokens, > param->string, -1)) { > + case Opt_rdirplus_none: > + ctx->flags &= > ~NFS_MOUNT_FORCE_RDIRPLUS; > + ctx->flags |= NFS_MOUNT_NORDIRPLUS; > + break; > + case Opt_rdirplus_force: > + ctx->flags &= ~NFS_MOUNT_NORDIRPLUS; > + ctx->flags |= > NFS_MOUNT_FORCE_RDIRPLUS; > + break; > + default: > + goto out_invalid_value; > + } > + } > break; > case Opt_sharecache: > if (result.negated) > diff --git a/fs/nfs/super.c b/fs/nfs/super.c > index aeb715b4a690..9a747b224a9d 100644 > --- a/fs/nfs/super.c > +++ b/fs/nfs/super.c > @@ -456,6 +456,7 @@ static void nfs_show_mount_options(struct > seq_file *m, struct nfs_server *nfss, > { NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" }, > { NFS_MOUNT_UNSHARED, ",nosharecache", "" }, > { NFS_MOUNT_NORESVPORT, ",noresvport", "" }, > + { NFS_MOUNT_FORCE_RDIRPLUS, ",rdirplus=force", "" }, Put the above together with the other rdirplus options? > { 0, NULL, NULL } > }; > const struct proc_nfs_info *nfs_infop; > diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h > index f00bfcee7120..3774b2235a1e 100644 > --- a/include/linux/nfs_fs_sb.h > +++ b/include/linux/nfs_fs_sb.h > @@ -167,6 +167,7 @@ struct nfs_server { > #define NFS_MOUNT_TRUNK_DISCOVERY 0x04000000 > #define NFS_MOUNT_SHUTDOWN 0x08000000 > #define NFS_MOUNT_NO_ALIGNWRITE 0x10000000 > +#define NFS_MOUNT_FORCE_RDIRPLUS 0x20000000 > > unsigned int fattr_valid; /* Valid attributes > */ > unsigned int caps; /* server > capabilities */ -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" 2025-03-13 16:34 ` Trond Myklebust @ 2025-03-13 16:51 ` Benjamin Coddington 0 siblings, 0 replies; 5+ messages in thread From: Benjamin Coddington @ 2025-03-13 16:51 UTC (permalink / raw) To: Trond Myklebust; +Cc: anna, linux-nfs On 13 Mar 2025, at 12:34, Trond Myklebust wrote: > On Thu, 2025-03-13 at 10:43 -0400, Benjamin Coddington wrote: >> There are certain users that wish to force the NFS client to choose >> READDIRPLUS over READDIR for a particular mount. Update the >> "rdirplus" mount >> option to optionally accept values. For "rdirplus=force", the NFS >> client >> will always attempt to use READDDIRPLUS. The setting of >> "rdirplus=none" is >> aliased to the existing "nordirplus". >> >> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> >> --- >> fs/nfs/dir.c | 2 ++ >> fs/nfs/fs_context.c | 32 ++++++++++++++++++++++++++++---- >> fs/nfs/super.c | 1 + >> include/linux/nfs_fs_sb.h | 1 + >> 4 files changed, 32 insertions(+), 4 deletions(-) >> >> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c >> index 2b04038b0e40..c9de0e474cf5 100644 >> --- a/fs/nfs/dir.c >> +++ b/fs/nfs/dir.c >> @@ -666,6 +666,8 @@ static bool nfs_use_readdirplus(struct inode >> *dir, struct dir_context *ctx, >> { >> if (!nfs_server_capable(dir, NFS_CAP_READDIRPLUS)) >> return false; >> + if (NFS_SERVER(dir)->flags && NFS_MOUNT_FORCE_RDIRPLUS) > > Bitwise and? Oh crap - my tests are junk. Ben ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] nfs(5): Add new rdirplus functionality, clarify 2025-03-13 14:42 [PATCH v3 0/1] Mount option rdirplus extension Benjamin Coddington 2025-03-13 14:43 ` [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" Benjamin Coddington @ 2025-03-13 15:18 ` Benjamin Coddington 1 sibling, 0 replies; 5+ messages in thread From: Benjamin Coddington @ 2025-03-13 15:18 UTC (permalink / raw) To: steved; +Cc: linux-nfs The proposed kernel [patch][1] will modify the rdirplus mount option to accept optional string values of "none" and "force". Update the man page to reflect these changes and clarify the current client's behavior for the default. [1]: https://lore.kernel.org/linux-nfs/8c33cd92be52255b0dd0a7489c9e5cc35434ec95.1741876784.git.bcodding@redhat.com/T/#u Signed-off-by: Benjamin Coddington <bcodding@redhat.com> --- utils/mount/nfs.man | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man index eab4692a87de..91fc508de280 100644 --- a/utils/mount/nfs.man +++ b/utils/mount/nfs.man @@ -434,11 +434,16 @@ option may also be used by some pNFS drivers to decide how many connections to set up to the data servers. .TP 1.5i .BR rdirplus " / " nordirplus -Selects whether to use NFS v3 or v4 READDIRPLUS requests. -If this option is not specified, the NFS client uses READDIRPLUS requests -on NFS v3 or v4 mounts to read small directories. -Some applications perform better if the client uses only READDIR requests -for all directories. +Selects whether to use NFS v3 or v4 READDIRPLUS requests. If this option is +not specified, the NFS client uses a heuristic to optimize performance by +choosing READDIR vs READDIRPLUS based on how often the calling process uses +the additional attributes returned from READDIRPLUS. Some applications +perform better if the client uses only READDIR requests for all directories. +.TP 1.5i +.BR rdirplus={none|force} +If set to "force", the NFS client always attempts to use READDIRPLUS +requests. If set to "none", the behavior is the same as +.B nordirplus. .TP 1.5i .BI retry= n The number of minutes that the -- 2.47.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-03-13 16:51 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-13 14:42 [PATCH v3 0/1] Mount option rdirplus extension Benjamin Coddington 2025-03-13 14:43 ` [PATCH v3 1/1] NFS: Extend rdirplus mount option with "force|none" Benjamin Coddington 2025-03-13 16:34 ` Trond Myklebust 2025-03-13 16:51 ` Benjamin Coddington 2025-03-13 15:18 ` [PATCH] nfs(5): Add new rdirplus functionality, clarify Benjamin Coddington
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox