Linux filesystem development
 help / color / mirror / Atom feed
* [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold
@ 2026-05-15 15:35 Chuck Lever
  2026-05-15 15:35 ` [PATCH 1/7] tools headers UAPI: Sync case-sensitivity flags from linux/fs.h Chuck Lever
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Address review comments for the series in the vfs/vfs-7.2.casefold
branch.

Chuck Lever (7):
  tools headers UAPI: Sync case-sensitivity flags from linux/fs.h
  nfs: Avoid transient zeroed case capability bits during probe
  nfs: Skip pathconf probe when neither field is consumed
  fs: Clarify FS_CASEFOLD_FL semantics in UAPI header
  nfsd: Use kernel credentials for case-info probe
  nfsd: Map -ESTALE from case probe to NFS3ERR_STALE
  nfsd: Cap case-folding probe cost across READDIR entries

 fs/nfs/client.c                               | 29 ++++++----
 fs/nfsd/nfs3proc.c                            |  3 +
 fs/nfsd/nfs4xdr.c                             | 55 ++++++++++++++++---
 fs/nfsd/vfs.c                                 |  4 +-
 fs/nfsd/xdr4.h                                | 14 +++++
 include/uapi/linux/fs.h                       | 11 +++-
 .../perf/trace/beauty/include/uapi/linux/fs.h |  7 +++
 7 files changed, 98 insertions(+), 25 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/7] tools headers UAPI: Sync case-sensitivity flags from linux/fs.h
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  2026-05-15 15:35 ` [PATCH 2/7] nfs: Avoid transient zeroed case capability bits during probe Chuck Lever
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

The case-sensitivity series adds FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING to include/uapi/linux/fs.h, and
tools/perf/check-headers.sh would warn about the resulting drift
in the perf beauty copy.  Pick up only those two flags (and the
surrounding comment block) so the series does not introduce new
drift of its own.

This is not a full sync.  The perf copy is also missing the
FS_IOC_SHUTDOWN block added by commit 1f662195dbc0 ("fs: add
generic FS_IOC_SHUTDOWN definitions").  Because
tools/perf/check-headers.sh emits a single warning per file, that
warning will remain active until the older drift is picked up
too; closing it is left to a separate sync outside this series.

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=2
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 tools/perf/trace/beauty/include/uapi/linux/fs.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/trace/beauty/include/uapi/linux/fs.h b/tools/perf/trace/beauty/include/uapi/linux/fs.h
index 70b2b661f42c..2fa003575e8b 100644
--- a/tools/perf/trace/beauty/include/uapi/linux/fs.h
+++ b/tools/perf/trace/beauty/include/uapi/linux/fs.h
@@ -254,6 +254,13 @@ struct file_attr {
 #define FS_XFLAG_DAX		0x00008000	/* use DAX for IO */
 #define FS_XFLAG_COWEXTSIZE	0x00010000	/* CoW extent size allocator hint */
 #define FS_XFLAG_VERITY		0x00020000	/* fs-verity enabled */
+/*
+ * Case handling flags (read-only, cannot be set via ioctl).
+ * Default (neither set) indicates POSIX semantics: case-sensitive
+ * lookups and case-preserving storage.
+ */
+#define FS_XFLAG_CASEFOLD	0x00040000	/* case-insensitive lookups */
+#define FS_XFLAG_CASENONPRESERVING 0x00080000	/* case not preserved */
 #define FS_XFLAG_HASATTR	0x80000000	/* no DIFLAG for this	*/
 
 /* the read-only stuff doesn't really belong here, but any other place is
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/7] nfs: Avoid transient zeroed case capability bits during probe
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
  2026-05-15 15:35 ` [PATCH 1/7] tools headers UAPI: Sync case-sensitivity flags from linux/fs.h Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  2026-05-15 15:35 ` [PATCH 3/7] nfs: Skip pathconf probe when neither field is consumed Chuck Lever
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

nfs_probe_fsinfo() clears NFS_CAP_CASE_INSENSITIVE and
NFS_CAP_CASE_NONPRESERVING ahead of the synchronous pathconf RPC and
sets them again only after the reply arrives. The code path is gated
by clp->rpc_ops->version < 4 and is therefore reached on NFSv2/v3
remount via nfs_reconfigure(), which calls nfs_probe_server() against
a live mount. Concurrent readers walking server->caps can observe the
cleared state for the duration of the round-trip and report the wrong
case-sensitivity attributes.

Compute the post-probe capability mask on the stack and assign it to
server->caps in a single store so readers see either the stale value
or the freshly computed one, never an intermediate zero. Preserve the
original behaviour of dropping the bits when the pathconf RPC itself
fails.

The analogous transient zero on the NFSv4 path lives in
nfs4_server_capabilities() and is left for a separate fix.

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=10
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/client.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 3db2f18315b8..28b66bb0dd33 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -937,20 +937,23 @@ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, str
 	pathinfo.fattr = fattr;
 	nfs_fattr_init(fattr);
 
-	/* Clear before probing so a failed RPC does not retain stale bits. */
-	if (clp->rpc_ops->version < 4)
-		server->caps &= ~(NFS_CAP_CASE_INSENSITIVE |
-				  NFS_CAP_CASE_NONPRESERVING);
-
 	if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) {
 		if (server->namelen == 0)
 			server->namelen = pathinfo.max_namelen;
 		if (clp->rpc_ops->version < 4) {
+			unsigned int caps = server->caps;
+
+			caps &= ~(NFS_CAP_CASE_INSENSITIVE |
+				  NFS_CAP_CASE_NONPRESERVING);
 			if (pathinfo.case_insensitive)
-				server->caps |= NFS_CAP_CASE_INSENSITIVE;
+				caps |= NFS_CAP_CASE_INSENSITIVE;
 			if (!pathinfo.case_preserving)
-				server->caps |= NFS_CAP_CASE_NONPRESERVING;
+				caps |= NFS_CAP_CASE_NONPRESERVING;
+			server->caps = caps;
 		}
+	} else if (clp->rpc_ops->version < 4) {
+		server->caps &= ~(NFS_CAP_CASE_INSENSITIVE |
+				  NFS_CAP_CASE_NONPRESERVING);
 	}
 
 	if (clp->rpc_ops->discover_trunking != NULL &&
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/7] nfs: Skip pathconf probe when neither field is consumed
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
  2026-05-15 15:35 ` [PATCH 1/7] tools headers UAPI: Sync case-sensitivity flags from linux/fs.h Chuck Lever
  2026-05-15 15:35 ` [PATCH 2/7] nfs: Avoid transient zeroed case capability bits during probe Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  2026-05-15 15:35 ` [PATCH 4/7] fs: Clarify FS_CASEFOLD_FL semantics in UAPI header Chuck Lever
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

The PATHCONF RPC issued from nfs_probe_fsinfo() supplies two pieces of
information: max_namelen, used only when server->namelen has not been
pinned by mount options, and the case_insensitive / case_preserving
fields, used only by the NFSv2/NFSv3 path. NFSv4 receives its case
sensitivity caps from the FATTR4_CASE_* attributes during the
set_capabilities probe, and a non-zero server->namelen short-circuits
the only other field of interest.

When both conditions hold (NFSv4 with namelen pinned), the pathconf
reply is discarded in full but the round-trip is still on the mount
critical path. Gate the call on version < 4 || namelen == 0 so that
mounts which cannot benefit from the reply do not pay for it.

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=10
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/client.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 28b66bb0dd33..73b95318ba48 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -937,23 +937,25 @@ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, str
 	pathinfo.fattr = fattr;
 	nfs_fattr_init(fattr);
 
-	if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) {
-		if (server->namelen == 0)
-			server->namelen = pathinfo.max_namelen;
-		if (clp->rpc_ops->version < 4) {
-			unsigned int caps = server->caps;
+	if (clp->rpc_ops->version < 4 || server->namelen == 0) {
+		if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) {
+			if (server->namelen == 0)
+				server->namelen = pathinfo.max_namelen;
+			if (clp->rpc_ops->version < 4) {
+				unsigned int caps = server->caps;
 
-			caps &= ~(NFS_CAP_CASE_INSENSITIVE |
-				  NFS_CAP_CASE_NONPRESERVING);
-			if (pathinfo.case_insensitive)
-				caps |= NFS_CAP_CASE_INSENSITIVE;
-			if (!pathinfo.case_preserving)
-				caps |= NFS_CAP_CASE_NONPRESERVING;
-			server->caps = caps;
+				caps &= ~(NFS_CAP_CASE_INSENSITIVE |
+					  NFS_CAP_CASE_NONPRESERVING);
+				if (pathinfo.case_insensitive)
+					caps |= NFS_CAP_CASE_INSENSITIVE;
+				if (!pathinfo.case_preserving)
+					caps |= NFS_CAP_CASE_NONPRESERVING;
+				server->caps = caps;
+			}
+		} else if (clp->rpc_ops->version < 4) {
+			server->caps &= ~(NFS_CAP_CASE_INSENSITIVE |
+					  NFS_CAP_CASE_NONPRESERVING);
 		}
-	} else if (clp->rpc_ops->version < 4) {
-		server->caps &= ~(NFS_CAP_CASE_INSENSITIVE |
-				  NFS_CAP_CASE_NONPRESERVING);
 	}
 
 	if (clp->rpc_ops->discover_trunking != NULL &&
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/7] fs: Clarify FS_CASEFOLD_FL semantics in UAPI header
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
                   ` (2 preceding siblings ...)
  2026-05-15 15:35 ` [PATCH 3/7] nfs: Skip pathconf probe when neither field is consumed Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  2026-05-15 15:35 ` [PATCH 5/7] nfsd: Use kernel credentials for case-info probe Chuck Lever
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

The existing one-liner "Folder is case insensitive" leaves the
impression that FS_CASEFOLD_FL is reserved for directories.
That impression is wrong: filesystems that derive
case-insensitivity from mount or volume state report the bit on
non-directory inodes via i_op->fileattr_get, so userspace
inspecting FS_IOC_GETFLAGS can see it on any inode type.

Replace the one-liner with a block comment that names directories
as the typical case, records that non-directory inodes may also
report the bit, and notes FS_XFLAG_CASEFOLD as the read-only
companion exposed through FS_IOC_FSGETXATTR.

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=3
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/uapi/linux/fs.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 2ea4c81df08f..bd87262f2e34 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -395,7 +395,16 @@ struct file_attr {
 #define FS_DAX_FL			0x02000000 /* Inode is DAX */
 #define FS_INLINE_DATA_FL		0x10000000 /* Reserved for ext4 */
 #define FS_PROJINHERIT_FL		0x20000000 /* Create with parents projid */
-#define FS_CASEFOLD_FL			0x40000000 /* Folder is case insensitive */
+/*
+ * FS_CASEFOLD_FL indicates case-insensitive name lookup. The
+ * bit is most often reported on directories, where it controls
+ * lookups of entries within. Filesystems that derive
+ * case-insensitivity from mount or volume state may also report
+ * it on non-directory inodes; userspace must not assume the bit
+ * is directory-only. FS_XFLAG_CASEFOLD reports the same
+ * information read-only via FS_IOC_FSGETXATTR.
+ */
+#define FS_CASEFOLD_FL			0x40000000
 #define FS_RESERVED_FL			0x80000000 /* reserved for ext2 lib */
 
 #define FS_FL_USER_VISIBLE		0x0003DFFF /* User visible flags */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/7] nfsd: Use kernel credentials for case-info probe
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
                   ` (3 preceding siblings ...)
  2026-05-15 15:35 ` [PATCH 4/7] fs: Clarify FS_CASEFOLD_FL semantics in UAPI header Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  2026-05-15 15:35 ` [PATCH 6/7] nfsd: Map -ESTALE from case probe to NFS3ERR_STALE Chuck Lever
  2026-05-15 15:35 ` [PATCH 7/7] nfsd: Cap case-folding probe cost across READDIR entries Chuck Lever
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

nfsd_get_case_info() takes prepare_creds() and overrides fsuid/fsgid
to GLOBAL_ROOT, intending to escape per-client policy on the parent
directory. prepare_creds() copies the calling task's full credential,
including the LSM security label, so only the DAC identity is
neutralized. With labeled NFS, where the active LSM context has been
mapped to the client, security_inode_file_getattr() can still deny the
probe with -EACCES even though the case-folding property the caller
wants is structural and identical for every client. The docblock
already states the intent ("the probe runs with kernel credentials"),
which the implementation does not deliver.

prepare_kernel_cred(&init_task) constructs a credential from
init_task's identity and security label, the kernel's own unconfined
context. Use it instead and drop the redundant fsuid/fsgid overrides
that init_task already supplies. The probe now matches the docblock,
LSM denials on the parent disappear, and the call sites that map an
unexpected error to NFS3ERR_SERVERFAULT or fail an NFSv4 GETATTR
outright stop seeing -EACCES from this path.

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=14
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/vfs.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 85ff418127c7..ba97e287c007 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -2943,13 +2943,11 @@ nfsd_get_case_info(struct dentry *dentry, bool *case_insensitive,
 		put = true;
 	}
 
-	probe = prepare_creds();
+	probe = prepare_kernel_cred(&init_task);
 	if (!probe) {
 		err = -ENOMEM;
 		goto out;
 	}
-	probe->fsuid = GLOBAL_ROOT_UID;
-	probe->fsgid = GLOBAL_ROOT_GID;
 	saved = override_creds(probe);
 
 	err = vfs_fileattr_get(cd, &fa);
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/7] nfsd: Map -ESTALE from case probe to NFS3ERR_STALE
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
                   ` (4 preceding siblings ...)
  2026-05-15 15:35 ` [PATCH 5/7] nfsd: Use kernel credentials for case-info probe Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  2026-05-15 15:35 ` [PATCH 7/7] nfsd: Cap case-folding probe cost across READDIR entries Chuck Lever
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

The PATHCONF switch in nfsd3_proc_pathconf() recognizes -EOPNOTSUPP
(filesystem does not expose case state) and maps -EACCES / -EPERM to
nfserr_stale, but lets every other errno fall through to
nfserr_serverfault. -ESTALE escapes the same way even though RFC 1813
lists NFS3ERR_STALE as a permitted PATHCONF status, so a probe of an
NFS-backed re-export whose parent dentry has been invalidated returns
SERVERFAULT and tells the client the server is broken when the handle
itself simply went stale.

Add an explicit -ESTALE arm that maps to nfserr_stale.

Fixes: a8de9c3b40e4 ("nfsd: Report export case-folding via NFSv3 PATHCONF")
Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=13
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs3proc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 12b9172c6be1..aeda7a802bdf 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -745,6 +745,9 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
 			 */
 			resp->status = nfserr_stale;
 			break;
+		case -ESTALE:
+			resp->status = nfserr_stale;
+			break;
 		default:
 			resp->status = nfserr_serverfault;
 			break;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 7/7] nfsd: Cap case-folding probe cost across READDIR entries
  2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
                   ` (5 preceding siblings ...)
  2026-05-15 15:35 ` [PATCH 6/7] nfsd: Map -ESTALE from case probe to NFS3ERR_STALE Chuck Lever
@ 2026-05-15 15:35 ` Chuck Lever
  6 siblings, 0 replies; 8+ messages in thread
From: Chuck Lever @ 2026-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, Chuck Lever, sashiko-bot

From: Chuck Lever <chuck.lever@oracle.com>

NFSv4 READDIR carries a per-entry attrmask. When the attrmask
includes FATTR4_CASE_INSENSITIVE or FATTR4_CASE_PRESERVING,
nfsd4_encode_fattr4() resolves each non-directory child's case
attributes by calling nfsd_get_case_info(), which dget_parent()s
back to the directory being read and re-runs the cred swap and LSM
probe per child. The encoder amplifies a single answer into one
prepare_kernel_cred() allocation, two LSM hooks, and one put_cred()
RCU callback for every non-directory entry.

No mainstream NFSv4 client has been observed to populate a READDIR
attrmask with these attributes; the Linux client queries them only
via SERVER_CAPS at mount time. The exposure is therefore to test
clients exploring corner cases and to hostile clients that submit
an attrmask designed to multiply server work by rd_dircount.

Probe the directory being read once and cache the result on
struct nfsd4_readdir for use by every non-directory child. The
probe targets the readdir filehandle's dentry, which is held for
the duration of the request, rather than dget_parent() of a
child's locklessly-acquired dentry; the latter could be moved out
of the directory by a concurrent rename and report attributes
from an unrelated parent. Directory entries continue to be
queried individually, because casefold-capable filesystems (ext4,
f2fs) report case state per directory. The other callers of
nfsd4_encode_fattr4() (single GETATTR, the buffer wrapper) pass
NULL for the cache pointer and behave as before.

Reported-by: sashiko-bot <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com?part=14
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4xdr.c | 55 +++++++++++++++++++++++++++++++++++++++--------
 fs/nfsd/xdr4.h    | 14 ++++++++++++
 2 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 319007b79d49..20355dc3f1d1 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3883,13 +3883,16 @@ static const nfsd4_enc_attr nfsd4_enc_fattr4_encode_ops[] = {
 
 /*
  * Note: @fhp can be NULL; in this case, we might have to compose the filehandle
- * ourselves.
+ * ourselves. @case_cache is NULL for callers that encode a single dentry
+ * (GETATTR, the buffer wrapper); READDIR passes a per-request cache so
+ * non-directory children share the parent's case-folding probe result.
  */
 static __be32
 nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 		    struct svc_fh *fhp, struct svc_export *exp,
 		    struct dentry *dentry, const u32 *bmval,
-		    int ignore_crossmnt)
+		    int ignore_crossmnt,
+		    struct nfsd_case_attrs_cache *case_cache)
 {
 	DECLARE_BITMAP(attr_bitmap, ARRAY_SIZE(nfsd4_enc_fattr4_encode_ops));
 	struct nfs4_delegation *dp = NULL;
@@ -3999,9 +4002,17 @@ nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 		args.fhp = fhp;
 	if (attrmask[0] & (FATTR4_WORD0_CASE_INSENSITIVE |
 			   FATTR4_WORD0_CASE_PRESERVING)) {
-		err = nfsd_get_case_info(dentry, &args.case_insensitive,
-					 &args.case_preserving);
 		/*
+		 * In a batched encoder (READDIR) every non-directory
+		 * child shares the same case-folding answer, so the
+		 * directory being read is probed once and the result is
+		 * cached. The probe targets case_cache->dir, the held
+		 * readdir filehandle's dentry, instead of the child's
+		 * locklessly-acquired dentry, which a concurrent rename
+		 * could move under an unrelated parent. Directory
+		 * entries are queried directly because casefold-capable
+		 * filesystems answer per directory.
+		 *
 		 * Per RFC 8881 Section 18.7.3, an attribute advertised
 		 * in SUPPORTED_ATTRS must come back with a value or the
 		 * GETATTR must fail. nfsd_get_case_info() fills POSIX
@@ -4011,8 +4022,24 @@ nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 		 * advertises. Other errors fail the operation as the
 		 * spec requires.
 		 */
-		if (err && err != -EOPNOTSUPP)
-			goto out_nfserr;
+		if (case_cache && !d_is_dir(dentry)) {
+			if (!case_cache->valid) {
+				err = nfsd_get_case_info(case_cache->dir,
+							 &case_cache->insensitive,
+							 &case_cache->preserving);
+				if (err && err != -EOPNOTSUPP)
+					goto out_nfserr;
+				case_cache->valid = true;
+			}
+			args.case_insensitive = case_cache->insensitive;
+			args.case_preserving = case_cache->preserving;
+		} else {
+			err = nfsd_get_case_info(dentry,
+						 &args.case_insensitive,
+						 &args.case_preserving);
+			if (err && err != -EOPNOTSUPP)
+				goto out_nfserr;
+		}
 	}
 
 	if (attrmask[0] & FATTR4_WORD0_ACL) {
@@ -4170,7 +4197,7 @@ __be32 nfsd4_encode_fattr_to_buf(__be32 **p, int words,
 
 	svcxdr_init_encode_from_buffer(&xdr, &dummy, *p, words << 2);
 	ret = nfsd4_encode_fattr4(rqstp, &xdr, fhp, exp, dentry, bmval,
-				  ignore_crossmnt);
+				  ignore_crossmnt, NULL);
 	*p = xdr.p;
 	return ret;
 }
@@ -4208,6 +4235,7 @@ nfsd4_encode_entry4_fattr(struct nfsd4_readdir *cd, const char *name,
 	struct dentry *dentry;
 	__be32 nfserr;
 	int ignore_crossmnt = 0;
+	bool crossed = false;
 
 	dentry = lookup_one_positive_unlocked(&nop_mnt_idmap,
 					      &QSTR_LEN(name, namlen),
@@ -4244,11 +4272,18 @@ nfsd4_encode_entry4_fattr(struct nfsd4_readdir *cd, const char *name,
 		nfserr = check_nfsd_access(exp, cd->rd_rqstp, false);
 		if (nfserr)
 			goto out_put;
+		crossed = true;
 
 	}
 out_encode:
+	/*
+	 * A crossed entry no longer shares a parent with the directory
+	 * being read, so it must neither consume nor populate the
+	 * per-readdir case-folding cache.
+	 */
 	nfserr = nfsd4_encode_fattr4(cd->rd_rqstp, cd->xdr, NULL, exp, dentry,
-				     cd->rd_bmval, ignore_crossmnt);
+				     cd->rd_bmval, ignore_crossmnt,
+				     crossed ? NULL : &cd->rd_case_cache);
 out_put:
 	dput(dentry);
 	exp_put(exp);
@@ -4495,7 +4530,7 @@ nfsd4_encode_getattr(struct nfsd4_compoundres *resp, __be32 nfserr,
 
 	/* obj_attributes */
 	return nfsd4_encode_fattr4(resp->rqstp, xdr, fhp, fhp->fh_export,
-				   fhp->fh_dentry, getattr->ga_bmval, 0);
+				   fhp->fh_dentry, getattr->ga_bmval, 0, NULL);
 }
 
 static __be32
@@ -5022,6 +5057,8 @@ static __be32 nfsd4_encode_dirlist4(struct xdr_stream *xdr,
 	readdir->rd_maxcount = maxcount;
 	readdir->common.err = 0;
 	readdir->cookie_offset = 0;
+	readdir->rd_case_cache.dir = readdir->rd_fhp->fh_dentry;
+	readdir->rd_case_cache.valid = false;
 	offset = readdir->rd_cookie;
 	status = nfsd_readdir(readdir->rd_rqstp, readdir->rd_fhp, &offset,
 			      &readdir->common, nfsd4_encode_entry4);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 417e9ad9fbb3..615797df218f 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -432,6 +432,19 @@ struct nfsd4_read {
 	u32			rd_eof;             /* response */
 };
 
+/*
+ * Cache the case-folding properties of @dir so a batched encoder
+ * (e.g., READDIR) does not re-probe per child. @dir is the
+ * directory being read, held by the request, so it is stable
+ * against rename for the duration of the cache's lifetime.
+ */
+struct nfsd_case_attrs_cache {
+	struct dentry	*dir;
+	bool		valid;
+	bool		insensitive;
+	bool		preserving;
+};
+
 struct nfsd4_readdir {
 	u64		rd_cookie;          /* request */
 	nfs4_verifier	rd_verf;            /* request */
@@ -444,6 +457,7 @@ struct nfsd4_readdir {
 	struct readdir_cd	common;
 	struct xdr_stream	*xdr;
 	int			cookie_offset;
+	struct nfsd_case_attrs_cache rd_case_cache;
 };
 
 struct nfsd4_release_lockowner {
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-15 15:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 15:35 [PATCH 0/7] Fixes for vfs/vfs-7.2.casefold Chuck Lever
2026-05-15 15:35 ` [PATCH 1/7] tools headers UAPI: Sync case-sensitivity flags from linux/fs.h Chuck Lever
2026-05-15 15:35 ` [PATCH 2/7] nfs: Avoid transient zeroed case capability bits during probe Chuck Lever
2026-05-15 15:35 ` [PATCH 3/7] nfs: Skip pathconf probe when neither field is consumed Chuck Lever
2026-05-15 15:35 ` [PATCH 4/7] fs: Clarify FS_CASEFOLD_FL semantics in UAPI header Chuck Lever
2026-05-15 15:35 ` [PATCH 5/7] nfsd: Use kernel credentials for case-info probe Chuck Lever
2026-05-15 15:35 ` [PATCH 6/7] nfsd: Map -ESTALE from case probe to NFS3ERR_STALE Chuck Lever
2026-05-15 15:35 ` [PATCH 7/7] nfsd: Cap case-folding probe cost across READDIR entries Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox