* [RFC PATCH] fs: Plumb case sensitivity bits into statx
@ 2025-09-25 15:11 Chuck Lever
2025-09-25 15:50 ` Amir Goldstein
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Chuck Lever @ 2025-09-25 15:11 UTC (permalink / raw)
To: linux-fsdevel, linux-nfs; +Cc: Chuck Lever, Jeff Layton, Volker Lendecke
From: Chuck Lever <chuck.lever@oracle.com>
Both the NFSv3 and NFSv4 protocols enable NFS clients to query NFS
servers about the case sensitivity and case preservation behaviors
of shared file systems. Today, the Linux NFSD implementation
unconditionally returns "the export is case sensitive and case
preserving".
However, a few Linux in-tree file system types appear to have some
ability to handle case-folded filenames. Some of our users would
like to exploit that functionality from their non-POSIX NFS clients.
Enable upper layers such as NFSD to retrieve case sensitivity
information from file systems by adding a statx API for this
purpose. Introduce a sample producer and a sample consumer for this
information.
If this mechanism seems sensible, a future patch might add a similar
field to the user-space-visible statx structure. User-space file
servers already use a variety of APIs to acquire this information.
Suggested-by: Jeff Layton <jlayton@kernel.org>
Cc: Volker Lendecke <Volker.Lendecke@sernet.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/fat/file.c | 5 +++++
fs/nfsd/nfs3proc.c | 35 +++++++++++++++++++++++++++--------
include/linux/stat.h | 1 +
include/uapi/linux/stat.h | 15 +++++++++++++++
4 files changed, 48 insertions(+), 8 deletions(-)
I'm certain this RFC patch has a number of problems, but it should
serve as a discussion point.
diff --git a/fs/fat/file.c b/fs/fat/file.c
index 4fc49a614fb8..8572e36d8f27 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -413,6 +413,11 @@ int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
stat->result_mask |= STATX_BTIME;
stat->btime = MSDOS_I(inode)->i_crtime;
}
+ if (request_mask & STATX_CASE_INFO) {
+ stat->result_mask |= STATX_CASE_INFO;
+ /* STATX_CASE_PRESERVING is cleared */
+ stat->case_info = statx_case_ascii;
+ }
return 0;
}
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index b6d03e1ef5f7..b319d1c4385c 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -697,6 +697,31 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
return rpc_success;
}
+static __be32
+nfsd3_proc_case(struct svc_fh *fhp, struct nfsd3_pathconfres *resp)
+{
+ struct path p = {
+ .mnt = fhp->fh_export->ex_path.mnt,
+ .dentry = fhp->fh_dentry,
+ };
+ u32 request_mask = STATX_CASE_INFO;
+ struct kstat stat;
+ __be32 nfserr;
+
+ nfserr = nfserrno(vfs_getattr(&p, &stat, request_mask,
+ AT_STATX_SYNC_AS_STAT));
+ if (nfserr != nfs_ok)
+ return nfserr;
+ if (!(stat.result_mask & STATX_CASE_INFO))
+ return nfs_ok;
+
+ resp->p_case_insensitive =
+ stat.case_info & STATX_CASE_FOLDING_TYPE ? 0 : 1;
+ resp->p_case_preserving =
+ stat.case_info & STATX_CASE_PRESERVING ? 1 : 0;
+ return nfs_ok;
+}
+
/*
* Get pathconf info for the specified file
*/
@@ -722,17 +747,11 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
if (resp->status == nfs_ok) {
struct super_block *sb = argp->fh.fh_dentry->d_sb;
- /* Note that we don't care for remote fs's here */
- switch (sb->s_magic) {
- case EXT2_SUPER_MAGIC:
+ if (sb->s_magic == EXT2_SUPER_MAGIC) {
resp->p_link_max = EXT2_LINK_MAX;
resp->p_name_max = EXT2_NAME_LEN;
- break;
- case MSDOS_SUPER_MAGIC:
- resp->p_case_insensitive = 1;
- resp->p_case_preserving = 0;
- break;
}
+ resp->status = nfsd3_proc_case(&argp->fh, resp);
}
fh_put(&argp->fh);
diff --git a/include/linux/stat.h b/include/linux/stat.h
index e3d00e7bb26d..abb47cbb233a 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -59,6 +59,7 @@ struct kstat {
u32 atomic_write_unit_max;
u32 atomic_write_unit_max_opt;
u32 atomic_write_segments_max;
+ u32 case_info;
};
/* These definitions are internal to the kernel for now. Mainly used by nfsd. */
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 1686861aae20..e929b30d64b6 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -219,6 +219,7 @@ struct statx {
#define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
#define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
#define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
+#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
#define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
@@ -257,4 +258,18 @@ struct statx {
#define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
+/*
+ * File system support for case folding is available via a bitmap.
+ */
+#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
+
+/* Values stored in the low-order byte of .case_info */
+enum {
+ statx_case_sensitive = 0,
+ statx_case_ascii,
+ statx_case_utf8,
+ statx_case_utf16,
+};
+#define STATX_CASE_FOLDING_TYPE 0x000000ff
+
#endif /* _UAPI_LINUX_STAT_H */
--
2.51.0
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-09-25 15:11 [RFC PATCH] fs: Plumb case sensitivity bits into statx Chuck Lever
@ 2025-09-25 15:50 ` Amir Goldstein
2025-10-03 15:24 ` Gabriel Krisman Bertazi
2025-09-26 4:20 ` Christoph Hellwig
2025-09-26 10:00 ` Jeff Layton
2 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2025-09-25 15:50 UTC (permalink / raw)
To: Chuck Lever
Cc: linux-fsdevel, linux-nfs, Chuck Lever, Jeff Layton,
Volker Lendecke, Gabriel Krisman Bertazi, CIFS
On Thu, Sep 25, 2025 at 5:21 PM Chuck Lever <cel@kernel.org> wrote:
>
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Both the NFSv3 and NFSv4 protocols enable NFS clients to query NFS
> servers about the case sensitivity and case preservation behaviors
> of shared file systems. Today, the Linux NFSD implementation
> unconditionally returns "the export is case sensitive and case
> preserving".
>
> However, a few Linux in-tree file system types appear to have some
> ability to handle case-folded filenames. Some of our users would
> like to exploit that functionality from their non-POSIX NFS clients.
>
> Enable upper layers such as NFSD to retrieve case sensitivity
> information from file systems by adding a statx API for this
> purpose. Introduce a sample producer and a sample consumer for this
> information.
>
> If this mechanism seems sensible, a future patch might add a similar
> field to the user-space-visible statx structure. User-space file
> servers already use a variety of APIs to acquire this information.
>
> Suggested-by: Jeff Layton <jlayton@kernel.org>
> Cc: Volker Lendecke <Volker.Lendecke@sernet.de>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/fat/file.c | 5 +++++
> fs/nfsd/nfs3proc.c | 35 +++++++++++++++++++++++++++--------
> include/linux/stat.h | 1 +
> include/uapi/linux/stat.h | 15 +++++++++++++++
> 4 files changed, 48 insertions(+), 8 deletions(-)
>
> I'm certain this RFC patch has a number of problems, but it should
> serve as a discussion point.
>
>
> diff --git a/fs/fat/file.c b/fs/fat/file.c
> index 4fc49a614fb8..8572e36d8f27 100644
> --- a/fs/fat/file.c
> +++ b/fs/fat/file.c
> @@ -413,6 +413,11 @@ int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
> stat->result_mask |= STATX_BTIME;
> stat->btime = MSDOS_I(inode)->i_crtime;
> }
> + if (request_mask & STATX_CASE_INFO) {
> + stat->result_mask |= STATX_CASE_INFO;
> + /* STATX_CASE_PRESERVING is cleared */
> + stat->case_info = statx_case_ascii;
> + }
>
> return 0;
> }
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index b6d03e1ef5f7..b319d1c4385c 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -697,6 +697,31 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
> return rpc_success;
> }
>
> +static __be32
> +nfsd3_proc_case(struct svc_fh *fhp, struct nfsd3_pathconfres *resp)
> +{
> + struct path p = {
> + .mnt = fhp->fh_export->ex_path.mnt,
> + .dentry = fhp->fh_dentry,
> + };
> + u32 request_mask = STATX_CASE_INFO;
> + struct kstat stat;
> + __be32 nfserr;
> +
> + nfserr = nfserrno(vfs_getattr(&p, &stat, request_mask,
> + AT_STATX_SYNC_AS_STAT));
> + if (nfserr != nfs_ok)
> + return nfserr;
> + if (!(stat.result_mask & STATX_CASE_INFO))
> + return nfs_ok;
> +
> + resp->p_case_insensitive =
> + stat.case_info & STATX_CASE_FOLDING_TYPE ? 0 : 1;
> + resp->p_case_preserving =
> + stat.case_info & STATX_CASE_PRESERVING ? 1 : 0;
> + return nfs_ok;
> +}
> +
> /*
> * Get pathconf info for the specified file
> */
> @@ -722,17 +747,11 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
> if (resp->status == nfs_ok) {
> struct super_block *sb = argp->fh.fh_dentry->d_sb;
>
> - /* Note that we don't care for remote fs's here */
> - switch (sb->s_magic) {
> - case EXT2_SUPER_MAGIC:
> + if (sb->s_magic == EXT2_SUPER_MAGIC) {
> resp->p_link_max = EXT2_LINK_MAX;
> resp->p_name_max = EXT2_NAME_LEN;
> - break;
> - case MSDOS_SUPER_MAGIC:
> - resp->p_case_insensitive = 1;
> - resp->p_case_preserving = 0;
> - break;
> }
> + resp->status = nfsd3_proc_case(&argp->fh, resp);
> }
>
> fh_put(&argp->fh);
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index e3d00e7bb26d..abb47cbb233a 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -59,6 +59,7 @@ struct kstat {
> u32 atomic_write_unit_max;
> u32 atomic_write_unit_max_opt;
> u32 atomic_write_segments_max;
> + u32 case_info;
> };
>
> /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index 1686861aae20..e929b30d64b6 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -219,6 +219,7 @@ struct statx {
> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
>
> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
>
> @@ -257,4 +258,18 @@ struct statx {
> #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
>
>
> +/*
> + * File system support for case folding is available via a bitmap.
> + */
> +#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
> +
> +/* Values stored in the low-order byte of .case_info */
> +enum {
> + statx_case_sensitive = 0,
> + statx_case_ascii,
> + statx_case_utf8,
> + statx_case_utf16,
> +};
> +#define STATX_CASE_FOLDING_TYPE 0x000000ff
> +
> #endif /* _UAPI_LINUX_STAT_H */
> --
> 2.51.0
>
CC unicode maintainer and SMB list.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-09-25 15:11 [RFC PATCH] fs: Plumb case sensitivity bits into statx Chuck Lever
2025-09-25 15:50 ` Amir Goldstein
@ 2025-09-26 4:20 ` Christoph Hellwig
2025-09-26 13:02 ` Chuck Lever
2025-09-26 10:00 ` Jeff Layton
2 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2025-09-26 4:20 UTC (permalink / raw)
To: Chuck Lever
Cc: linux-fsdevel, linux-nfs, Chuck Lever, Jeff Layton,
Volker Lendecke
On Thu, Sep 25, 2025 at 11:11:40AM -0400, Chuck Lever wrote:
> + if (request_mask & STATX_CASE_INFO) {
> + stat->result_mask |= STATX_CASE_INFO;
> + /* STATX_CASE_PRESERVING is cleared */
> + stat->case_info = statx_case_ascii;
FAT is using code pages specified on the command line for it's case
insensitivity handling, which coverse much more than ASCISS.
> +/* Values stored in the low-order byte of .case_info */
> +enum {
> + statx_case_sensitive = 0,
> + statx_case_ascii,
> + statx_case_utf8,
> + statx_case_utf16,
> +};
What are these supposed to mean? ASCII, utf8 and utf16 are all
encodings and not case folding algorithms. While the folding is obvious
for ASCII, it is not for unicode and there are all kinds of different
variants. Also I don't know of any file systems using utf16 encoding
and even if it did, it would interact with the VFS and nfsd using
utf8. Note that the 16-bit ucs-2 encoding used by windows file systems
is a different things than unicode encodings like utf16.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-09-25 15:11 [RFC PATCH] fs: Plumb case sensitivity bits into statx Chuck Lever
2025-09-25 15:50 ` Amir Goldstein
2025-09-26 4:20 ` Christoph Hellwig
@ 2025-09-26 10:00 ` Jeff Layton
2025-09-26 13:05 ` Chuck Lever
2 siblings, 1 reply; 20+ messages in thread
From: Jeff Layton @ 2025-09-26 10:00 UTC (permalink / raw)
To: Chuck Lever, linux-fsdevel, linux-nfs; +Cc: Chuck Lever, Volker Lendecke
On Thu, 2025-09-25 at 11:11 -0400, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Both the NFSv3 and NFSv4 protocols enable NFS clients to query NFS
> servers about the case sensitivity and case preservation behaviors
> of shared file systems. Today, the Linux NFSD implementation
> unconditionally returns "the export is case sensitive and case
> preserving".
>
> However, a few Linux in-tree file system types appear to have some
> ability to handle case-folded filenames. Some of our users would
> like to exploit that functionality from their non-POSIX NFS clients.
>
> Enable upper layers such as NFSD to retrieve case sensitivity
> information from file systems by adding a statx API for this
> purpose. Introduce a sample producer and a sample consumer for this
> information.
>
> If this mechanism seems sensible, a future patch might add a similar
> field to the user-space-visible statx structure. User-space file
> servers already use a variety of APIs to acquire this information.
>
> Suggested-by: Jeff Layton <jlayton@kernel.org>
> Cc: Volker Lendecke <Volker.Lendecke@sernet.de>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/fat/file.c | 5 +++++
> fs/nfsd/nfs3proc.c | 35 +++++++++++++++++++++++++++--------
> include/linux/stat.h | 1 +
> include/uapi/linux/stat.h | 15 +++++++++++++++
> 4 files changed, 48 insertions(+), 8 deletions(-)
>
> I'm certain this RFC patch has a number of problems, but it should
> serve as a discussion point.
>
>
> diff --git a/fs/fat/file.c b/fs/fat/file.c
> index 4fc49a614fb8..8572e36d8f27 100644
> --- a/fs/fat/file.c
> +++ b/fs/fat/file.c
> @@ -413,6 +413,11 @@ int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
> stat->result_mask |= STATX_BTIME;
> stat->btime = MSDOS_I(inode)->i_crtime;
> }
> + if (request_mask & STATX_CASE_INFO) {
> + stat->result_mask |= STATX_CASE_INFO;
> + /* STATX_CASE_PRESERVING is cleared */
> + stat->case_info = statx_case_ascii;
> + }
>
> return 0;
> }
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index b6d03e1ef5f7..b319d1c4385c 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -697,6 +697,31 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
> return rpc_success;
> }
>
> +static __be32
> +nfsd3_proc_case(struct svc_fh *fhp, struct nfsd3_pathconfres *resp)
> +{
> + struct path p = {
> + .mnt = fhp->fh_export->ex_path.mnt,
> + .dentry = fhp->fh_dentry,
> + };
> + u32 request_mask = STATX_CASE_INFO;
> + struct kstat stat;
> + __be32 nfserr;
> +
> + nfserr = nfserrno(vfs_getattr(&p, &stat, request_mask,
> + AT_STATX_SYNC_AS_STAT));
> + if (nfserr != nfs_ok)
> + return nfserr;
> + if (!(stat.result_mask & STATX_CASE_INFO))
> + return nfs_ok;
> +
> + resp->p_case_insensitive =
> + stat.case_info & STATX_CASE_FOLDING_TYPE ? 0 : 1;
> + resp->p_case_preserving =
> + stat.case_info & STATX_CASE_PRESERVING ? 1 : 0;
> + return nfs_ok;
> +}
> +
> /*
> * Get pathconf info for the specified file
> */
> @@ -722,17 +747,11 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
> if (resp->status == nfs_ok) {
> struct super_block *sb = argp->fh.fh_dentry->d_sb;
>
> - /* Note that we don't care for remote fs's here */
> - switch (sb->s_magic) {
> - case EXT2_SUPER_MAGIC:
> + if (sb->s_magic == EXT2_SUPER_MAGIC) {
> resp->p_link_max = EXT2_LINK_MAX;
> resp->p_name_max = EXT2_NAME_LEN;
> - break;
> - case MSDOS_SUPER_MAGIC:
> - resp->p_case_insensitive = 1;
> - resp->p_case_preserving = 0;
> - break;
> }
> + resp->status = nfsd3_proc_case(&argp->fh, resp);
> }
>
> fh_put(&argp->fh);
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index e3d00e7bb26d..abb47cbb233a 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -59,6 +59,7 @@ struct kstat {
> u32 atomic_write_unit_max;
> u32 atomic_write_unit_max_opt;
> u32 atomic_write_segments_max;
> + u32 case_info;
> };
>
> /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index 1686861aae20..e929b30d64b6 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -219,6 +219,7 @@ struct statx {
> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
>
>
Do you intend to expose this new attribute to userland? If not, then it
should probably snuggle up next to STATX_CHANGE_COOKIE. If so, then you
need to claim a field for it in struct statx, and populate it.
I'll also note that you're not using a lot of the bits in this field,
and struct statx has a 16-bit spare field next to stx_mode. You may
want to rework this to use a 16 bit field and expose it to userland
there.
>
> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
>
> @@ -257,4 +258,18 @@ struct statx {
> #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
>
>
> +/*
> + * File system support for case folding is available via a bitmap.
> + */
> +#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
> +
> +/* Values stored in the low-order byte of .case_info */
> +enum {
> + statx_case_sensitive = 0,
> + statx_case_ascii,
> + statx_case_utf8,
> + statx_case_utf16,
> +};
These all need _very_ clear definitions, especially if you're exposing
this to userland. The devil is in the details when it comes to
casefolding and there is not a lot of consistency across filesystems.
The callers need to know what they can expect when they see that these
bits are set.
> +#define STATX_CASE_FOLDING_TYPE 0x000000ff
> +
> #endif /* _UAPI_LINUX_STAT_H */
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-09-26 4:20 ` Christoph Hellwig
@ 2025-09-26 13:02 ` Chuck Lever
0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2025-09-26 13:02 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-fsdevel, linux-nfs, Chuck Lever, Jeff Layton,
Volker Lendecke
On 9/25/25 9:20 PM, Christoph Hellwig wrote:
> On Thu, Sep 25, 2025 at 11:11:40AM -0400, Chuck Lever wrote:
>> + if (request_mask & STATX_CASE_INFO) {
>> + stat->result_mask |= STATX_CASE_INFO;
>> + /* STATX_CASE_PRESERVING is cleared */
>> + stat->case_info = statx_case_ascii;
>
> FAT is using code pages specified on the command line for it's case
> insensitivity handling, which coverse much more than ASCISS.
I noticed that a mount option controls whether the filename encoding is
UTF-8. Clearly this will need more logic to set the correct bit.
>> +/* Values stored in the low-order byte of .case_info */
>> +enum {
>> + statx_case_sensitive = 0,
>> + statx_case_ascii,
>> + statx_case_utf8,
>> + statx_case_utf16,
>> +};
>
> What are these supposed to mean?
For the moment, the meaning is unclear because I simply wanted to
demonstrate that more than one behavior can be reported. Someone with
greater expertise than mine can help refine the specific semantics.
> ASCII, utf8 and utf16 are all
> encodings and not case folding algorithms. While the folding is obvious
> for ASCII, it is not for unicode and there are all kinds of different
> variants.
Fair enough... this is the right spot to report those variants. Or we
can decide that is inconsequential or impossible and simply reduce this
to a single "filename case is {in}sensitive" bit.
> Also I don't know of any file systems using utf16 encoding
> and even if it did, it would interact with the VFS and nfsd using
> utf8. Note that the 16-bit ucs-2 encoding used by windows file systems
> is a different things than unicode encodings like utf16.
Sure, UTF16 can be dropped or replaced.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-09-26 10:00 ` Jeff Layton
@ 2025-09-26 13:05 ` Chuck Lever
0 siblings, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2025-09-26 13:05 UTC (permalink / raw)
To: Jeff Layton, linux-fsdevel, linux-nfs; +Cc: Chuck Lever, Volker Lendecke
On 9/26/25 3:00 AM, Jeff Layton wrote:
>> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
>> index 1686861aae20..e929b30d64b6 100644
>> --- a/include/uapi/linux/stat.h
>> +++ b/include/uapi/linux/stat.h
>> @@ -219,6 +219,7 @@ struct statx {
>> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
>> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
>> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
>> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
>>
>>
> Do you intend to expose this new attribute to userland? If not, then it
> should probably snuggle up next to STATX_CHANGE_COOKIE. If so, then you
> need to claim a field for it in struct statx, and populate it.
As I mentioned in the patch description, exposing to user space can be
done as a next step if we decide to pursue this proposal further. Yes,
that is something I have in mind.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-09-25 15:50 ` Amir Goldstein
@ 2025-10-03 15:24 ` Gabriel Krisman Bertazi
2025-10-03 15:34 ` Chuck Lever
2025-10-03 17:19 ` Steve French
0 siblings, 2 replies; 20+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-10-03 15:24 UTC (permalink / raw)
To: Amir Goldstein
Cc: Chuck Lever, linux-fsdevel, linux-nfs, Chuck Lever, Jeff Layton,
Volker Lendecke, CIFS
Amir Goldstein <amir73il@gmail.com> writes:
> On Thu, Sep 25, 2025 at 5:21 PM Chuck Lever <cel@kernel.org> wrote:
>>
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> Both the NFSv3 and NFSv4 protocols enable NFS clients to query NFS
>> servers about the case sensitivity and case preservation behaviors
>> of shared file systems. Today, the Linux NFSD implementation
>> unconditionally returns "the export is case sensitive and case
>> preserving".
>>
>> However, a few Linux in-tree file system types appear to have some
>> ability to handle case-folded filenames. Some of our users would
>> like to exploit that functionality from their non-POSIX NFS clients.
>>
>> Enable upper layers such as NFSD to retrieve case sensitivity
>> information from file systems by adding a statx API for this
>> purpose. Introduce a sample producer and a sample consumer for this
>> information.
>>
>> If this mechanism seems sensible, a future patch might add a similar
>> field to the user-space-visible statx structure. User-space file
>> servers already use a variety of APIs to acquire this information.
>>
>> Suggested-by: Jeff Layton <jlayton@kernel.org>
>> Cc: Volker Lendecke <Volker.Lendecke@sernet.de>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> fs/fat/file.c | 5 +++++
>> fs/nfsd/nfs3proc.c | 35 +++++++++++++++++++++++++++--------
>> include/linux/stat.h | 1 +
>> include/uapi/linux/stat.h | 15 +++++++++++++++
>> 4 files changed, 48 insertions(+), 8 deletions(-)
>>
>> I'm certain this RFC patch has a number of problems, but it should
>> serve as a discussion point.
>>
>>
>> diff --git a/fs/fat/file.c b/fs/fat/file.c
>> index 4fc49a614fb8..8572e36d8f27 100644
>> --- a/fs/fat/file.c
>> +++ b/fs/fat/file.c
>> @@ -413,6 +413,11 @@ int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
>> stat->result_mask |= STATX_BTIME;
>> stat->btime = MSDOS_I(inode)->i_crtime;
>> }
>> + if (request_mask & STATX_CASE_INFO) {
>> + stat->result_mask |= STATX_CASE_INFO;
>> + /* STATX_CASE_PRESERVING is cleared */
>> + stat->case_info = statx_case_ascii;
>> + }
>>
>> return 0;
>> }
>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>> index b6d03e1ef5f7..b319d1c4385c 100644
>> --- a/fs/nfsd/nfs3proc.c
>> +++ b/fs/nfsd/nfs3proc.c
>> @@ -697,6 +697,31 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
>> return rpc_success;
>> }
>>
>> +static __be32
>> +nfsd3_proc_case(struct svc_fh *fhp, struct nfsd3_pathconfres *resp)
>> +{
>> + struct path p = {
>> + .mnt = fhp->fh_export->ex_path.mnt,
>> + .dentry = fhp->fh_dentry,
>> + };
>> + u32 request_mask = STATX_CASE_INFO;
>> + struct kstat stat;
>> + __be32 nfserr;
>> +
>> + nfserr = nfserrno(vfs_getattr(&p, &stat, request_mask,
>> + AT_STATX_SYNC_AS_STAT));
>> + if (nfserr != nfs_ok)
>> + return nfserr;
>> + if (!(stat.result_mask & STATX_CASE_INFO))
>> + return nfs_ok;
>> +
>> + resp->p_case_insensitive =
>> + stat.case_info & STATX_CASE_FOLDING_TYPE ? 0 : 1;
>> + resp->p_case_preserving =
>> + stat.case_info & STATX_CASE_PRESERVING ? 1 : 0;
>> + return nfs_ok;
>> +}
>> +
>> /*
>> * Get pathconf info for the specified file
>> */
>> @@ -722,17 +747,11 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
>> if (resp->status == nfs_ok) {
>> struct super_block *sb = argp->fh.fh_dentry->d_sb;
>>
>> - /* Note that we don't care for remote fs's here */
>> - switch (sb->s_magic) {
>> - case EXT2_SUPER_MAGIC:
>> + if (sb->s_magic == EXT2_SUPER_MAGIC) {
>> resp->p_link_max = EXT2_LINK_MAX;
>> resp->p_name_max = EXT2_NAME_LEN;
>> - break;
>> - case MSDOS_SUPER_MAGIC:
>> - resp->p_case_insensitive = 1;
>> - resp->p_case_preserving = 0;
>> - break;
>> }
>> + resp->status = nfsd3_proc_case(&argp->fh, resp);
>> }
>>
>> fh_put(&argp->fh);
>> diff --git a/include/linux/stat.h b/include/linux/stat.h
>> index e3d00e7bb26d..abb47cbb233a 100644
>> --- a/include/linux/stat.h
>> +++ b/include/linux/stat.h
>> @@ -59,6 +59,7 @@ struct kstat {
>> u32 atomic_write_unit_max;
>> u32 atomic_write_unit_max_opt;
>> u32 atomic_write_segments_max;
>> + u32 case_info;
>> };
>>
>> /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
>> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
>> index 1686861aae20..e929b30d64b6 100644
>> --- a/include/uapi/linux/stat.h
>> +++ b/include/uapi/linux/stat.h
>> @@ -219,6 +219,7 @@ struct statx {
>> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
>> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
>> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
>> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
>>
>> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
>>
>> @@ -257,4 +258,18 @@ struct statx {
>> #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
>>
>>
>> +/*
>> + * File system support for case folding is available via a bitmap.
>> + */
>> +#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
>> +
>> +/* Values stored in the low-order byte of .case_info */
>> +enum {
>> + statx_case_sensitive = 0,
>> + statx_case_ascii,
>> + statx_case_utf8,
>> + statx_case_utf16,
>> +};
>> +#define STATX_CASE_FOLDING_TYPE 0x000000ff
Does the protocol care about unicode version? For userspace, it would
be very relevant to expose it, as well as other details such as
decomposition type.
>> +
>> #endif /* _UAPI_LINUX_STAT_H */
>> --
>> 2.51.0
>>
>
> CC unicode maintainer and SMB list.
Thanks for the CC, Amir!
>
> Thanks,
> Amir.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 15:24 ` Gabriel Krisman Bertazi
@ 2025-10-03 15:34 ` Chuck Lever
2025-10-03 20:43 ` Gabriel Krisman Bertazi
2025-10-03 17:19 ` Steve French
1 sibling, 1 reply; 20+ messages in thread
From: Chuck Lever @ 2025-10-03 15:34 UTC (permalink / raw)
To: Gabriel Krisman Bertazi, Amir Goldstein
Cc: linux-fsdevel, linux-nfs, Chuck Lever, Jeff Layton,
Volker Lendecke, CIFS
On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
> Amir Goldstein <amir73il@gmail.com> writes:
>
>> On Thu, Sep 25, 2025 at 5:21 PM Chuck Lever <cel@kernel.org> wrote:
>>> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
>>> index 1686861aae20..e929b30d64b6 100644
>>> --- a/include/uapi/linux/stat.h
>>> +++ b/include/uapi/linux/stat.h
>>> @@ -219,6 +219,7 @@ struct statx {
>>> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
>>> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
>>> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
>>> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
>>>
>>> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
>>>
>>> @@ -257,4 +258,18 @@ struct statx {
>>> #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
>>>
>>>
>>> +/*
>>> + * File system support for case folding is available via a bitmap.
>>> + */
>>> +#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
>>> +
>>> +/* Values stored in the low-order byte of .case_info */
>>> +enum {
>>> + statx_case_sensitive = 0,
>>> + statx_case_ascii,
>>> + statx_case_utf8,
>>> + statx_case_utf16,
>>> +};
>>> +#define STATX_CASE_FOLDING_TYPE 0x000000ff
>
> Does the protocol care about unicode version? For userspace, it would
> be very relevant to expose it, as well as other details such as
> decomposition type.
For the purposes of indicating case sensitivity and preservation, the
NFS protocol does not currently care about unicode version.
But this is a very flexible proposal right now. Please recommend what
you'd like to see here. I hope I've given enough leeway that a unicode
version could be provided for other API consumers.
(As I mentioned to Jeff, there is no user space statx component in the
current proposal, but it should get one if it is agreed that's useful to
include).
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 15:24 ` Gabriel Krisman Bertazi
2025-10-03 15:34 ` Chuck Lever
@ 2025-10-03 17:19 ` Steve French
1 sibling, 0 replies; 20+ messages in thread
From: Steve French @ 2025-10-03 17:19 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Amir Goldstein, Chuck Lever, linux-fsdevel, linux-nfs,
Chuck Lever, Jeff Layton, Volker Lendecke, CIFS
On Fri, Oct 3, 2025 at 10:52 AM Gabriel Krisman Bertazi
<gabriel@krisman.be> wrote:
>
> Amir Goldstein <amir73il@gmail.com> writes:
>
> > On Thu, Sep 25, 2025 at 5:21 PM Chuck Lever <cel@kernel.org> wrote:
> >>
> >> From: Chuck Lever <chuck.lever@oracle.com>
> >>
> >> Both the NFSv3 and NFSv4 protocols enable NFS clients to query NFS
> >> servers about the case sensitivity and case preservation behaviors
> >> of shared file systems. Today, the Linux NFSD implementation
> >> unconditionally returns "the export is case sensitive and case
> >> preserving".
> >>
> >> However, a few Linux in-tree file system types appear to have some
> >> ability to handle case-folded filenames. Some of our users would
> >> like to exploit that functionality from their non-POSIX NFS clients.
> >>
> >> Enable upper layers such as NFSD to retrieve case sensitivity
> >> information from file systems by adding a statx API for this
> >> purpose. Introduce a sample producer and a sample consumer for this
> >> information.
> >>
> >> If this mechanism seems sensible, a future patch might add a similar
> >> field to the user-space-visible statx structure. User-space file
> >> servers already use a variety of APIs to acquire this information.
> >>
> >> Suggested-by: Jeff Layton <jlayton@kernel.org>
> >> Cc: Volker Lendecke <Volker.Lendecke@sernet.de>
> >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> >> ---
> >> fs/fat/file.c | 5 +++++
> >> fs/nfsd/nfs3proc.c | 35 +++++++++++++++++++++++++++--------
> >> include/linux/stat.h | 1 +
> >> include/uapi/linux/stat.h | 15 +++++++++++++++
> >> 4 files changed, 48 insertions(+), 8 deletions(-)
> >>
> >> I'm certain this RFC patch has a number of problems, but it should
> >> serve as a discussion point.
> >>
> >>
> >> diff --git a/fs/fat/file.c b/fs/fat/file.c
> >> index 4fc49a614fb8..8572e36d8f27 100644
> >> --- a/fs/fat/file.c
> >> +++ b/fs/fat/file.c
> >> @@ -413,6 +413,11 @@ int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
> >> stat->result_mask |= STATX_BTIME;
> >> stat->btime = MSDOS_I(inode)->i_crtime;
> >> }
> >> + if (request_mask & STATX_CASE_INFO) {
> >> + stat->result_mask |= STATX_CASE_INFO;
> >> + /* STATX_CASE_PRESERVING is cleared */
> >> + stat->case_info = statx_case_ascii;
> >> + }
> >>
> >> return 0;
> >> }
> >> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> >> index b6d03e1ef5f7..b319d1c4385c 100644
> >> --- a/fs/nfsd/nfs3proc.c
> >> +++ b/fs/nfsd/nfs3proc.c
> >> @@ -697,6 +697,31 @@ nfsd3_proc_fsinfo(struct svc_rqst *rqstp)
> >> return rpc_success;
> >> }
> >>
> >> +static __be32
> >> +nfsd3_proc_case(struct svc_fh *fhp, struct nfsd3_pathconfres *resp)
> >> +{
> >> + struct path p = {
> >> + .mnt = fhp->fh_export->ex_path.mnt,
> >> + .dentry = fhp->fh_dentry,
> >> + };
> >> + u32 request_mask = STATX_CASE_INFO;
> >> + struct kstat stat;
> >> + __be32 nfserr;
> >> +
> >> + nfserr = nfserrno(vfs_getattr(&p, &stat, request_mask,
> >> + AT_STATX_SYNC_AS_STAT));
> >> + if (nfserr != nfs_ok)
> >> + return nfserr;
> >> + if (!(stat.result_mask & STATX_CASE_INFO))
> >> + return nfs_ok;
> >> +
> >> + resp->p_case_insensitive =
> >> + stat.case_info & STATX_CASE_FOLDING_TYPE ? 0 : 1;
> >> + resp->p_case_preserving =
> >> + stat.case_info & STATX_CASE_PRESERVING ? 1 : 0;
> >> + return nfs_ok;
> >> +}
> >> +
> >> /*
> >> * Get pathconf info for the specified file
> >> */
> >> @@ -722,17 +747,11 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
> >> if (resp->status == nfs_ok) {
> >> struct super_block *sb = argp->fh.fh_dentry->d_sb;
> >>
> >> - /* Note that we don't care for remote fs's here */
> >> - switch (sb->s_magic) {
> >> - case EXT2_SUPER_MAGIC:
> >> + if (sb->s_magic == EXT2_SUPER_MAGIC) {
> >> resp->p_link_max = EXT2_LINK_MAX;
> >> resp->p_name_max = EXT2_NAME_LEN;
> >> - break;
> >> - case MSDOS_SUPER_MAGIC:
> >> - resp->p_case_insensitive = 1;
> >> - resp->p_case_preserving = 0;
> >> - break;
> >> }
> >> + resp->status = nfsd3_proc_case(&argp->fh, resp);
> >> }
> >>
> >> fh_put(&argp->fh);
> >> diff --git a/include/linux/stat.h b/include/linux/stat.h
> >> index e3d00e7bb26d..abb47cbb233a 100644
> >> --- a/include/linux/stat.h
> >> +++ b/include/linux/stat.h
> >> @@ -59,6 +59,7 @@ struct kstat {
> >> u32 atomic_write_unit_max;
> >> u32 atomic_write_unit_max_opt;
> >> u32 atomic_write_segments_max;
> >> + u32 case_info;
> >> };
> >>
> >> /* These definitions are internal to the kernel for now. Mainly used by nfsd. */
> >> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> >> index 1686861aae20..e929b30d64b6 100644
> >> --- a/include/uapi/linux/stat.h
> >> +++ b/include/uapi/linux/stat.h
> >> @@ -219,6 +219,7 @@ struct statx {
> >> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
> >> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
> >> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
> >> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
> >>
> >> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
> >>
> >> @@ -257,4 +258,18 @@ struct statx {
> >> #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
> >>
> >>
> >> +/*
> >> + * File system support for case folding is available via a bitmap.
> >> + */
> >> +#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
> >> +
> >> +/* Values stored in the low-order byte of .case_info */
> >> +enum {
> >> + statx_case_sensitive = 0,
> >> + statx_case_ascii,
> >> + statx_case_utf8,
> >> + statx_case_utf16,
> >> +};
> >> +#define STATX_CASE_FOLDING_TYPE 0x000000ff
>
> Does the protocol care about unicode version? For userspace, it would
> be very relevant to expose it, as well as other details such as
> decomposition type.
The (SMB2/SMB3/SMB3.1.1) protocol specification documentation refers
to https://www.unicode.org/versions/Unicode5.0.0/ and states
"Unless otherwise specified, all textual strings MUST be in Unicode
version 5.0 format, as specified in [UNICODE], using the 16-bit
Unicode Transformation Format (UTF-16) form of the encoding."
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 15:34 ` Chuck Lever
@ 2025-10-03 20:43 ` Gabriel Krisman Bertazi
2025-10-03 21:05 ` Chuck Lever
0 siblings, 1 reply; 20+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-10-03 20:43 UTC (permalink / raw)
To: Chuck Lever
Cc: Amir Goldstein, linux-fsdevel, linux-nfs, Chuck Lever,
Jeff Layton, Volker Lendecke, CIFS
Chuck Lever <cel@kernel.org> writes:
> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
>> Amir Goldstein <amir73il@gmail.com> writes:
>>
>>> On Thu, Sep 25, 2025 at 5:21 PM Chuck Lever <cel@kernel.org> wrote:
>
>>>> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
>>>> index 1686861aae20..e929b30d64b6 100644
>>>> --- a/include/uapi/linux/stat.h
>>>> +++ b/include/uapi/linux/stat.h
>>>> @@ -219,6 +219,7 @@ struct statx {
>>>> #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */
>>>> #define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */
>>>> #define STATX_DIO_READ_ALIGN 0x00020000U /* Want/got dio read alignment info */
>>>> +#define STATX_CASE_INFO 0x00040000U /* Want/got case folding info */
>>>>
>>>> #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */
>>>>
>>>> @@ -257,4 +258,18 @@ struct statx {
>>>> #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */
>>>>
>>>>
>>>> +/*
>>>> + * File system support for case folding is available via a bitmap.
>>>> + */
>>>> +#define STATX_CASE_PRESERVING 0x80000000 /* File name case is preserved */
>>>> +
>>>> +/* Values stored in the low-order byte of .case_info */
>>>> +enum {
>>>> + statx_case_sensitive = 0,
>>>> + statx_case_ascii,
>>>> + statx_case_utf8,
>>>> + statx_case_utf16,
>>>> +};
>>>> +#define STATX_CASE_FOLDING_TYPE 0x000000ff
>>
>> Does the protocol care about unicode version? For userspace, it would
>> be very relevant to expose it, as well as other details such as
>> decomposition type.
>
> For the purposes of indicating case sensitivity and preservation, the
> NFS protocol does not currently care about unicode version.
>
> But this is a very flexible proposal right now. Please recommend what
> you'd like to see here. I hope I've given enough leeway that a unicode
> version could be provided for other API consumers.
But also, encoding version information is filesystem-wide, so it would
fit statfs.
For filesystems using fs/unicode/, we'd want to expose the unicode
version, from sb->s_encoding->version and the sb->s_encoding_flags.
The tuple (version,flags) defines the casefolding behavior.
Currently, it is written to the kernel log at mount-time, but that is
easily lost/wrapped.
> (As I mentioned to Jeff, there is no user space statx component in the
> current proposal, but it should get one if it is agreed that's useful to
> include).
I believe it is useful to expose it to userspace, simply because it
modifies the behavior of the filesystem. An application like Steam can
poke it to decide whether it needs to enable compatibility alternatives
to in-kernel case-folding, instead of assuming the encoding and testing
if "chattr +F" works.
It is not a critical feature, because mkfs for all case-insensitive
filesystems only ever supported one unicode version and strict mode is
rarely used. But if we ever update unicode or provide more flavors of
casefolding for compatibility with other filesystems, which was
requested in the past, userspace would need to have a way to retrieve
that information.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 20:43 ` Gabriel Krisman Bertazi
@ 2025-10-03 21:05 ` Chuck Lever
2025-10-03 21:11 ` ronnie sahlberg
2025-10-03 21:15 ` Gabriel Krisman Bertazi
0 siblings, 2 replies; 20+ messages in thread
From: Chuck Lever @ 2025-10-03 21:05 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Amir Goldstein, linux-fsdevel, linux-nfs, Chuck Lever,
Jeff Layton, Volker Lendecke, CIFS
On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
> Chuck Lever <cel@kernel.org> writes:
>
>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
>>> Does the protocol care about unicode version? For userspace, it would
>>> be very relevant to expose it, as well as other details such as
>>> decomposition type.
>>
>> For the purposes of indicating case sensitivity and preservation, the
>> NFS protocol does not currently care about unicode version.
>>
>> But this is a very flexible proposal right now. Please recommend what
>> you'd like to see here. I hope I've given enough leeway that a unicode
>> version could be provided for other API consumers.
>
> But also, encoding version information is filesystem-wide, so it would
> fit statfs.
ext4 appears to have the ability to set the case folding behavior
on each directory, that's why I started with statx.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 21:05 ` Chuck Lever
@ 2025-10-03 21:11 ` ronnie sahlberg
2025-10-03 21:15 ` Gabriel Krisman Bertazi
1 sibling, 0 replies; 20+ messages in thread
From: ronnie sahlberg @ 2025-10-03 21:11 UTC (permalink / raw)
To: Chuck Lever
Cc: Gabriel Krisman Bertazi, Amir Goldstein, linux-fsdevel, linux-nfs,
Chuck Lever, Jeff Layton, Volker Lendecke, CIFS
On Sat, 4 Oct 2025 at 07:05, Chuck Lever <cel@kernel.org> wrote:
>
> On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
> > Chuck Lever <cel@kernel.org> writes:
> >
> >> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
>
> >>> Does the protocol care about unicode version? For userspace, it would
> >>> be very relevant to expose it, as well as other details such as
> >>> decomposition type.
> >>
> >> For the purposes of indicating case sensitivity and preservation, the
> >> NFS protocol does not currently care about unicode version.
> >>
> >> But this is a very flexible proposal right now. Please recommend what
> >> you'd like to see here. I hope I've given enough leeway that a unicode
> >> version could be provided for other API consumers.
> >
> > But also, encoding version information is filesystem-wide, so it would
> > fit statfs.
>
> ext4 appears to have the ability to set the case folding behavior
> on each directory, that's why I started with statx.
I think that is effectively also the case for cifs.
Perhaps not for normal directories but when following DFS links or reparse
points you can traverse between servers/shares that have different
case handling.
So I think statx would make sense here too.
>
>
> --
> Chuck Lever
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 21:05 ` Chuck Lever
2025-10-03 21:11 ` ronnie sahlberg
@ 2025-10-03 21:15 ` Gabriel Krisman Bertazi
2025-10-04 17:27 ` Chuck Lever
2025-10-06 11:19 ` Christian Brauner
1 sibling, 2 replies; 20+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-10-03 21:15 UTC (permalink / raw)
To: Chuck Lever
Cc: Amir Goldstein, linux-fsdevel, linux-nfs, Chuck Lever,
Jeff Layton, Volker Lendecke, CIFS
Chuck Lever <cel@kernel.org> writes:
> On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
>> Chuck Lever <cel@kernel.org> writes:
>>
>>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
>
>>>> Does the protocol care about unicode version? For userspace, it would
>>>> be very relevant to expose it, as well as other details such as
>>>> decomposition type.
>>>
>>> For the purposes of indicating case sensitivity and preservation, the
>>> NFS protocol does not currently care about unicode version.
>>>
>>> But this is a very flexible proposal right now. Please recommend what
>>> you'd like to see here. I hope I've given enough leeway that a unicode
>>> version could be provided for other API consumers.
>>
>> But also, encoding version information is filesystem-wide, so it would
>> fit statfs.
>
> ext4 appears to have the ability to set the case folding behavior
> on each directory, that's why I started with statx.
Yes. casefold is set per directory, but the unicode version and
casefolding semantics used by those casefolded directories are defined
for the entire filesystem.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 21:15 ` Gabriel Krisman Bertazi
@ 2025-10-04 17:27 ` Chuck Lever
2025-10-06 11:19 ` Christian Brauner
1 sibling, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2025-10-04 17:27 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Amir Goldstein, linux-fsdevel, linux-nfs, Chuck Lever,
Jeff Layton, Volker Lendecke, CIFS
On 10/3/25 5:15 PM, Gabriel Krisman Bertazi wrote:
> Chuck Lever <cel@kernel.org> writes:
>
>> On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
>>> Chuck Lever <cel@kernel.org> writes:
>>>
>>>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
>>
>>>>> Does the protocol care about unicode version? For userspace, it would
>>>>> be very relevant to expose it, as well as other details such as
>>>>> decomposition type.
>>>>
>>>> For the purposes of indicating case sensitivity and preservation, the
>>>> NFS protocol does not currently care about unicode version.
>>>>
>>>> But this is a very flexible proposal right now. Please recommend what
>>>> you'd like to see here. I hope I've given enough leeway that a unicode
>>>> version could be provided for other API consumers.
>>>
>>> But also, encoding version information is filesystem-wide, so it would
>>> fit statfs.
>>
>> ext4 appears to have the ability to set the case folding behavior
>> on each directory, that's why I started with statx.
>
> Yes. casefold is set per directory, but the unicode version and
> casefolding semantics used by those casefolded directories are defined
> for the entire filesystem.
>
Got it. That keeps the proposed statx changes simple. Let me look at how
extensible the statfs API is. Actually that falls a little outside of
the mission to support NFS's needs, so perhaps that should be a separate
effort? Let me know what you think.
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-03 21:15 ` Gabriel Krisman Bertazi
2025-10-04 17:27 ` Chuck Lever
@ 2025-10-06 11:19 ` Christian Brauner
2025-10-07 17:18 ` Gabriel Krisman Bertazi
1 sibling, 1 reply; 20+ messages in thread
From: Christian Brauner @ 2025-10-06 11:19 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Chuck Lever, Amir Goldstein, linux-fsdevel, linux-nfs,
Chuck Lever, Jeff Layton, Volker Lendecke, CIFS
On Fri, Oct 03, 2025 at 05:15:24PM -0400, Gabriel Krisman Bertazi wrote:
> Chuck Lever <cel@kernel.org> writes:
>
> > On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
> >> Chuck Lever <cel@kernel.org> writes:
> >>
> >>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
> >
> >>>> Does the protocol care about unicode version? For userspace, it would
> >>>> be very relevant to expose it, as well as other details such as
> >>>> decomposition type.
> >>>
> >>> For the purposes of indicating case sensitivity and preservation, the
> >>> NFS protocol does not currently care about unicode version.
> >>>
> >>> But this is a very flexible proposal right now. Please recommend what
> >>> you'd like to see here. I hope I've given enough leeway that a unicode
> >>> version could be provided for other API consumers.
> >>
> >> But also, encoding version information is filesystem-wide, so it would
> >> fit statfs.
> >
> > ext4 appears to have the ability to set the case folding behavior
> > on each directory, that's why I started with statx.
>
> Yes. casefold is set per directory, but the unicode version and
> casefolding semantics used by those casefolded directories are defined
> for the entire filesystem.
I'm not too fond of wasting statx() space for this. Couldn't this be
exposed via the new file_getattr() system call?:
/*
* Variable size structure for file_[sg]et_attr().
*
* Note. This is alternative to the structure 'struct file_kattr'/'struct fsxattr'.
* As this structure is passed to/from userspace with its size, this can
* be versioned based on the size.
*/
struct file_attr {
__u64 fa_xflags; /* xflags field value (get/set) */
__u32 fa_extsize; /* extsize field value (get/set)*/
__u32 fa_nextents; /* nextents field value (get) */
__u32 fa_projid; /* project identifier (get/set) */
__u32 fa_cowextsize; /* CoW extsize field value (get/set) */
};
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-06 11:19 ` Christian Brauner
@ 2025-10-07 17:18 ` Gabriel Krisman Bertazi
2025-10-10 11:11 ` Christian Brauner
0 siblings, 1 reply; 20+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-10-07 17:18 UTC (permalink / raw)
To: Christian Brauner
Cc: Chuck Lever, Amir Goldstein, linux-fsdevel, linux-nfs,
Chuck Lever, Jeff Layton, Volker Lendecke, CIFS
Christian Brauner <brauner@kernel.org> writes:
> On Fri, Oct 03, 2025 at 05:15:24PM -0400, Gabriel Krisman Bertazi wrote:
>> Chuck Lever <cel@kernel.org> writes:
>>
>> > On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
>> >> Chuck Lever <cel@kernel.org> writes:
>> >>
>> >>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
>> >
>> >>>> Does the protocol care about unicode version? For userspace, it would
>> >>>> be very relevant to expose it, as well as other details such as
>> >>>> decomposition type.
>> >>>
>> >>> For the purposes of indicating case sensitivity and preservation, the
>> >>> NFS protocol does not currently care about unicode version.
>> >>>
>> >>> But this is a very flexible proposal right now. Please recommend what
>> >>> you'd like to see here. I hope I've given enough leeway that a unicode
>> >>> version could be provided for other API consumers.
>> >>
>> >> But also, encoding version information is filesystem-wide, so it would
>> >> fit statfs.
>> >
>> > ext4 appears to have the ability to set the case folding behavior
>> > on each directory, that's why I started with statx.
>>
>> Yes. casefold is set per directory, but the unicode version and
>> casefolding semantics used by those casefolded directories are defined
>> for the entire filesystem.
>
> I'm not too fond of wasting statx() space for this. Couldn't this be
> exposed via the new file_getattr() system call?:
Do you mean exposing of unicode version and flags to userspace? If so,
yes, for sure, it can be fit in file_get_attr. It was never exposed
before, so there is no user expectation about it!
Thanks,
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-07 17:18 ` Gabriel Krisman Bertazi
@ 2025-10-10 11:11 ` Christian Brauner
2025-10-10 12:43 ` Chuck Lever
2025-10-10 14:49 ` Darrick J. Wong
0 siblings, 2 replies; 20+ messages in thread
From: Christian Brauner @ 2025-10-10 11:11 UTC (permalink / raw)
To: Gabriel Krisman Bertazi
Cc: Chuck Lever, Amir Goldstein, linux-fsdevel, linux-nfs,
Chuck Lever, Jeff Layton, Volker Lendecke, CIFS
On Tue, Oct 07, 2025 at 01:18:32PM -0400, Gabriel Krisman Bertazi wrote:
> Christian Brauner <brauner@kernel.org> writes:
>
> > On Fri, Oct 03, 2025 at 05:15:24PM -0400, Gabriel Krisman Bertazi wrote:
> >> Chuck Lever <cel@kernel.org> writes:
> >>
> >> > On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
> >> >> Chuck Lever <cel@kernel.org> writes:
> >> >>
> >> >>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
> >> >
> >> >>>> Does the protocol care about unicode version? For userspace, it would
> >> >>>> be very relevant to expose it, as well as other details such as
> >> >>>> decomposition type.
> >> >>>
> >> >>> For the purposes of indicating case sensitivity and preservation, the
> >> >>> NFS protocol does not currently care about unicode version.
> >> >>>
> >> >>> But this is a very flexible proposal right now. Please recommend what
> >> >>> you'd like to see here. I hope I've given enough leeway that a unicode
> >> >>> version could be provided for other API consumers.
> >> >>
> >> >> But also, encoding version information is filesystem-wide, so it would
> >> >> fit statfs.
> >> >
> >> > ext4 appears to have the ability to set the case folding behavior
> >> > on each directory, that's why I started with statx.
> >>
> >> Yes. casefold is set per directory, but the unicode version and
> >> casefolding semantics used by those casefolded directories are defined
> >> for the entire filesystem.
> >
> > I'm not too fond of wasting statx() space for this. Couldn't this be
> > exposed via the new file_getattr() system call?:
>
> Do you mean exposing of unicode version and flags to userspace? If so,
> yes, for sure, it can be fit in file_get_attr. It was never exposed
> before, so there is no user expectation about it!
Imho it would fit better there than statx(). If this becomes really
super common than we can also later decide to additional expose it via
statx() but for now I think it'd be better to move this into the new
file_attr()* apis.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-10 11:11 ` Christian Brauner
@ 2025-10-10 12:43 ` Chuck Lever
2025-10-10 14:49 ` Darrick J. Wong
1 sibling, 0 replies; 20+ messages in thread
From: Chuck Lever @ 2025-10-10 12:43 UTC (permalink / raw)
To: Christian Brauner, Gabriel Krisman Bertazi
Cc: Amir Goldstein, linux-fsdevel, linux-nfs, Chuck Lever,
Jeff Layton, Volker Lendecke, CIFS
On 10/10/25 7:11 AM, Christian Brauner wrote:
>>> I'm not too fond of wasting statx() space for this. Couldn't this be
>>> exposed via the new file_getattr() system call?:
>> Do you mean exposing of unicode version and flags to userspace? If so,
>> yes, for sure, it can be fit in file_get_attr. It was never exposed
>> before, so there is no user expectation about it!
> Imho it would fit better there than statx(). If this becomes really
> super common than we can also later decide to additional expose it via
> statx() but for now I think it'd be better to move this into the new
> file_attr()* apis.
Christian, I'm still not clear what you mean by "this". Do you mean only
the unicode version? Or do you mean both the unicode version *and* the
case sensitivity/preservation flags?
--
Chuck Lever
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-10 11:11 ` Christian Brauner
2025-10-10 12:43 ` Chuck Lever
@ 2025-10-10 14:49 ` Darrick J. Wong
2025-10-10 19:06 ` Gabriel Krisman Bertazi
1 sibling, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2025-10-10 14:49 UTC (permalink / raw)
To: Christian Brauner
Cc: Gabriel Krisman Bertazi, Chuck Lever, Amir Goldstein,
linux-fsdevel, linux-nfs, Chuck Lever, Jeff Layton,
Volker Lendecke, CIFS
On Fri, Oct 10, 2025 at 01:11:32PM +0200, Christian Brauner wrote:
> On Tue, Oct 07, 2025 at 01:18:32PM -0400, Gabriel Krisman Bertazi wrote:
> > Christian Brauner <brauner@kernel.org> writes:
> >
> > > On Fri, Oct 03, 2025 at 05:15:24PM -0400, Gabriel Krisman Bertazi wrote:
> > >> Chuck Lever <cel@kernel.org> writes:
> > >>
> > >> > On 10/3/25 4:43 PM, Gabriel Krisman Bertazi wrote:
> > >> >> Chuck Lever <cel@kernel.org> writes:
> > >> >>
> > >> >>> On 10/3/25 11:24 AM, Gabriel Krisman Bertazi wrote:
> > >> >
> > >> >>>> Does the protocol care about unicode version? For userspace, it would
> > >> >>>> be very relevant to expose it, as well as other details such as
> > >> >>>> decomposition type.
> > >> >>>
> > >> >>> For the purposes of indicating case sensitivity and preservation, the
> > >> >>> NFS protocol does not currently care about unicode version.
> > >> >>>
> > >> >>> But this is a very flexible proposal right now. Please recommend what
> > >> >>> you'd like to see here. I hope I've given enough leeway that a unicode
> > >> >>> version could be provided for other API consumers.
> > >> >>
> > >> >> But also, encoding version information is filesystem-wide, so it would
> > >> >> fit statfs.
> > >> >
> > >> > ext4 appears to have the ability to set the case folding behavior
> > >> > on each directory, that's why I started with statx.
> > >>
> > >> Yes. casefold is set per directory, but the unicode version and
> > >> casefolding semantics used by those casefolded directories are defined
> > >> for the entire filesystem.
> > >
> > > I'm not too fond of wasting statx() space for this. Couldn't this be
> > > exposed via the new file_getattr() system call?:
> >
> > Do you mean exposing of unicode version and flags to userspace? If so,
> > yes, for sure, it can be fit in file_get_attr. It was never exposed
> > before, so there is no user expectation about it!
>
> Imho it would fit better there than statx(). If this becomes really
> super common than we can also later decide to additional expose it via
> statx() but for now I think it'd be better to move this into the new
> file_attr()* apis.
n00b question here: Can you enable (or disable) casefolding and the
folding scheme used? My guess is that one ought to be able to do that
either (a) on an empty directory or (b) by reindexing the entire
directory if the filesystem supports that kind of thing? But hey, it's
not like xfs supports any of that. ;)
--D
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [RFC PATCH] fs: Plumb case sensitivity bits into statx
2025-10-10 14:49 ` Darrick J. Wong
@ 2025-10-10 19:06 ` Gabriel Krisman Bertazi
0 siblings, 0 replies; 20+ messages in thread
From: Gabriel Krisman Bertazi @ 2025-10-10 19:06 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Christian Brauner, Chuck Lever, Amir Goldstein, linux-fsdevel,
linux-nfs, Chuck Lever, Jeff Layton, Volker Lendecke, CIFS
"Darrick J. Wong" <djwong@kernel.org> writes:
> n00b question here: Can you enable (or disable) casefolding and the
> folding scheme used? My guess is that one ought to be able to do that
> either (a) on an empty directory or (b) by reindexing the entire
> directory if the filesystem supports that kind of thing? But hey, it's
> not like xfs supports any of that. ;)
We only support enabling/disabling on an empty directory.
Disabling casefolding on an already populated directory would be easier
to do - just re-index, as you said. But to enable it, you'd need to
handle cases where you have two different files that now have the "same"
name (differing only by case). Then, which one you'll get is quite
unpredictable (perhaps, the order the dirent appears on-disk, etc).
So we just don't allow it.
--
Gabriel Krisman Bertazi
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-10-10 19:27 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-25 15:11 [RFC PATCH] fs: Plumb case sensitivity bits into statx Chuck Lever
2025-09-25 15:50 ` Amir Goldstein
2025-10-03 15:24 ` Gabriel Krisman Bertazi
2025-10-03 15:34 ` Chuck Lever
2025-10-03 20:43 ` Gabriel Krisman Bertazi
2025-10-03 21:05 ` Chuck Lever
2025-10-03 21:11 ` ronnie sahlberg
2025-10-03 21:15 ` Gabriel Krisman Bertazi
2025-10-04 17:27 ` Chuck Lever
2025-10-06 11:19 ` Christian Brauner
2025-10-07 17:18 ` Gabriel Krisman Bertazi
2025-10-10 11:11 ` Christian Brauner
2025-10-10 12:43 ` Chuck Lever
2025-10-10 14:49 ` Darrick J. Wong
2025-10-10 19:06 ` Gabriel Krisman Bertazi
2025-10-03 17:19 ` Steve French
2025-09-26 4:20 ` Christoph Hellwig
2025-09-26 13:02 ` Chuck Lever
2025-09-26 10:00 ` Jeff Layton
2025-09-26 13:05 ` Chuck Lever
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).