* Re: [PATCH v3 1/9] kernel/api: introduce kernel API specification framework
From: Sasha Levin @ 2026-05-05 7:45 UTC (permalink / raw)
To: Nicolas Schier
Cc: Sasha Levin, Nathan Chancellor, linux-api, linux-kernel,
linux-doc, linux-fsdevel, linux-kbuild, linux-kselftest,
workflows, tools, x86, Thomas Gleixner, Paul E . McKenney,
Greg Kroah-Hartman, Jonathan Corbet, Dmitry Vyukov, Randy Dunlap,
Cyril Hrubis, Kees Cook, Jake Edge, David Laight, Askar Safin,
Gabriele Paoloni, Mauro Carvalho Chehab, Christian Brauner,
Alexander Viro, Andrew Morton, Masahiro Yamada, Shuah Khan,
Ingo Molnar, Arnd Bergmann
In-Reply-To: <afIykLLPj7m0fcsX@levanger>
On Wed, Apr 29, 2026 at 06:32:16PM +0200, Nicolas Schier wrote:
> On Sun, Apr 26, 2026 at 11:37:45PM -0400, Nathan Chancellor wrote:
> > On Fri, 24 Apr 2026 12:51:21 -0400, Sasha Levin <sashal@kernel.org> wrote:
> > > diff --git a/kernel/Makefile b/kernel/Makefile
> > > [...]
> > > +obj-$(CONFIG_KAPI_SPEC) += api/
> > > +# Ensure api/ is always cleaned even when CONFIG_KAPI_SPEC is not set
> > > +obj- += api/
> >
> > If $(CONFIG_KAPI_SPEC) is not set, shouldn't
> >
> > obj-$(CONFIG_KAPI_SPEC) += api/
> >
> > evaluate to
> >
> > obj- += api/
> >
> > anyways? Why the duplication? This is the only place in the kernel where
> > this would be needed?
>
> yes, this is definitely not needed, as obj- is always evaluated during
> 'make clean', cp. scripts/Makefile.clean [1].
>
> Kind regards
> Nicolas
>
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/Makefile.clean?h=v7.1-rc1#n30
Thanks for the pointer!
The redundant "obj- += api/" and the accompanying comment are dropped in v4.
--
Thanks,
Sasha
^ permalink raw reply
* Re: [PATCH v3 1/9] kernel/api: introduce kernel API specification framework
From: Sasha Levin @ 2026-05-05 7:45 UTC (permalink / raw)
To: Nicolas Schier
Cc: Sasha Levin, Nathan Chancellor, linux-api, linux-kernel,
linux-doc, linux-fsdevel, linux-kbuild, linux-kselftest,
workflows, tools, x86, Thomas Gleixner, Paul E . McKenney,
Greg Kroah-Hartman, Jonathan Corbet, Dmitry Vyukov, Randy Dunlap,
Cyril Hrubis, Kees Cook, Jake Edge, David Laight, Askar Safin,
Gabriele Paoloni, Mauro Carvalho Chehab, Christian Brauner,
Alexander Viro, Andrew Morton, Masahiro Yamada, Shuah Khan,
Ingo Molnar, Arnd Bergmann
In-Reply-To: <afI1LMB_vNMWYU7o@levanger>
On Wed, Apr 29, 2026 at 06:43:24PM +0200, Nicolas Schier wrote:
> On Fri, Apr 24, 2026 at 12:51:21PM -0400, Sasha Levin wrote:
> > diff --git a/kernel/api/Makefile b/kernel/api/Makefile
> > new file mode 100644
> > index 0000000000000..c0a13fc590e4a
> > --- /dev/null
> > +++ b/kernel/api/Makefile
> > @@ -0,0 +1,14 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Makefile for the Kernel API Specification Framework
> > +#
> > +
> > +# Core API specification framework
> > +obj-$(CONFIG_KAPI_SPEC) += kernel_api_spec.o
>
> Bike-shedding: I'd use 'obj-y' here, to state clearly that
> kernel_api_spec.c is the core part in the kernel/api/ subdir. If
> CONFIG_KAPI_SPEC is unset, the subfir will not be entered at all.
Agreed, switched to "obj-y" in v4. The subdir gate moves up to
kernel/Makefile and the entry inside kernel/api/Makefile becomes
unconditional. The other two entries (KAPI_SPEC_DEBUGFS,
KAPI_KUNIT_TEST) keep their own conditional guards since they are
optional sub-features.
--
Thanks,
Sasha
^ permalink raw reply
* Re: [PATCH v3 2/9] kernel/api: enable kerneldoc-based API specifications
From: Sasha Levin @ 2026-05-05 7:45 UTC (permalink / raw)
To: Nicolas Schier
Cc: Sasha Levin, Nathan Chancellor, linux-api, linux-kernel,
linux-doc, linux-fsdevel, linux-kbuild, linux-kselftest,
workflows, tools, x86, Thomas Gleixner, Paul E . McKenney,
Greg Kroah-Hartman, Jonathan Corbet, Dmitry Vyukov, Randy Dunlap,
Cyril Hrubis, Kees Cook, Jake Edge, David Laight, Askar Safin,
Gabriele Paoloni, Mauro Carvalho Chehab, Christian Brauner,
Alexander Viro, Andrew Morton, Masahiro Yamada, Shuah Khan,
Ingo Molnar, Arnd Bergmann
In-Reply-To: <afNrbm8URHlClZ-8@levanger>
On Thu, Apr 30, 2026 at 04:47:10PM +0200, Nicolas Schier wrote:
> On Fri, Apr 24, 2026 at 12:51:22PM -0400, Sasha Levin wrote:
> > +# Generate API spec headers from kernel-doc comments
> > +ifeq ($(CONFIG_KAPI_SPEC),y)
> > +# Function to check if a file has API specifications
> > +has-apispec = $(shell grep -qE '^\s*\*\s*context-flags:' $(src)/$(1) 2>/dev/null && echo $(1))
> > +
> > +# Get base names without directory prefix
> > +c-objs-base := $(notdir $(real-obj-y) $(real-obj-m))
> > +# Filter to only .o files with corresponding .c source files
> > +c-files := $(foreach o,$(c-objs-base),$(if $(wildcard $(src)/$(o:.o=.c)),$(o:.o=.c)))
>
> Looks to me as if the two lines above are redundant, since 'find'
> (below) will find all files gathered in $(c-files).
Right, those two lines are dropped in v4. The replacement uses the
kbuild-derived file list described below, so neither set survives.
> > +# Also check for any additional .c files that contain API specs but are included
> > +extra-c-files := $(shell find $(src) -maxdepth 1 -name "*.c" -exec grep -l '^\s*\*\s*\(long-desc\|context-flags\|state-trans\):' {} \; 2>/dev/null | xargs -r basename -a)
> > +# Combine both lists and remove duplicates
> > +all-c-files := $(sort $(c-files) $(extra-c-files))
> > +# Only include files that actually have API specifications
> > +apispec-files := $(foreach f,$(all-c-files),$(call has-apispec,$(f)))
> > +# Generate apispec targets with proper directory prefix
> > +apispec-y := $(addprefix $(obj)/,$(apispec-files:.c=.apispec.h))
>
> To goal is to find any relevant C file in $(src)/ (but not deeper below)
> that holds KAPI documentation, right?
>
> I do not like the find call, as it picks up anything. Might it make
> sense to evaluate $(obj-) along with $(obj-y) and $(obj-m) to pick up
> all C files that are references in kbuild?
>
>
>
> # in top definition block -- before 'include $(kbuild-file)' et al.
> obj- :=
>
> # below the definitions of real-obj-{y,m}
> real-obj-any := $(call real-search, $(obj-y) $(obj-m) $(obj-), .o, -objs -y -m -)
>
> has-apispec = $(shell grep -lE '^\s*\*\s*context-flags:' $(1) 2>/dev/null)
> apispec-y := $(patsubst $(src)/%.c, $(obj)/%.apispec.h, $(call has-apispec,
> $(patsubst $(obj)/%.o, $(src)/%.c, $(real-obj-any))))
>
> #...
>
> # Source files that include their own apispec.h need to depend on it
> $(apispec-y:.apispec.h=.o): $(obj)/%.o: $(obj)/%.apispec.h
>
> (untested)
Thanks, the kbuild-driven approach is much cleaner. v4 takes your sketch
with two adjustments:
1. obj-m is already addprefix'd with $(obj)/ by line 116 of
Makefile.build at the point where this block runs, so calling
real-search again on the mixed list double-prefixes the module
entries (giving $(src)/$(obj)/foo.c). v4 uses the existing
$(real-obj-y)/$(real-obj-m) and strips the prefix in patsubst
instead:
apispec-c-files := $(call has-apispec, \
$(patsubst $(obj)/%.o,$(src)/%.c, \
$(filter-out %/built-in.a,$(real-obj-y) $(real-obj-m))))
apispec-y := $(patsubst $(src)/%.c,$(obj)/%.apispec.h,$(apispec-c-files))
2. The has-apispec grep needs to match the same set of keys that
tools/lib/python/kdoc/kdoc_apispec.py actually parses, which is
"contexts:", "context-flags:" and "context:" interchangeably (see
_get_section calls around line 866 of kdoc_apispec.py). The original
grep for "context-flags:" matched zero files in the tree (every
instrumented file uses "contexts:"), which is the latent bug behind
the build failure you saw. v4 widens the regex:
has-apispec = $(shell grep -lE \
'^[[:space:]]*\*[[:space:]]*(contexts|context-flags|context):' \
$(1) 2>/dev/null)
> > diff --git a/scripts/Makefile.clean b/scripts/Makefile.clean
> > index 6ead00ec7313b..f78dbbe637f27 100644
> > --- a/scripts/Makefile.clean
> > +++ b/scripts/Makefile.clean
> > @@ -35,6 +35,9 @@ __clean-files := $(filter-out $(no-clean-files), $(__clean-files))
> >
> > __clean-files := $(wildcard $(addprefix $(obj)/, $(__clean-files)))
> >
> > +# Also clean generated apispec headers (computed dynamically in Makefile.build)
> > +__clean-files += $(wildcard $(obj)/*.apispec.h)
>
> We have a list of wildcard clean patterns in top-level Makefile
> (line 2114 ff.); please add '*.apispec.h' there instead.
Will fix.
> When I apply the series on top of v7.1, compilation fails with
>
> ../fs/open.c:2148:10: fatal error: open.apispec.h: No such file or directory
> ../fs/read_write.c:2519:10: fatal error: read_write.apispec.h: No such file or directory
This is the symptom of (2) above. fs/open.c and fs/read_write.c only
declare "contexts: process, sleepable" (no "context-flags:" anywhere in
the tree, confirmed via grep), so apispec-files was always empty and no
*.apispec.h ever got generated. With CONFIG_KAPI_SPEC=y the
"#if IS_ENABLED(CONFIG_KAPI_SPEC)" guarded include then fails. Also
reproducible on v7.0; thanks for catching it.
Thanks for the review and the kbuild sketch!
--
Thanks,
Sasha
^ permalink raw reply
* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Herbert Xu @ 2026-05-05 9:31 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-crypto, linux-doc, linux-api, linux-kernel, netdev,
Linus Torvalds
In-Reply-To: <20260430011544.31823-1-ebiggers@kernel.org>
On Wed, Apr 29, 2026 at 06:15:44PM -0700, Eric Biggers wrote:
> AF_ALG is almost completely unnecessary, and it exposes a massive attack
> surface that hasn't been standing up to modern vulnerability discovery
> tools. The latest one even has its own website, providing a small
> Python script that reliably roots most Linux distros: https://copy.fail/
>
> This isn't sustainable, especially as LLMs have accelerated the rate the
> vulnerabilities are coming in. The effort that is being put into this
> thing is vastly disproportional to the few programs that actually use
> it, and those programs would be better served by userspace code anyway.
>
> These issues have been noted in many mailing list discussions already.
> But until now they haven't been reflected in the documentation or
> kconfig menu itself, and the vulnerabilities are still coming in.
>
> Let's go ahead and document the deprecation.
>
> This isn't intended to change anything overnight. After all, most Linux
> distros won't be able to disable the kconfig options quite yet, mainly
> because of iwd. But this should create a bit more impetus for these
> userspace programs to be fixed, and the documentation update should also
> help prevent more users from appearing.
>
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> This patch is targeting crypto/master
>
> Documentation/crypto/userspace-if.rst | 82 ++++++++++++++++++++-------
> crypto/Kconfig | 69 ++++++++++++++++------
> 2 files changed, 113 insertions(+), 38 deletions(-)
Patch applied. Thanks.
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Andy Lutomirski @ 2026-05-05 23:17 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-crypto, Herbert Xu, linux-doc, linux-api, linux-kernel,
netdev, Linus Torvalds
In-Reply-To: <20260430011544.31823-1-ebiggers@kernel.org>
> On Apr 29, 2026, at 6:19 PM, Eric Biggers <ebiggers@kernel.org> wrote:
>
> AF_ALG is almost completely unnecessary, and it exposes a massive attack
> surface that hasn't been standing up to modern vulnerability discovery
> tools. The latest one even has its own website, providing a small
> Python script that reliably roots most Linux distros: https://copy.fail/
How about adding a configuration option, defaulted on, that requires
capable(CAP_SYS_ADMIN) to create the socket (and maybe also to bind /
connect it). And a sysctl to allow the administrator to override this
in the unlikely event that it’s needed.
IIRC cryptsetup used to and maybe even still does require these
sockets sometimes and this would let it keep working. And there's all
the FIPS stuff downthread.
>
> This isn't sustainable, especially as LLMs have accelerated the rate the
> vulnerabilities are coming in. The effort that is being put into this
> thing is vastly disproportional to the few programs that actually use
> it, and those programs would be better served by userspace code anyway.
>
> These issues have been noted in many mailing list discussions already.
> But until now they haven't been reflected in the documentation or
> kconfig menu itself, and the vulnerabilities are still coming in.
>
> Let's go ahead and document the deprecation.
>
> This isn't intended to change anything overnight. After all, most Linux
> distros won't be able to disable the kconfig options quite yet, mainly
> because of iwd. But this should create a bit more impetus for these
> userspace programs to be fixed, and the documentation update should also
> help prevent more users from appearing.
>
> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
> ---
>
> This patch is targeting crypto/master
>
> Documentation/crypto/userspace-if.rst | 82 ++++++++++++++++++++-------
> crypto/Kconfig | 69 ++++++++++++++++------
> 2 files changed, 113 insertions(+), 38 deletions(-)
>
> diff --git a/Documentation/crypto/userspace-if.rst b/Documentation/crypto/userspace-if.rst
> index 021759198fe7..c39f5c79a5b7 100644
> --- a/Documentation/crypto/userspace-if.rst
> +++ b/Documentation/crypto/userspace-if.rst
> @@ -2,30 +2,72 @@ User Space Interface
> ====================
>
> Introduction
> ------------
>
> -The concepts of the kernel crypto API visible to kernel space is fully
> -applicable to the user space interface as well. Therefore, the kernel
> -crypto API high level discussion for the in-kernel use cases applies
> -here as well.
> -
> -The major difference, however, is that user space can only act as a
> -consumer and never as a provider of a transformation or cipher
> -algorithm.
> -
> -The following covers the user space interface exported by the kernel
> -crypto API. A working example of this description is libkcapi that can
> -be obtained from [1]. That library can be used by user space
> -applications that require cryptographic services from the kernel.
> -
> -Some details of the in-kernel kernel crypto API aspects do not apply to
> -user space, however. This includes the difference between synchronous
> -and asynchronous invocations. The user space API call is fully
> -synchronous.
> -
> -[1] https://www.chronox.de/libkcapi/index.html
> +AF_ALG provides unprivileged userspace programs access to arbitrary hash,
> +symmetric cipher, AEAD, and RNG algorithms that are implemented in kernel-mode
> +code.
> +
> +AF_ALG is insecure and is deprecated. Originally added to the kernel in 2010,
> +most kernel developers now consider it to be a mistake.
> +
> +AF_ALG continues to be supported only for backwards compatibility. On systems
> +where no programs using AF_ALG remain, the support for it should be disabled by
> +disabling ``CONFIG_CRYPTO_USER_API_*``.
> +
> +Deprecation
> +-----------
> +
> +AF_ALG was originally intended to provide userspace programs access to crypto
> +accelerators that they wouldn't otherwise have access to.
> +
> +However, that capability turned out to not be useful on very many systems. More
> +significantly, the actual implementation exposes a vastly greater amount of
> +functionality than that. It actually provides access to all software algorithms.
> +
> +This includes arbitrary compositions of different algorithms created via a
> +complex template system, as well as algorithms that only make sense as internal
> +implementation details of other algorithms. It also includes full zero-copy
> +support, which is difficult for the kernel to implement securely.
> +
> +Ultimately, these algorithms are just math computations. They use the same
> +instructions that userspace programs already have access to, just accessed in a
> +much more convoluted and less efficient way.
> +
> +Indeed, userspace code is nearly always what is being used anyway. These same
> +algorithms are widely implemented in userspace crypto libraries.
> +
> +Meanwhile, AF_ALG hasn't been withstanding modern vulnerability discovery tools
> +such as syzbot and large language models. It receives a steady stream of CVEs.
> +Some of the examples include:
> +
> +- CVE-2026-31677
> +- CVE-2026-31431 (https://copy.fail)
> +- CVE-2025-38079
> +- CVE-2025-37808
> +- CVE-2024-26824
> +- CVE-2022-48781
> +- CVE-2019-8912
> +- CVE-2018-14619
> +- CVE-2017-18075
> +- CVE-2017-17806
> +- CVE-2017-17805
> +- CVE-2016-10147
> +- CVE-2015-8970
> +- CVE-2015-3331
> +- CVE-2014-9644
> +- CVE-2013-7421
> +- CVE-2011-4081
> +
> +It is recommended that, whenever possible, userspace programs be migrated to
> +userspace crypto code (which again, is what is normally used anyway) and
> +``CONFIG_CRYPTO_USER_API_*`` be disabled. On systems that use SELinux, SELinux
> +can also be used to restrict the use of AF_ALG to trusted programs.
> +
> +The remainder of this documentation provides the historical documentation for
> +the deprecated AF_ALG interface.
>
> User Space API General Remarks
> ------------------------------
>
> The kernel crypto API is accessible from user space. Currently, the
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index 103d1f58cb7c..6cd1c478d4be 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -1278,48 +1278,72 @@ config CRYPTO_DF80090A
> tristate
> select CRYPTO_AES
> select CRYPTO_CTR
>
> endmenu
> -menu "Userspace interface"
> +menu "Userspace interface (deprecated)"
>
> config CRYPTO_USER_API
> tristate
>
> config CRYPTO_USER_API_HASH
> - tristate "Hash algorithms"
> + tristate "Hash algorithms (deprecated)"
> depends on NET
> select CRYPTO_HASH
> select CRYPTO_USER_API
> help
> - Enable the userspace interface for hash algorithms.
> + Enable the AF_ALG userspace interface for hash algorithms. This
> + provides unprivileged userspace programs access to arbitrary hash
> + algorithms implemented in the kernel's privileged execution context.
>
> - See Documentation/crypto/userspace-if.rst and
> - https://www.chronox.de/libkcapi/html/index.html
> + This interface is deprecated and is supported only for backwards
> + compatibility. It regularly has vulnerabilities, and the capabilities
> + it provides are redundant with userspace crypto libraries.
> +
> + Enable this only if needed for support for a program that hasn't yet
> + been converted to userspace crypto, for example iwd.
> +
> + See also Documentation/crypto/userspace-if.rst
>
> config CRYPTO_USER_API_SKCIPHER
> - tristate "Symmetric key cipher algorithms"
> + tristate "Symmetric key cipher algorithms (deprecated)"
> depends on NET
> select CRYPTO_SKCIPHER
> select CRYPTO_USER_API
> help
> - Enable the userspace interface for symmetric key cipher algorithms.
> + Enable the AF_ALG userspace interface for symmetric key algorithms.
> + This provides unprivileged userspace programs access to arbitrary
> + symmetric key algorithms implemented in the kernel's privileged
> + execution context.
> +
> + This interface is deprecated and is supported only for backwards
> + compatibility. It regularly has vulnerabilities, and the capabilities
> + it provides are redundant with userspace crypto libraries.
> +
> + Enable this only if needed for support for a program that hasn't yet
> + been converted to userspace crypto, for example iwd, or cryptsetup
> + with certain algorithms.
>
> - See Documentation/crypto/userspace-if.rst and
> - https://www.chronox.de/libkcapi/html/index.html
> + See also Documentation/crypto/userspace-if.rst
>
> config CRYPTO_USER_API_RNG
> - tristate "RNG (random number generator) algorithms"
> + tristate "Random number generation algorithms (deprecated)"
> depends on NET
> select CRYPTO_RNG
> select CRYPTO_USER_API
> help
> - Enable the userspace interface for RNG (random number generator)
> - algorithms.
> + Enable the AF_ALG userspace interface for random number generation
> + (RNG) algorithms. This provides unprivileged userspace programs
> + access to arbitrary RNG algorithms implemented in the kernel's
> + privileged execution context.
>
> - See Documentation/crypto/userspace-if.rst and
> - https://www.chronox.de/libkcapi/html/index.html
> + This interface is deprecated and is supported only for backwards
> + compatibility. It regularly has vulnerabilities, and the capabilities
> + it provides are redundant with userspace crypto libraries as well as
> + the normal kernel RNG (e.g., /dev/urandom and getrandom(2)).
> +
> + See also Documentation/crypto/userspace-if.rst
>
> config CRYPTO_USER_API_RNG_CAVP
> bool "Enable CAVP testing of DRBG"
> depends on CRYPTO_USER_API_RNG && CRYPTO_DRBG
> help
> @@ -1330,20 +1354,29 @@ config CRYPTO_USER_API_RNG_CAVP
>
> This should only be enabled for CAVP testing. You should say
> no unless you know what this is.
>
> config CRYPTO_USER_API_AEAD
> - tristate "AEAD cipher algorithms"
> + tristate "AEAD cipher algorithms (deprecated)"
> depends on NET
> select CRYPTO_AEAD
> select CRYPTO_SKCIPHER
> select CRYPTO_USER_API
> help
> - Enable the userspace interface for AEAD cipher algorithms.
> + Enable the AF_ALG userspace interface for authenticated encryption
> + with associated data (AEAD) algorithms. This provides unprivileged
> + userspace programs access to arbitrary AEAD algorithms implemented in
> + the kernel's privileged execution context.
> +
> + This interface is deprecated and is supported only for backwards
> + compatibility. It regularly has vulnerabilities, and the capabilities
> + it provides are redundant with userspace crypto libraries.
> +
> + Enable this only if needed for support for a program that hasn't yet
> + been converted to userspace crypto, for example iwd.
>
> - See Documentation/crypto/userspace-if.rst and
> - https://www.chronox.de/libkcapi/html/index.html
> + See also Documentation/crypto/userspace-if.rst
>
> config CRYPTO_USER_API_ENABLE_OBSOLETE
> bool "Obsolete cryptographic algorithms"
> depends on CRYPTO_USER_API
> default y
>
> base-commit: 57b8e2d666a31fa201432d58f5fe3469a0dd83ba
> --
> 2.54.0
>
>
^ permalink raw reply
* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Eric Biggers @ 2026-05-06 0:17 UTC (permalink / raw)
To: Andy Lutomirski
Cc: linux-crypto, Herbert Xu, linux-doc, linux-api, linux-kernel,
netdev, Linus Torvalds
In-Reply-To: <CALCETrVqG+1yErRJjkxvJrf=A+Vu84HTR4Bx1Pcd8G1C0PJcMA@mail.gmail.com>
On Tue, May 05, 2026 at 04:17:18PM -0700, Andy Lutomirski wrote:
> > On Apr 29, 2026, at 6:19 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > AF_ALG is almost completely unnecessary, and it exposes a massive attack
> > surface that hasn't been standing up to modern vulnerability discovery
> > tools. The latest one even has its own website, providing a small
> > Python script that reliably roots most Linux distros: https://copy.fail/
>
> How about adding a configuration option, defaulted on, that requires
> capable(CAP_SYS_ADMIN) to create the socket (and maybe also to bind /
> connect it). And a sysctl to allow the administrator to override this
> in the unlikely event that it’s needed.
>
> IIRC cryptsetup used to and maybe even still does require these
> sockets sometimes and this would let it keep working. And there's all
> the FIPS stuff downthread.
Yes, I'd like to add a default-on requirement to hold a capability in
the initial user namespace. We're trying to figure out the details.
It sounds like iwd runs with CAP_NET_ADMIN, not necessarily
CAP_SYS_ADMIN. So it may need to be:
has_capability_noaudit(current, CAP_NET_ADMIN) || capable(CAP_SYS_ADMIN)
iwd is being discussed in the thread
https://lore.kernel.org/linux-crypto/bcbbef00-5881-421b-8892-7be6c04b832d@gmail.com/
cryptsetup is normally run with CAP_SYS_ADMIN, but not always (e.g.,
'cryptsetup benchmark'). It might be acceptable for users to add sudo
in the exceptional cases. cryptsetup is being discussed in the thread
https://lore.kernel.org/linux-crypto/5dd3be22-13fb-41fb-b469-1ae6472200b1@gmail.com/
bluez needs investigation.
- Eric
^ permalink raw reply
* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Jeff Barnes @ 2026-05-06 14:42 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Eric Biggers, linux-crypto@vger.kernel.org, Herbert Xu,
linux-doc@vger.kernel.org, linux-api@vger.kernel.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Linus Torvalds
In-Reply-To: <CALCETrVqG+1yErRJjkxvJrf=A+Vu84HTR4Bx1Pcd8G1C0PJcMA@mail.gmail.com>
Hi,
On May 5 2026, at 7:17 pm, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Apr 29, 2026, at 6:19 PM, Eric Biggers <ebiggers@kernel.org> wrote:
>>
>> AF_ALG is almost completely unnecessary, and it exposes a massive attack
>> surface that hasn't been standing up to modern vulnerability discovery
>> tools. The latest one even has its own website, providing a small
>> Python script that reliably roots most Linux distros: https://copy.fail/
>
> How about adding a configuration option, defaulted on, that requires
> capable(CAP_SYS_ADMIN) to create the socket (and maybe also to bind /
> connect it). And a sysctl to allow the administrator to override this
> in the unlikely event that it’s needed.
>
> IIRC cryptsetup used to and maybe even still does require these
> sockets sometimes and this would let it keep working. And there's all
> the FIPS stuff downthread.
Apologize in advance for the long-winded answer.
The "FIPS stuff" centers on using sha512hmac -> libkcapi -> AF_ALG for
verifying integrity. The early‑boot sha512hmac check that some
distributions use (typically from initramfs) sits at an awkward
intersection of multiple standards, and it may help to clarify where it
actually fits and where it doesn't.
From a standards perspective, FIPS 140‑3 requires a cryptographic module
to perform self‑integrity verification using an approved algorithm and
to prevent the module from entering an operational state on failure. In
the Linux kernel, the cryptographic module is the kernel crypto
subsystem, and these requirements are met by the kernel’s internal
power‑up self‑tests (KATs, etc.) on the crypto code and critical data as
loaded into memory.
FIPS 199 / SP 800‑53 (e.g., SI‑7) impose system‑level integrity
requirements (for Moderate impact systems), i.e., that unauthorized
modification of critical components is prevented or detected and that
failures result in a protective action. These controls are explicitly
technology‑agnostic and are not limited to cryptographic‑module self‑tests.
The sha512hmac check is not the FIPS 140‑3 cryptographic‑module
self‑integrity test. Instead, it has historically been used as a system
integrity control that provides auditors with assurance that the kernel
image containing the cryptographic module has not been modified prior to
execution, and that a failure will halt the boot.
Although FIPS 140‑3 does not mandate an HMAC over the kernel image, the
early‑boot HMAC became an accepted evidence pattern for satisfying
system‑integrity expectations (FIPS 199 / SI‑7) alongside a kernel
crypto validation. This is why it is often perceived as “required” for
FIPS submissions, even though it is not normatively required by
FIPS 140‑3 itself.
With the deprecation/removal of AF_ALG for this use case, there is no
longer a supported way to perform an early‑boot, userspace‑driven HMAC
using validated kernel crypto without introducing circular dependencies
(e.g., relying on userspace crypto before crypto self‑tests complete).
As a result, there is no drop‑in replacement for sha512hmac that
preserves all of its historical properties.
This is a new development that challenges a long‑standing assumption:
that system‑integrity evidence and cryptographic‑module self‑integrity
can be cleanly separated while still being demonstrated by a single
early‑boot mechanism. That assumption no longer holds given proposed
kernel interfaces.
A more accurate decomposition (and one that aligns with the intent of
the standards) is to separate integrity enforcement by system phase.
1. Secure Boot (or equivalent platform verification) ensures that a
modified kernel image is not executed at all. This satisfies the
requirement that critical components are not loaded in a modified state
and that integrity failure results in a protective action (boot prevention).
2. IMA (with appraisal and enforcement) ensures that modified
executables, modules, or firmware cannot be loaded or executed once the
kernel is running.
3. Kernel crypto self‑tests continue to satisfy FIPS 140‑3
self‑integrity requirements independently of the above.
Taken together, Secure Boot + IMA provide continuous system‑integrity
enforcement without re‑introducing early‑boot HMACs or AF_ALG
dependencies, while keeping cryptographic‑module self‑integrity
correctly scoped to the kernel crypto subsystem.
The transition away from sha512hmac is therefore not a removal of
integrity enforcement, but a shift from a single, early‑boot mechanism
to a phased integrity model that better reflects the separation of
concerns already present in the standards — even though this separation
was previously masked by the hacky HMAC approach.
This change will require updated documentation and auditor education,
but it reflects the current technical reality and avoids perpetuating an
interface that no longer has a sustainable implementation path.
>
>
>>
>> This isn't sustainable, especially as LLMs have accelerated the rate the
>> vulnerabilities are coming in. The effort that is being put into this
>> thing is vastly disproportional to the few programs that actually use
>> it, and those programs would be better served by userspace code anyway.
>>
>> These issues have been noted in many mailing list discussions already.
>> But until now they haven't been reflected in the documentation or
>> kconfig menu itself, and the vulnerabilities are still coming in.
>>
>> Let's go ahead and document the deprecation.
>>
>> This isn't intended to change anything overnight. After all, most Linux
>> distros won't be able to disable the kconfig options quite yet, mainly
>> because of iwd. But this should create a bit more impetus for these
>> userspace programs to be fixed, and the documentation update should also
>> help prevent more users from appearing.
>>
>> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
>> ---
>>
>> This patch is targeting crypto/master
>>
>> Documentation/crypto/userspace-if.rst | 82 ++++++++++++++++++++-------
>> crypto/Kconfig | 69 ++++++++++++++++------
>> 2 files changed, 113 insertions(+), 38 deletions(-)
>>
>> diff --git a/Documentation/crypto/userspace-if.rst b/Documentation/crypto/userspace-if.rst
>> index 021759198fe7..c39f5c79a5b7 100644
>> --- a/Documentation/crypto/userspace-if.rst
>> +++ b/Documentation/crypto/userspace-if.rst
>> @@ -2,30 +2,72 @@ User Space Interface
>> ====================
>>
>> Introduction
>> ------------
>>
>> -The concepts of the kernel crypto API visible to kernel space is fully
>> -applicable to the user space interface as well. Therefore, the kernel
>> -crypto API high level discussion for the in-kernel use cases applies
>> -here as well.
>> -
>> -The major difference, however, is that user space can only act as a
>> -consumer and never as a provider of a transformation or cipher
>> -algorithm.
>> -
>> -The following covers the user space interface exported by the kernel
>> -crypto API. A working example of this description is libkcapi that can
>> -be obtained from [1]. That library can be used by user space
>> -applications that require cryptographic services from the kernel.
>> -
>> -Some details of the in-kernel kernel crypto API aspects do not apply to
>> -user space, however. This includes the difference between synchronous
>> -and asynchronous invocations. The user space API call is fully
>> -synchronous.
>> -
>> -[1] https://www.chronox.de/libkcapi/index.html
>> +AF_ALG provides unprivileged userspace programs access to arbitrary hash,
>> +symmetric cipher, AEAD, and RNG algorithms that are implemented in kernel-mode
>> +code.
>> +
>> +AF_ALG is insecure and is deprecated. Originally added to the kernel
>> in 2010,
>> +most kernel developers now consider it to be a mistake.
>> +
>> +AF_ALG continues to be supported only for backwards compatibility.
>> On systems
>> +where no programs using AF_ALG remain, the support for it should be
>> disabled by
>> +disabling ``CONFIG_CRYPTO_USER_API_*``.
>> +
>> +Deprecation
>> +-----------
>> +
>> +AF_ALG was originally intended to provide userspace programs access
>> to crypto
>> +accelerators that they wouldn't otherwise have access to.
>> +
>> +However, that capability turned out to not be useful on very many
>> systems. More
>> +significantly, the actual implementation exposes a vastly greater
>> amount of
>> +functionality than that. It actually provides access to all software algorithms.
>> +
>> +This includes arbitrary compositions of different algorithms created
>> via a
>> +complex template system, as well as algorithms that only make sense
>> as internal
>> +implementation details of other algorithms. It also includes full zero-copy
>> +support, which is difficult for the kernel to implement securely.
>> +
>> +Ultimately, these algorithms are just math computations. They use
>> the same
>> +instructions that userspace programs already have access to, just
>> accessed in a
>> +much more convoluted and less efficient way.
>> +
>> +Indeed, userspace code is nearly always what is being used anyway.
>> These same
>> +algorithms are widely implemented in userspace crypto libraries.
>> +
>> +Meanwhile, AF_ALG hasn't been withstanding modern vulnerability
>> discovery tools
>> +such as syzbot and large language models. It receives a steady
>> stream of CVEs.
>> +Some of the examples include:
>> +
>> +- CVE-2026-31677
>> +- CVE-2026-31431 (https://copy.fail)
>> +- CVE-2025-38079
>> +- CVE-2025-37808
>> +- CVE-2024-26824
>> +- CVE-2022-48781
>> +- CVE-2019-8912
>> +- CVE-2018-14619
>> +- CVE-2017-18075
>> +- CVE-2017-17806
>> +- CVE-2017-17805
>> +- CVE-2016-10147
>> +- CVE-2015-8970
>> +- CVE-2015-3331
>> +- CVE-2014-9644
>> +- CVE-2013-7421
>> +- CVE-2011-4081
>> +
>> +It is recommended that, whenever possible, userspace programs be
>> migrated to
>> +userspace crypto code (which again, is what is normally used anyway) and
>> +``CONFIG_CRYPTO_USER_API_*`` be disabled. On systems that use
>> SELinux, SELinux
>> +can also be used to restrict the use of AF_ALG to trusted programs.
>> +
>> +The remainder of this documentation provides the historical
>> documentation for
>> +the deprecated AF_ALG interface.
>>
>> User Space API General Remarks
>> ------------------------------
>>
>> The kernel crypto API is accessible from user space. Currently, the
>> diff --git a/crypto/Kconfig b/crypto/Kconfig
>> index 103d1f58cb7c..6cd1c478d4be 100644
>> --- a/crypto/Kconfig
>> +++ b/crypto/Kconfig
>> @@ -1278,48 +1278,72 @@ config CRYPTO_DF80090A
>> tristate
>> select CRYPTO_AES
>> select CRYPTO_CTR
>>
>> endmenu
>> -menu "Userspace interface"
>> +menu "Userspace interface (deprecated)"
>>
>> config CRYPTO_USER_API
>> tristate
>>
>> config CRYPTO_USER_API_HASH
>> - tristate "Hash algorithms"
>> + tristate "Hash algorithms (deprecated)"
>> depends on NET
>> select CRYPTO_HASH
>> select CRYPTO_USER_API
>> help
>> - Enable the userspace interface for hash algorithms.
>> + Enable the AF_ALG userspace interface for hash algorithms. This
>> + provides unprivileged userspace programs access to arbitrary hash
>> + algorithms implemented in the kernel's privileged execution context.
>>
>> - See Documentation/crypto/userspace-if.rst and
>> - https://www.chronox.de/libkcapi/html/index.html
>> + This interface is deprecated and is supported only for backwards
>> + compatibility. It regularly has vulnerabilities, and the capabilities
>> + it provides are redundant with userspace crypto libraries.
>> +
>> + Enable this only if needed for support for a program that
>> hasn't yet
>> + been converted to userspace crypto, for example iwd.
>> +
>> + See also Documentation/crypto/userspace-if.rst
>>
>> config CRYPTO_USER_API_SKCIPHER
>> - tristate "Symmetric key cipher algorithms"
>> + tristate "Symmetric key cipher algorithms (deprecated)"
>> depends on NET
>> select CRYPTO_SKCIPHER
>> select CRYPTO_USER_API
>> help
>> - Enable the userspace interface for symmetric key cipher algorithms.
>> + Enable the AF_ALG userspace interface for symmetric key algorithms.
>> + This provides unprivileged userspace programs access to arbitrary
>> + symmetric key algorithms implemented in the kernel's privileged
>> + execution context.
>> +
>> + This interface is deprecated and is supported only for backwards
>> + compatibility. It regularly has vulnerabilities, and the capabilities
>> + it provides are redundant with userspace crypto libraries.
>> +
>> + Enable this only if needed for support for a program that
>> hasn't yet
>> + been converted to userspace crypto, for example iwd, or cryptsetup
>> + with certain algorithms.
>>
>> - See Documentation/crypto/userspace-if.rst and
>> - https://www.chronox.de/libkcapi/html/index.html
>> + See also Documentation/crypto/userspace-if.rst
>>
>> config CRYPTO_USER_API_RNG
>> - tristate "RNG (random number generator) algorithms"
>> + tristate "Random number generation algorithms (deprecated)"
>> depends on NET
>> select CRYPTO_RNG
>> select CRYPTO_USER_API
>> help
>> - Enable the userspace interface for RNG (random number generator)
>> - algorithms.
>> + Enable the AF_ALG userspace interface for random number generation
>> + (RNG) algorithms. This provides unprivileged userspace programs
>> + access to arbitrary RNG algorithms implemented in the kernel's
>> + privileged execution context.
>>
>> - See Documentation/crypto/userspace-if.rst and
>> - https://www.chronox.de/libkcapi/html/index.html
>> + This interface is deprecated and is supported only for backwards
>> + compatibility. It regularly has vulnerabilities, and the capabilities
>> + it provides are redundant with userspace crypto libraries as
>> well as
>> + the normal kernel RNG (e.g., /dev/urandom and getrandom(2)).
>> +
>> + See also Documentation/crypto/userspace-if.rst
>>
>> config CRYPTO_USER_API_RNG_CAVP
>> bool "Enable CAVP testing of DRBG"
>> depends on CRYPTO_USER_API_RNG && CRYPTO_DRBG
>> help
>> @@ -1330,20 +1354,29 @@ config CRYPTO_USER_API_RNG_CAVP
>>
>> This should only be enabled for CAVP testing. You should say
>> no unless you know what this is.
>>
>> config CRYPTO_USER_API_AEAD
>> - tristate "AEAD cipher algorithms"
>> + tristate "AEAD cipher algorithms (deprecated)"
>> depends on NET
>> select CRYPTO_AEAD
>> select CRYPTO_SKCIPHER
>> select CRYPTO_USER_API
>> help
>> - Enable the userspace interface for AEAD cipher algorithms.
>> + Enable the AF_ALG userspace interface for authenticated encryption
>> + with associated data (AEAD) algorithms. This provides unprivileged
>> + userspace programs access to arbitrary AEAD algorithms
>> implemented in
>> + the kernel's privileged execution context.
>> +
>> + This interface is deprecated and is supported only for backwards
>> + compatibility. It regularly has vulnerabilities, and the capabilities
>> + it provides are redundant with userspace crypto libraries.
>> +
>> + Enable this only if needed for support for a program that
>> hasn't yet
>> + been converted to userspace crypto, for example iwd.
>>
>> - See Documentation/crypto/userspace-if.rst and
>> - https://www.chronox.de/libkcapi/html/index.html
>> + See also Documentation/crypto/userspace-if.rst
>>
>> config CRYPTO_USER_API_ENABLE_OBSOLETE
>> bool "Obsolete cryptographic algorithms"
>> depends on CRYPTO_USER_API
>> default y
>>
>> base-commit: 57b8e2d666a31fa201432d58f5fe3469a0dd83ba
>> --
>> 2.54.0
>>
>>
>
^ permalink raw reply
* [PATCH v14 00/15] Exposing case folding behavior
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Darrick J. Wong, Roland Mainz, Steve French
Christian, let's lock this one in. I will post subsequent changes
as delta patches.
Following on from:
https://lore.kernel.org/linux-nfs/20251021-zypressen-bazillus-545a44af57fd@brauner/T/#m0ba197d75b7921d994cf284f3cef3a62abb11aaa
I'm attempting to implement enough support in the Linux VFS to
enable file services like NFSD and ksmbd (and user space
equivalents) to provide the actual status of case folding support
in local file systems. The default behavior for local file systems
not explicitly supported in this series is to reflect the usual
POSIX behaviors:
case-insensitive = false
case-nonpreserving = false
The case-insensitivity and case-nonpreserving booleans can be
consumed immediately by NFSD. These two attributes have been part of
the NFSv3 and NFSv4 protocols for decades, in order to support NFS
client implementations on non-POSIX systems.
Support for user space file servers is why this series exposes case
folding information via a user-space API. I don't know of any other
category of user-space application that requires access to case
folding info.
The Linux NFS community has a growing interest in supporting NFS
clients on Windows and MacOS platforms, where file name behavior does
not align with traditional POSIX semantics.
One example of a Windows-based NFS client is [1]. This client
implementation explicitly requires servers to report
FATTR4_WORD0_CASE_INSENSITIVE = TRUE for proper operation, a hard
requirement for Windows client interoperability because Windows
applications expect case-insensitive behavior. When an NFS client
knows the server is case-insensitive, it can avoid issuing multiple
LOOKUP/READDIR requests to search for case variants, and applications
like Win32 programs work correctly without manual workarounds or
code changes.
Even the Linux client can take advantage of this information. Trond
merged patches 4 years ago [2] that introduce support for case
insensitivity, in support of the Hammerspace NFS server. In
particular, when a client detects a case-insensitive NFS share,
negative dentry caching must be disabled (a lookup for "FILE.TXT"
failing shouldn't cache a negative entry when "file.txt" exists)
and directory change invalidation must clear all cached case-folded
file name variants.
Hammerspace servers and several other NFS server implementations
operate in multi-protocol environments, where a single file service
instance caters to both NFS and SMB clients. In those cases, things
work more smoothly for everyone when the NFS client can see and adapt
to the case folding behavior that SMB users rely on and expect. NFSD
needs to support the case-insensitivity and case-nonpreserving
booleans properly in order to participate as a first-class citizen
in such environments.
[1] https://github.com/kofemann/ms-nfs41-client
[2] https://patchwork.kernel.org/project/linux-nfs/cover/20211217203658.439352-1-trondmy@kernel.org/
---
Changes since v13:
- Address findings from sashiko (gemini-3.1):
- ntfs3: Drop fileattr_get from symlink and special inode ops
- nfsd: Probe nfsd_get_case_info() under kernel creds to avoid
spurious NFS4ERR_ACCESS from per-client MAC policy
Changes since v12:
- Address findings from sashiko (gemini-3.1):
- cifs: Restrict case-handling flags to directories per UAPI
- nfs: Clear case caps before PATHCONF so a failed reply
does not retain stale bits from the prior probe
- nfsd: Document the parent-resolution corner cases of
nfsd_get_case_info() (single-file exports, disconnected
dentries, hardlinks) in the v3 and v4 commit messages
Changes since v11:
- isofs: Wire .fileattr_get only on directory inodes, since
NFSD and ksmbd query casefolding on directories (Jan Kara)
- xfs, hfsplus: Drop the FS_CASEFOLD_FL fileattr_get mask;
admit the bit through fileattr_set's allowlist instead
- Address findings from sashiko(gemini-3) and gpt-5.5:
- cifs: Wire .fileattr_get on cifs_namespace_inode_operations
so DFS referral / automount directories report case handling
- fat, ntfs3: Fill FS_IMMUTABLE_FL in fileattr_get
- hfsplus: Hide FS_CASEFOLD_FL from the legacy flags view so
chattr round-trips do not hit the setflags whitelist
- nfs: Clear NFS_CAP_CASE_INSENSITIVE and
NFS_CAP_CASE_NONPRESERVING before re-OR'ing in the v3 and
v4 probe paths so re-probe / TSM does not retain stale caps
- nfsd: Switch nfsd_get_case_info() to errno return so
v3 PATHCONF and v4 GETATTR can apply version-appropriate
policy on failure
- nfsd: Use dget_parent() in v4 case-attr probe to keep
the parent dentry referenced across the query
- isofs: Report FS_XFLAG_CASENONPRESERVING for map=n/map=a
Changes since v10:
- cifs: Source case-handling flags from the server's cached
FS_ATTRIBUTE_INFORMATION reply instead of the nocase mount
option, with a nocase fallback when the reply is absent
- Address findings from sashiko(gemini-3) and gpt-5.5:
- nfs: Skip pathconf case bits on NFSv4 (set via FATTR4_CASE_*
instead)
- xfs: Hide FS_CASEFOLD_FL from the legacy flags view so
chattr round-trips do not hit the setflags whitelist
- ext4, f2fs: Drop redundant fileattr_get patches; the
FS_CASEFOLD_FL translation in fileattr_fill_flags() already
reports FS_XFLAG_CASEFOLD for casefolded directories
- nfsd: Report FATTR4_HOMOGENEOUS = FALSE when the exported
filesystem has a Unicode encoding, since per-directory
casefold makes the fs-scoped case attributes inhomogeneous
- nfsd: Document in nfsd_get_case_info() why -ENOIOCTLCMD and
-ENOTTY are swallowed while other errors propagate
- fat: Honor vfat 'check=strict' when reporting FS_XFLAG_CASEFOLD
- Set FS_CASEFOLD_FL so FS_IOC_GETFLAGS reflects case-insensitive
mount
- isofs: Register fileattr_get on regular file and symlink inodes,
not just directories
- nfsd: Query NFSv4 FATTR4_CASE_* from the parent directory for
non-directory objects, since casefold lives on the directory
Changes since v9:
- nfs: always probe PATHCONF for case caps. Default to case-
preserving when the server does not report case_preserving
- nfsd, ksmbd: tolerate -ENOTTY from vfs_fileattr_get() so
overlayfs exports on backing filesystems without fileattr_get
do not fail the RPC
- xfs: map FS_XFLAG_CASEFOLD inside xfs_ip2xflags() so BULKSTAT
and FS_IOC_FSGETXATTR report the flag consistently
- vboxsf: reject a short host reply to SHFL_INFO_VOLUME before
trusting volinfo.properties.case_sensitive
Changes since v8:
- Rebase on v7.0-rc1
Changes since v7:
- Split file_attr initialization changes into a separate patch
Changes since v6:
- Remove the memset from vfs_fileattr_get
Changes since v5:
- Finish the conversion to FS_XFLAGs
- NFSv4 GETATTR now clears the attr mask bit if nfsd_get_case_info()
fails
Changes since v4:
- Observe the MSDOS "nocase" mount option
- Define new FS_XFLAGs for the user API
Changes since v3:
- Change fa->case_preserving to fa_case_nonpreserving
- VFAT is case preserving
- Make new fields available to user space
Changes since v2:
- Remove unicode labels
- Replace vfs_get_case_info
- Add support for several more local file system implementations
- Add support for in-kernel SMB server
Changes since RFC:
- Use file_getattr instead of statx
- Postpone exposing Unicode version until later
- Support NTFS and ext4 in addition to FAT
- Support NFSv4 fattr4 in addition to NFSv3 PATHCONF
---
Chuck Lever (15):
fs: Move file_kattr initialization to callers
fs: Add case sensitivity flags to file_kattr
fat: Implement fileattr_get for case sensitivity
exfat: Implement fileattr_get for case sensitivity
ntfs3: Implement fileattr_get for case sensitivity
hfs: Implement fileattr_get for case sensitivity
hfsplus: Report case sensitivity in fileattr_get
xfs: Report case sensitivity in fileattr_get
cifs: Implement fileattr_get for case sensitivity
nfs: Implement fileattr_get for case sensitivity
vboxsf: Implement fileattr_get for case sensitivity
isofs: Implement fileattr_get for case sensitivity
nfsd: Report export case-folding via NFSv3 PATHCONF
nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
fs/exfat/exfat_fs.h | 2 +
fs/exfat/file.c | 18 ++++++++-
fs/exfat/namei.c | 1 +
fs/fat/fat.h | 3 ++
fs/fat/file.c | 36 +++++++++++++++++
fs/fat/namei_msdos.c | 1 +
fs/fat/namei_vfat.c | 1 +
fs/file_attr.c | 16 ++++----
fs/hfs/dir.c | 1 +
fs/hfs/hfs_fs.h | 2 +
fs/hfs/inode.c | 14 +++++++
fs/hfsplus/inode.c | 16 +++++++-
fs/isofs/dir.c | 16 ++++++++
fs/isofs/isofs.h | 3 ++
fs/nfs/client.c | 21 +++++++---
fs/nfs/inode.c | 15 +++++++
fs/nfs/internal.h | 3 ++
fs/nfs/namespace.c | 2 +
fs/nfs/nfs3proc.c | 2 +
fs/nfs/nfs3xdr.c | 7 +++-
fs/nfs/nfs4proc.c | 10 +++--
fs/nfs/proc.c | 3 ++
fs/nfs/symlink.c | 3 ++
fs/nfsd/nfs3proc.c | 36 +++++++++++++----
fs/nfsd/nfs4xdr.c | 52 +++++++++++++++++++++++--
fs/nfsd/vfs.c | 88 ++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/vfs.h | 3 ++
fs/nfsd/xdr3.h | 4 +-
fs/ntfs3/file.c | 29 ++++++++++++++
fs/ntfs3/namei.c | 1 +
fs/ntfs3/ntfs_fs.h | 1 +
fs/smb/client/cifsfs.c | 53 +++++++++++++++++++++++++
fs/smb/client/cifsfs.h | 3 ++
fs/smb/client/namespace.c | 1 +
fs/smb/server/smb2pdu.c | 30 +++++++++++---
fs/vboxsf/dir.c | 1 +
fs/vboxsf/file.c | 6 ++-
fs/vboxsf/super.c | 7 ++++
fs/vboxsf/utils.c | 30 ++++++++++++++
fs/vboxsf/vfsmod.h | 6 +++
fs/xfs/libxfs/xfs_inode_util.c | 2 +
fs/xfs/xfs_ioctl.c | 22 +++++++++--
include/linux/fileattr.h | 3 +-
include/linux/nfs_fs_sb.h | 2 +-
include/linux/nfs_xdr.h | 2 +
include/uapi/linux/fs.h | 7 ++++
46 files changed, 536 insertions(+), 49 deletions(-)
---
base-commit: 6596a02b207886e9e00bb0161c7fd59fea53c081
change-id: 20260422-case-sensitivity-5cbffc8f1558
Best regards,
--
Chuck Lever <chuck.lever@oracle.com>
^ permalink raw reply
* [PATCH v14 01/15] fs: Move file_kattr initialization to callers
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Darrick J. Wong, Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
fileattr_fill_xflags() and fileattr_fill_flags() memset the
entire file_kattr struct before populating select fields, so
callers cannot pre-set fields in fa->fsx_xflags without having
their values clobbered. Darrick Wong noted that a function
named "fill_xflags" touching more than xflags forces callers
to know implementation details beyond its apparent scope.
Drop the memset from both fill functions and initialize at the
entry points instead: ioctl_setflags(), ioctl_fssetxattr(),
the file_setattr() syscall, and xfs_ioc_fsgetxattra() now
declare fa with an aggregate initializer. ioctl_getflags(),
ioctl_fsgetxattr(), and the file_getattr() syscall already
aggregate-initialize fa to pass flags_valid/fsx_valid hints
into vfs_fileattr_get().
Subsequent patches rely on this so that ->fileattr_get()
handlers can set case-sensitivity flags (FS_XFLAG_CASEFOLD,
FS_XFLAG_CASENONPRESERVING) in fa->fsx_xflags before the fill
functions run.
Suggested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/file_attr.c | 12 ++++--------
fs/xfs/xfs_ioctl.c | 2 +-
2 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/fs/file_attr.c b/fs/file_attr.c
index da983e105d70..f429da66a317 100644
--- a/fs/file_attr.c
+++ b/fs/file_attr.c
@@ -15,12 +15,10 @@
* @fa: fileattr pointer
* @xflags: FS_XFLAG_* flags
*
- * Set ->fsx_xflags, ->fsx_valid and ->flags (translated xflags). All
- * other fields are zeroed.
+ * Set ->fsx_xflags, ->fsx_valid and ->flags (translated xflags).
*/
void fileattr_fill_xflags(struct file_kattr *fa, u32 xflags)
{
- memset(fa, 0, sizeof(*fa));
fa->fsx_valid = true;
fa->fsx_xflags = xflags;
if (fa->fsx_xflags & FS_XFLAG_IMMUTABLE)
@@ -48,11 +46,9 @@ EXPORT_SYMBOL(fileattr_fill_xflags);
* @flags: FS_*_FL flags
*
* Set ->flags, ->flags_valid and ->fsx_xflags (translated flags).
- * All other fields are zeroed.
*/
void fileattr_fill_flags(struct file_kattr *fa, u32 flags)
{
- memset(fa, 0, sizeof(*fa));
fa->flags_valid = true;
fa->flags = flags;
if (fa->flags & FS_SYNC_FL)
@@ -325,7 +321,7 @@ int ioctl_setflags(struct file *file, unsigned int __user *argp)
{
struct mnt_idmap *idmap = file_mnt_idmap(file);
struct dentry *dentry = file->f_path.dentry;
- struct file_kattr fa;
+ struct file_kattr fa = {};
unsigned int flags;
int err;
@@ -357,7 +353,7 @@ int ioctl_fssetxattr(struct file *file, void __user *argp)
{
struct mnt_idmap *idmap = file_mnt_idmap(file);
struct dentry *dentry = file->f_path.dentry;
- struct file_kattr fa;
+ struct file_kattr fa = {};
int err;
err = copy_fsxattr_from_user(&fa, argp);
@@ -431,7 +427,7 @@ SYSCALL_DEFINE5(file_setattr, int, dfd, const char __user *, filename,
struct path filepath __free(path_put) = {};
unsigned int lookup_flags = 0;
struct file_attr fattr;
- struct file_kattr fa;
+ struct file_kattr fa = {};
int error;
BUILD_BUG_ON(sizeof(struct file_attr) < FILE_ATTR_SIZE_VER0);
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 46e234863644..ed9b4846c05f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -517,7 +517,7 @@ xfs_ioc_fsgetxattra(
xfs_inode_t *ip,
void __user *arg)
{
- struct file_kattr fa;
+ struct file_kattr fa = {};
xfs_ilock(ip, XFS_ILOCK_SHARED);
xfs_fill_fsxattr(ip, XFS_ATTR_FORK, &fa);
--
2.53.0
^ permalink raw reply related
* [PATCH v14 02/15] fs: Add case sensitivity flags to file_kattr
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Darrick J. Wong, Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Enable upper layers such as NFSD to retrieve case sensitivity
information from file systems by adding FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING flags.
Filesystems report case-insensitive or case-nonpreserving behavior
by setting these flags directly in fa->fsx_xflags. The default
(flags unset) indicates POSIX semantics: case-sensitive and
case-preserving. Both flags are added to FS_XFLAG_RDONLY_MASK so
FS_IOC_FSSETXATTR silently strips them, keeping the new xflags
strictly a reporting interface. Callers that want to toggle
casefolding continue to use FS_IOC_SETFLAGS with FS_CASEFOLD_FL,
the established UAPI on filesystems that support the operation
(ext4 and f2fs on empty directories).
Case sensitivity information is exported to userspace via the
fa_xflags field in the FS_IOC_FSGETXATTR ioctl and file_getattr()
system call.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/file_attr.c | 4 ++++
include/linux/fileattr.h | 3 ++-
include/uapi/linux/fs.h | 7 +++++++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/file_attr.c b/fs/file_attr.c
index f429da66a317..bfb00d256dd5 100644
--- a/fs/file_attr.c
+++ b/fs/file_attr.c
@@ -37,6 +37,8 @@ void fileattr_fill_xflags(struct file_kattr *fa, u32 xflags)
fa->flags |= FS_PROJINHERIT_FL;
if (fa->fsx_xflags & FS_XFLAG_VERITY)
fa->flags |= FS_VERITY_FL;
+ if (fa->fsx_xflags & FS_XFLAG_CASEFOLD)
+ fa->flags |= FS_CASEFOLD_FL;
}
EXPORT_SYMBOL(fileattr_fill_xflags);
@@ -67,6 +69,8 @@ void fileattr_fill_flags(struct file_kattr *fa, u32 flags)
fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
if (fa->flags & FS_VERITY_FL)
fa->fsx_xflags |= FS_XFLAG_VERITY;
+ if (fa->flags & FS_CASEFOLD_FL)
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
}
EXPORT_SYMBOL(fileattr_fill_flags);
diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
index 3780904a63a6..58044b598016 100644
--- a/include/linux/fileattr.h
+++ b/include/linux/fileattr.h
@@ -16,7 +16,8 @@
/* Read-only inode flags */
#define FS_XFLAG_RDONLY_MASK \
- (FS_XFLAG_PREALLOC | FS_XFLAG_HASATTR | FS_XFLAG_VERITY)
+ (FS_XFLAG_PREALLOC | FS_XFLAG_HASATTR | FS_XFLAG_VERITY | \
+ FS_XFLAG_CASEFOLD | FS_XFLAG_CASENONPRESERVING)
/* Flags to indicate valid value of fsx_ fields */
#define FS_XFLAG_VALUES_MASK \
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 13f71202845e..2ea4c81df08f 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -254,6 +254,13 @@ struct file_attr {
#define FS_XFLAG_DAX 0x00008000 /* use DAX for IO */
#define FS_XFLAG_COWEXTSIZE 0x00010000 /* CoW extent size allocator hint */
#define FS_XFLAG_VERITY 0x00020000 /* fs-verity enabled */
+/*
+ * Case handling flags (read-only, cannot be set via ioctl).
+ * Default (neither set) indicates POSIX semantics: case-sensitive
+ * lookups and case-preserving storage.
+ */
+#define FS_XFLAG_CASEFOLD 0x00040000 /* case-insensitive lookups */
+#define FS_XFLAG_CASENONPRESERVING 0x00080000 /* case not preserved */
#define FS_XFLAG_HASATTR 0x80000000 /* no DIFLAG for this */
/* the read-only stuff doesn't really belong here, but any other place is
--
2.53.0
^ permalink raw reply related
* [PATCH v14 03/15] fat: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Report FAT's case sensitivity behavior via the FS_XFLAG_CASEFOLD
and FS_XFLAG_CASENONPRESERVING flags. FAT filesystems are
case-insensitive by default.
MSDOS supports a 'nocase' mount option that enables case-sensitive
behavior; check this option when reporting case sensitivity.
VFAT long filename entries preserve case; without VFAT, only
uppercased 8.3 short names are stored. MSDOS with 'nocase' also
preserves case since the name-formatting code skips upcasing when
'nocase' is set. Check both options when reporting case preservation.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/fat/fat.h | 3 +++
fs/fat/file.c | 36 ++++++++++++++++++++++++++++++++++++
fs/fat/namei_msdos.c | 1 +
fs/fat/namei_vfat.c | 1 +
4 files changed, 41 insertions(+)
diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 5a58f0bf8ce8..99ed9228a677 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -10,6 +10,8 @@
#include <linux/fs_context.h>
#include <linux/fs_parser.h>
+struct file_kattr;
+
/*
* vfat shortname flags
*/
@@ -408,6 +410,7 @@ extern void fat_truncate_blocks(struct inode *inode, loff_t offset);
extern int fat_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int flags);
+int fat_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
extern int fat_file_fsync(struct file *file, loff_t start, loff_t end,
int datasync);
diff --git a/fs/fat/file.c b/fs/fat/file.c
index becccdd2e501..37e7049b4c8c 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -17,6 +17,7 @@
#include <linux/fsnotify.h>
#include <linux/security.h>
#include <linux/falloc.h>
+#include <linux/fileattr.h>
#include "fat.h"
static long fat_fallocate(struct file *file, int mode,
@@ -398,6 +399,40 @@ void fat_truncate_blocks(struct inode *inode, loff_t offset)
fat_flush_inodes(inode->i_sb, inode, NULL);
}
+int fat_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ struct msdos_sb_info *sbi = MSDOS_SB(dentry->d_sb);
+ bool case_sensitive;
+
+ /*
+ * FAT filesystems are case-insensitive by default. VFAT
+ * becomes case-sensitive when mounted with 'check=strict',
+ * which installs vfat_dentry_ops. MSDOS has no such option;
+ * its 'nocase' mount option selects case-sensitive matching.
+ *
+ * VFAT long filename entries preserve case. Without VFAT, only
+ * uppercased 8.3 short names are stored. MSDOS with 'nocase'
+ * also preserves case.
+ */
+ if (sbi->options.isvfat)
+ case_sensitive = sbi->options.name_check == 's';
+ else
+ case_sensitive = sbi->options.nocase;
+
+ if (!case_sensitive) {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ if (!sbi->options.isvfat)
+ fa->fsx_xflags |= FS_XFLAG_CASENONPRESERVING;
+ }
+ if (d_inode(dentry)->i_flags & S_IMMUTABLE) {
+ fa->fsx_xflags |= FS_XFLAG_IMMUTABLE;
+ fa->flags |= FS_IMMUTABLE_FL;
+ }
+ return 0;
+}
+EXPORT_SYMBOL_GPL(fat_fileattr_get);
+
int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask, unsigned int flags)
{
@@ -575,5 +610,6 @@ EXPORT_SYMBOL_GPL(fat_setattr);
const struct inode_operations fat_file_inode_operations = {
.setattr = fat_setattr,
.getattr = fat_getattr,
+ .fileattr_get = fat_fileattr_get,
.update_time = fat_update_time,
};
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index 4cc65f330fb7..0fd2971ad4b1 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -644,6 +644,7 @@ static const struct inode_operations msdos_dir_inode_operations = {
.rename = msdos_rename,
.setattr = fat_setattr,
.getattr = fat_getattr,
+ .fileattr_get = fat_fileattr_get,
.update_time = fat_update_time,
};
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 918b3756674c..e909447873e3 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -1185,6 +1185,7 @@ static const struct inode_operations vfat_dir_inode_operations = {
.rename = vfat_rename2,
.setattr = fat_setattr,
.getattr = fat_getattr,
+ .fileattr_get = fat_fileattr_get,
.update_time = fat_update_time,
};
--
2.53.0
^ permalink raw reply related
* [PATCH v14 04/15] exfat: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Report exFAT's case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. exFAT compares names through the volume's upcase table; in
practice that table folds case, and case is preserved at rest.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/exfat/exfat_fs.h | 2 ++
fs/exfat/file.c | 18 ++++++++++++++++--
fs/exfat/namei.c | 1 +
3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
index 89ef5368277f..aff4dcd4e75a 100644
--- a/fs/exfat/exfat_fs.h
+++ b/fs/exfat/exfat_fs.h
@@ -496,6 +496,8 @@ int exfat_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, unsigned int request_mask,
unsigned int query_flags);
+struct file_kattr;
+int exfat_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
int exfat_file_fsync(struct file *file, loff_t start, loff_t end, int datasync);
long exfat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
long exfat_compat_ioctl(struct file *filp, unsigned int cmd,
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 354bdcfe4abc..91e5511945d1 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -14,6 +14,7 @@
#include <linux/writeback.h>
#include <linux/filelock.h>
#include <linux/falloc.h>
+#include <linux/fileattr.h>
#include "exfat_raw.h"
#include "exfat_fs.h"
@@ -323,6 +324,18 @@ int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
return 0;
}
+int exfat_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ /*
+ * exFAT compares filenames through an upcase table, so lookup
+ * is always case-insensitive. Long names are stored in UTF-16
+ * with case intact; CASENONPRESERVING stays clear.
+ */
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ return 0;
+}
+
int exfat_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *attr)
{
@@ -817,6 +830,7 @@ const struct file_operations exfat_file_operations = {
};
const struct inode_operations exfat_file_inode_operations = {
- .setattr = exfat_setattr,
- .getattr = exfat_getattr,
+ .setattr = exfat_setattr,
+ .getattr = exfat_getattr,
+ .fileattr_get = exfat_fileattr_get,
};
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
index 2c5636634b4a..94002e43db08 100644
--- a/fs/exfat/namei.c
+++ b/fs/exfat/namei.c
@@ -1311,4 +1311,5 @@ const struct inode_operations exfat_dir_inode_operations = {
.rename = exfat_rename,
.setattr = exfat_setattr,
.getattr = exfat_getattr,
+ .fileattr_get = exfat_fileattr_get,
};
--
2.53.0
^ permalink raw reply related
* [PATCH v14 05/15] ntfs3: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Report NTFS case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. NTFS always preserves case at rest.
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/ntfs3/file.c | 29 +++++++++++++++++++++++++++++
fs/ntfs3/namei.c | 1 +
fs/ntfs3/ntfs_fs.h | 1 +
3 files changed, 31 insertions(+)
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index b041639ab406..ad9350d7fc3f 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -180,6 +180,34 @@ long ntfs_compat_ioctl(struct file *filp, u32 cmd, unsigned long arg)
}
#endif
+/*
+ * ntfs_fileattr_get - inode_operations::fileattr_get
+ */
+int ntfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ struct inode *inode = d_inode(dentry);
+ struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+
+ /* Avoid any operation if inode is bad. */
+ if (unlikely(is_bad_ni(ntfs_i(inode))))
+ return -EINVAL;
+
+ /*
+ * NTFS preserves case (the default). Case sensitivity depends on
+ * mount options: with "nocase", NTFS is case-insensitive;
+ * otherwise it is case-sensitive.
+ */
+ if (sbi->options->nocase) {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ }
+ if (inode->i_flags & S_IMMUTABLE) {
+ fa->fsx_xflags |= FS_XFLAG_IMMUTABLE;
+ fa->flags |= FS_IMMUTABLE_FL;
+ }
+ return 0;
+}
+
/*
* ntfs_getattr - inode_operations::getattr
*/
@@ -1547,6 +1575,7 @@ const struct inode_operations ntfs_file_inode_operations = {
.get_acl = ntfs_get_acl,
.set_acl = ntfs_set_acl,
.fiemap = ntfs_fiemap,
+ .fileattr_get = ntfs_fileattr_get,
};
const struct file_operations ntfs_file_operations = {
diff --git a/fs/ntfs3/namei.c b/fs/ntfs3/namei.c
index b2af8f695e60..e159ba66a34a 100644
--- a/fs/ntfs3/namei.c
+++ b/fs/ntfs3/namei.c
@@ -518,6 +518,7 @@ const struct inode_operations ntfs_dir_inode_operations = {
.getattr = ntfs_getattr,
.listxattr = ntfs_listxattr,
.fiemap = ntfs_fiemap,
+ .fileattr_get = ntfs_fileattr_get,
};
const struct inode_operations ntfs_special_inode_operations = {
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index bbf3b6a1dcbe..41db22d652c4 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -529,6 +529,7 @@ bool dir_is_empty(struct inode *dir);
extern const struct file_operations ntfs_dir_operations;
/* Globals from file.c */
+int ntfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
int ntfs_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask, u32 flags);
int ntfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
--
2.53.0
^ permalink raw reply related
* [PATCH v14 06/15] hfs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:52 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Report HFS case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. HFS is always case-insensitive (using Mac OS Roman case
folding) and always preserves case at rest.
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/hfs/dir.c | 1 +
fs/hfs/hfs_fs.h | 2 ++
fs/hfs/inode.c | 14 ++++++++++++++
3 files changed, 17 insertions(+)
diff --git a/fs/hfs/dir.c b/fs/hfs/dir.c
index f5e7efe924e7..c4c6e1623f55 100644
--- a/fs/hfs/dir.c
+++ b/fs/hfs/dir.c
@@ -328,4 +328,5 @@ const struct inode_operations hfs_dir_inode_operations = {
.rmdir = hfs_remove,
.rename = hfs_rename,
.setattr = hfs_inode_setattr,
+ .fileattr_get = hfs_fileattr_get,
};
diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
index ac0e83f77a0f..1b23448c9a48 100644
--- a/fs/hfs/hfs_fs.h
+++ b/fs/hfs/hfs_fs.h
@@ -177,6 +177,8 @@ extern int hfs_get_block(struct inode *inode, sector_t block,
extern const struct address_space_operations hfs_aops;
extern const struct address_space_operations hfs_btree_aops;
+struct file_kattr;
+int hfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
int hfs_write_begin(const struct kiocb *iocb, struct address_space *mapping,
loff_t pos, unsigned int len, struct folio **foliop,
void **fsdata);
diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
index 89b33a9d46d5..f41cc261684d 100644
--- a/fs/hfs/inode.c
+++ b/fs/hfs/inode.c
@@ -18,6 +18,7 @@
#include <linux/uio.h>
#include <linux/xattr.h>
#include <linux/blkdev.h>
+#include <linux/fileattr.h>
#include "hfs_fs.h"
#include "btree.h"
@@ -699,6 +700,18 @@ static int hfs_file_fsync(struct file *filp, loff_t start, loff_t end,
return ret;
}
+int hfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ /*
+ * HFS compares filenames using Mac OS Roman case folding, so
+ * lookup is always case-insensitive. Names are stored on disk
+ * with case intact; CASENONPRESERVING stays clear.
+ */
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ return 0;
+}
+
static const struct file_operations hfs_file_operations = {
.llseek = generic_file_llseek,
.read_iter = generic_file_read_iter,
@@ -715,4 +728,5 @@ static const struct inode_operations hfs_file_inode_operations = {
.lookup = hfs_file_lookup,
.setattr = hfs_inode_setattr,
.listxattr = generic_listxattr,
+ .fileattr_get = hfs_fileattr_get,
};
--
2.53.0
^ permalink raw reply related
* [PATCH v14 07/15] hfsplus: Report case sensitivity in fileattr_get
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Add case sensitivity reporting to the existing hfsplus_fileattr_get()
function via the FS_XFLAG_CASEFOLD flag. HFS+ always preserves case
at rest.
Case sensitivity depends on how the volume was formatted: HFSX
volumes may be either case-sensitive or case-insensitive, indicated
by the HFSPLUS_SB_CASEFOLD superblock flag.
FS_XFLAG_CASEFOLD is read-only: FS_XFLAG_RDONLY_MASK ensures
FS_IOC_FSSETXATTR strips it. The legacy FS_IOC_SETFLAGS path in
hfsplus_fileattr_set() also allows FS_CASEFOLD_FL through its
allowlist on case-insensitive volumes so that a chattr
read-modify-write cycle does not fail with EOPNOTSUPP.
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/hfsplus/inode.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index d05891ec492e..5565c14b4bf6 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -740,6 +740,7 @@ int hfsplus_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
{
struct inode *inode = d_inode(dentry);
struct hfsplus_inode_info *hip = HFSPLUS_I(inode);
+ struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb);
unsigned int flags = 0;
if (inode->i_flags & S_IMMUTABLE)
@@ -748,6 +749,8 @@ int hfsplus_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
flags |= FS_APPEND_FL;
if (hip->userflags & HFSPLUS_FLG_NODUMP)
flags |= FS_NODUMP_FL;
+ if (test_bit(HFSPLUS_SB_CASEFOLD, &sbi->flags))
+ flags |= FS_CASEFOLD_FL;
fileattr_fill_flags(fa, flags);
@@ -759,13 +762,24 @@ int hfsplus_fileattr_set(struct mnt_idmap *idmap,
{
struct inode *inode = d_inode(dentry);
struct hfsplus_inode_info *hip = HFSPLUS_I(inode);
+ struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb);
+ unsigned int allowed = FS_IMMUTABLE_FL | FS_APPEND_FL | FS_NODUMP_FL;
unsigned int new_fl = 0;
if (fileattr_has_fsx(fa))
return -EOPNOTSUPP;
+ /*
+ * FS_CASEFOLD_FL reflects HFSPLUS_SB_CASEFOLD, a mount-time
+ * property. Accept it as a no-op so chattr's RMW round-trip
+ * succeeds; reject any attempt to enable it on a volume that
+ * was not formatted case-insensitive.
+ */
+ if (test_bit(HFSPLUS_SB_CASEFOLD, &sbi->flags))
+ allowed |= FS_CASEFOLD_FL;
+
/* don't silently ignore unsupported ext2 flags */
- if (fa->flags & ~(FS_IMMUTABLE_FL|FS_APPEND_FL|FS_NODUMP_FL))
+ if (fa->flags & ~allowed)
return -EOPNOTSUPP;
if (fa->flags & FS_IMMUTABLE_FL)
--
2.53.0
^ permalink raw reply related
* [PATCH v14 08/15] xfs: Report case sensitivity in fileattr_get
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Upper layers such as NFSD need to query whether a filesystem
is case-sensitive. Add FS_XFLAG_CASEFOLD to xfs_ip2xflags()
when the filesystem is formatted with the ASCIICI feature
flag. This serves both FS_IOC_FSGETXATTR (via xfs_fill_fsxattr()
in xfs_fileattr_get()) and XFS_IOC_BULKSTAT (which populates
bs_xflags directly from xfs_ip2xflags()), so bulkstat consumers
and per-inode queries see a consistent view of the filesystem's
case-folding behavior.
FS_XFLAG_CASEFOLD is read-only: FS_XFLAG_RDONLY_MASK ensures
FS_IOC_FSSETXATTR strips it, and xfs_flags2diflags() has no
clause for CASEFOLD so the on-disk diflags are unaffected.
The legacy FS_IOC_SETFLAGS path in xfs_fileattr_set() also
allows FS_CASEFOLD_FL through its allowlist on ASCIICI
filesystems so that a chattr read-modify-write cycle does
not fail with EOPNOTSUPP.
XFS always preserves case. XFS is case-sensitive by default,
but supports ASCII case-insensitive lookups when formatted
with the ASCIICI feature flag.
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/xfs/libxfs/xfs_inode_util.c | 2 ++
fs/xfs/xfs_ioctl.c | 20 +++++++++++++++++---
2 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_inode_util.c b/fs/xfs/libxfs/xfs_inode_util.c
index 551fa51befb6..82be54b6f8d3 100644
--- a/fs/xfs/libxfs/xfs_inode_util.c
+++ b/fs/xfs/libxfs/xfs_inode_util.c
@@ -130,6 +130,8 @@ xfs_ip2xflags(
if (xfs_inode_has_attr_fork(ip))
flags |= FS_XFLAG_HASATTR;
+ if (xfs_has_asciici(ip->i_mount))
+ flags |= FS_XFLAG_CASEFOLD;
return flags;
}
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index ed9b4846c05f..f8216f74679f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -755,9 +755,23 @@ xfs_fileattr_set(
trace_xfs_ioctl_setattr(ip);
if (!fa->fsx_valid) {
- if (fa->flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL |
- FS_NOATIME_FL | FS_NODUMP_FL |
- FS_SYNC_FL | FS_DAX_FL | FS_PROJINHERIT_FL))
+ unsigned int allowed = FS_IMMUTABLE_FL | FS_APPEND_FL |
+ FS_NOATIME_FL | FS_NODUMP_FL |
+ FS_SYNC_FL | FS_DAX_FL |
+ FS_PROJINHERIT_FL;
+
+ /*
+ * FS_CASEFOLD_FL reflects the ASCIICI superblock feature,
+ * a read-only property. Accept it as a no-op so chattr's
+ * RMW round-trip succeeds; reject any attempt to enable
+ * it on a non-ASCIICI filesystem. xfs_flags2diflags()
+ * has no clause for CASEFOLD, so the bit is dropped from
+ * the on-disk diflags regardless.
+ */
+ if (xfs_has_asciici(mp))
+ allowed |= FS_CASEFOLD_FL;
+
+ if (fa->flags & ~allowed)
return -EOPNOTSUPP;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v14 09/15] cifs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Steve French, Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Upper layers such as NFSD need a way to query whether a filesystem
handles filenames in a case-sensitive manner. Report CIFS/SMB case
handling behavior via FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING.
The authoritative source is the server itself: at mount time CIFS
issues QueryFSInfo(FS_ATTRIBUTE_INFORMATION) and caches the reply
on the tcon. That reply carries FILE_CASE_SENSITIVE_SEARCH and
FILE_CASE_PRESERVED_NAMES, which reflect whatever case handling
the share actually implements after SMB3.1.1 POSIX extensions
negotiation. Translating those two bits into the VFS flags lets
cifs_fileattr_get report what the server advertises rather than
what the client was asked to pretend.
QueryFSInfo is best-effort; the mount completes even if the server
does not answer. MaxPathNameComponentLength is zero in that case
and is used as the "no reply received" sentinel. When no reply is
available, fall back to the nocase mount option so that the reported
behavior agrees with the dentry comparison operations installed on
the superblock.
The callback is registered on cifs_dir_inode_ops so that NFSD,
ksmbd, and other consumers querying case handling against a
directory get a definitive answer, and on cifs_file_inode_ops to
preserve FS_COMPR_FL reporting on regular files. cifs_set_ops()
also installs cifs_namespace_inode_operations on DFS referral
directories that carry IS_AUTOMOUNT; register the same callback
there so the answer does not depend on whether the directory is
a referral point.
Registering fileattr_get routes FS_IOC_GETFLAGS through
vfs_fileattr_get() and short-circuits the syscall's fallback to
cifs_ioctl(). That fallback invoked CIFSGetExtAttr() under
CONFIG_CIFS_POSIX and CONFIG_CIFS_ALLOW_INSECURE_LEGACY on servers
advertising CIFS_UNIX_EXTATTR_CAP, surfacing the SMB1 Unix-extension
immutable, append, and nodump bits. cifs_fileattr_get carries over
only FS_COMPR_FL from cached cifsAttrs; the SMB1 extattr fetch is
not reproduced. SMB1 is deprecated, and acquiring a netfid from
within a dentry-only callback is not worth preserving a path tied
to an insecure legacy dialect.
Acked-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/smb/client/cifsfs.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++
fs/smb/client/cifsfs.h | 3 +++
fs/smb/client/namespace.c | 1 +
3 files changed, 57 insertions(+)
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index 2025739f070a..6c113ae7fdd3 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -30,6 +30,7 @@
#include <linux/xattr.h>
#include <linux/mm.h>
#include <linux/key-type.h>
+#include <linux/fileattr.h>
#include <uapi/linux/magic.h>
#include <net/ipv6.h>
#include "cifsfs.h"
@@ -1199,6 +1200,56 @@ struct file_system_type smb3_fs_type = {
MODULE_ALIAS_FS("smb3");
MODULE_ALIAS("smb3");
+int cifs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ struct cifs_sb_info *cifs_sb = CIFS_SB(dentry->d_sb);
+ struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
+ struct inode *inode = d_inode(dentry);
+ u32 attrs;
+
+ /* Preserve FS_COMPR_FL previously reported by cifs_ioctl(). */
+ if (CIFS_I(inode)->cifsAttrs & ATTR_COMPRESSED)
+ fa->flags |= FS_COMPR_FL;
+
+ /*
+ * FS_CASEFOLD_FL is defined by UAPI as a folder attribute,
+ * and userspace tools (e.g., lsattr) display it only on
+ * directories. Confine the case-handling bits to directories
+ * to match that convention; for non-directories the share's
+ * case semantics are still discoverable through the parent.
+ */
+ if (!S_ISDIR(inode->i_mode))
+ return 0;
+
+ /*
+ * The server's FS_ATTRIBUTE_INFORMATION response, cached on
+ * the tcon at mount, reflects the share's case-handling
+ * semantics after any POSIX extensions negotiation. Prefer
+ * it over the client-local nocase mount option, which only
+ * governs dentry comparison on this superblock.
+ *
+ * QueryFSInfo is best-effort at mount; when it did not
+ * populate fsAttrInfo, MaxPathNameComponentLength remains
+ * zero. In that case fall back to nocase so the reporting
+ * matches the comparison behavior installed on the sb.
+ */
+ if (le32_to_cpu(tcon->fsAttrInfo.MaxPathNameComponentLength) == 0) {
+ if (tcon->nocase) {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ }
+ return 0;
+ }
+ attrs = le32_to_cpu(tcon->fsAttrInfo.Attributes);
+ if (!(attrs & FILE_CASE_SENSITIVE_SEARCH)) {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ }
+ if (!(attrs & FILE_CASE_PRESERVED_NAMES))
+ fa->fsx_xflags |= FS_XFLAG_CASENONPRESERVING;
+ return 0;
+}
+
const struct inode_operations cifs_dir_inode_ops = {
.create = cifs_create,
.atomic_open = cifs_atomic_open,
@@ -1217,6 +1268,7 @@ const struct inode_operations cifs_dir_inode_ops = {
.listxattr = cifs_listxattr,
.get_acl = cifs_get_acl,
.set_acl = cifs_set_acl,
+ .fileattr_get = cifs_fileattr_get,
};
const struct inode_operations cifs_file_inode_ops = {
@@ -1227,6 +1279,7 @@ const struct inode_operations cifs_file_inode_ops = {
.fiemap = cifs_fiemap,
.get_acl = cifs_get_acl,
.set_acl = cifs_set_acl,
+ .fileattr_get = cifs_fileattr_get,
};
const char *cifs_get_link(struct dentry *dentry, struct inode *inode,
diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h
index 7370b38da938..5f0d459d1a89 100644
--- a/fs/smb/client/cifsfs.h
+++ b/fs/smb/client/cifsfs.h
@@ -89,6 +89,9 @@ extern const struct inode_operations cifs_file_inode_ops;
extern const struct inode_operations cifs_symlink_inode_ops;
extern const struct inode_operations cifs_namespace_inode_operations;
+struct file_kattr;
+int cifs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
+
/* Functions related to files and directories */
extern const struct netfs_request_ops cifs_req_ops;
diff --git a/fs/smb/client/namespace.c b/fs/smb/client/namespace.c
index 52a520349cb7..52a51b032fae 100644
--- a/fs/smb/client/namespace.c
+++ b/fs/smb/client/namespace.c
@@ -294,4 +294,5 @@ struct vfsmount *cifs_d_automount(struct path *path)
}
const struct inode_operations cifs_namespace_inode_operations = {
+ .fileattr_get = cifs_fileattr_get,
};
--
2.53.0
^ permalink raw reply related
* [PATCH v14 10/15] nfs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
An NFS server re-exporting an NFS mount point needs to report
the case sensitivity behavior of the underlying filesystem to
its clients. NFSD's attribute encoder obtains that information
by calling vfs_fileattr_get() on the lower filesystem, so the
NFS client must implement fileattr_get to surface what it
learned from its own server.
The NFS client already retrieves case sensitivity information
from servers during mount via PATHCONF (NFSv3) or the
FATTR4_CASE_INSENSITIVE/FATTR4_CASE_PRESERVING attributes
(NFSv4). Expose this information through fileattr_get by
reporting the FS_XFLAG_CASEFOLD and FS_XFLAG_CASENONPRESERVING
flags. NFSv2 lacks PATHCONF support, so mounts using that protocol
version default to standard POSIX behavior: case-sensitive and
case-preserving.
PATHCONF is now invoked unconditionally for NFSv2 and NFSv3 mounts
so the case-sensitivity capabilities are established even when the
user pins server->namelen with the namlen= mount option. That option
is orthogonal to case handling, and skipping PATHCONF because
namelen was already known would leave the caps unset.
The two capability bits carry opposite polarity because their POSIX
defaults differ. Most servers are case-sensitive and case-
preserving, matching "neither xflag set." NFS_CAP_CASE_INSENSITIVE
is set only when the server affirms case insensitivity, so "server
said no" and "server did not answer" both collapse to the case-
sensitive default. NFS_CAP_CASE_NONPRESERVING follows the same
pattern in the opposite direction: set only when the server affirms
that it does not preserve case, so that silence or a missing
attribute lands on the case-preserving default. The NFSv4 probe
checks res.attr_bitmask[0] to distinguish "server said false" from
"server omitted the attribute" before setting the bit.
Both capability bits are cleared before each probe so a remount,
an NFSv4 transparent state migration to a server with different
case semantics, or a probe whose reply does not arrive does not
retain stale capabilities from the prior probe.
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/client.c | 21 +++++++++++++++------
fs/nfs/inode.c | 15 +++++++++++++++
fs/nfs/internal.h | 3 +++
fs/nfs/namespace.c | 2 ++
fs/nfs/nfs3proc.c | 2 ++
fs/nfs/nfs3xdr.c | 7 +++++--
fs/nfs/nfs4proc.c | 10 +++++++---
fs/nfs/proc.c | 3 +++
fs/nfs/symlink.c | 3 +++
include/linux/nfs_fs_sb.h | 2 +-
include/linux/nfs_xdr.h | 2 ++
11 files changed, 58 insertions(+), 12 deletions(-)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index be02bb227741..3db2f18315b8 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -914,6 +914,7 @@ static void nfs_server_set_fsinfo(struct nfs_server *server,
*/
static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, struct nfs_fattr *fattr)
{
+ struct nfs_pathconf pathinfo = { };
struct nfs_fsinfo fsinfo;
struct nfs_client *clp = server->nfs_client;
int error;
@@ -933,15 +934,23 @@ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, str
nfs_server_set_fsinfo(server, &fsinfo);
- /* Get some general file system info */
- if (server->namelen == 0) {
- struct nfs_pathconf pathinfo;
+ pathinfo.fattr = fattr;
+ nfs_fattr_init(fattr);
- pathinfo.fattr = fattr;
- nfs_fattr_init(fattr);
+ /* Clear before probing so a failed RPC does not retain stale bits. */
+ if (clp->rpc_ops->version < 4)
+ server->caps &= ~(NFS_CAP_CASE_INSENSITIVE |
+ NFS_CAP_CASE_NONPRESERVING);
- if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0)
+ if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) {
+ if (server->namelen == 0)
server->namelen = pathinfo.max_namelen;
+ if (clp->rpc_ops->version < 4) {
+ if (pathinfo.case_insensitive)
+ server->caps |= NFS_CAP_CASE_INSENSITIVE;
+ if (!pathinfo.case_preserving)
+ server->caps |= NFS_CAP_CASE_NONPRESERVING;
+ }
}
if (clp->rpc_ops->discover_trunking != NULL &&
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 98a8f0de1199..fdcbe6f2052c 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -41,6 +41,7 @@
#include <linux/freezer.h>
#include <linux/uaccess.h>
#include <linux/iversion.h>
+#include <linux/fileattr.h>
#include "nfs4_fs.h"
#include "callback.h"
@@ -1101,6 +1102,20 @@ int nfs_getattr(struct mnt_idmap *idmap, const struct path *path,
}
EXPORT_SYMBOL_GPL(nfs_getattr);
+int nfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ struct inode *inode = d_inode(dentry);
+
+ if (nfs_server_capable(inode, NFS_CAP_CASE_INSENSITIVE)) {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ }
+ if (nfs_server_capable(inode, NFS_CAP_CASE_NONPRESERVING))
+ fa->fsx_xflags |= FS_XFLAG_CASENONPRESERVING;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(nfs_fileattr_get);
+
static void nfs_init_lock_context(struct nfs_lock_context *l_ctx)
{
refcount_set(&l_ctx->count, 1);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index fc5456377160..309d3f679bb3 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -449,6 +449,9 @@ extern void nfs_set_cache_invalid(struct inode *inode, unsigned long flags);
extern bool nfs_check_cache_invalid(struct inode *, unsigned long);
extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode);
+struct file_kattr;
+int nfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
+
#if IS_ENABLED(CONFIG_NFS_LOCALIO)
/* localio.c */
struct nfs_local_dio {
diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
index af9be0c5f516..6d0073c24771 100644
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -246,11 +246,13 @@ nfs_namespace_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
const struct inode_operations nfs_mountpoint_inode_operations = {
.getattr = nfs_getattr,
.setattr = nfs_setattr,
+ .fileattr_get = nfs_fileattr_get,
};
const struct inode_operations nfs_referral_inode_operations = {
.getattr = nfs_namespace_getattr,
.setattr = nfs_namespace_setattr,
+ .fileattr_get = nfs_fileattr_get,
};
static void nfs_expire_automounts(struct work_struct *work)
diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index 95d7cd564b74..b80d0c5efc27 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -1053,6 +1053,7 @@ static const struct inode_operations nfs3_dir_inode_operations = {
.permission = nfs_permission,
.getattr = nfs_getattr,
.setattr = nfs_setattr,
+ .fileattr_get = nfs_fileattr_get,
#ifdef CONFIG_NFS_V3_ACL
.listxattr = nfs3_listxattr,
.get_inode_acl = nfs3_get_acl,
@@ -1064,6 +1065,7 @@ static const struct inode_operations nfs3_file_inode_operations = {
.permission = nfs_permission,
.getattr = nfs_getattr,
.setattr = nfs_setattr,
+ .fileattr_get = nfs_fileattr_get,
#ifdef CONFIG_NFS_V3_ACL
.listxattr = nfs3_listxattr,
.get_inode_acl = nfs3_get_acl,
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index e17d72908412..e745e78faab0 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -2276,8 +2276,11 @@ static int decode_pathconf3resok(struct xdr_stream *xdr,
if (unlikely(!p))
return -EIO;
result->max_link = be32_to_cpup(p++);
- result->max_namelen = be32_to_cpup(p);
- /* ignore remaining fields */
+ result->max_namelen = be32_to_cpup(p++);
+ p++; /* ignore no_trunc */
+ p++; /* ignore chown_restricted */
+ result->case_insensitive = be32_to_cpup(p++) != 0;
+ result->case_preserving = be32_to_cpup(p) != 0;
return 0;
}
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index d839a97df822..62f66684fbc8 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3933,7 +3933,8 @@ static int _nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *f
server->caps &=
~(NFS_CAP_ACLS | NFS_CAP_HARDLINKS | NFS_CAP_SYMLINKS |
NFS_CAP_SECURITY_LABEL | NFS_CAP_FS_LOCATIONS |
- NFS_CAP_OPEN_XOR | NFS_CAP_DELEGTIME);
+ NFS_CAP_OPEN_XOR | NFS_CAP_DELEGTIME |
+ NFS_CAP_CASE_INSENSITIVE | NFS_CAP_CASE_NONPRESERVING);
server->fattr_valid = NFS_ATTR_FATTR_V4;
if (res.attr_bitmask[0] & FATTR4_WORD0_ACL &&
res.acl_bitmask & ACL4_SUPPORT_ALLOW_ACL)
@@ -3944,8 +3945,9 @@ static int _nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *f
server->caps |= NFS_CAP_SYMLINKS;
if (res.case_insensitive)
server->caps |= NFS_CAP_CASE_INSENSITIVE;
- if (res.case_preserving)
- server->caps |= NFS_CAP_CASE_PRESERVING;
+ if ((res.attr_bitmask[0] & FATTR4_WORD0_CASE_PRESERVING) &&
+ !res.case_preserving)
+ server->caps |= NFS_CAP_CASE_NONPRESERVING;
#ifdef CONFIG_NFS_V4_SECURITY_LABEL
if (res.attr_bitmask[2] & FATTR4_WORD2_SECURITY_LABEL)
server->caps |= NFS_CAP_SECURITY_LABEL;
@@ -10598,6 +10600,7 @@ static const struct inode_operations nfs4_dir_inode_operations = {
.getattr = nfs_getattr,
.setattr = nfs_setattr,
.listxattr = nfs4_listxattr,
+ .fileattr_get = nfs_fileattr_get,
};
static const struct inode_operations nfs4_file_inode_operations = {
@@ -10605,6 +10608,7 @@ static const struct inode_operations nfs4_file_inode_operations = {
.getattr = nfs_getattr,
.setattr = nfs_setattr,
.listxattr = nfs4_listxattr,
+ .fileattr_get = nfs_fileattr_get,
};
static struct nfs_server *nfs4_clone_server(struct nfs_server *source,
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index 70795684b8e8..03c2c1f31be9 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -598,6 +598,7 @@ nfs_proc_pathconf(struct nfs_server *server, struct nfs_fh *fhandle,
{
info->max_link = 0;
info->max_namelen = NFS2_MAXNAMLEN;
+ info->case_preserving = true;
return 0;
}
@@ -718,12 +719,14 @@ static const struct inode_operations nfs_dir_inode_operations = {
.permission = nfs_permission,
.getattr = nfs_getattr,
.setattr = nfs_setattr,
+ .fileattr_get = nfs_fileattr_get,
};
static const struct inode_operations nfs_file_inode_operations = {
.permission = nfs_permission,
.getattr = nfs_getattr,
.setattr = nfs_setattr,
+ .fileattr_get = nfs_fileattr_get,
};
const struct nfs_rpc_ops nfs_v2_clientops = {
diff --git a/fs/nfs/symlink.c b/fs/nfs/symlink.c
index 58146e935402..74a072896f8d 100644
--- a/fs/nfs/symlink.c
+++ b/fs/nfs/symlink.c
@@ -22,6 +22,8 @@
#include <linux/mm.h>
#include <linux/string.h>
+#include "internal.h"
+
/* Symlink caching in the page cache is even more simplistic
* and straight-forward than readdir caching.
*/
@@ -74,4 +76,5 @@ const struct inode_operations nfs_symlink_inode_operations = {
.get_link = nfs_get_link,
.getattr = nfs_getattr,
.setattr = nfs_setattr,
+ .fileattr_get = nfs_fileattr_get,
};
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 4daee27fa5eb..34d294774f8c 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -306,7 +306,7 @@ struct nfs_server {
#define NFS_CAP_ATOMIC_OPEN (1U << 4)
#define NFS_CAP_LGOPEN (1U << 5)
#define NFS_CAP_CASE_INSENSITIVE (1U << 6)
-#define NFS_CAP_CASE_PRESERVING (1U << 7)
+#define NFS_CAP_CASE_NONPRESERVING (1U << 7)
#define NFS_CAP_REBOOT_LAYOUTRETURN (1U << 8)
#define NFS_CAP_OFFLOAD_STATUS (1U << 9)
#define NFS_CAP_ZERO_RANGE (1U << 10)
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index ff1f12aa73d2..7c2057e40f99 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -182,6 +182,8 @@ struct nfs_pathconf {
struct nfs_fattr *fattr; /* Post-op attributes */
__u32 max_link; /* max # of hard links */
__u32 max_namelen; /* max name length */
+ bool case_insensitive;
+ bool case_preserving;
};
struct nfs4_change_info {
--
2.53.0
^ permalink raw reply related
* [PATCH v14 11/15] vboxsf: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Upper layers such as NFSD need a way to query whether a
filesystem handles filenames in a case-sensitive manner. Report
VirtualBox shared folder case handling behavior via the
FS_XFLAG_CASEFOLD flag.
The case sensitivity property is queried from the VirtualBox host
service at mount time and cached in struct vboxsf_sbi. The host
determines case sensitivity based on the underlying host filesystem
(for example, Windows NTFS is case-insensitive while Linux ext4 is
case-sensitive).
VirtualBox shared folders always preserve filename case exactly
as provided by the guest. The host interface does not expose a
separate case-preserving property; leaving
FS_XFLAG_CASENONPRESERVING unset reports the POSIX-default
case-preserving behavior, which matches vboxsf semantics.
The callback is registered in all three inode_operations
structures (directory, file, and symlink) to ensure consistent
reporting across all inode types.
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/vboxsf/dir.c | 1 +
fs/vboxsf/file.c | 6 ++++--
fs/vboxsf/super.c | 7 +++++++
fs/vboxsf/utils.c | 30 ++++++++++++++++++++++++++++++
fs/vboxsf/vfsmod.h | 6 ++++++
5 files changed, 48 insertions(+), 2 deletions(-)
diff --git a/fs/vboxsf/dir.c b/fs/vboxsf/dir.c
index 42bedc4ec7af..c5bd3271aa96 100644
--- a/fs/vboxsf/dir.c
+++ b/fs/vboxsf/dir.c
@@ -477,4 +477,5 @@ const struct inode_operations vboxsf_dir_iops = {
.symlink = vboxsf_dir_symlink,
.getattr = vboxsf_getattr,
.setattr = vboxsf_setattr,
+ .fileattr_get = vboxsf_fileattr_get,
};
diff --git a/fs/vboxsf/file.c b/fs/vboxsf/file.c
index 7a7a3fbb2651..943953867e18 100644
--- a/fs/vboxsf/file.c
+++ b/fs/vboxsf/file.c
@@ -222,7 +222,8 @@ const struct file_operations vboxsf_reg_fops = {
const struct inode_operations vboxsf_reg_iops = {
.getattr = vboxsf_getattr,
- .setattr = vboxsf_setattr
+ .setattr = vboxsf_setattr,
+ .fileattr_get = vboxsf_fileattr_get,
};
static int vboxsf_read_folio(struct file *file, struct folio *folio)
@@ -389,5 +390,6 @@ static const char *vboxsf_get_link(struct dentry *dentry, struct inode *inode,
}
const struct inode_operations vboxsf_lnk_iops = {
- .get_link = vboxsf_get_link
+ .get_link = vboxsf_get_link,
+ .fileattr_get = vboxsf_fileattr_get,
};
diff --git a/fs/vboxsf/super.c b/fs/vboxsf/super.c
index a618cb093e00..a61fbab51d37 100644
--- a/fs/vboxsf/super.c
+++ b/fs/vboxsf/super.c
@@ -185,6 +185,13 @@ static int vboxsf_fill_super(struct super_block *sb, struct fs_context *fc)
if (err)
goto fail_unmap;
+ /*
+ * A failed query leaves sbi->case_insensitive false, so the
+ * mount defaults to reporting case-sensitive behavior. Do not
+ * fail the mount over an advisory attribute.
+ */
+ vboxsf_query_case_sensitive(sbi);
+
sb->s_magic = VBOXSF_SUPER_MAGIC;
sb->s_blocksize = 1024;
sb->s_maxbytes = MAX_LFS_FILESIZE;
diff --git a/fs/vboxsf/utils.c b/fs/vboxsf/utils.c
index 440e8c50629d..298bfc93255c 100644
--- a/fs/vboxsf/utils.c
+++ b/fs/vboxsf/utils.c
@@ -11,6 +11,7 @@
#include <linux/sizes.h>
#include <linux/pagemap.h>
#include <linux/vfs.h>
+#include <linux/fileattr.h>
#include "vfsmod.h"
struct inode *vboxsf_new_inode(struct super_block *sb)
@@ -567,3 +568,32 @@ int vboxsf_dir_read_all(struct vboxsf_sbi *sbi, struct vboxsf_dir_info *sf_d,
return err;
}
+
+int vboxsf_query_case_sensitive(struct vboxsf_sbi *sbi)
+{
+ struct shfl_volinfo volinfo = {};
+ u32 buf_len;
+ int err;
+
+ buf_len = sizeof(volinfo);
+ err = vboxsf_fsinfo(sbi->root, 0, SHFL_INFO_GET | SHFL_INFO_VOLUME,
+ &buf_len, &volinfo);
+ if (err)
+ return err;
+ if (buf_len < sizeof(volinfo))
+ return 0;
+
+ sbi->case_insensitive = !volinfo.properties.case_sensitive;
+ return 0;
+}
+
+int vboxsf_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ struct vboxsf_sbi *sbi = VBOXSF_SBI(dentry->d_sb);
+
+ if (sbi->case_insensitive) {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ }
+ return 0;
+}
diff --git a/fs/vboxsf/vfsmod.h b/fs/vboxsf/vfsmod.h
index 05973eb89d52..b61afd0ce842 100644
--- a/fs/vboxsf/vfsmod.h
+++ b/fs/vboxsf/vfsmod.h
@@ -47,6 +47,7 @@ struct vboxsf_sbi {
u32 next_generation;
u32 root;
int bdi_id;
+ bool case_insensitive;
};
/* per-inode information */
@@ -111,6 +112,11 @@ void vboxsf_dir_info_free(struct vboxsf_dir_info *p);
int vboxsf_dir_read_all(struct vboxsf_sbi *sbi, struct vboxsf_dir_info *sf_d,
u64 handle);
+int vboxsf_query_case_sensitive(struct vboxsf_sbi *sbi);
+
+struct file_kattr;
+int vboxsf_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
+
/* from vboxsf_wrappers.c */
int vboxsf_connect(void);
void vboxsf_disconnect(void);
--
2.53.0
^ permalink raw reply related
* [PATCH v14 12/15] isofs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Upper layers such as NFSD need a way to query whether a
filesystem handles filenames in a case-sensitive manner so
they can provide correct semantics to remote clients. Without
this information, NFS exports of ISO 9660 filesystems cannot
advertise their filename case behavior.
Implement isofs_fileattr_get() to report ISO 9660 case handling
behavior. The 'check=r' (relaxed) mount option enables
case-insensitive lookups and is reported via FS_XFLAG_CASEFOLD.
By default, Joliet extensions operate in relaxed mode while
plain ISO 9660 uses strict (case-sensitive) mode.
Plain ISO 9660 names on the medium are uppercase. When neither
Rock Ridge nor Joliet is in effect, the default 'map=n' option
(and 'map=a') routes lookup and readdir through
isofs_name_translate(), which forces A-Z to a-z. The names
visible to userspace then differ in case from the on-disc form,
so report FS_XFLAG_CASENONPRESERVING in that configuration. Rock
Ridge and Joliet both deliver names as authored, and 'map=o'
emits the raw on-disc name unchanged, so those configurations
remain case-preserving.
Casefolding is a directory property, and the in-tree consumers
(NFSD, ksmbd) issue the query against a directory: NFSD walks
to the parent for non-directory dentries before calling
vfs_fileattr_get(), and ksmbd reports per-share attributes from
the share root. Wire .fileattr_get only on
isofs_dir_inode_operations. The CASEFOLD flag is set in both
fa->fsx_xflags and fa->flags so FS_IOC_FSGETXATTR and
FS_IOC_GETFLAGS agree.
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/isofs/dir.c | 16 ++++++++++++++++
fs/isofs/isofs.h | 3 +++
2 files changed, 19 insertions(+)
diff --git a/fs/isofs/dir.c b/fs/isofs/dir.c
index 2fd9948d606e..55385a72a4ce 100644
--- a/fs/isofs/dir.c
+++ b/fs/isofs/dir.c
@@ -14,6 +14,7 @@
#include <linux/gfp.h>
#include <linux/filelock.h>
#include "isofs.h"
+#include <linux/fileattr.h>
int isofs_name_translate(struct iso_directory_record *de, char *new, struct inode *inode)
{
@@ -267,6 +268,20 @@ static int isofs_readdir(struct file *file, struct dir_context *ctx)
return result;
}
+int isofs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+ struct isofs_sb_info *sbi = ISOFS_SB(dentry->d_sb);
+
+ if (sbi->s_check == 'r') {
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+ fa->flags |= FS_CASEFOLD_FL;
+ }
+ if (!sbi->s_joliet_level && !sbi->s_rock &&
+ (sbi->s_mapping == 'n' || sbi->s_mapping == 'a'))
+ fa->fsx_xflags |= FS_XFLAG_CASENONPRESERVING;
+ return 0;
+}
+
const struct file_operations isofs_dir_operations =
{
.llseek = generic_file_llseek,
@@ -281,6 +296,7 @@ const struct file_operations isofs_dir_operations =
const struct inode_operations isofs_dir_inode_operations =
{
.lookup = isofs_lookup,
+ .fileattr_get = isofs_fileattr_get,
};
diff --git a/fs/isofs/isofs.h b/fs/isofs/isofs.h
index 506555837533..0ec8b24a42ed 100644
--- a/fs/isofs/isofs.h
+++ b/fs/isofs/isofs.h
@@ -197,6 +197,9 @@ isofs_normalize_block_and_offset(struct iso_directory_record* de,
}
}
+struct file_kattr;
+int isofs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
+
extern const struct inode_operations isofs_dir_inode_operations;
extern const struct file_operations isofs_dir_operations;
extern const struct address_space_operations isofs_symlink_aops;
--
2.53.0
^ permalink raw reply related
* [PATCH v14 13/15] nfsd: Report export case-folding via NFSv3 PATHCONF
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
The hard-coded MSDOS_SUPER_MAGIC check in nfsd3_proc_pathconf()
only recognizes FAT filesystems as case-insensitive. Modern
filesystems like F2FS, exFAT, and CIFS support case-insensitive
directories, but NFSv3 clients cannot discover this capability.
Query the export's actual case behavior through ->fileattr_get
instead. This allows NFSv3 clients to correctly handle case
sensitivity for any filesystem that implements the fileattr
interface. Filesystems without ->fileattr_get continue to report
the default POSIX behavior (case-sensitive, case-preserving).
This change depends on the earlier "fat: Implement fileattr_get
for case sensitivity" patch in this series, which ensures FAT
filesystems report their case behavior correctly via the
fileattr interface.
Case-folding is a per-directory property, so
nfsd_get_case_info() queries the parent dentry for
non-directory filehandles. Three inherent corner cases follow:
a single-file export's parent lies outside the exported
subtree, so the LSM hook evaluates against an unexported
directory; a disconnected dentry from fh_verify() has
d_parent == itself, so the file's own attributes are reported
until the dentry connects; and a hardlinked file resolves
through the alias the dcache currently holds, so when the
inode is linked into both case-folded and case-sensitive
directories the reported value tracks whichever parent is
active. These limitations are not addressable without
redefining the protocol attribute as per-parent rather than
per-object.
RFC 1813 restricts PATHCONF errors to NFS3ERR_STALE,
NFS3ERR_BADHANDLE, and NFS3ERR_SERVERFAULT. When an LSM hook
denies the case-folding query on the parent, NFS3ERR_STALE is
the only correct mapping: NFS3ERR_SERVERFAULT misrepresents a
working server as broken, and NFS3ERR_BADHANDLE implies a
decoding failure that did not occur. A client purging the
filehandle on receipt is the desired outcome, since the server
has refused to read attributes through it. Substituting POSIX
defaults instead would let the same handle report
casefold=false now and casefold=true once policy permits,
opening a silent name-collision window on case-insensitive
exports.
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs3proc.c | 36 +++++++++++++++++-----
fs/nfsd/vfs.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/nfsd/vfs.h | 3 ++
fs/nfsd/xdr3.h | 4 +--
4 files changed, 121 insertions(+), 10 deletions(-)
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 42adc5461db0..12b9172c6be1 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -710,23 +710,43 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
resp->p_name_max = 255; /* at least */
resp->p_no_trunc = 0;
resp->p_chown_restricted = 1;
- resp->p_case_insensitive = 0;
- resp->p_case_preserving = 1;
+ resp->p_case_insensitive = false;
+ resp->p_case_preserving = true;
resp->status = fh_verify(rqstp, &argp->fh, 0, NFSD_MAY_NOP);
if (resp->status == nfs_ok) {
struct super_block *sb = argp->fh.fh_dentry->d_sb;
+ int err;
- /* Note that we don't care for remote fs's here */
- switch (sb->s_magic) {
- case EXT2_SUPER_MAGIC:
+ if (sb->s_magic == EXT2_SUPER_MAGIC) {
resp->p_link_max = EXT2_LINK_MAX;
resp->p_name_max = EXT2_NAME_LEN;
+ }
+
+ err = nfsd_get_case_info(argp->fh.fh_dentry,
+ &resp->p_case_insensitive,
+ &resp->p_case_preserving);
+ /*
+ * RFC 1813 lists NFS3ERR_STALE, NFS3ERR_BADHANDLE, and
+ * NFS3ERR_SERVERFAULT as the only PATHCONF errors.
+ */
+ switch (err) {
+ case 0:
+ case -EOPNOTSUPP:
+ /* Both arms leave the output booleans valid. */
break;
- case MSDOS_SUPER_MAGIC:
- resp->p_case_insensitive = 1;
- resp->p_case_preserving = 0;
+ case -EACCES:
+ case -EPERM:
+ /*
+ * Policy denied the query. Report STALE so the
+ * handle is unusable without implying a server
+ * malfunction.
+ */
+ resp->status = nfserr_stale;
+ break;
+ default:
+ resp->status = nfserr_serverfault;
break;
}
}
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index eafdf7b7890f..85ff418127c7 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -32,6 +32,7 @@
#include <linux/writeback.h>
#include <linux/security.h>
#include <linux/sunrpc/xdr.h>
+#include <linux/fileattr.h>
#include "xdr3.h"
@@ -2891,3 +2892,90 @@ nfsd_permission(struct svc_cred *cred, struct svc_export *exp,
return err? nfserrno(err) : 0;
}
+
+/**
+ * nfsd_get_case_info - get case sensitivity info for a dentry
+ * @dentry: dentry to query
+ * @case_insensitive: set to true if name comparison ignores case
+ * @case_preserving: set to true if case is preserved on disk
+ *
+ * On casefold-capable filesystems the flag lives on the directory,
+ * not on its entries, so for a non-directory @dentry the parent is
+ * queried instead. A directory (including an export root, whose
+ * parent lies outside the export) is queried as-is so its own
+ * contents' lookup behavior is reported. NFSD advertises
+ * fattr4_homogeneous as FALSE, so per-directory answers may differ
+ * within an export.
+ *
+ * The probe runs with kernel credentials. case_insensitive and
+ * case_preserving describe the directory's structural lookup
+ * behavior, not the caller's identity; running under the calling
+ * client's mapped credentials would let per-client MAC policy on
+ * the parent directory turn this query into NFS4ERR_ACCESS even
+ * though the underlying property is the same for every client.
+ *
+ * When the filesystem does not expose case-folding state (no
+ * ->fileattr_get, or the callback returns -EOPNOTSUPP /
+ * -ENOIOCTLCMD / -ENOTTY / -EINVAL), the outputs are filled with
+ * POSIX defaults (case-sensitive, case-preserving) on the premise
+ * that a filesystem with case-folding support wires up
+ * fileattr_get.
+ *
+ * Return: 0 with outputs filled, -EOPNOTSUPP with outputs filled
+ * to POSIX defaults, or a negative errno (e.g., -EIO,
+ * -ESTALE, -ENOMEM) with outputs unmodified.
+ */
+int
+nfsd_get_case_info(struct dentry *dentry, bool *case_insensitive,
+ bool *case_preserving)
+{
+ struct file_kattr fa = {};
+ const struct cred *saved;
+ struct cred *probe;
+ struct dentry *cd;
+ bool put = false;
+ int err;
+
+ if (d_is_dir(dentry)) {
+ cd = dentry;
+ } else {
+ cd = dget_parent(dentry);
+ put = true;
+ }
+
+ probe = prepare_creds();
+ if (!probe) {
+ err = -ENOMEM;
+ goto out;
+ }
+ probe->fsuid = GLOBAL_ROOT_UID;
+ probe->fsgid = GLOBAL_ROOT_GID;
+ saved = override_creds(probe);
+
+ err = vfs_fileattr_get(cd, &fa);
+
+ put_cred(revert_creds(saved));
+out:
+ if (put)
+ dput(cd);
+ switch (err) {
+ case 0:
+ *case_insensitive = fa.fsx_xflags & FS_XFLAG_CASEFOLD;
+ *case_preserving =
+ !(fa.fsx_xflags & FS_XFLAG_CASENONPRESERVING);
+ return 0;
+ case -EINVAL:
+ case -ENOTTY:
+ case -ENOIOCTLCMD:
+ case -EOPNOTSUPP:
+ /*
+ * Filesystem does not expose case state.
+ * Report POSIX defaults.
+ */
+ *case_insensitive = false;
+ *case_preserving = true;
+ return -EOPNOTSUPP;
+ default:
+ return err;
+ }
+}
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 702a844f2106..e09ea04a51b9 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -156,6 +156,9 @@ __be32 nfsd_readdir(struct svc_rqst *, struct svc_fh *,
loff_t *, struct readdir_cd *, nfsd_filldir_t);
__be32 nfsd_statfs(struct svc_rqst *, struct svc_fh *,
struct kstatfs *, int access);
+int nfsd_get_case_info(struct dentry *dentry,
+ bool *case_insensitive,
+ bool *case_preserving);
__be32 nfsd_permission(struct svc_cred *cred, struct svc_export *exp,
struct dentry *dentry, int acc);
diff --git a/fs/nfsd/xdr3.h b/fs/nfsd/xdr3.h
index 522067b7fd75..a7c9714b0b0e 100644
--- a/fs/nfsd/xdr3.h
+++ b/fs/nfsd/xdr3.h
@@ -209,8 +209,8 @@ struct nfsd3_pathconfres {
__u32 p_name_max;
__u32 p_no_trunc;
__u32 p_chown_restricted;
- __u32 p_case_insensitive;
- __u32 p_case_preserving;
+ bool p_case_insensitive;
+ bool p_case_preserving;
};
struct nfsd3_commitres {
--
2.53.0
^ permalink raw reply related
* [PATCH v14 14/15] nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
NFSD currently provides NFSv4 clients with hard-coded responses
indicating all exported filesystems are case-sensitive and
case-preserving. This is incorrect for case-insensitive filesystems
and ext4 directories with casefold enabled.
Query the underlying filesystem's actual case sensitivity via
nfsd_get_case_info() and return accurate values to clients. This
supports per-directory settings for filesystems that allow mixing
case-sensitive and case-insensitive directories within an export.
The helper queries the parent dentry for non-directory filehandles
because case-folding is a per-directory property. That resolution
has the same corner cases here as for NFSv3 PATHCONF: single-file
exports query an unexported parent, disconnected dentries report
defaults until reconnected, and hardlinked files track whichever
alias the dcache currently holds.
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4xdr.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 49 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 2a0946c630e1..319007b79d49 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3158,6 +3158,8 @@ struct nfsd4_fattr_args {
u32 rdattr_err;
bool contextsupport;
bool ignore_crossmnt;
+ bool case_insensitive;
+ bool case_preserving;
};
typedef __be32(*nfsd4_enc_attr)(struct xdr_stream *xdr,
@@ -3356,6 +3358,33 @@ static __be32 nfsd4_encode_fattr4_acl(struct xdr_stream *xdr,
return nfs_ok;
}
+static __be32 nfsd4_encode_fattr4_case_insensitive(struct xdr_stream *xdr,
+ const struct nfsd4_fattr_args *args)
+{
+ return nfsd4_encode_bool(xdr, args->case_insensitive);
+}
+
+static __be32 nfsd4_encode_fattr4_case_preserving(struct xdr_stream *xdr,
+ const struct nfsd4_fattr_args *args)
+{
+ return nfsd4_encode_bool(xdr, args->case_preserving);
+}
+
+static __be32 nfsd4_encode_fattr4_homogeneous(struct xdr_stream *xdr,
+ const struct nfsd4_fattr_args *args)
+{
+ /*
+ * Casefold-capable filesystems (e.g. ext4 or f2fs with the
+ * casefold feature) attach a Unicode encoding at mount time
+ * but apply case folding per directory. The per-file-system
+ * case_insensitive and case_preserving values can therefore
+ * legitimately differ across objects that share the same fsid.
+ * Report FATTR4_HOMOGENEOUS = FALSE on such filesystems to
+ * keep that variation consistent with RFC 8881 Section 5.8.2.16.
+ */
+ return nfsd4_encode_bool(xdr, !sb_has_encoding(args->dentry->d_sb));
+}
+
static __be32 nfsd4_encode_fattr4_filehandle(struct xdr_stream *xdr,
const struct nfsd4_fattr_args *args)
{
@@ -3748,8 +3777,8 @@ static const nfsd4_enc_attr nfsd4_enc_fattr4_encode_ops[] = {
[FATTR4_ACLSUPPORT] = nfsd4_encode_fattr4_aclsupport,
[FATTR4_ARCHIVE] = nfsd4_encode_fattr4__noop,
[FATTR4_CANSETTIME] = nfsd4_encode_fattr4__true,
- [FATTR4_CASE_INSENSITIVE] = nfsd4_encode_fattr4__false,
- [FATTR4_CASE_PRESERVING] = nfsd4_encode_fattr4__true,
+ [FATTR4_CASE_INSENSITIVE] = nfsd4_encode_fattr4_case_insensitive,
+ [FATTR4_CASE_PRESERVING] = nfsd4_encode_fattr4_case_preserving,
[FATTR4_CHOWN_RESTRICTED] = nfsd4_encode_fattr4__true,
[FATTR4_FILEHANDLE] = nfsd4_encode_fattr4_filehandle,
[FATTR4_FILEID] = nfsd4_encode_fattr4_fileid,
@@ -3758,7 +3787,7 @@ static const nfsd4_enc_attr nfsd4_enc_fattr4_encode_ops[] = {
[FATTR4_FILES_TOTAL] = nfsd4_encode_fattr4_files_total,
[FATTR4_FS_LOCATIONS] = nfsd4_encode_fattr4_fs_locations,
[FATTR4_HIDDEN] = nfsd4_encode_fattr4__noop,
- [FATTR4_HOMOGENEOUS] = nfsd4_encode_fattr4__true,
+ [FATTR4_HOMOGENEOUS] = nfsd4_encode_fattr4_homogeneous,
[FATTR4_MAXFILESIZE] = nfsd4_encode_fattr4_maxfilesize,
[FATTR4_MAXLINK] = nfsd4_encode_fattr4_maxlink,
[FATTR4_MAXNAME] = nfsd4_encode_fattr4_maxname,
@@ -3968,6 +3997,23 @@ nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
args.fhp = tempfh;
} else
args.fhp = fhp;
+ if (attrmask[0] & (FATTR4_WORD0_CASE_INSENSITIVE |
+ FATTR4_WORD0_CASE_PRESERVING)) {
+ err = nfsd_get_case_info(dentry, &args.case_insensitive,
+ &args.case_preserving);
+ /*
+ * Per RFC 8881 Section 18.7.3, an attribute advertised
+ * in SUPPORTED_ATTRS must come back with a value or the
+ * GETATTR must fail. nfsd_get_case_info() fills POSIX
+ * defaults and returns -EOPNOTSUPP when the underlying
+ * filesystem does not expose case state; encode those
+ * defaults so the reply agrees with what SUPPORTED_ATTRS
+ * advertises. Other errors fail the operation as the
+ * spec requires.
+ */
+ if (err && err != -EOPNOTSUPP)
+ goto out_nfserr;
+ }
if (attrmask[0] & FATTR4_WORD0_ACL) {
err = nfsd4_get_nfs4_acl(rqstp, dentry, &args.acl);
--
2.53.0
^ permalink raw reply related
* [PATCH v14 15/15] ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
From: Chuck Lever @ 2026-05-07 8:53 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Roland Mainz
In-Reply-To: <20260507-case-sensitivity-v14-0-e62cc8200435@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
FS_ATTRIBUTE_INFORMATION responses have always reported
FILE_CASE_SENSITIVE_SEARCH and FILE_CASE_PRESERVED_NAMES
unconditionally. Case-insensitive filesystems like exFAT, and
casefolded directories on ext4 or f2fs, have no way to signal
their actual semantics to SMB clients.
Now that filesystems expose case behavior through ->fileattr_get,
query it via vfs_fileattr_get() and translate the FS_XFLAG_CASEFOLD
and FS_XFLAG_CASENONPRESERVING flags into the corresponding SMB
attributes. Filesystems without ->fileattr_get continue reporting
default POSIX behavior (case-sensitive, case-preserving).
SMB's FS_ATTRIBUTE_INFORMATION reports per-share attributes from
the share root, not per-file. Shares mixing casefold and
non-casefold directories report the root directory's behavior.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/smb/server/smb2pdu.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index ee32e61b6d3c..cf0bc453a036 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -14,6 +14,7 @@
#include <linux/falloc.h>
#include <linux/mount.h>
#include <linux/filelock.h>
+#include <linux/fileattr.h>
#include "glob.h"
#include "smbfsctl.h"
@@ -5541,16 +5542,33 @@ static int smb2_get_info_filesystem(struct ksmbd_work *work,
case FS_ATTRIBUTE_INFORMATION:
{
FILE_SYSTEM_ATTRIBUTE_INFO *info;
+ struct file_kattr fa = {};
size_t sz;
+ u32 attrs;
+ int err;
info = (FILE_SYSTEM_ATTRIBUTE_INFO *)rsp->Buffer;
- info->Attributes = cpu_to_le32(FILE_SUPPORTS_OBJECT_IDS |
- FILE_PERSISTENT_ACLS |
- FILE_UNICODE_ON_DISK |
- FILE_CASE_PRESERVED_NAMES |
- FILE_CASE_SENSITIVE_SEARCH |
- FILE_SUPPORTS_BLOCK_REFCOUNTING);
+ attrs = FILE_SUPPORTS_OBJECT_IDS |
+ FILE_PERSISTENT_ACLS |
+ FILE_UNICODE_ON_DISK |
+ FILE_SUPPORTS_BLOCK_REFCOUNTING;
+ err = vfs_fileattr_get(path.dentry, &fa);
+ /*
+ * -EINVAL, -EOPNOTSUPP: ntfs-3g and other FUSE
+ * filesystems that lack FS_IOC_FSGETXATTR support.
+ */
+ if (err && err != -ENOIOCTLCMD && err != -ENOTTY &&
+ err != -EINVAL && err != -EOPNOTSUPP) {
+ path_put(&path);
+ return err;
+ }
+ if (!(fa.fsx_xflags & FS_XFLAG_CASEFOLD))
+ attrs |= FILE_CASE_SENSITIVE_SEARCH;
+ if (!(fa.fsx_xflags & FS_XFLAG_CASENONPRESERVING))
+ attrs |= FILE_CASE_PRESERVED_NAMES;
+
+ info->Attributes = cpu_to_le32(attrs);
info->Attributes |= cpu_to_le32(server_conf.share_fake_fscaps);
if (test_share_config_flag(work->tcon->share_conf,
--
2.53.0
^ permalink raw reply related
* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Kamran Khan @ 2026-05-10 15:54 UTC (permalink / raw)
To: Jeff Barnes, Andy Lutomirski
Cc: Eric Biggers, linux-crypto@vger.kernel.org, Herbert Xu,
linux-doc@vger.kernel.org, linux-api@vger.kernel.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Linus Torvalds
In-Reply-To: <14A441D8-5370-44BE-8732-99BF8107C3FD@getmailspring.com>
Hi,
AF_ALG is useful not just for hardware-offloading, but also for memory
isolation so that applications only get oracle access to the crypto keys
and a memory-safety vulnerability in user applications would not
immediately put the secret key material at risk.
I understand and appreciate the concern with complex attack surface and
the increased frequency of attacks in this area. But I fear that
completely removing AF_ALG increases the risk for userspace applications
relying on it for memory isolation.
What alternatives do userspace applications have on Linux for ensuring
crypto keys are not exposed in user memory? That is, FreeBSD and NetBSD
natively provide /dev/crypto; removing AF_ALG would kill the only
equivalent option on the Linux side for kernel-delegated cryptography.
Thanks,
Kamran.
On 5/6/26 7:42 AM, Jeff Barnes wrote:
> Hi,
>
> On May 5 2026, at 7:17 pm, Andy Lutomirski <luto@amacapital.net> wrote:
>
>>> On Apr 29, 2026, at 6:19 PM, Eric Biggers <ebiggers@kernel.org> wrote:
>>>
>>> AF_ALG is almost completely unnecessary, and it exposes a massive attack
>>> surface that hasn't been standing up to modern vulnerability discovery
>>> tools. The latest one even has its own website, providing a small
>>> Python script that reliably roots most Linux distros: https://copy.fail/
>>
>> How about adding a configuration option, defaulted on, that requires
>> capable(CAP_SYS_ADMIN) to create the socket (and maybe also to bind /
>> connect it). And a sysctl to allow the administrator to override this
>> in the unlikely event that it’s needed.
>>
>> IIRC cryptsetup used to and maybe even still does require these
>> sockets sometimes and this would let it keep working. And there's all
>> the FIPS stuff downthread.
>
> Apologize in advance for the long-winded answer.
>
> The "FIPS stuff" centers on using sha512hmac -> libkcapi -> AF_ALG for
> verifying integrity. The early‑boot sha512hmac check that some
> distributions use (typically from initramfs) sits at an awkward
> intersection of multiple standards, and it may help to clarify where it
> actually fits and where it doesn't.
>
> From a standards perspective, FIPS 140‑3 requires a cryptographic module
> to perform self‑integrity verification using an approved algorithm and
> to prevent the module from entering an operational state on failure. In
> the Linux kernel, the cryptographic module is the kernel crypto
> subsystem, and these requirements are met by the kernel’s internal
> power‑up self‑tests (KATs, etc.) on the crypto code and critical data as
> loaded into memory.
>
> FIPS 199 / SP 800‑53 (e.g., SI‑7) impose system‑level integrity
> requirements (for Moderate impact systems), i.e., that unauthorized
> modification of critical components is prevented or detected and that
> failures result in a protective action. These controls are explicitly
> technology‑agnostic and are not limited to cryptographic‑module self‑tests.
>
> The sha512hmac check is not the FIPS 140‑3 cryptographic‑module
> self‑integrity test. Instead, it has historically been used as a system
> integrity control that provides auditors with assurance that the kernel
> image containing the cryptographic module has not been modified prior to
> execution, and that a failure will halt the boot.
>
> Although FIPS 140‑3 does not mandate an HMAC over the kernel image, the
> early‑boot HMAC became an accepted evidence pattern for satisfying
> system‑integrity expectations (FIPS 199 / SI‑7) alongside a kernel
> crypto validation. This is why it is often perceived as “required” for
> FIPS submissions, even though it is not normatively required by
> FIPS 140‑3 itself.
>
> With the deprecation/removal of AF_ALG for this use case, there is no
> longer a supported way to perform an early‑boot, userspace‑driven HMAC
> using validated kernel crypto without introducing circular dependencies
> (e.g., relying on userspace crypto before crypto self‑tests complete).
> As a result, there is no drop‑in replacement for sha512hmac that
> preserves all of its historical properties.
>
> This is a new development that challenges a long‑standing assumption:
> that system‑integrity evidence and cryptographic‑module self‑integrity
> can be cleanly separated while still being demonstrated by a single
> early‑boot mechanism. That assumption no longer holds given proposed
> kernel interfaces.
>
> A more accurate decomposition (and one that aligns with the intent of
> the standards) is to separate integrity enforcement by system phase.
>
> 1. Secure Boot (or equivalent platform verification) ensures that a
> modified kernel image is not executed at all. This satisfies the
> requirement that critical components are not loaded in a modified state
> and that integrity failure results in a protective action (boot prevention).
>
> 2. IMA (with appraisal and enforcement) ensures that modified
> executables, modules, or firmware cannot be loaded or executed once the
> kernel is running.
>
> 3. Kernel crypto self‑tests continue to satisfy FIPS 140‑3
> self‑integrity requirements independently of the above.
>
> Taken together, Secure Boot + IMA provide continuous system‑integrity
> enforcement without re‑introducing early‑boot HMACs or AF_ALG
> dependencies, while keeping cryptographic‑module self‑integrity
> correctly scoped to the kernel crypto subsystem.
>
> The transition away from sha512hmac is therefore not a removal of
> integrity enforcement, but a shift from a single, early‑boot mechanism
> to a phased integrity model that better reflects the separation of
> concerns already present in the standards — even though this separation
> was previously masked by the hacky HMAC approach.
>
> This change will require updated documentation and auditor education,
> but it reflects the current technical reality and avoids perpetuating an
> interface that no longer has a sustainable implementation path.
>
>>
>>
>>>
>>> This isn't sustainable, especially as LLMs have accelerated the rate the
>>> vulnerabilities are coming in. The effort that is being put into this
>>> thing is vastly disproportional to the few programs that actually use
>>> it, and those programs would be better served by userspace code anyway.
>>>
>>> These issues have been noted in many mailing list discussions already.
>>> But until now they haven't been reflected in the documentation or
>>> kconfig menu itself, and the vulnerabilities are still coming in.
>>>
>>> Let's go ahead and document the deprecation.
>>>
>>> This isn't intended to change anything overnight. After all, most Linux
>>> distros won't be able to disable the kconfig options quite yet, mainly
>>> because of iwd. But this should create a bit more impetus for these
>>> userspace programs to be fixed, and the documentation update should also
>>> help prevent more users from appearing.
>>>
>>> Signed-off-by: Eric Biggers <ebiggers@kernel.org>
>>> ---
>>>
>>> This patch is targeting crypto/master
>>>
>>> Documentation/crypto/userspace-if.rst | 82 ++++++++++++++++++++-------
>>> crypto/Kconfig | 69 ++++++++++++++++------
>>> 2 files changed, 113 insertions(+), 38 deletions(-)
>>>
>>> diff --git a/Documentation/crypto/userspace-if.rst b/Documentation/crypto/userspace-if.rst
>>> index 021759198fe7..c39f5c79a5b7 100644
>>> --- a/Documentation/crypto/userspace-if.rst
>>> +++ b/Documentation/crypto/userspace-if.rst
>>> @@ -2,30 +2,72 @@ User Space Interface
>>> ====================
>>>
>>> Introduction
>>> ------------
>>>
>>> -The concepts of the kernel crypto API visible to kernel space is fully
>>> -applicable to the user space interface as well. Therefore, the kernel
>>> -crypto API high level discussion for the in-kernel use cases applies
>>> -here as well.
>>> -
>>> -The major difference, however, is that user space can only act as a
>>> -consumer and never as a provider of a transformation or cipher
>>> -algorithm.
>>> -
>>> -The following covers the user space interface exported by the kernel
>>> -crypto API. A working example of this description is libkcapi that can
>>> -be obtained from [1]. That library can be used by user space
>>> -applications that require cryptographic services from the kernel.
>>> -
>>> -Some details of the in-kernel kernel crypto API aspects do not apply to
>>> -user space, however. This includes the difference between synchronous
>>> -and asynchronous invocations. The user space API call is fully
>>> -synchronous.
>>> -
>>> -[1] https://www.chronox.de/libkcapi/index.html
>>> +AF_ALG provides unprivileged userspace programs access to arbitrary hash,
>>> +symmetric cipher, AEAD, and RNG algorithms that are implemented in kernel-mode
>>> +code.
>>> +
>>> +AF_ALG is insecure and is deprecated. Originally added to the kernel
>>> in 2010,
>>> +most kernel developers now consider it to be a mistake.
>>> +
>>> +AF_ALG continues to be supported only for backwards compatibility.
>>> On systems
>>> +where no programs using AF_ALG remain, the support for it should be
>>> disabled by
>>> +disabling ``CONFIG_CRYPTO_USER_API_*``.
>>> +
>>> +Deprecation
>>> +-----------
>>> +
>>> +AF_ALG was originally intended to provide userspace programs access
>>> to crypto
>>> +accelerators that they wouldn't otherwise have access to.
>>> +
>>> +However, that capability turned out to not be useful on very many
>>> systems. More
>>> +significantly, the actual implementation exposes a vastly greater
>>> amount of
>>> +functionality than that. It actually provides access to all software algorithms.
>>> +
>>> +This includes arbitrary compositions of different algorithms created
>>> via a
>>> +complex template system, as well as algorithms that only make sense
>>> as internal
>>> +implementation details of other algorithms. It also includes full zero-copy
>>> +support, which is difficult for the kernel to implement securely.
>>> +
>>> +Ultimately, these algorithms are just math computations. They use
>>> the same
>>> +instructions that userspace programs already have access to, just
>>> accessed in a
>>> +much more convoluted and less efficient way.
>>> +
>>> +Indeed, userspace code is nearly always what is being used anyway.
>>> These same
>>> +algorithms are widely implemented in userspace crypto libraries.
>>> +
>>> +Meanwhile, AF_ALG hasn't been withstanding modern vulnerability
>>> discovery tools
>>> +such as syzbot and large language models. It receives a steady
>>> stream of CVEs.
>>> +Some of the examples include:
>>> +
>>> +- CVE-2026-31677
>>> +- CVE-2026-31431 (https://copy.fail)
>>> +- CVE-2025-38079
>>> +- CVE-2025-37808
>>> +- CVE-2024-26824
>>> +- CVE-2022-48781
>>> +- CVE-2019-8912
>>> +- CVE-2018-14619
>>> +- CVE-2017-18075
>>> +- CVE-2017-17806
>>> +- CVE-2017-17805
>>> +- CVE-2016-10147
>>> +- CVE-2015-8970
>>> +- CVE-2015-3331
>>> +- CVE-2014-9644
>>> +- CVE-2013-7421
>>> +- CVE-2011-4081
>>> +
>>> +It is recommended that, whenever possible, userspace programs be
>>> migrated to
>>> +userspace crypto code (which again, is what is normally used anyway) and
>>> +``CONFIG_CRYPTO_USER_API_*`` be disabled. On systems that use
>>> SELinux, SELinux
>>> +can also be used to restrict the use of AF_ALG to trusted programs.
>>> +
>>> +The remainder of this documentation provides the historical
>>> documentation for
>>> +the deprecated AF_ALG interface.
>>>
>>> User Space API General Remarks
>>> ------------------------------
>>>
>>> The kernel crypto API is accessible from user space. Currently, the
>>> diff --git a/crypto/Kconfig b/crypto/Kconfig
>>> index 103d1f58cb7c..6cd1c478d4be 100644
>>> --- a/crypto/Kconfig
>>> +++ b/crypto/Kconfig
>>> @@ -1278,48 +1278,72 @@ config CRYPTO_DF80090A
>>> tristate
>>> select CRYPTO_AES
>>> select CRYPTO_CTR
>>>
>>> endmenu
>>> -menu "Userspace interface"
>>> +menu "Userspace interface (deprecated)"
>>>
>>> config CRYPTO_USER_API
>>> tristate
>>>
>>> config CRYPTO_USER_API_HASH
>>> - tristate "Hash algorithms"
>>> + tristate "Hash algorithms (deprecated)"
>>> depends on NET
>>> select CRYPTO_HASH
>>> select CRYPTO_USER_API
>>> help
>>> - Enable the userspace interface for hash algorithms.
>>> + Enable the AF_ALG userspace interface for hash algorithms. This
>>> + provides unprivileged userspace programs access to arbitrary hash
>>> + algorithms implemented in the kernel's privileged execution context.
>>>
>>> - See Documentation/crypto/userspace-if.rst and
>>> - https://www.chronox.de/libkcapi/html/index.html
>>> + This interface is deprecated and is supported only for backwards
>>> + compatibility. It regularly has vulnerabilities, and the capabilities
>>> + it provides are redundant with userspace crypto libraries.
>>> +
>>> + Enable this only if needed for support for a program that
>>> hasn't yet
>>> + been converted to userspace crypto, for example iwd.
>>> +
>>> + See also Documentation/crypto/userspace-if.rst
>>>
>>> config CRYPTO_USER_API_SKCIPHER
>>> - tristate "Symmetric key cipher algorithms"
>>> + tristate "Symmetric key cipher algorithms (deprecated)"
>>> depends on NET
>>> select CRYPTO_SKCIPHER
>>> select CRYPTO_USER_API
>>> help
>>> - Enable the userspace interface for symmetric key cipher algorithms.
>>> + Enable the AF_ALG userspace interface for symmetric key algorithms.
>>> + This provides unprivileged userspace programs access to arbitrary
>>> + symmetric key algorithms implemented in the kernel's privileged
>>> + execution context.
>>> +
>>> + This interface is deprecated and is supported only for backwards
>>> + compatibility. It regularly has vulnerabilities, and the capabilities
>>> + it provides are redundant with userspace crypto libraries.
>>> +
>>> + Enable this only if needed for support for a program that
>>> hasn't yet
>>> + been converted to userspace crypto, for example iwd, or cryptsetup
>>> + with certain algorithms.
>>>
>>> - See Documentation/crypto/userspace-if.rst and
>>> - https://www.chronox.de/libkcapi/html/index.html
>>> + See also Documentation/crypto/userspace-if.rst
>>>
>>> config CRYPTO_USER_API_RNG
>>> - tristate "RNG (random number generator) algorithms"
>>> + tristate "Random number generation algorithms (deprecated)"
>>> depends on NET
>>> select CRYPTO_RNG
>>> select CRYPTO_USER_API
>>> help
>>> - Enable the userspace interface for RNG (random number generator)
>>> - algorithms.
>>> + Enable the AF_ALG userspace interface for random number generation
>>> + (RNG) algorithms. This provides unprivileged userspace programs
>>> + access to arbitrary RNG algorithms implemented in the kernel's
>>> + privileged execution context.
>>>
>>> - See Documentation/crypto/userspace-if.rst and
>>> - https://www.chronox.de/libkcapi/html/index.html
>>> + This interface is deprecated and is supported only for backwards
>>> + compatibility. It regularly has vulnerabilities, and the capabilities
>>> + it provides are redundant with userspace crypto libraries as
>>> well as
>>> + the normal kernel RNG (e.g., /dev/urandom and getrandom(2)).
>>> +
>>> + See also Documentation/crypto/userspace-if.rst
>>>
>>> config CRYPTO_USER_API_RNG_CAVP
>>> bool "Enable CAVP testing of DRBG"
>>> depends on CRYPTO_USER_API_RNG && CRYPTO_DRBG
>>> help
>>> @@ -1330,20 +1354,29 @@ config CRYPTO_USER_API_RNG_CAVP
>>>
>>> This should only be enabled for CAVP testing. You should say
>>> no unless you know what this is.
>>>
>>> config CRYPTO_USER_API_AEAD
>>> - tristate "AEAD cipher algorithms"
>>> + tristate "AEAD cipher algorithms (deprecated)"
>>> depends on NET
>>> select CRYPTO_AEAD
>>> select CRYPTO_SKCIPHER
>>> select CRYPTO_USER_API
>>> help
>>> - Enable the userspace interface for AEAD cipher algorithms.
>>> + Enable the AF_ALG userspace interface for authenticated encryption
>>> + with associated data (AEAD) algorithms. This provides unprivileged
>>> + userspace programs access to arbitrary AEAD algorithms
>>> implemented in
>>> + the kernel's privileged execution context.
>>> +
>>> + This interface is deprecated and is supported only for backwards
>>> + compatibility. It regularly has vulnerabilities, and the capabilities
>>> + it provides are redundant with userspace crypto libraries.
>>> +
>>> + Enable this only if needed for support for a program that
>>> hasn't yet
>>> + been converted to userspace crypto, for example iwd.
>>>
>>> - See Documentation/crypto/userspace-if.rst and
>>> - https://www.chronox.de/libkcapi/html/index.html
>>> + See also Documentation/crypto/userspace-if.rst
>>>
>>> config CRYPTO_USER_API_ENABLE_OBSOLETE
>>> bool "Obsolete cryptographic algorithms"
>>> depends on CRYPTO_USER_API
>>> default y
>>>
>>> base-commit: 57b8e2d666a31fa201432d58f5fe3469a0dd83ba
>>> --
>>> 2.54.0
>>>
>>>
>>
>
^ permalink raw reply
* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Eric Biggers @ 2026-05-10 16:32 UTC (permalink / raw)
To: Kamran Khan
Cc: Jeff Barnes, Andy Lutomirski, linux-crypto@vger.kernel.org,
Herbert Xu, linux-doc@vger.kernel.org, linux-api@vger.kernel.org,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Linus Torvalds
In-Reply-To: <0b8bba44-f6bb-4d69-b9d4-5787c276d41a@inspirated.com>
On Sun, May 10, 2026 at 08:54:07AM -0700, Kamran Khan wrote:
> Hi,
>
> AF_ALG is useful not just for hardware-offloading, but also for memory
> isolation so that applications only get oracle access to the crypto keys and
> a memory-safety vulnerability in user applications would not immediately put
> the secret key material at risk.
Note that if that memory-safety vulnerability leads to code execution in
the application, then it doesn't matter that it "only" has oracle
access. It can still decrypt any data encrypted by that key.
The relevant threat model would be arbitrary reads, not any
"memory-safety vulnerability".
> I understand and appreciate the concern with complex attack surface and the
> increased frequency of attacks in this area. But I fear that completely
> removing AF_ALG increases the risk for userspace applications relying on it
> for memory isolation.
>
> What alternatives do userspace applications have on Linux for ensuring
> crypto keys are not exposed in user memory? That is, FreeBSD and NetBSD
> natively provide /dev/crypto; removing AF_ALG would kill the only equivalent
> option on the Linux side for kernel-delegated cryptography.
The standard solution is simply to use an isolated userspace process
like ssh-agent. Yes, the keys will be in "user memory". But "not
exposed in user memory" is *not* a correct statement of the problem.
(Also note that protecting not-actively-in-use data from arbitrary read
primitives doesn't require cryptography at all. That can be done simply
by using mprotect() to remove read permission from the memory, then
temporarily adding it back when it needs to be accessed.)
In any case, any hypothetical security benefit provided by AF_ALG would
have to be *very high* to outweigh the continuous stream of
vulnerabilities in it. I understand that people using AF_ALG might not
be familiar with that continuous stream of vulnerabilities, but it would
be worth spending some time researching what has been going on.
- Eric
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox