Linux userland API discussions

Linux userland API discussions
 help / color / mirror / Atom feed

* Re: [RFC PATCH 0/2] futex: how to solve the robust_list race condition?
From: Mathieu Desnoyers @ 2026-02-20 22:41 UTC (permalink / raw)
  To: André Almeida, Carlos O'Donell,
	Sebastian Andrzej Siewior, Peter Zijlstra, Florian Weimer,
	Rich Felker, Torvald Riegel, Darren Hart, Thomas Gleixner,
	Ingo Molnar, Davidlohr Bueso, Arnd Bergmann, Liam R . Howlett,
	Lorenzo Stoakes, Michal Hocko
  Cc: kernel-dev, linux-api, linux-kernel, libc-alpha
In-Reply-To: <0d334517-63ee-46c9-884d-6c2ae8388b87@efficios.com>

On 2026-02-20 16:42, Mathieu Desnoyers wrote:
> +CC libc-alpha.
> 
> On 2026-02-20 15:26, André Almeida wrote:
>> During LPC 2025, I presented a session about creating a new syscall for
>> robust_list[0][1]. However, most of the session discussion wasn't much 
>> related
>> to the new syscall itself, but much more related to an old bug that 
>> exists in
>> the current robust_list mechanism.
>>
>> Since at least 2012, there's an open bug reporting a race condition, as
>> Carlos O'Donell pointed out:
>>
>>    "File corruption race condition in robust mutex unlocking"
>>    https://sourceware.org/bugzilla/show_bug.cgi?id=14485
>>
>> To help understand the bug, I've created a reproducer (patch 1/2) and a
>> companion kernel hack (patch 2/2) that helps to make the race condition
>> more likely. When the bug happens, the reproducer shows a message
>> comparing the original memory with the corrupted one:
>>
>>    "Memory was corrupted by the kernel: 8001fe8d8001fe8d vs 
>> 8001fe8dc0000000"
>>
>> I'm not sure yet what would be the appropriated approach to fix it, so I
>> decided to reach the community before moving forward in some direction.
>> One suggestion from Peter[2] resolves around serializing the mmap() 
>> and the
>> robust list exit path, which might cause overheads for the common case,
>> where list_op_pending is empty.
>>
>> However, giving that there's a new interface being prepared, this could
>> also give the opportunity to rethink how list_op_pending works, and get
>> rid of the race condition by design.
>>
>> Feedback is very much welcome.
> 
> Looking at this bug, one thing I'm starting to consider is that it
> appears to be an issue inherent to lack of synchronization between
> pthread_mutex_destroy(3) and the per-thread list_op_pending fields
> and not so much a kernel issue.
> 
> Here is why I think the issue is purely userspace:
> 
> Let's suppose we have a shared memory area across Processes 1 and 
> Process 2,
> which internally have its own custom memory allocator in userspace to
> allocate/free space within that shared memory.
> 
> Process 1, Thread A stumbles through the scenario highlighted by this 
> bug, and
> basically gets preempted at this FIXME in libc 
> __pthread_mutex_unlock_full():
> 
>        if (__glibc_unlikely ((atomic_exchange_release (&mutex- 
>  >__data.__lock, 0)
>                               & FUTEX_WAITERS) != 0))
>          futex_wake ((unsigned int *) &mutex->__data.__lock, 1, private);
> 
>        /* We must clear op_pending after we release the mutex.
>           FIXME However, this violates the mutex destruction requirements
>           because another thread could acquire the mutex, destroy it, and
>           reuse the memory for something else; then, if this thread 
> crashes,
>           and the memory happens to have a value equal to the TID, the 
> kernel
>           will believe it is still related to the mutex (which has been
>           destroyed already) and will modify some other random object.  */
>        __asm ("" ::: "memory");
>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Then Process 1, Thread B runs, grabs the lock, releases it, and based on
> program state it knows it can pthread_mutex_destroy() this lock, free its
> associated memory through the custom shared memory allocator, and allocate
> it for other purposes. Then we get to the point where Process 1 is
> killed, and where the robust futex kernel code corrupts data in shared
> memory because of the dangling list_op_pending pointer.
> 
> That shared memory data is still observable by Process B, which will get a
> corrupted state.
> 
> Notice how this all happens without any munmap(2)/mmap(2) in the sequence ?
> This is why I think this is purely a userspace issue rather than an issue
> we can solve by adding extra synchronization in the kernel.
> 
> The one point we have in that sequence where I think we can add 
> synchronization
> is pthread_mutex_destroy(3) in libc. One possible "big hammer" solution 
> would be
> to make pthread_mutex_destroy iterate on all other threads list_op_pending
> and busy-wait if it finds that the mutex address is in use. It would of 
> course
> only have to do that for robust futexes.
> 
> If that big hammer solution is not fast enough for many-threaded use-cases,
> then we can think of other approaches such as adding a reference counter
> in the mutex structure, or introducing hazard pointers in userspace to 
> reduce
> synchronization iteration from nr_threads to nr_cpus (or even down to max
> rseq mm_cid).

To make matters even worse, the pthread_mutex_destroy(3) and reallocation
could happen from Process 2 rather than Process 1. So iterating on a
threads from Process 1 is not sufficient. We'd need to synchronize
pthread_mutex_destroy on something within the mutex structure which is
observable from all processes using the lock, for instance a reference count.

Thanks,

Mathieu

> 
> Thoughts ?
> 
> Thanks,
> 
> Mathieu
> 
>>
>> Thanks!
>>     André
>>
>> [0] https://lore.kernel.org/lkml/20251122-tonyk-robust_futex- 
>> v6-0-05fea005a0fd@igalia.com/
>> [1] https://lpc.events/event/19/contributions/2108/
>> [2] https://lore.kernel.org/ 
>> lkml/20241219171344.GA26279@noisy.programming.kicks-ass.net/
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply

* Re: [RFC PATCH 0/2] futex: how to solve the robust_list race condition?
From: Mathieu Desnoyers @ 2026-02-20 21:42 UTC (permalink / raw)
  To: André Almeida, Carlos O'Donell,
	Sebastian Andrzej Siewior, Peter Zijlstra, Florian Weimer,
	Rich Felker, Torvald Riegel, Darren Hart, Thomas Gleixner,
	Ingo Molnar, Davidlohr Bueso, Arnd Bergmann, Liam R . Howlett,
	Lorenzo Stoakes, Michal Hocko
  Cc: kernel-dev, linux-api, linux-kernel, libc-alpha
In-Reply-To: <20260220202620.139584-1-andrealmeid@igalia.com>

+CC libc-alpha.

On 2026-02-20 15:26, André Almeida wrote:
> During LPC 2025, I presented a session about creating a new syscall for
> robust_list[0][1]. However, most of the session discussion wasn't much related
> to the new syscall itself, but much more related to an old bug that exists in
> the current robust_list mechanism.
> 
> Since at least 2012, there's an open bug reporting a race condition, as
> Carlos O'Donell pointed out:
> 
>    "File corruption race condition in robust mutex unlocking"
>    https://sourceware.org/bugzilla/show_bug.cgi?id=14485
> 
> To help understand the bug, I've created a reproducer (patch 1/2) and a
> companion kernel hack (patch 2/2) that helps to make the race condition
> more likely. When the bug happens, the reproducer shows a message
> comparing the original memory with the corrupted one:
> 
>    "Memory was corrupted by the kernel: 8001fe8d8001fe8d vs 8001fe8dc0000000"
> 
> I'm not sure yet what would be the appropriated approach to fix it, so I
> decided to reach the community before moving forward in some direction.
> One suggestion from Peter[2] resolves around serializing the mmap() and the
> robust list exit path, which might cause overheads for the common case,
> where list_op_pending is empty.
> 
> However, giving that there's a new interface being prepared, this could
> also give the opportunity to rethink how list_op_pending works, and get
> rid of the race condition by design.
> 
> Feedback is very much welcome.

Looking at this bug, one thing I'm starting to consider is that it
appears to be an issue inherent to lack of synchronization between
pthread_mutex_destroy(3) and the per-thread list_op_pending fields
and not so much a kernel issue.

Here is why I think the issue is purely userspace:

Let's suppose we have a shared memory area across Processes 1 and Process 2,
which internally have its own custom memory allocator in userspace to
allocate/free space within that shared memory.

Process 1, Thread A stumbles through the scenario highlighted by this bug, and
basically gets preempted at this FIXME in libc __pthread_mutex_unlock_full():

       if (__glibc_unlikely ((atomic_exchange_release (&mutex->__data.__lock, 0)
                              & FUTEX_WAITERS) != 0))
         futex_wake ((unsigned int *) &mutex->__data.__lock, 1, private);

       /* We must clear op_pending after we release the mutex.
          FIXME However, this violates the mutex destruction requirements
          because another thread could acquire the mutex, destroy it, and
          reuse the memory for something else; then, if this thread crashes,
          and the memory happens to have a value equal to the TID, the kernel
          will believe it is still related to the mutex (which has been
          destroyed already) and will modify some other random object.  */
       __asm ("" ::: "memory");
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);

Then Process 1, Thread B runs, grabs the lock, releases it, and based on
program state it knows it can pthread_mutex_destroy() this lock, free its
associated memory through the custom shared memory allocator, and allocate
it for other purposes. Then we get to the point where Process 1 is
killed, and where the robust futex kernel code corrupts data in shared
memory because of the dangling list_op_pending pointer.

That shared memory data is still observable by Process B, which will get a
corrupted state.

Notice how this all happens without any munmap(2)/mmap(2) in the sequence ?
This is why I think this is purely a userspace issue rather than an issue
we can solve by adding extra synchronization in the kernel.

The one point we have in that sequence where I think we can add synchronization
is pthread_mutex_destroy(3) in libc. One possible "big hammer" solution would be
to make pthread_mutex_destroy iterate on all other threads list_op_pending
and busy-wait if it finds that the mutex address is in use. It would of course
only have to do that for robust futexes.

If that big hammer solution is not fast enough for many-threaded use-cases,
then we can think of other approaches such as adding a reference counter
in the mutex structure, or introducing hazard pointers in userspace to reduce
synchronization iteration from nr_threads to nr_cpus (or even down to max
rseq mm_cid).

Thoughts ?

Thanks,

Mathieu

> 
> Thanks!
> 	André
> 
> [0] https://lore.kernel.org/lkml/20251122-tonyk-robust_futex-v6-0-05fea005a0fd@igalia.com/
> [1] https://lpc.events/event/19/contributions/2108/
> [2] https://lore.kernel.org/lkml/20241219171344.GA26279@noisy.programming.kicks-ass.net/

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

^ permalink raw reply

* Re: [RFC PATCH 0/2] futex: how to solve the robust_list race condition?
From: Liam R. Howlett @ 2026-02-20 20:51 UTC (permalink / raw)
  To: André Almeida
  Cc: Carlos O'Donell, Sebastian Andrzej Siewior, Peter Zijlstra,
	Florian Weimer, Rich Felker, Torvald Riegel, Darren Hart,
	Thomas Gleixner, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Mathieu Desnoyers, kernel-dev, linux-api, linux-kernel,
	Suren Baghdasaryan, Lorenzo Stoakes, Michal Hocko
In-Reply-To: <20260220202620.139584-1-andrealmeid@igalia.com>

+Cc Suren, Lorenzo, and Michal

* André Almeida <andrealmeid@igalia.com> [260220 15:27]:
> During LPC 2025, I presented a session about creating a new syscall for
> robust_list[0][1]. However, most of the session discussion wasn't much related
> to the new syscall itself, but much more related to an old bug that exists in
> the current robust_list mechanism.

Ah, sorry for hijacking the session, that was not my intention, but this
needs to be addressed before we propagate the issue into the next
iteration.

> 
> Since at least 2012, there's an open bug reporting a race condition, as
> Carlos O'Donell pointed out:
> 
>   "File corruption race condition in robust mutex unlocking"
>   https://sourceware.org/bugzilla/show_bug.cgi?id=14485
> 
> To help understand the bug, I've created a reproducer (patch 1/2) and a
> companion kernel hack (patch 2/2) that helps to make the race condition
> more likely. When the bug happens, the reproducer shows a message
> comparing the original memory with the corrupted one:
> 
>   "Memory was corrupted by the kernel: 8001fe8d8001fe8d vs 8001fe8dc0000000"
> 
> I'm not sure yet what would be the appropriated approach to fix it, so I
> decided to reach the community before moving forward in some direction.
> One suggestion from Peter[2] resolves around serializing the mmap() and the
> robust list exit path, which might cause overheads for the common case,
> where list_op_pending is empty.
> 
> However, giving that there's a new interface being prepared, this could
> also give the opportunity to rethink how list_op_pending works, and get
> rid of the race condition by design.
> 
> Feedback is very much welcome.

There was a delay added to the oom reaper for these tasks [1] by commit
e4a38402c36e ("oom_kill.c: futex: delay the OOM reaper to allow time for
proper futex cleanup")

We did discuss marking the vmas as needing to be skipped by the oom
manager, but no clear path forward was clear.  It's also not clear if
that's the only area where such a problem exists.

[1].  https://lore.kernel.org/all/20220414144042.677008-1-npache@redhat.com/T/#u

> 
> Thanks!
> 	André
> 
> [0] https://lore.kernel.org/lkml/20251122-tonyk-robust_futex-v6-0-05fea005a0fd@igalia.com/
> [1] https://lpc.events/event/19/contributions/2108/
> [2] https://lore.kernel.org/lkml/20241219171344.GA26279@noisy.programming.kicks-ass.net/
> 
> André Almeida (2):
>   futex: Create reproducer for robust_list race condition
>   futex: Add debug delays
> 
>  kernel/futex/core.c |  10 +++
>  robust_bug.c        | 178 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 188 insertions(+)
>  create mode 100644 robust_bug.c
> 
> -- 
> 2.53.0
> 

^ permalink raw reply

* [RFC PATCH 0/2] futex: how to solve the robust_list race condition?
From: André Almeida @ 2026-02-20 20:26 UTC (permalink / raw)
  To: Carlos O'Donell, Sebastian Andrzej Siewior, Peter Zijlstra,
	Florian Weimer, Rich Felker, Torvald Riegel, Darren Hart,
	Thomas Gleixner, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Mathieu Desnoyers, Liam R . Howlett
  Cc: kernel-dev, linux-api, linux-kernel, André Almeida

During LPC 2025, I presented a session about creating a new syscall for
robust_list[0][1]. However, most of the session discussion wasn't much related
to the new syscall itself, but much more related to an old bug that exists in
the current robust_list mechanism.

Since at least 2012, there's an open bug reporting a race condition, as
Carlos O'Donell pointed out:

  "File corruption race condition in robust mutex unlocking"
  https://sourceware.org/bugzilla/show_bug.cgi?id=14485

To help understand the bug, I've created a reproducer (patch 1/2) and a
companion kernel hack (patch 2/2) that helps to make the race condition
more likely. When the bug happens, the reproducer shows a message
comparing the original memory with the corrupted one:

  "Memory was corrupted by the kernel: 8001fe8d8001fe8d vs 8001fe8dc0000000"

I'm not sure yet what would be the appropriated approach to fix it, so I
decided to reach the community before moving forward in some direction.
One suggestion from Peter[2] resolves around serializing the mmap() and the
robust list exit path, which might cause overheads for the common case,
where list_op_pending is empty.

However, giving that there's a new interface being prepared, this could
also give the opportunity to rethink how list_op_pending works, and get
rid of the race condition by design.

Feedback is very much welcome.

Thanks!
	André

[0] https://lore.kernel.org/lkml/20251122-tonyk-robust_futex-v6-0-05fea005a0fd@igalia.com/
[1] https://lpc.events/event/19/contributions/2108/
[2] https://lore.kernel.org/lkml/20241219171344.GA26279@noisy.programming.kicks-ass.net/

André Almeida (2):
  futex: Create reproducer for robust_list race condition
  futex: Add debug delays

 kernel/futex/core.c |  10 +++
 robust_bug.c        | 178 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 188 insertions(+)
 create mode 100644 robust_bug.c

-- 
2.53.0

^ permalink raw reply

* [RFC PATCH 2/2] futex: hack: Add debug delays
From: André Almeida @ 2026-02-20 20:26 UTC (permalink / raw)
  To: Carlos O'Donell, Sebastian Andrzej Siewior, Peter Zijlstra,
	Florian Weimer, Rich Felker, Torvald Riegel, Darren Hart,
	Thomas Gleixner, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Mathieu Desnoyers, Liam R . Howlett
  Cc: kernel-dev, linux-api, linux-kernel, André Almeida
In-Reply-To: <20260220202620.139584-1-andrealmeid@igalia.com>

Add delays to handle_futex_death() to increase the chance of hitting the race
condition.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
 kernel/futex/core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index cf7e610eac42..d409b3368cb3 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -44,6 +44,7 @@
 #include <linux/prctl.h>
 #include <linux/mempolicy.h>
 #include <linux/mmap_lock.h>
+#include <linux/delay.h>
 
 #include "futex.h"
 #include "../locking/rtmutex_common.h"
@@ -1095,6 +1096,12 @@ static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr,
 	 * does not guarantee R/W access. If that fails we
 	 * give up and leave the futex locked.
 	 */
+
+	if (!strcmp(current->comm, "robust_bug")) {
+		printk("robust_bug is exiting\n");
+		msleep(500);
+	}
+
 	if ((err = futex_cmpxchg_value_locked(&nval, uaddr, uval, mval))) {
 		switch (err) {
 		case -EFAULT:
@@ -1112,6 +1119,9 @@ static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr,
 		}
 	}
 
+	if (!strcmp(current->comm, "robust_bug"))
+		printk("memory written\n");
+
 	if (nval != uval)
 		goto retry;
 
-- 
2.53.0


^ permalink raw reply related

* [RFC PATCH 1/2] futex: Create reproducer for robust_list race condition
From: André Almeida @ 2026-02-20 20:26 UTC (permalink / raw)
  To: Carlos O'Donell, Sebastian Andrzej Siewior, Peter Zijlstra,
	Florian Weimer, Rich Felker, Torvald Riegel, Darren Hart,
	Thomas Gleixner, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Mathieu Desnoyers, Liam R . Howlett
  Cc: kernel-dev, linux-api, linux-kernel, André Almeida
In-Reply-To: <20260220202620.139584-1-andrealmeid@igalia.com>

Create a reproducer for https://sourceware.org/bugzilla/show_bug.cgi?id=14485

This is not supposed to be merged.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
 robust_bug.c | 178 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 178 insertions(+)
 create mode 100644 robust_bug.c

diff --git a/robust_bug.c b/robust_bug.c
new file mode 100644
index 000000000000..1ade4e6d66dd
--- /dev/null
+++ b/robust_bug.c
@@ -0,0 +1,178 @@
+/*
+ *  gcc robust_bug.c -o robust_bug
+ *
+ * This is a reproducer for "File corruption race condition in robust
+ * mutex unlocking" from https://sourceware.org/bugzilla/show_bug.cgi?id=14485
+ *
+ * To increase the changes of reaching the race condition, a delay can be added
+ * to the kernel function handle_futex_death(), just before the user memory
+ * write futex_cmpxchg_value_locked().
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+#include <linux/futex.h>
+#include <pthread.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+#include <time.h>
+
+#define cpu_relax() asm volatile("rep; nop");
+
+/*
+ * This struct is an example of a lock struct, shared between the threads.
+ */
+struct lock_struct {
+	uint32_t 		futex;
+	struct robust_list	list;
+};
+
+static struct lock_struct *lock;
+
+/*
+ * This is the struct that we are going to use to allocate on top of the 
+ * freed memory to observe the race condition.
+ */
+struct another_struct {
+	uint64_t value;
+};
+
+static pthread_barrier_t barrier;
+
+static int set_robust_list(struct robust_list_head *head)
+{
+	return syscall(SYS_set_robust_list, head, sizeof(*head));
+}
+
+/*
+ * This thread emulates the behaviour of a thread releasing a robust mutex:
+ * - It starts by adding the mutex to the op_pending field
+ * - Remove the mutex from the robust list
+ * - Release the lock and wake up waiters
+ * - Remove the mutex from the op_pending field
+ *
+ * However, this thread dies before doing this last step, leaving the mutex
+ * behind in the op_pending field.
+ */
+void *func_b(void *arg)
+{
+	static struct robust_list_head head;
+	pid_t tid = gettid() | FUTEX_WAITERS;
+
+	/*
+	 * Initial thread setup. This would happen in an earlier stage of the
+	 * thread execution.
+	 */
+	set_robust_list(&head);
+	head.list.next = &head.list;
+	head.futex_offset = (size_t) offsetof(struct lock_struct, futex) -
+		(size_t) offsetof(struct lock_struct, list);
+
+	/* This thread takes the lock... */
+	lock->futex = tid;
+
+	/* ...would do some work here... */
+
+	/*
+	 * ...and starts the release process. Adds the mutex to be released on
+	 * the op_pending.
+	 */
+	head.list_op_pending = &lock->list;
+
+	/* Barrier to synchronize thread B taking the lock */
+	pthread_barrier_wait(&barrier);
+	usleep(100);
+
+	/*
+	 * Here we would release the lock and wake up any waiters.
+	 *
+	 * lock->futex = LOCK_FREE;
+	 * futex_wake(lock->futex, 1);
+	 */
+
+	/*
+	 * We would remove the lock from op_pending, but we emulate a thread
+	 * exiting before doing it.
+	 */
+	return NULL;
+}
+
+int main(int argc, char *argv[])
+{
+	struct another_struct *new;
+	uint64_t original_val;
+	pthread_t thread_b;
+	uint32_t value;
+	int ret;
+
+	ret = pthread_barrier_init(&barrier, NULL, 2);
+	if (ret) {
+		puts("pthread_barrier_init failed");
+		return -1;
+	}
+
+	/* Initialize the lock */
+	lock = mmap(NULL, sizeof(struct lock_struct), PROT_READ | PROT_WRITE,
+		    MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	if (lock == MAP_FAILED) {
+		puts("mmap failed");
+		return -1;
+	}
+	memset(lock, 0, sizeof(*lock));
+
+	/* Create the thread B that will take the lock */
+	pthread_create(&thread_b, NULL, func_b, NULL);
+
+	/* Barrier to synchronize thread B taking the lock */
+	pthread_barrier_wait(&barrier);
+
+	/* Copy this value as we will use it later */
+	value = lock->futex;
+
+	/*
+	 * Here, this thread would do the following:
+	 * - It would wait for the lock, and be wake from thread B
+	 * - Take the lock, do some work, and release it
+	 * - After releasing the lock and being the last user, it can correctly
+	 *   free it
+	 */
+	munmap(lock, sizeof(struct lock_struct));
+
+	/*
+	 * After freeing the lock, this thread allocates memory, which
+	 * happens to be at the same address of the lock, and by chance, it fills
+	 * the memory with the TID of thread B.
+	 */
+	new = mmap(NULL, sizeof(struct another_struct), PROT_READ | PROT_WRITE,
+		    MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	if (new == MAP_FAILED) {
+		puts("mmap failed");
+		return -1;
+	}
+	if ((uintptr_t) lock != (uintptr_t) new) {
+		puts("mmap got a different address");
+		return -1;
+	}
+
+	new->value = ((uint64_t) value << 32) + value;
+
+	/* Create a backup of the current value */
+	original_val = new->value;
+
+	/* Wait for the memory corruption to happen... */	
+	while (new->value == original_val)
+		cpu_relax();
+
+	/* ...and now the kernel just overwrote an unrelated user memory! */
+	printf("Memory was corrupted by the kernel: %lx vs %lx\n",
+		original_val, new->value);
+
+	munmap(new, sizeof(struct another_struct));
+
+	return 0;
+}
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH 0/2] mount: add OPEN_TREE_NAMESPACE
From: Askar Safin @ 2026-02-19 23:42 UTC (permalink / raw)
  To: rob; +Cc: containers, initramfs, linux-api, linux-fsdevel, linux-kernel
In-Reply-To: <6375f293-709c-41b8-a23d-12010baa3cae@landley.net>

Rob Landley <rob@landley.net>:
> Also, could you guys make CONFIG_DEVTMPFS_MOUNT work with initramfs?

I did something similar:
https://lore.kernel.org/initramfs/20260219210312.3468980-1-safinaskar@gmail.com/T/#u

Does this solve your problem?

-- 
Askar Safin

^ permalink raw reply

* Re: [PATCH v8 08/17] ext4: Report case sensitivity in fileattr_get
From: Theodore Tso @ 2026-02-19 13:14 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Al Viro, Christian Brauner, Jan Kara, linux-fsdevel, linux-ext4,
	linux-xfs, linux-cifs, linux-nfs, linux-api, linux-f2fs-devel,
	hirofumi, linkinjeon, sj1557.seo, yuezhang.mo,
	almaz.alexandrovich, slava, glaubitz, frank.li, adilger.kernel,
	cem, sfrench, pc, ronniesahlberg, sprasad, trondmy, anna, jaegeuk,
	chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-9-cel@kernel.org>

On Tue, Feb 17, 2026 at 04:47:32PM -0500, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Report ext4's case sensitivity behavior via the FS_XFLAG_CASEFOLD
> flag. ext4 always preserves case at rest.
> 
> Case sensitivity is a per-directory setting in ext4. If the queried
> inode is a casefolded directory, report case-insensitive; otherwise
> report case-sensitive (standard POSIX behavior).
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Acked-by: Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply

* Re: [PATCH bpf-next v10 4/8] bpf: Add syscall common attributes support for prog_load
From: Andrii Nakryiko @ 2026-02-18 18:44 UTC (permalink / raw)
  To: Leon Hwang
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
	Andrey Albershteyn, Willem de Bruijn, Jason Xing, Tao Chen,
	Mykyta Yatsenko, Kumar Kartikeya Dwivedi, Anton Protopopov,
	Amery Hung, Rong Tao, linux-kernel, linux-api, linux-kselftest,
	kernel-patches-bot
In-Reply-To: <eb82cc40-e5c0-4f23-ad92-92633ccb2e0d@linux.dev>

On Wed, Feb 11, 2026 at 9:50 PM Leon Hwang <leon.hwang@linux.dev> wrote:
>
>
>
> On 12/2/26 06:08, Andrii Nakryiko wrote:
> > On Wed, Feb 11, 2026 at 7:13 AM Leon Hwang <leon.hwang@linux.dev> wrote:
> >>
>
> [...]
>
> >> diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
> >> index e31747b84fe2..a2b41bf5e9cb 100644
> >> --- a/kernel/bpf/log.c
> >> +++ b/kernel/bpf/log.c
> >> @@ -864,14 +864,43 @@ void print_insn_state(struct bpf_verifier_env *env, const struct bpf_verifier_st
> >>         print_verifier_state(env, vstate, frameno, false);
> >>  }
> >>
> >> +static bool bpf_log_attrs_set(u64 log_buf, u32 log_size, u32 log_level)
> >> +{
> >> +       return log_buf && log_size && log_level;
> >> +}
> >> +
> >> +static bool bpf_log_attrs_diff(struct bpf_common_attr *common, u64 log_buf, u32 log_size,
> >> +                              u32 log_level)
> >> +{
> >> +       return bpf_log_attrs_set(log_buf, log_size, log_level) &&
> >> +               bpf_log_attrs_set(common->log_buf, common->log_size, common->log_level) &&
> >> +               (log_buf != common->log_buf || log_size != common->log_size ||
> >> +                log_level != common->log_level);
> >> +}
> >> +
> >
> > I'm not sure this check is doing what we discussed previously?... If
> > log_buf is set, but log_size or log_level is zero, you'll just ignore
> > log_buf here...
> >
> > Maybe let's keep it super simple:
> >
> > if (log_buf && common->log_buf && log_buf != common->log_buf)
> >     return -EINVAL;
> > /* same for log_size, log_level, log_true_size */
> >
> > and then below just
> >
> > log->log_buf = u64_to_user_ptr(log_buf ?: common->log_buf);
> > log->log_size = log_size ?: common->log_size;
> >
> > and so on
> >
> >
> > We can be stricter than that, of course (as in, all triplets have to
> > be completely set in either/both common_attr and attr, and they should
> > completely match), but it's just more code for little benefit.
> >
>
> We cannot mix fields across the two sources. For example, using log_buf
> from attr together with common->log_size when log_size is zero would mix
> the configuration and make the effective log setup ambiguous.
>
> The intent is to align strictly with the semantics enforced by
> bpf_verifier_log_attr_valid():
>
> * log_buf and log_size must be specified together.
> * A non-NULL log_buf requires log_level != 0.
> * All values must pass basic sanity checks.
>
> Given that contract, we should:
>
> 1. Validate the log attributes from attr and common independently using
>    the same helper.
> 2. if both sides provide log buffers, require the tuples to match
>    exactly.
> 3. select either the attr tuple or the common tuple as a whole — never
>    mix fields across the two.
>
> The patch below implements this by reusing bpf_verifier_log_attr_valid()
> for both sources and resolving conflicts before selecting the effective
> log configuration.
> >
> >>  int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
> >> -                     u32 __user *log_true_size)
> >> +                     u32 __user *log_true_size, struct bpf_common_attr *common, bpfptr_t uattr,
> >> +                     u32 size)
> >>  {
> >> +       if (bpf_log_attrs_diff(common, log_buf, log_size, log_level))
> >> +               return -EINVAL;
> >> +
> >>         memset(log, 0, sizeof(*log));
> >>         log->log_buf = u64_to_user_ptr(log_buf);
> >>         log->log_size = log_size;
> >>         log->log_level = log_level;
> >>         log->log_true_size = log_true_size;
> >> +
> >> +       if (!log_buf && common->log_buf) {
> >> +               log->log_buf = u64_to_user_ptr(common->log_buf);
> >> +               log->log_size = common->log_size;
> >> +               log->log_level = common->log_level;
> >> +               if (size >= offsetofend(struct bpf_common_attr, log_true_size))
> >> +                       log->log_true_size = uattr.user +
> >> +                               offsetof(struct bpf_common_attr, log_true_size);
> >> +               else
> >> +                       log->log_true_size = NULL;
> >
> > why not treat log_true_size same as log_buf/log_level/log_size? If
> > both are provided, they should match, and then we don't have a
> > possibility of inconsistency?
> >
> log_true_size is different from log_buf/log_size/log_level.
>
> It is not a regular attribute stored in either union bpf_attr or
> struct bpf_common_attr. Instead, it is a user pointer derived from
> uattr.user + offset.
>
> As a result, the computed log_true_size pointer for union bpf_attr
> and for struct bpf_common_attr will always differ, because they are
> based on different base user pointers (uattr.user vs
> uattr_common.user).
>
> So unlike the other log attributes, pointer equality is not a
> meaningful consistency check for log_true_size. The only sensible
> rule is that whichever side provides the effective log triplet also
> determines the write-back destination.

yeah, you are right, I forgot that log_true_size is not a pointer
itself, it's just a field in user-provided attrs. I'll check what you
did in v11, let's continue there.

>
> Thanks,
> Leon
>
> ---
>
> Based-on commit 19de32d4cb58 ("selftests/bpf: Migrate align.c tests to
> test_loader framework").
>
> From 32ec02c06d2abacbde17a45edbda46ef8a16fa2d Mon Sep 17 00:00:00 2001
> From: Leon Hwang <leon.hwang@linux.dev>
> Date: Wed, 11 Feb 2026 23:11:11 +0800
> Subject: [PATCH bpf-next v11 4/8] bpf: Add syscall common attributes support
>  for prog_load
>
> BPF_PROG_LOAD can now take log parameters from both union bpf_attr and
> struct bpf_common_attr. The merge rules are:
>
> - if both sides provide a complete log tuple (buf/size/level) and they
>   match, use it;
> - if only one side provides log parameters, use that one;
> - if both sides provide complete tuples but they differ, return -EINVAL.
>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
>  include/linux/bpf_verifier.h |  3 ++-
>  kernel/bpf/log.c             | 38 ++++++++++++++++++++++++++++--------
>  kernel/bpf/syscall.c         |  2 +-
>  3 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index dbd9bdb955b3..34f28d40022a 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -643,7 +643,8 @@ struct bpf_log_attr {
>  };
>
>  int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32
> log_size, u32 log_level,
> -                     u32 __user *log_true_size);
> +                     u32 __user *log_true_size, struct bpf_common_attr *common,
> bpfptr_t uattr,
> +                     u32 size);
>  int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct
> bpf_verifier_log *log);
>
>  #define BPF_MAX_SUBPROGS 256
> diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
> index e31747b84fe2..47bf496b673e 100644
> --- a/kernel/bpf/log.c
> +++ b/kernel/bpf/log.c
> @@ -13,17 +13,17 @@
>
>  #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args)
>
> -static bool bpf_verifier_log_attr_valid(const struct bpf_verifier_log *log)
> +static bool bpf_verifier_log_attr_valid(u32 log_level, char __user
> *log_buf, u32 log_size)
>  {
>         /* ubuf and len_total should both be specified (or not) together */
> -       if (!!log->ubuf != !!log->len_total)
> +       if (!!log_buf != !!log_size)
>                 return false;
>         /* log buf without log_level is meaningless */
> -       if (log->ubuf && log->level == 0)
> +       if (log_buf && log_level == 0)
>                 return false;
> -       if (log->level & ~BPF_LOG_MASK)
> +       if (log_level & ~BPF_LOG_MASK)
>                 return false;
> -       if (log->len_total > UINT_MAX >> 2)
> +       if (log_size > UINT_MAX >> 2)
>                 return false;
>         return true;
>  }
> @@ -36,7 +36,7 @@ int bpf_vlog_init(struct bpf_verifier_log *log, u32
> log_level,
>         log->len_total = log_size;
>
>         /* log attributes have to be sane */
> -       if (!bpf_verifier_log_attr_valid(log))
> +       if (!bpf_verifier_log_attr_valid(log_level, log_buf, log_size))
>                 return -EINVAL;
>
>         return 0;
> @@ -865,13 +865,35 @@ void print_insn_state(struct bpf_verifier_env
> *env, const struct bpf_verifier_st
>  }
>
>  int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32
> log_size, u32 log_level,
> -                     u32 __user *log_true_size)
> +                     u32 __user *log_true_size, struct bpf_common_attr *common,
> bpfptr_t uattr,
> +                     u32 size)
>  {
> +       char __user *ubuf_common = u64_to_user_ptr(common->log_buf);
> +       char __user *ubuf = u64_to_user_ptr(log_buf);
> +
> +       if (!bpf_verifier_log_attr_valid(common->log_level, ubuf_common,
> common->log_size) ||
> +           !bpf_verifier_log_attr_valid(log_level, ubuf, log_size))
> +               return -EINVAL;
> +
> +       if (ubuf && ubuf_common && (ubuf != ubuf_common || log_size !=
> common->log_size ||
> +                                   log_level != common->log_level))
> +               return -EINVAL;
> +
>         memset(log, 0, sizeof(*log));
> -       log->log_buf = u64_to_user_ptr(log_buf);
> +       log->log_buf = ubuf;
>         log->log_size = log_size;
>         log->log_level = log_level;
>         log->log_true_size = log_true_size;
> +
> +       if (!ubuf && ubuf_common) {
> +               log->log_buf = ubuf_common;
> +               log->log_size = common->log_size;
> +               log->log_level = common->log_level;
> +               log->log_true_size = NULL;
> +               if (size >= offsetofend(struct bpf_common_attr, log_true_size))
> +                       log->log_true_size = uattr.user +
> +                               offsetof(struct bpf_common_attr, log_true_size);
> +       }
>         return 0;
>  }
>
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index e86674811996..17116603ff51 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -6247,7 +6247,7 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t
> uattr, unsigned int size,
>                 if (from_user && size >= offsetofend(union bpf_attr, log_true_size))
>                         log_true_size = uattr.user + offsetof(union bpf_attr, log_true_size);
>                 err = bpf_log_attr_init(&attr_log, attr.log_buf, attr.log_size,
> attr.log_level,
> -                                       log_true_size);
> +                                       log_true_size, &attr_common, uattr_common, size_common);
>                 err = err ?: bpf_prog_load(&attr, uattr, &attr_log);
>                 break;
>         case BPF_OBJ_PIN:
> --
> 2.52.0
>
>

^ permalink raw reply

* [PATCH v8 17/17] ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

ksmbd hard-codes FILE_CASE_SENSITIVE_SEARCH and
FILE_CASE_PRESERVED_NAMES in FS_ATTRIBUTE_INFORMATION responses,
incorrectly indicating all exports are case-sensitive. This breaks
clients accessing case-insensitive filesystems like exFAT or
ext4/f2fs directories with casefold enabled.

Query actual case behavior via vfs_fileattr_get() and report accurate
attributes to SMB clients. Filesystems without ->fileattr_get continue
reporting default POSIX behavior (case-sensitive, case-preserving).

SMB's FS_ATTRIBUTE_INFORMATION reports per-share attributes from the
share root, not per-file. Shares mixing casefold and non-casefold
directories report the root directory's behavior.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/smb/server/smb2pdu.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index cbb31efdbaa2..d2a64afdd950 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -13,6 +13,7 @@
 #include <linux/falloc.h>
 #include <linux/mount.h>
 #include <linux/filelock.h>
+#include <linux/fileattr.h>
 
 #include "glob.h"
 #include "smbfsctl.h"
@@ -5497,16 +5498,28 @@ static int smb2_get_info_filesystem(struct ksmbd_work *work,
 	case FS_ATTRIBUTE_INFORMATION:
 	{
 		FILE_SYSTEM_ATTRIBUTE_INFO *info;
+		struct file_kattr fa = {};
 		size_t sz;
+		u32 attrs;
+		int err;
 
 		info = (FILE_SYSTEM_ATTRIBUTE_INFO *)rsp->Buffer;
-		info->Attributes = cpu_to_le32(FILE_SUPPORTS_OBJECT_IDS |
-					       FILE_PERSISTENT_ACLS |
-					       FILE_UNICODE_ON_DISK |
-					       FILE_CASE_PRESERVED_NAMES |
-					       FILE_CASE_SENSITIVE_SEARCH |
-					       FILE_SUPPORTS_BLOCK_REFCOUNTING);
+		attrs = FILE_SUPPORTS_OBJECT_IDS |
+			FILE_PERSISTENT_ACLS |
+			FILE_UNICODE_ON_DISK |
+			FILE_SUPPORTS_BLOCK_REFCOUNTING;
 
+		err = vfs_fileattr_get(path.dentry, &fa);
+		if (err && err != -ENOIOCTLCMD) {
+			path_put(&path);
+			return err;
+		}
+		if (!(fa.fsx_xflags & FS_XFLAG_CASEFOLD))
+			attrs |= FILE_CASE_SENSITIVE_SEARCH;
+		if (!(fa.fsx_xflags & FS_XFLAG_CASENONPRESERVING))
+			attrs |= FILE_CASE_PRESERVED_NAMES;
+
+		info->Attributes = cpu_to_le32(attrs);
 		info->Attributes |= cpu_to_le32(server_conf.share_fake_fscaps);
 
 		if (test_share_config_flag(work->tcon->share_conf,
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 16/17] nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

NFSD currently provides NFSv4 clients with hard-coded responses
indicating all exported filesystems are case-sensitive and
case-preserving. This is incorrect for case-insensitive filesystems
and ext4 directories with casefold enabled.

Query the underlying filesystem's actual case sensitivity via
nfsd_get_case_info() and return accurate values to clients. This
supports per-directory settings for filesystems that allow mixing
case-sensitive and case-insensitive directories within an export.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4xdr.c | 31 +++++++++++++++++++++++++++----
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 5172dbd0cb05..26f77b622612 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3157,6 +3157,8 @@ struct nfsd4_fattr_args {
 	u32			rdattr_err;
 	bool			contextsupport;
 	bool			ignore_crossmnt;
+	bool			case_insensitive;
+	bool			case_preserving;
 };
 
 typedef __be32(*nfsd4_enc_attr)(struct xdr_stream *xdr,
@@ -3355,6 +3357,18 @@ static __be32 nfsd4_encode_fattr4_acl(struct xdr_stream *xdr,
 	return nfs_ok;
 }
 
+static __be32 nfsd4_encode_fattr4_case_insensitive(struct xdr_stream *xdr,
+					const struct nfsd4_fattr_args *args)
+{
+	return nfsd4_encode_bool(xdr, args->case_insensitive);
+}
+
+static __be32 nfsd4_encode_fattr4_case_preserving(struct xdr_stream *xdr,
+					const struct nfsd4_fattr_args *args)
+{
+	return nfsd4_encode_bool(xdr, args->case_preserving);
+}
+
 static __be32 nfsd4_encode_fattr4_filehandle(struct xdr_stream *xdr,
 					     const struct nfsd4_fattr_args *args)
 {
@@ -3747,8 +3761,8 @@ static const nfsd4_enc_attr nfsd4_enc_fattr4_encode_ops[] = {
 	[FATTR4_ACLSUPPORT]		= nfsd4_encode_fattr4_aclsupport,
 	[FATTR4_ARCHIVE]		= nfsd4_encode_fattr4__noop,
 	[FATTR4_CANSETTIME]		= nfsd4_encode_fattr4__true,
-	[FATTR4_CASE_INSENSITIVE]	= nfsd4_encode_fattr4__false,
-	[FATTR4_CASE_PRESERVING]	= nfsd4_encode_fattr4__true,
+	[FATTR4_CASE_INSENSITIVE]	= nfsd4_encode_fattr4_case_insensitive,
+	[FATTR4_CASE_PRESERVING]	= nfsd4_encode_fattr4_case_preserving,
 	[FATTR4_CHOWN_RESTRICTED]	= nfsd4_encode_fattr4__true,
 	[FATTR4_FILEHANDLE]		= nfsd4_encode_fattr4_filehandle,
 	[FATTR4_FILEID]			= nfsd4_encode_fattr4_fileid,
@@ -3954,8 +3968,9 @@ nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 		if (err)
 			goto out_nfserr;
 	}
-	if ((attrmask[0] & (FATTR4_WORD0_FILEHANDLE | FATTR4_WORD0_FSID)) &&
-	    !fhp) {
+	if ((attrmask[0] & (FATTR4_WORD0_FILEHANDLE | FATTR4_WORD0_FSID |
+			    FATTR4_WORD0_CASE_INSENSITIVE |
+			    FATTR4_WORD0_CASE_PRESERVING)) && !fhp) {
 		tempfh = kmalloc(sizeof(struct svc_fh), GFP_KERNEL);
 		status = nfserr_jukebox;
 		if (!tempfh)
@@ -3967,6 +3982,14 @@ nfsd4_encode_fattr4(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 		args.fhp = tempfh;
 	} else
 		args.fhp = fhp;
+	if (attrmask[0] & (FATTR4_WORD0_CASE_INSENSITIVE |
+			   FATTR4_WORD0_CASE_PRESERVING)) {
+		status = nfsd_get_case_info(args.fhp, &args.case_insensitive,
+					    &args.case_preserving);
+		if (status != nfs_ok)
+			attrmask[0] &= ~(FATTR4_WORD0_CASE_INSENSITIVE |
+					 FATTR4_WORD0_CASE_PRESERVING);
+	}
 
 	if (attrmask[0] & FATTR4_WORD0_ACL) {
 		err = nfsd4_get_nfs4_acl(rqstp, dentry, &args.acl);
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 15/17] nfsd: Report export case-folding via NFSv3 PATHCONF
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

The hard-coded MSDOS_SUPER_MAGIC check in nfsd3_proc_pathconf()
only recognizes FAT filesystems as case-insensitive. Modern
filesystems like F2FS, exFAT, and CIFS support case-insensitive
directories, but NFSv3 clients cannot discover this capability.

Query the export's actual case behavior through ->fileattr_get
instead. This allows NFSv3 clients to correctly handle case
sensitivity for any filesystem that implements the fileattr
interface. Filesystems without ->fileattr_get continue to report
the default POSIX behavior (case-sensitive, case-preserving).

This change assumes the ("fat: Implement fileattr_get for case
sensitivity") has been applied, which ensures FAT filesystems
report their case behavior correctly via the fileattr interface.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs3proc.c | 18 ++++++++++--------
 fs/nfsd/vfs.c      | 25 +++++++++++++++++++++++++
 fs/nfsd/vfs.h      |  2 ++
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 42adc5461db0..9be0aca01de0 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -717,17 +717,19 @@ nfsd3_proc_pathconf(struct svc_rqst *rqstp)
 
 	if (resp->status == nfs_ok) {
 		struct super_block *sb = argp->fh.fh_dentry->d_sb;
+		bool case_insensitive, case_preserving;
 
-		/* Note that we don't care for remote fs's here */
-		switch (sb->s_magic) {
-		case EXT2_SUPER_MAGIC:
+		if (sb->s_magic == EXT2_SUPER_MAGIC) {
 			resp->p_link_max = EXT2_LINK_MAX;
 			resp->p_name_max = EXT2_NAME_LEN;
-			break;
-		case MSDOS_SUPER_MAGIC:
-			resp->p_case_insensitive = 1;
-			resp->p_case_preserving  = 0;
-			break;
+		}
+
+		resp->status = nfsd_get_case_info(&argp->fh,
+						  &case_insensitive,
+						  &case_preserving);
+		if (resp->status == nfs_ok) {
+			resp->p_case_insensitive = case_insensitive;
+			resp->p_case_preserving = case_preserving;
 		}
 	}
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index c884c3f34afb..75f862beb61f 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -32,6 +32,7 @@
 #include <linux/writeback.h>
 #include <linux/security.h>
 #include <linux/sunrpc/xdr.h>
+#include <linux/fileattr.h>
 
 #include "xdr3.h"
 
@@ -2891,3 +2892,27 @@ nfsd_permission(struct svc_cred *cred, struct svc_export *exp,
 
 	return err? nfserrno(err) : 0;
 }
+
+/**
+ * nfsd_get_case_info - get case sensitivity info for a file handle
+ * @fhp: file handle that has already been verified
+ * @case_insensitive: output, true if the filesystem is case-insensitive
+ * @case_preserving: output, true if the filesystem preserves case
+ *
+ * Returns nfs_ok on success, or an nfserr on failure.
+ */
+__be32
+nfsd_get_case_info(struct svc_fh *fhp, bool *case_insensitive,
+		   bool *case_preserving)
+{
+	struct file_kattr fa = {};
+	int err;
+
+	err = vfs_fileattr_get(fhp->fh_dentry, &fa);
+	if (err && err != -ENOIOCTLCMD)
+		return nfserrno(err);
+
+	*case_insensitive = fa.fsx_xflags & FS_XFLAG_CASEFOLD;
+	*case_preserving = !(fa.fsx_xflags & FS_XFLAG_CASENONPRESERVING);
+	return nfs_ok;
+}
diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h
index 702a844f2106..9425f2667d95 100644
--- a/fs/nfsd/vfs.h
+++ b/fs/nfsd/vfs.h
@@ -156,6 +156,8 @@ __be32		nfsd_readdir(struct svc_rqst *, struct svc_fh *,
 			     loff_t *, struct readdir_cd *, nfsd_filldir_t);
 __be32		nfsd_statfs(struct svc_rqst *, struct svc_fh *,
 				struct kstatfs *, int access);
+__be32		nfsd_get_case_info(struct svc_fh *fhp, bool *case_insensitive,
+				   bool *case_preserving);
 
 __be32		nfsd_permission(struct svc_cred *cred, struct svc_export *exp,
 				struct dentry *dentry, int acc);
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 14/17] isofs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Upper layers such as NFSD need a way to query whether a
filesystem handles filenames in a case-sensitive manner so
they can provide correct semantics to remote clients. Without
this information, NFS exports of ISO 9660 filesystems cannot
properly advertise their filename case behavior.

Implement isofs_fileattr_get() to report ISO 9660 case handling
behavior via the FS_XFLAG_CASEFOLD flag. The 'check=r' (relaxed)
mount option enables case-insensitive lookups, and this setting
determines the value reported. By default, Joliet extensions
operate in relaxed mode while plain ISO 9660 uses strict
(case-sensitive) mode. All ISO 9660 variants are case-preserving,
meaning filenames are stored exactly as they appear on the disc.

The callback is registered only on isofs_dir_inode_operations
because isofs has no custom inode_operations for regular
files, and symlinks use the generic page_symlink_inode_operations.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/isofs/dir.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/fs/isofs/dir.c b/fs/isofs/dir.c
index 2ca16c3fe5ef..225124eefa46 100644
--- a/fs/isofs/dir.c
+++ b/fs/isofs/dir.c
@@ -14,6 +14,7 @@
 #include <linux/gfp.h>
 #include <linux/filelock.h>
 #include "isofs.h"
+#include <linux/fileattr.h>

 int isofs_name_translate(struct iso_directory_record *de, char *new, struct inode *inode)
 {
@@ -267,6 +268,19 @@ static int isofs_readdir(struct file *file, struct dir_context *ctx)
 	return result;
 }

+static int isofs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	struct isofs_sb_info *sbi = ISOFS_SB(dentry->d_sb);
+
+	/*
+	 * FS_XFLAG_CASEFOLD indicates case-insensitive lookups.
+	 * When check=r (relaxed) is set, lookups ignore case.
+	 */
+	if (sbi->s_check == 'r')
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	return 0;
+}
+
 const struct file_operations isofs_dir_operations =
 {
 	.llseek = generic_file_llseek,
@@ -281,6 +295,7 @@ const struct file_operations isofs_dir_operations =
 const struct inode_operations isofs_dir_inode_operations =
 {
 	.lookup = isofs_lookup,
+	.fileattr_get = isofs_fileattr_get,
 };

-- 
2.53.0

^ permalink raw reply related

* [PATCH v8 13/17] vboxsf: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Upper layers such as NFSD need a way to query whether a
filesystem handles filenames in a case-sensitive manner. Report
VirtualBox shared folder case handling behavior via the
FS_XFLAG_CASEFOLD flag.

The case sensitivity property is queried from the VirtualBox host
service at mount time and cached in struct vboxsf_sbi. The host
determines case sensitivity based on the underlying host filesystem
(for example, Windows NTFS is case-insensitive while Linux ext4 is
case-sensitive).

VirtualBox shared folders always preserve filename case exactly
as provided by the guest. The host interface does not expose a
case_preserving property, so this is hardcoded to true.

The callback is registered in all three inode_operations
structures (directory, file, and symlink) to ensure consistent
reporting across all inode types.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/vboxsf/dir.c    |  1 +
 fs/vboxsf/file.c   |  6 ++++--
 fs/vboxsf/super.c  |  4 ++++
 fs/vboxsf/utils.c  | 31 +++++++++++++++++++++++++++++++
 fs/vboxsf/vfsmod.h |  6 ++++++
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/fs/vboxsf/dir.c b/fs/vboxsf/dir.c
index 42bedc4ec7af..c5bd3271aa96 100644
--- a/fs/vboxsf/dir.c
+++ b/fs/vboxsf/dir.c
@@ -477,4 +477,5 @@ const struct inode_operations vboxsf_dir_iops = {
 	.symlink = vboxsf_dir_symlink,
 	.getattr = vboxsf_getattr,
 	.setattr = vboxsf_setattr,
+	.fileattr_get = vboxsf_fileattr_get,
 };
diff --git a/fs/vboxsf/file.c b/fs/vboxsf/file.c
index 111752010edb..30a4f4262928 100644
--- a/fs/vboxsf/file.c
+++ b/fs/vboxsf/file.c
@@ -222,7 +222,8 @@ const struct file_operations vboxsf_reg_fops = {
 
 const struct inode_operations vboxsf_reg_iops = {
 	.getattr = vboxsf_getattr,
-	.setattr = vboxsf_setattr
+	.setattr = vboxsf_setattr,
+	.fileattr_get = vboxsf_fileattr_get,
 };
 
 static int vboxsf_read_folio(struct file *file, struct folio *folio)
@@ -389,5 +390,6 @@ static const char *vboxsf_get_link(struct dentry *dentry, struct inode *inode,
 }
 
 const struct inode_operations vboxsf_lnk_iops = {
-	.get_link = vboxsf_get_link
+	.get_link = vboxsf_get_link,
+	.fileattr_get = vboxsf_fileattr_get,
 };
diff --git a/fs/vboxsf/super.c b/fs/vboxsf/super.c
index 241647b060ee..fcabeca2a339 100644
--- a/fs/vboxsf/super.c
+++ b/fs/vboxsf/super.c
@@ -185,6 +185,10 @@ static int vboxsf_fill_super(struct super_block *sb, struct fs_context *fc)
 	if (err)
 		goto fail_unmap;
 
+	err = vboxsf_query_case_sensitive(sbi);
+	if (err)
+		goto fail_unmap;
+
 	sb->s_magic = VBOXSF_SUPER_MAGIC;
 	sb->s_blocksize = 1024;
 	sb->s_maxbytes = MAX_LFS_FILESIZE;
diff --git a/fs/vboxsf/utils.c b/fs/vboxsf/utils.c
index 9515bbf0b54c..658b8b0ebbd7 100644
--- a/fs/vboxsf/utils.c
+++ b/fs/vboxsf/utils.c
@@ -11,6 +11,7 @@
 #include <linux/sizes.h>
 #include <linux/pagemap.h>
 #include <linux/vfs.h>
+#include <linux/fileattr.h>
 #include "vfsmod.h"
 
 struct inode *vboxsf_new_inode(struct super_block *sb)
@@ -567,3 +568,33 @@ int vboxsf_dir_read_all(struct vboxsf_sbi *sbi, struct vboxsf_dir_info *sf_d,
 
 	return err;
 }
+
+int vboxsf_query_case_sensitive(struct vboxsf_sbi *sbi)
+{
+	struct shfl_volinfo volinfo = {};
+	u32 buf_len;
+	int err;
+
+	buf_len = sizeof(volinfo);
+	err = vboxsf_fsinfo(sbi->root, 0, SHFL_INFO_GET | SHFL_INFO_VOLUME,
+			    &buf_len, &volinfo);
+	if (err)
+		return err;
+
+	sbi->case_insensitive = !volinfo.properties.case_sensitive;
+	return 0;
+}
+
+int vboxsf_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	struct vboxsf_sbi *sbi = VBOXSF_SBI(dentry->d_sb);
+
+	/*
+	 * VirtualBox shared folders preserve filename case exactly as
+	 * provided by the guest (the default). The host interface does
+	 * not expose a case-preservation property.
+	 */
+	if (sbi->case_insensitive)
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	return 0;
+}
diff --git a/fs/vboxsf/vfsmod.h b/fs/vboxsf/vfsmod.h
index 05973eb89d52..b61afd0ce842 100644
--- a/fs/vboxsf/vfsmod.h
+++ b/fs/vboxsf/vfsmod.h
@@ -47,6 +47,7 @@ struct vboxsf_sbi {
 	u32 next_generation;
 	u32 root;
 	int bdi_id;
+	bool case_insensitive;
 };
 
 /* per-inode information */
@@ -111,6 +112,11 @@ void vboxsf_dir_info_free(struct vboxsf_dir_info *p);
 int vboxsf_dir_read_all(struct vboxsf_sbi *sbi, struct vboxsf_dir_info *sf_d,
 			u64 handle);
 
+int vboxsf_query_case_sensitive(struct vboxsf_sbi *sbi);
+
+struct file_kattr;
+int vboxsf_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
+
 /* from vboxsf_wrappers.c */
 int vboxsf_connect(void);
 void vboxsf_disconnect(void);
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 12/17] f2fs: Add case sensitivity reporting to fileattr_get
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

NFS and other remote filesystem protocols need to determine
whether a local filesystem performs case-insensitive lookups
so they can provide correct semantics to clients. Without
this information, f2fs exports cannot properly advertise
their filename case behavior.

Report f2fs case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. Like ext4, f2fs supports per-directory case folding via
the casefold flag (IS_CASEFOLDED). Files are always case-preserving.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/f2fs/file.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index c8a2f17a8f11..f7f616e041dc 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3453,6 +3453,13 @@ int f2fs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
 	if (f2fs_sb_has_project_quota(F2FS_I_SB(inode)))
 		fa->fsx_projid = from_kprojid(&init_user_ns, fi->i_projid);
 
+	/*
+	 * f2fs preserves case (the default). If this inode is a
+	 * casefolded directory, report case-insensitive; otherwise
+	 * report case-sensitive (standard POSIX behavior).
+	 */
+	if (IS_CASEFOLDED(inode))
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
 	return 0;
 }
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 11/17] nfs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

An NFS server re-exporting an NFS mount point needs to report the
case sensitivity behavior of the underlying filesystem to its
clients. Without this, re-export servers cannot accurately convey
case handling semantics, potentially causing client applications to
make incorrect assumptions about filename collisions and lookups.

The NFS client already retrieves case sensitivity information from
servers during mount via PATHCONF (NFSv3) or the
FATTR4_CASE_INSENSITIVE/FATTR4_CASE_PRESERVING attributes (NFSv4).
Expose this information through fileattr_get by reporting the
FS_XFLAG_CASEFOLD and FS_XFLAG_CASENONPRESERVING flags. NFSv2 lacks
PATHCONF support, so mounts using that protocol version default to
standard POSIX behavior: case-sensitive and case-preserving.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/client.c         |  9 +++++++--
 fs/nfs/inode.c          | 20 ++++++++++++++++++++
 fs/nfs/internal.h       |  3 +++
 fs/nfs/nfs3proc.c       |  2 ++
 fs/nfs/nfs3xdr.c        |  7 +++++--
 fs/nfs/nfs4proc.c       |  2 ++
 fs/nfs/proc.c           |  3 +++
 fs/nfs/symlink.c        |  3 +++
 include/linux/nfs_xdr.h |  2 ++
 9 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index fd15731cf361..0410f6053bfd 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -935,13 +935,18 @@ static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, str
 
 	/* Get some general file system info */
 	if (server->namelen == 0) {
-		struct nfs_pathconf pathinfo;
+		struct nfs_pathconf pathinfo = { };
 
 		pathinfo.fattr = fattr;
 		nfs_fattr_init(fattr);
 
-		if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0)
+		if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) {
 			server->namelen = pathinfo.max_namelen;
+			if (pathinfo.case_insensitive)
+				server->caps |= NFS_CAP_CASE_INSENSITIVE;
+			if (pathinfo.case_preserving)
+				server->caps |= NFS_CAP_CASE_PRESERVING;
+		}
 	}
 
 	if (clp->rpc_ops->discover_trunking != NULL &&
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 331cdecdd966..8a2edce22a53 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -41,6 +41,7 @@
 #include <linux/freezer.h>
 #include <linux/uaccess.h>
 #include <linux/iversion.h>
+#include <linux/fileattr.h>
 
 #include "nfs4_fs.h"
 #include "callback.h"
@@ -1101,6 +1102,25 @@ int nfs_getattr(struct mnt_idmap *idmap, const struct path *path,
 }
 EXPORT_SYMBOL_GPL(nfs_getattr);
 
+/**
+ * nfs_fileattr_get - Retrieve file attributes
+ * @dentry: object to query
+ * @fa: file attributes to fill in
+ *
+ * Return: 0 on success
+ */
+int nfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	struct inode *inode = d_inode(dentry);
+
+	if (nfs_server_capable(inode, NFS_CAP_CASE_INSENSITIVE))
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	if (!nfs_server_capable(inode, NFS_CAP_CASE_PRESERVING))
+		fa->fsx_xflags |= FS_XFLAG_CASENONPRESERVING;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nfs_fileattr_get);
+
 static void nfs_init_lock_context(struct nfs_lock_context *l_ctx)
 {
 	refcount_set(&l_ctx->count, 1);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 63e09dfc27a8..e29695d4b7ed 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -449,6 +449,9 @@ extern void nfs_set_cache_invalid(struct inode *inode, unsigned long flags);
 extern bool nfs_check_cache_invalid(struct inode *, unsigned long);
 extern int nfs_wait_bit_killable(struct wait_bit_key *key, int mode);
 
+struct file_kattr;
+int nfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
+
 #if IS_ENABLED(CONFIG_NFS_LOCALIO)
 /* localio.c */
 struct nfs_local_dio {
diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index d3d2fbeba89d..c30c8c6b1b9e 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -1047,6 +1047,7 @@ static const struct inode_operations nfs3_dir_inode_operations = {
 	.permission	= nfs_permission,
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
+	.fileattr_get	= nfs_fileattr_get,
 #ifdef CONFIG_NFS_V3_ACL
 	.listxattr	= nfs3_listxattr,
 	.get_inode_acl	= nfs3_get_acl,
@@ -1058,6 +1059,7 @@ static const struct inode_operations nfs3_file_inode_operations = {
 	.permission	= nfs_permission,
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
+	.fileattr_get	= nfs_fileattr_get,
 #ifdef CONFIG_NFS_V3_ACL
 	.listxattr	= nfs3_listxattr,
 	.get_inode_acl	= nfs3_get_acl,
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index e17d72908412..e745e78faab0 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -2276,8 +2276,11 @@ static int decode_pathconf3resok(struct xdr_stream *xdr,
 	if (unlikely(!p))
 		return -EIO;
 	result->max_link = be32_to_cpup(p++);
-	result->max_namelen = be32_to_cpup(p);
-	/* ignore remaining fields */
+	result->max_namelen = be32_to_cpup(p++);
+	p++;	/* ignore no_trunc */
+	p++;	/* ignore chown_restricted */
+	result->case_insensitive = be32_to_cpup(p++) != 0;
+	result->case_preserving = be32_to_cpup(p) != 0;
 	return 0;
 }
 
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 180229320731..c9ab542bdb90 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -10598,6 +10598,7 @@ static const struct inode_operations nfs4_dir_inode_operations = {
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
 	.listxattr	= nfs4_listxattr,
+	.fileattr_get	= nfs_fileattr_get,
 };
 
 static const struct inode_operations nfs4_file_inode_operations = {
@@ -10605,6 +10606,7 @@ static const struct inode_operations nfs4_file_inode_operations = {
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
 	.listxattr	= nfs4_listxattr,
+	.fileattr_get	= nfs_fileattr_get,
 };
 
 static struct nfs_server *nfs4_clone_server(struct nfs_server *source,
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index 0e440ebf5335..a92c0ea26ea0 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -597,6 +597,7 @@ nfs_proc_pathconf(struct nfs_server *server, struct nfs_fh *fhandle,
 {
 	info->max_link = 0;
 	info->max_namelen = NFS2_MAXNAMLEN;
+	info->case_preserving = true;
 	return 0;
 }
 
@@ -717,12 +718,14 @@ static const struct inode_operations nfs_dir_inode_operations = {
 	.permission	= nfs_permission,
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
+	.fileattr_get	= nfs_fileattr_get,
 };
 
 static const struct inode_operations nfs_file_inode_operations = {
 	.permission	= nfs_permission,
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
+	.fileattr_get	= nfs_fileattr_get,
 };
 
 const struct nfs_rpc_ops nfs_v2_clientops = {
diff --git a/fs/nfs/symlink.c b/fs/nfs/symlink.c
index 58146e935402..74a072896f8d 100644
--- a/fs/nfs/symlink.c
+++ b/fs/nfs/symlink.c
@@ -22,6 +22,8 @@
 #include <linux/mm.h>
 #include <linux/string.h>
 
+#include "internal.h"
+
 /* Symlink caching in the page cache is even more simplistic
  * and straight-forward than readdir caching.
  */
@@ -74,4 +76,5 @@ const struct inode_operations nfs_symlink_inode_operations = {
 	.get_link	= nfs_get_link,
 	.getattr	= nfs_getattr,
 	.setattr	= nfs_setattr,
+	.fileattr_get	= nfs_fileattr_get,
 };
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index ff1f12aa73d2..7c2057e40f99 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -182,6 +182,8 @@ struct nfs_pathconf {
 	struct nfs_fattr	*fattr; /* Post-op attributes */
 	__u32			max_link; /* max # of hard links */
 	__u32			max_namelen; /* max name length */
+	bool			case_insensitive;
+	bool			case_preserving;
 };
 
 struct nfs4_change_info {
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 10/17] cifs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Upper layers such as NFSD need a way to query whether a filesystem
handles filenames in a case-sensitive manner. Report CIFS/SMB case
handling behavior via the FS_XFLAG_CASEFOLD flag.

CIFS servers (typically Windows or Samba) are usually case-insensitive
but case-preserving, meaning they ignore case during lookups but store
filenames exactly as provided.

The implementation reports case sensitivity based on the nocase mount
option, which reflects whether the client expects the server to perform
case-insensitive comparisons. When nocase is set, the mount is reported
as case-insensitive.

The callback is registered in all three inode_operations structures
(directory, file, and symlink) to ensure consistent reporting across
all inode types.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/smb/client/cifsfs.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index afda1d7c1ee1..9d85785a1c67 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -30,6 +30,7 @@
 #include <linux/xattr.h>
 #include <linux/mm.h>
 #include <linux/key-type.h>
+#include <linux/fileattr.h>
 #include <uapi/linux/magic.h>
 #include <net/ipv6.h>
 #include "cifsfs.h"
@@ -1189,6 +1190,22 @@ struct file_system_type smb3_fs_type = {
 MODULE_ALIAS_FS("smb3");
 MODULE_ALIAS("smb3");
 
+static int cifs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	struct cifs_sb_info *cifs_sb = CIFS_SB(dentry->d_sb);
+	struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
+
+	/*
+	 * Case sensitivity is reported based on the nocase mount option.
+	 * CIFS servers typically perform case-insensitive lookups while
+	 * preserving case in stored filenames. The nocase option indicates
+	 * case-insensitive comparison is in effect for this mount.
+	 */
+	if (tcon->nocase)
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	return 0;
+}
+
 const struct inode_operations cifs_dir_inode_ops = {
 	.create = cifs_create,
 	.atomic_open = cifs_atomic_open,
@@ -1206,6 +1223,7 @@ const struct inode_operations cifs_dir_inode_ops = {
 	.listxattr = cifs_listxattr,
 	.get_acl = cifs_get_acl,
 	.set_acl = cifs_set_acl,
+	.fileattr_get = cifs_fileattr_get,
 };
 
 const struct inode_operations cifs_file_inode_ops = {
@@ -1216,6 +1234,7 @@ const struct inode_operations cifs_file_inode_ops = {
 	.fiemap = cifs_fiemap,
 	.get_acl = cifs_get_acl,
 	.set_acl = cifs_set_acl,
+	.fileattr_get = cifs_fileattr_get,
 };
 
 const char *cifs_get_link(struct dentry *dentry, struct inode *inode,
@@ -1250,6 +1269,7 @@ const struct inode_operations cifs_symlink_inode_ops = {
 	.setattr = cifs_setattr,
 	.permission = cifs_permission,
 	.listxattr = cifs_listxattr,
+	.fileattr_get = cifs_fileattr_get,
 };
 
 /*
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 09/17] xfs: Report case sensitivity in fileattr_get
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
	Darrick J. Wong
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Upper layers such as NFSD need to query whether a filesystem is
case-sensitive. Report case sensitivity via the FS_XFLAG_CASEFOLD
flag in xfs_fileattr_get(). XFS always preserves case. XFS is
case-sensitive by default, but supports ASCII case-insensitive
lookups when formatted with the ASCIICI feature flag.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/xfs/xfs_ioctl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 369555275140..41c6b4cd8ac2 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -518,6 +518,13 @@ xfs_fileattr_get(
 	xfs_fill_fsxattr(ip, XFS_DATA_FORK, fa);
 	xfs_iunlock(ip, XFS_ILOCK_SHARED);
 
+	/*
+	 * FS_XFLAG_CASEFOLD indicates case-insensitive lookups with
+	 * case preservation. This matches ASCIICI behavior: lookups
+	 * fold ASCII case while filenames remain stored verbatim.
+	 */
+	if (xfs_has_asciici(ip->i_mount))
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
 	return 0;
 }
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 08/17] ext4: Report case sensitivity in fileattr_get
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Report ext4's case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. ext4 always preserves case at rest.

Case sensitivity is a per-directory setting in ext4. If the queried
inode is a casefolded directory, report case-insensitive; otherwise
report case-sensitive (standard POSIX behavior).

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/ext4/ioctl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 3ae9cb50a0c0..a93017b80aa0 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -999,6 +999,13 @@ int ext4_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
 	if (ext4_has_feature_project(inode->i_sb))
 		fa->fsx_projid = from_kprojid(&init_user_ns, ei->i_projid);
 
+	/*
+	 * Case folding is a directory attribute in ext4. Set FS_XFLAG_CASEFOLD
+	 * for directories with the casefold attribute; all other inodes use
+	 * standard case-sensitive semantics.
+	 */
+	if (IS_CASEFOLDED(inode))
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
 	return 0;
 }
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 07/17] hfsplus: Report case sensitivity in fileattr_get
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Add case sensitivity reporting to the existing hfsplus_fileattr_get()
function via the FS_XFLAG_CASEFOLD flag. HFS+ always preserves case
at rest.

Case sensitivity depends on how the volume was formatted: HFSX
volumes may be either case-sensitive or case-insensitive, indicated
by the HFSPLUS_SB_CASEFOLD superblock flag.

Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/hfsplus/inode.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index 922ff41df042..fd50f0d52c81 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -728,6 +728,7 @@ int hfsplus_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
 {
 	struct inode *inode = d_inode(dentry);
 	struct hfsplus_inode_info *hip = HFSPLUS_I(inode);
+	struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb);
 	unsigned int flags = 0;
 
 	if (inode->i_flags & S_IMMUTABLE)
@@ -739,6 +740,13 @@ int hfsplus_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
 
 	fileattr_fill_flags(fa, flags);
 
+	/*
+	 * HFS+ preserves case (the default). Case sensitivity depends
+	 * on how the filesystem was formatted: HFSX volumes may be
+	 * either case-sensitive or case-insensitive.
+	 */
+	if (test_bit(HFSPLUS_SB_CASEFOLD, &sbi->flags))
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
 	return 0;
 }
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 06/17] hfs: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Report HFS case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. HFS is always case-insensitive (using Mac OS Roman case
folding) and always preserves case at rest.

Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/hfs/dir.c    |  1 +
 fs/hfs/hfs_fs.h |  2 ++
 fs/hfs/inode.c  | 13 +++++++++++++
 3 files changed, 16 insertions(+)

diff --git a/fs/hfs/dir.c b/fs/hfs/dir.c
index 0c615c078650..f85ef69a8375 100644
--- a/fs/hfs/dir.c
+++ b/fs/hfs/dir.c
@@ -328,4 +328,5 @@ const struct inode_operations hfs_dir_inode_operations = {
 	.rmdir		= hfs_remove,
 	.rename		= hfs_rename,
 	.setattr	= hfs_inode_setattr,
+	.fileattr_get	= hfs_fileattr_get,
 };
diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
index ac0e83f77a0f..1b23448c9a48 100644
--- a/fs/hfs/hfs_fs.h
+++ b/fs/hfs/hfs_fs.h
@@ -177,6 +177,8 @@ extern int hfs_get_block(struct inode *inode, sector_t block,
 extern const struct address_space_operations hfs_aops;
 extern const struct address_space_operations hfs_btree_aops;
 
+struct file_kattr;
+int hfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
 int hfs_write_begin(const struct kiocb *iocb, struct address_space *mapping,
 		    loff_t pos, unsigned int len, struct folio **foliop,
 		    void **fsdata);
diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
index 878535db64d6..b3e6e46ddd14 100644
--- a/fs/hfs/inode.c
+++ b/fs/hfs/inode.c
@@ -18,6 +18,7 @@
 #include <linux/uio.h>
 #include <linux/xattr.h>
 #include <linux/blkdev.h>
+#include <linux/fileattr.h>
 
 #include "hfs_fs.h"
 #include "btree.h"
@@ -716,6 +717,17 @@ static int hfs_file_fsync(struct file *filp, loff_t start, loff_t end,
 	return ret;
 }
 
+int hfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	/*
+	 * Report case-insensitive behavior: all name comparisons use
+	 * Mac OS Roman case folding. FS_XFLAG_CASENONPRESERVING remains
+	 * unset because original case is preserved on disk.
+	 */
+	fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	return 0;
+}
+
 static const struct file_operations hfs_file_operations = {
 	.llseek		= generic_file_llseek,
 	.read_iter	= generic_file_read_iter,
@@ -732,4 +744,5 @@ static const struct inode_operations hfs_file_inode_operations = {
 	.lookup		= hfs_file_lookup,
 	.setattr	= hfs_inode_setattr,
 	.listxattr	= generic_listxattr,
+	.fileattr_get	= hfs_fileattr_get,
 };
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 05/17] ntfs3: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Report NTFS case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. NTFS always preserves case at rest.

Case sensitivity depends on mount options: with "nocase", NTFS
is case-insensitive; otherwise it is case-sensitive.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/ntfs3/file.c    | 23 +++++++++++++++++++++++
 fs/ntfs3/inode.c   |  1 +
 fs/ntfs3/namei.c   |  2 ++
 fs/ntfs3/ntfs_fs.h |  1 +
 4 files changed, 27 insertions(+)

diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 6cb4479072a6..bf7e950ccdb8 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -147,6 +147,28 @@ long ntfs_compat_ioctl(struct file *filp, u32 cmd, unsigned long arg)
 }
 #endif
 
+/*
+ * ntfs_fileattr_get - inode_operations::fileattr_get
+ */
+int ntfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	struct inode *inode = d_inode(dentry);
+	struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
+
+	/* Avoid any operation if inode is bad. */
+	if (unlikely(is_bad_ni(ntfs_i(inode))))
+		return -EINVAL;
+
+	/*
+	 * NTFS preserves case (the default). Case sensitivity depends on
+	 * mount options: with "nocase", NTFS is case-insensitive;
+	 * otherwise it is case-sensitive.
+	 */
+	if (sbi->options && sbi->options->nocase)
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	return 0;
+}
+
 /*
  * ntfs_getattr - inode_operations::getattr
  */
@@ -1461,6 +1483,7 @@ const struct inode_operations ntfs_file_inode_operations = {
 	.get_acl	= ntfs_get_acl,
 	.set_acl	= ntfs_set_acl,
 	.fiemap		= ntfs_fiemap,
+	.fileattr_get	= ntfs_fileattr_get,
 };
 
 const struct file_operations ntfs_file_operations = {
diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index edfb973e4e82..a6d1489cc362 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -2088,6 +2088,7 @@ const struct inode_operations ntfs_link_inode_operations = {
 	.get_link	= ntfs_get_link,
 	.setattr	= ntfs_setattr,
 	.listxattr	= ntfs_listxattr,
+	.fileattr_get	= ntfs_fileattr_get,
 };
 
 const struct address_space_operations ntfs_aops = {
diff --git a/fs/ntfs3/namei.c b/fs/ntfs3/namei.c
index b2af8f695e60..eb241d7796ba 100644
--- a/fs/ntfs3/namei.c
+++ b/fs/ntfs3/namei.c
@@ -518,6 +518,7 @@ const struct inode_operations ntfs_dir_inode_operations = {
 	.getattr	= ntfs_getattr,
 	.listxattr	= ntfs_listxattr,
 	.fiemap		= ntfs_fiemap,
+	.fileattr_get	= ntfs_fileattr_get,
 };
 
 const struct inode_operations ntfs_special_inode_operations = {
@@ -526,6 +527,7 @@ const struct inode_operations ntfs_special_inode_operations = {
 	.listxattr	= ntfs_listxattr,
 	.get_acl	= ntfs_get_acl,
 	.set_acl	= ntfs_set_acl,
+	.fileattr_get	= ntfs_fileattr_get,
 };
 
 const struct dentry_operations ntfs_dentry_ops = {
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index f18349689458..94a23464c51f 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -505,6 +505,7 @@ extern const struct file_operations ntfs_dir_operations;
 extern const struct file_operations ntfs_legacy_dir_operations;
 
 /* Globals from file.c */
+int ntfs_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
 int ntfs_getattr(struct mnt_idmap *idmap, const struct path *path,
 		 struct kstat *stat, u32 request_mask, u32 flags);
 int ntfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 04/17] exfat: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Report exFAT's case sensitivity behavior via the FS_XFLAG_CASEFOLD
flag. exFAT is always case-insensitive (using an upcase table for
comparison) and always preserves case at rest.

Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/exfat/exfat_fs.h |  2 ++
 fs/exfat/file.c     | 16 ++++++++++++++--
 fs/exfat/namei.c    |  1 +
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
index 2dbed5f8ec26..686d4cd49546 100644
--- a/fs/exfat/exfat_fs.h
+++ b/fs/exfat/exfat_fs.h
@@ -468,6 +468,8 @@ int exfat_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
 int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
 		  struct kstat *stat, unsigned int request_mask,
 		  unsigned int query_flags);
+struct file_kattr;
+int exfat_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
 int exfat_file_fsync(struct file *file, loff_t start, loff_t end, int datasync);
 long exfat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
 long exfat_compat_ioctl(struct file *filp, unsigned int cmd,
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 90cd540afeaa..15629b0a6f6d 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -13,6 +13,7 @@
 #include <linux/msdos_fs.h>
 #include <linux/writeback.h>
 #include <linux/filelock.h>
+#include <linux/fileattr.h>
 
 #include "exfat_raw.h"
 #include "exfat_fs.h"
@@ -282,6 +283,16 @@ int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
 	return 0;
 }
 
+int exfat_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	/*
+	 * exFAT is always case-insensitive (using upcase table).
+	 * Case is preserved at rest (the default).
+	 */
+	fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	return 0;
+}
+
 int exfat_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
 		  struct iattr *attr)
 {
@@ -775,6 +786,7 @@ const struct file_operations exfat_file_operations = {
 };
 
 const struct inode_operations exfat_file_inode_operations = {
-	.setattr     = exfat_setattr,
-	.getattr     = exfat_getattr,
+	.setattr	= exfat_setattr,
+	.getattr	= exfat_getattr,
+	.fileattr_get	= exfat_fileattr_get,
 };
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c
index 670116ae9ec8..7895dda5cdb4 100644
--- a/fs/exfat/namei.c
+++ b/fs/exfat/namei.c
@@ -1323,4 +1323,5 @@ const struct inode_operations exfat_dir_inode_operations = {
 	.rename		= exfat_rename,
 	.setattr	= exfat_setattr,
 	.getattr	= exfat_getattr,
+	.fileattr_get	= exfat_fileattr_get,
 };
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 03/17] fat: Implement fileattr_get for case sensitivity
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Report FAT's case sensitivity behavior via the FS_XFLAG_CASEFOLD
and FS_XFLAG_CASENONPRESERVING flags. FAT filesystems are
case-insensitive by default.

MSDOS supports a 'nocase' mount option that enables case-sensitive
behavior; check this option when reporting case sensitivity.

VFAT long filename entries preserve case; without VFAT, only
uppercased 8.3 short names are stored. MSDOS with 'nocase' also
preserves case since the name-formatting code skips upcasing when
'nocase' is set. Check both options when reporting case preservation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/fat/fat.h         |  3 +++
 fs/fat/file.c        | 22 ++++++++++++++++++++++
 fs/fat/namei_msdos.c |  1 +
 fs/fat/namei_vfat.c  |  1 +
 4 files changed, 27 insertions(+)

diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 0d269dba897b..c5bcd1063f9c 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -10,6 +10,8 @@
 #include <linux/fs_context.h>
 #include <linux/fs_parser.h>
 
+struct file_kattr;
+
 /*
  * vfat shortname flags
  */
@@ -407,6 +409,7 @@ extern void fat_truncate_blocks(struct inode *inode, loff_t offset);
 extern int fat_getattr(struct mnt_idmap *idmap,
 		       const struct path *path, struct kstat *stat,
 		       u32 request_mask, unsigned int flags);
+int fat_fileattr_get(struct dentry *dentry, struct file_kattr *fa);
 extern int fat_file_fsync(struct file *file, loff_t start, loff_t end,
 			  int datasync);
 
diff --git a/fs/fat/file.c b/fs/fat/file.c
index 124d9c5431c8..6823269a8604 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -17,6 +17,7 @@
 #include <linux/fsnotify.h>
 #include <linux/security.h>
 #include <linux/falloc.h>
+#include <linux/fileattr.h>
 #include "fat.h"
 
 static long fat_fallocate(struct file *file, int mode,
@@ -396,6 +397,26 @@ void fat_truncate_blocks(struct inode *inode, loff_t offset)
 	fat_flush_inodes(inode->i_sb, inode, NULL);
 }
 
+int fat_fileattr_get(struct dentry *dentry, struct file_kattr *fa)
+{
+	struct msdos_sb_info *sbi = MSDOS_SB(dentry->d_sb);
+
+	/*
+	 * FAT filesystems are case-insensitive by default. MSDOS
+	 * supports a 'nocase' mount option for case-sensitive behavior.
+	 *
+	 * VFAT long filename entries preserve case. Without VFAT, only
+	 * uppercased 8.3 short names are stored. MSDOS with 'nocase'
+	 * also preserves case.
+	 */
+	if (!sbi->options.nocase)
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
+	if (!sbi->options.isvfat && !sbi->options.nocase)
+		fa->fsx_xflags |= FS_XFLAG_CASENONPRESERVING;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(fat_fileattr_get);
+
 int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
 		struct kstat *stat, u32 request_mask, unsigned int flags)
 {
@@ -573,5 +594,6 @@ EXPORT_SYMBOL_GPL(fat_setattr);
 const struct inode_operations fat_file_inode_operations = {
 	.setattr	= fat_setattr,
 	.getattr	= fat_getattr,
+	.fileattr_get	= fat_fileattr_get,
 	.update_time	= fat_update_time,
 };
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index 048c103b506a..4a3db08e51c0 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -642,6 +642,7 @@ static const struct inode_operations msdos_dir_inode_operations = {
 	.rename		= msdos_rename,
 	.setattr	= fat_setattr,
 	.getattr	= fat_getattr,
+	.fileattr_get	= fat_fileattr_get,
 	.update_time	= fat_update_time,
 };
 
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 2acfe3123a72..18f4c316aa05 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -1185,6 +1185,7 @@ static const struct inode_operations vfat_dir_inode_operations = {
 	.rename		= vfat_rename2,
 	.setattr	= fat_setattr,
 	.getattr	= fat_getattr,
+	.fileattr_get	= fat_fileattr_get,
 	.update_time	= fat_update_time,
 };
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH v8 02/17] fs: Add case sensitivity flags to file_kattr
From: Chuck Lever @ 2026-02-17 21:47 UTC (permalink / raw)
  To: Al Viro, Christian Brauner, Jan Kara
  Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
	linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
	yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
	tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
	trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
	Darrick J. Wong
In-Reply-To: <20260217214741.1928576-1-cel@kernel.org>

From: Chuck Lever <chuck.lever@oracle.com>

Enable upper layers such as NFSD to retrieve case sensitivity
information from file systems by adding FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING flags.

Filesystems report case-insensitive or case-nonpreserving behavior
by setting these flags directly in fa->fsx_xflags. The default
(flags unset) indicates POSIX semantics: case-sensitive and
case-preserving. These flags are read-only; userspace cannot set
them via ioctl.

Case sensitivity information is exported to userspace via the
fa_xflags field in the FS_IOC_FSGETXATTR ioctl and file_getattr()
system call.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/file_attr.c           | 4 ++++
 include/linux/fileattr.h | 3 ++-
 include/uapi/linux/fs.h  | 7 +++++++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/file_attr.c b/fs/file_attr.c
index 42aa511111a0..5d9a7ed159fb 100644
--- a/fs/file_attr.c
+++ b/fs/file_attr.c
@@ -37,6 +37,8 @@ void fileattr_fill_xflags(struct file_kattr *fa, u32 xflags)
 		fa->flags |= FS_PROJINHERIT_FL;
 	if (fa->fsx_xflags & FS_XFLAG_VERITY)
 		fa->flags |= FS_VERITY_FL;
+	if (fa->fsx_xflags & FS_XFLAG_CASEFOLD)
+		fa->flags |= FS_CASEFOLD_FL;
 }
 EXPORT_SYMBOL(fileattr_fill_xflags);
 
@@ -67,6 +69,8 @@ void fileattr_fill_flags(struct file_kattr *fa, u32 flags)
 		fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
 	if (fa->flags & FS_VERITY_FL)
 		fa->fsx_xflags |= FS_XFLAG_VERITY;
+	if (fa->flags & FS_CASEFOLD_FL)
+		fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
 }
 EXPORT_SYMBOL(fileattr_fill_flags);
 
diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
index 3780904a63a6..58044b598016 100644
--- a/include/linux/fileattr.h
+++ b/include/linux/fileattr.h
@@ -16,7 +16,8 @@
 
 /* Read-only inode flags */
 #define FS_XFLAG_RDONLY_MASK \
-	(FS_XFLAG_PREALLOC | FS_XFLAG_HASATTR | FS_XFLAG_VERITY)
+	(FS_XFLAG_PREALLOC | FS_XFLAG_HASATTR | FS_XFLAG_VERITY | \
+	 FS_XFLAG_CASEFOLD | FS_XFLAG_CASENONPRESERVING)
 
 /* Flags to indicate valid value of fsx_ fields */
 #define FS_XFLAG_VALUES_MASK \
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 70b2b661f42c..2fa003575e8b 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -254,6 +254,13 @@ struct file_attr {
 #define FS_XFLAG_DAX		0x00008000	/* use DAX for IO */
 #define FS_XFLAG_COWEXTSIZE	0x00010000	/* CoW extent size allocator hint */
 #define FS_XFLAG_VERITY		0x00020000	/* fs-verity enabled */
+/*
+ * Case handling flags (read-only, cannot be set via ioctl).
+ * Default (neither set) indicates POSIX semantics: case-sensitive
+ * lookups and case-preserving storage.
+ */
+#define FS_XFLAG_CASEFOLD	0x00040000	/* case-insensitive lookups */
+#define FS_XFLAG_CASENONPRESERVING 0x00080000	/* case not preserved */
 #define FS_XFLAG_HASATTR	0x80000000	/* no DIFLAG for this	*/
 
 /* the read-only stuff doesn't really belong here, but any other place is
-- 
2.53.0


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox