[patch 0/4] uaccess: Provide and use helpers for user masked access

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [patch 0/4] uaccess: Provide and use helpers for user masked access
@ 2025-08-13 15:57 Thomas Gleixner
  2025-08-13 15:57 ` [patch 1/4] uaccess: Provide common helpers for masked user access Thomas Gleixner
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Thomas Gleixner @ 2025-08-13 15:57 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, André Almeida, x86, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel

commit 2865baf54077 ("x86: support user address masking instead of
non-speculative conditional") provided an optimization for
unsafe_get/put_user(), which optimizes the Spectre-V1 mitigation in an
architecture specific way. Currently only x86_64 supports that.

The required code pattern screams for helper functions before it is copied
all over the kernel. So far the exposure is limited to futex, x86 and
fs/select.

Provide a set of helpers for common single size access patterns:

  - get/put_user_masked_uNN() where $NN is the variable size in
    bits, i.e. 8, 16, 32, 64.

  - user_read/write_masked_begin() for scenarios, where several
    unsafe_put/get_user() invocations are required.

Convert the existing users over to this.

The series applies on top of

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/urgent

which has a bug fix pending for the futex implementation of this.

It's also available from git:

    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git uaccess/masked

there is another candidate for conversion: RSEQ. This has not been
converted yet because there are more severe performance issues with this
code, which starts to become prominent in exit path profiling because glibc
started to use it. This will be addressed in a seperate series.

As this creates dependencies, merging it should go through one central tree,
i.e. tip, but that's a debate for later :)

Thanks,

	tglx
---
 arch/x86/include/asm/futex.h |   12 ++----
 fs/select.c                  |    4 --
 include/linux/uaccess.h      |   78 +++++++++++++++++++++++++++++++++++++++++++
 kernel/futex/futex.h         |   37 +-------------------
 4 files changed, 85 insertions(+), 46 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 1/4] uaccess: Provide common helpers for masked user access
  2025-08-13 15:57 [patch 0/4] uaccess: Provide and use helpers for user masked access Thomas Gleixner
@ 2025-08-13 15:57 ` Thomas Gleixner
  2025-08-26  7:04   ` Christophe Leroy
  2025-08-13 15:57 ` [patch 2/4] futex: Convert to get/put_user_masked_u32() Thomas Gleixner
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2025-08-13 15:57 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, André Almeida, x86, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel

commit 2865baf54077 ("x86: support user address masking instead of
non-speculative conditional") provided an optimization for
unsafe_get/put_user(), which optimizes the Spectre-V1 mitigation in an
architecture specific way. Currently only x86_64 supports that.

The required code pattern is:

	if (can_do_masked_user_access())
		dst = masked_user_access_begin(dst);
	else if (!user_write_access_begin(dst, sizeof(*dst)))
		return -EFAULT;
	unsafe_put_user(val, dst, Efault);
	user_read_access_end();
	return 0;
Efault:
	user_read_access_end();
	return -EFAULT;

The futex code already grew an instance of that and there are other areas,
which can be optimized, when the calling code actually verified before,
that the user pointer is both aligned and actually in user space.

Use the futex example and provide generic helper inlines for that to avoid
having tons of copies all over the tree.

This provides get/put_user_masked_uNN() where $NN is the variable size in
bits, i.e. 8, 16, 32, 64.

The second set of helpers is to encapsulate the prologue for larger access
patterns, e.g. multiple consecutive unsafe_put/get_user() scenarioes:

	if (can_do_masked_user_access())
		dst = masked_user_access_begin(dst);
	else if (!user_write_access_begin(dst, sizeof(*dst)))
		return -EFAULT;
	unsafe_put_user(a, &dst->a, Efault);
	unsafe_put_user(b, &dst->b, Efault);
	user_write_access_end();
	return 0;
Efault:
	user_write_access_end();
	return -EFAULT;

which allows to shorten this to:

	if (!user_write_masked_begin(dst))
		return -EFAULT;
	unsafe_put_user(a, &dst->a, Efault);
	...

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/uaccess.h |   78 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -569,6 +569,84 @@ static inline void user_access_restore(u
 #define user_read_access_end user_access_end
 #endif
 
+/*
+ * Conveniance macros to avoid spreading this pattern all over the place
+ */
+#define user_read_masked_begin(src) ({					\
+	bool __ret = true;						\
+									\
+	if (can_do_masked_user_access())				\
+		src = masked_user_access_begin(src);			\
+	else if (!user_read_access_begin(src, sizeof(*src)))		\
+		__ret = false;						\
+	__ret;								\
+})
+
+#define user_write_masked_begin(dst) ({					\
+	bool __ret = true;						\
+									\
+	if (can_do_masked_user_access())				\
+		dst = masked_user_access_begin(dst);			\
+	else if (!user_write_access_begin(dst, sizeof(*dst)))		\
+		__ret = false;						\
+	__ret;								\
+})
+
+/*
+ * get_user_masked_uNN() and put_user_masked_uNN() where NN is the size of
+ * the variable in bits. Supported values are 8, 16, 32 and 64.
+ *
+ * These functions can be used to optimize __get_user() and __put_user()
+ * scenarios, if the architecture supports masked user access. This avoids
+ * the more costly speculation barriers. If the architecture does not
+ * support it, it falls back to user_*_access_begin().
+ *
+ * As with __get/put_user() the user pointer has to be verified by the
+ * caller to be actually in user space.
+ */
+#define GEN_GET_USER_MASKED(type)					\
+	static __always_inline						\
+	int get_user_masked_##type (type *dst, type __user *src)	\
+	{								\
+		type val;						\
+									\
+		if (!user_read_masked_begin(src))			\
+			return -EFAULT;					\
+		unsafe_get_user(val, src, Efault);			\
+		user_read_access_end();					\
+		*dst = val;						\
+		return 0;						\
+	Efault:								\
+		user_read_access_end();					\
+		return -EFAULT;						\
+	}
+
+GEN_GET_USER_MASKED(u8)
+GEN_GET_USER_MASKED(u16)
+GEN_GET_USER_MASKED(u32)
+GEN_GET_USER_MASKED(u64)
+#undef GEN_GET_USER_MASKED
+
+#define GEN_PUT_USER_MASKED(type)					\
+	static __always_inline						\
+	int put_user_masked_##type (type val, type __user *dst)		\
+	{								\
+		if (!user_write_masked_begin(dst))			\
+			return -EFAULT;					\
+		unsafe_put_user(val, dst, Efault);			\
+		user_write_access_end();				\
+		return 0;						\
+	Efault:								\
+		user_write_access_end();				\
+		return -EFAULT;						\
+	}
+
+GEN_PUT_USER_MASKED(u8)
+GEN_PUT_USER_MASKED(u16)
+GEN_PUT_USER_MASKED(u32)
+GEN_PUT_USER_MASKED(u64)
+#undef GEN_PUT_USER_MASKED
+
 #ifdef CONFIG_HARDENED_USERCOPY
 void __noreturn usercopy_abort(const char *name, const char *detail,
 			       bool to_user, unsigned long offset,


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 2/4] futex: Convert to get/put_user_masked_u32()
  2025-08-13 15:57 [patch 0/4] uaccess: Provide and use helpers for user masked access Thomas Gleixner
  2025-08-13 15:57 ` [patch 1/4] uaccess: Provide common helpers for masked user access Thomas Gleixner
@ 2025-08-13 15:57 ` Thomas Gleixner
  2025-08-13 15:57 ` [patch 3/4] x86/futex: Use user_*_masked_begin() Thomas Gleixner
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2025-08-13 15:57 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, André Almeida, x86, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel

Use the generic variants for futex_put/get_value() instead of the open
coded implementation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "André Almeida" <andrealmeid@igalia.com>
---
 kernel/futex/futex.h |   37 ++-----------------------------------
 1 file changed, 2 insertions(+), 35 deletions(-)

--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -285,48 +285,15 @@ static inline int futex_cmpxchg_value_lo
  * This does a plain atomic user space read, and the user pointer has
  * already been verified earlier by get_futex_key() to be both aligned
  * and actually in user space, just like futex_atomic_cmpxchg_inatomic().
- *
- * We still want to avoid any speculation, and while __get_user() is
- * the traditional model for this, it's actually slower than doing
- * this manually these days.
- *
- * We could just have a per-architecture special function for it,
- * the same way we do futex_atomic_cmpxchg_inatomic(), but rather
- * than force everybody to do that, write it out long-hand using
- * the low-level user-access infrastructure.
- *
- * This looks a bit overkill, but generally just results in a couple
- * of instructions.
  */
 static __always_inline int futex_get_value(u32 *dest, u32 __user *from)
 {
-	u32 val;
-
-	if (can_do_masked_user_access())
-		from = masked_user_access_begin(from);
-	else if (!user_read_access_begin(from, sizeof(*from)))
-		return -EFAULT;
-	unsafe_get_user(val, from, Efault);
-	user_read_access_end();
-	*dest = val;
-	return 0;
-Efault:
-	user_read_access_end();
-	return -EFAULT;
+	return get_user_masked_u32(dest, from);
 }
 
 static __always_inline int futex_put_value(u32 val, u32 __user *to)
 {
-	if (can_do_masked_user_access())
-		to = masked_user_access_begin(to);
-	else if (!user_write_access_begin(to, sizeof(*to)))
-		return -EFAULT;
-	unsafe_put_user(val, to, Efault);
-	user_write_access_end();
-	return 0;
-Efault:
-	user_write_access_end();
-	return -EFAULT;
+	return put_user_masked_u32(val, to);
 }
 
 static inline int futex_get_value_locked(u32 *dest, u32 __user *from)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 3/4] x86/futex: Use user_*_masked_begin()
  2025-08-13 15:57 [patch 0/4] uaccess: Provide and use helpers for user masked access Thomas Gleixner
  2025-08-13 15:57 ` [patch 1/4] uaccess: Provide common helpers for masked user access Thomas Gleixner
  2025-08-13 15:57 ` [patch 2/4] futex: Convert to get/put_user_masked_u32() Thomas Gleixner
@ 2025-08-13 15:57 ` Thomas Gleixner
  2025-08-26  7:09   ` Christophe Leroy
  2025-08-13 15:57 ` [patch 4/4] select: Use user_read_masked_begin() Thomas Gleixner
  2025-08-17 13:49 ` [patch 0/4] uaccess: Provide and use helpers for user masked access David Laight
  4 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2025-08-13 15:57 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, x86, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel

Replace the can_do_masked_user_access() conditional with the generic macro.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
---
 arch/x86/include/asm/futex.h |   12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -48,9 +48,7 @@ do {								\
 static __always_inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
 		u32 __user *uaddr)
 {
-	if (can_do_masked_user_access())
-		uaddr = masked_user_access_begin(uaddr);
-	else if (!user_access_begin(uaddr, sizeof(u32)))
+	if (!user_write_masked_begin(uaddr))
 		return -EFAULT;
 
 	switch (op) {
@@ -74,7 +72,7 @@ static __always_inline int arch_futex_at
 		user_access_end();
 		return -ENOSYS;
 	}
-	user_access_end();
+	user_write_access_end();
 	return 0;
 Efault:
 	user_access_end();
@@ -86,9 +84,7 @@ static inline int futex_atomic_cmpxchg_i
 {
 	int ret = 0;
 
-	if (can_do_masked_user_access())
-		uaddr = masked_user_access_begin(uaddr);
-	else if (!user_access_begin(uaddr, sizeof(u32)))
+	if (!user_write_masked_begin(uaddr))
 		return -EFAULT;
 	asm volatile("\n"
 		"1:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"
@@ -98,7 +94,7 @@ static inline int futex_atomic_cmpxchg_i
 		: "r" (newval), "1" (oldval)
 		: "memory"
 	);
-	user_access_end();
+	user_write_access_end();
 	*uval = oldval;
 	return ret;
 }


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 4/4] select: Use user_read_masked_begin()
  2025-08-13 15:57 [patch 0/4] uaccess: Provide and use helpers for user masked access Thomas Gleixner
                   ` (2 preceding siblings ...)
  2025-08-13 15:57 ` [patch 3/4] x86/futex: Use user_*_masked_begin() Thomas Gleixner
@ 2025-08-13 15:57 ` Thomas Gleixner
  2025-08-17 13:49 ` [patch 0/4] uaccess: Provide and use helpers for user masked access David Laight
  4 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2025-08-13 15:57 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86

Replace the can_do_masked_user_access()	conditional with the generic macro.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org
---
 fs/select.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/fs/select.c
+++ b/fs/select.c
@@ -776,9 +776,7 @@ static inline int get_sigset_argpack(str
 {
 	// the path is hot enough for overhead of copy_from_user() to matter
 	if (from) {
-		if (can_do_masked_user_access())
-			from = masked_user_access_begin(from);
-		else if (!user_read_access_begin(from, sizeof(*from)))
+		if (!user_read_masked_begin(from))
 			return -EFAULT;
 		unsafe_get_user(to->p, &from->p, Efault);
 		unsafe_get_user(to->size, &from->size, Efault);


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-13 15:57 [patch 0/4] uaccess: Provide and use helpers for user masked access Thomas Gleixner
                   ` (3 preceding siblings ...)
  2025-08-13 15:57 ` [patch 4/4] select: Use user_read_masked_begin() Thomas Gleixner
@ 2025-08-17 13:49 ` David Laight
  2025-08-17 14:00   ` Linus Torvalds
  2025-08-18 21:21   ` David Laight
  4 siblings, 2 replies; 23+ messages in thread
From: David Laight @ 2025-08-17 13:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Wed, 13 Aug 2025 17:57:00 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> commit 2865baf54077 ("x86: support user address masking instead of
> non-speculative conditional") provided an optimization for
> unsafe_get/put_user(), which optimizes the Spectre-V1 mitigation in an
> architecture specific way. Currently only x86_64 supports that.
> 
> The required code pattern screams for helper functions before it is copied
> all over the kernel. So far the exposure is limited to futex, x86 and
> fs/select.
> 
> Provide a set of helpers for common single size access patterns:

(gmail hasn't decided to accept 1/4 yet - I need to find a better
mail relay...)

+/*
+ * Conveniance macros to avoid spreading this pattern all over the place
    ^ spelling...
+ */
+#define user_read_masked_begin(src) ({					\
+	bool __ret = true;						\
+									\
+	if (can_do_masked_user_access())				\
+		src = masked_user_access_begin(src);			\
+	else if (!user_read_access_begin(src, sizeof(*src)))		\
+		__ret = false;						\
+	__ret;								\
+})

I proposed something very similar a while back.
Since it updated 'src' it really ought to be passed by address.
For the general case you also need the a parameter for the size.

Linus didn't like it, but I've forgotten why.

I'm also not convinced of the name.
There isn't any 'masking' involved, so it shouldn't be propagated.

There is also an implementation issue.
The original masker_user_access_begin() returned ~0 for kernel addresses.
That requires that the code always access offset zero first.
I looked up some candidates for this code and found one (possibly epoll)
that did the accesses in the wrong order.
The current x86-64 'cmp+cmov' version returns the base of the guard page,
so is safe provided the accesses are 'reasonably sequential'.
That probably ought to be a requirement.

	David




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-17 13:49 ` [patch 0/4] uaccess: Provide and use helpers for user masked access David Laight
@ 2025-08-17 14:00   ` Linus Torvalds
  2025-08-17 15:29     ` David Laight
  2025-08-18 21:21   ` David Laight
  1 sibling, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2025-08-17 14:00 UTC (permalink / raw)
  To: David Laight
  Cc: Thomas Gleixner, LKML, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Sun, 17 Aug 2025 at 06:50, David Laight <david.laight.linux@gmail.com> wrote:
>
> Linus didn't like it, but I've forgotten why.

I think the reason I didn't love it is that it has a bit subtle
semantics, and I think you just proved my point:

> I'm also not convinced of the name.
> There isn't any 'masking' involved, so it shouldn't be propagated.

Sure there is. Look closer at that patch:

+       if (can_do_masked_user_access())                                \
+               src = masked_user_access_begin(src);                    \

IOW, that macro changes the argument and masks it.

So it's actually really easy to use, but it's also really easy to miss
that it does that.

We've done this before, and I have done it myself. The classic example
is the whole "do_div()" macro that everybody hated because it did
exactly the same thing (we also have "save_flags()" etc that have this
behavior).

So I don't love it - but I can't claim I've not done the same thing,
and honestly, it does make it very easy to use, so when Thomas sent
this series out I didn't speak out against it.

           Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-17 14:00   ` Linus Torvalds
@ 2025-08-17 15:29     ` David Laight
  2025-08-17 15:36       ` Linus Torvalds
  0 siblings, 1 reply; 23+ messages in thread
From: David Laight @ 2025-08-17 15:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, LKML, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Sun, 17 Aug 2025 07:00:17 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sun, 17 Aug 2025 at 06:50, David Laight <david.laight.linux@gmail.com> wrote:
> >
> > Linus didn't like it, but I've forgotten why.  
> 
> I think the reason I didn't love it is that it has a bit subtle
> semantics, and I think you just proved my point:

Just requiring the caller pass &user_ptr would make it more obvious.
The generated code (with 'src' -> *&src) will be the same.

> 
> > I'm also not convinced of the name.
> > There isn't any 'masking' involved, so it shouldn't be propagated.  
> 
> Sure there is. Look closer at that patch:
> 
> +       if (can_do_masked_user_access())                                \
> +               src = masked_user_access_begin(src);                    \
> 
> IOW, that macro changes the argument and masks it.

Except the change has never been a 'mask' in the traditional sense.
Neither the original cmp+sbb+or nor current cmp+cmov is really applying a mask.
I think the 'guard page' might even be the highest user page, so it isn't
even the case that kernel addresses get their low bits masked off.

The function could just be user_read_begin(void __user *addr, unsigned long len);
Although since it is the start of an unsafe_get_user() sequence perhaps
is should be unsafe_get_user_begin() ?

> 
> So it's actually really easy to use, but it's also really easy to miss
> that it does that.
> 
> We've done this before, and I have done it myself. The classic example
> is the whole "do_div()" macro that everybody hated because it did
> exactly the same thing

Divide is (well was, I think my zen5 has a fast divide) also slow enough that
I doubt it would have mattered.

- can you drop a 'must_check' on the div_u64() that people keep putting in
patches as a drop-in replacement for do_div()?

	David

> (we also have "save_flags()" etc that have this
> behavior).
> 
> So I don't love it - but I can't claim I've not done the same thing,
> and honestly, it does make it very easy to use, so when Thomas sent
> this series out I didn't speak out against it.
> 
>            Linus


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-17 15:29     ` David Laight
@ 2025-08-17 15:36       ` Linus Torvalds
  2025-08-18 11:59         ` David Laight
  0 siblings, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2025-08-17 15:36 UTC (permalink / raw)
  To: David Laight
  Cc: Thomas Gleixner, LKML, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Sun, 17 Aug 2025 at 08:29, David Laight <david.laight.linux@gmail.com> wrote:
>
> Just requiring the caller pass &user_ptr would make it more obvious.
> The generated code (with 'src' -> *&src) will be the same.

I personally absolutely detest passing pointers in and then hoping the
compiler will fix it up and not make the assembler do the stupid thing
that the source code does.

That's actually true in general - I strive to make the source code and
the generated code line up fairly closely, rather than "compilers fix
up the mistakes in the source code".

> > We've done this before, and I have done it myself. The classic example
> > is the whole "do_div()" macro that everybody hated because it did
> > exactly the same thing
>
> Divide is (well was, I think my zen5 has a fast divide) also slow enough that
> I doubt it would have mattered.

It mattered for code generation quality and smaller simpler code to look at.

I still look at the generated asm (not for do_div(), but other
things), and having compiler-generated code that makes sense and
matches the source is a big plus in my book.

                 Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-17 15:36       ` Linus Torvalds
@ 2025-08-18 11:59         ` David Laight
  0 siblings, 0 replies; 23+ messages in thread
From: David Laight @ 2025-08-18 11:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, LKML, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Sun, 17 Aug 2025 08:36:05 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sun, 17 Aug 2025 at 08:29, David Laight <david.laight.linux@gmail.com> wrote:
> >
> > Just requiring the caller pass &user_ptr would make it more obvious.
> > The generated code (with 'src' -> *&src) will be the same.  
> 
> I personally absolutely detest passing pointers in and then hoping the
> compiler will fix it up and not make the assembler do the stupid thing
> that the source code does.

I do generally dislike passing integers and pointers by reference.
It typically generates horrid code further down the function as well
as all the costs of the memory references themselves.

But hidden updates to variables are worse.
And we know (and can verify) that the generated code here is sane.

A possible problem with the 'hidden update' is that someone could easily
write code (eg in an ioctl handler) where the user pointer is in a buffer
that gets written back to user space.
Passing the pointer by reference makes it rather more obvious it can get
changed.

> That's actually true in general - I strive to make the source code and
> the generated code line up fairly closely, rather than "compilers fix
> up the mistakes in the source code".

Especially when you are trying to code what it thinks is 'a mistake' :-)

> > > We've done this before, and I have done it myself. The classic example
> > > is the whole "do_div()" macro that everybody hated because it did
> > > exactly the same thing  
> >
> > Divide is (well was, I think my zen5 has a fast divide) also slow enough that
> > I doubt it would have mattered.  
> 
> It mattered for code generation quality and smaller simpler code to look at.
> 
> I still look at the generated asm (not for do_div(), but other
> things), and having compiler-generated code that makes sense and
> matches the source is a big plus in my book.

Don't worry I also keep looking at generated code, some of it is
stunningly bad (and seems to get worse with each new compiler version).
Don't even think about what happens for C++ std::ostringstream.

	David

> 
>                  Linus


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-17 13:49 ` [patch 0/4] uaccess: Provide and use helpers for user masked access David Laight
  2025-08-17 14:00   ` Linus Torvalds
@ 2025-08-18 21:21   ` David Laight
  2025-08-18 21:36     ` Linus Torvalds
  2025-08-19  4:44     ` Thomas Weißschuh
  1 sibling, 2 replies; 23+ messages in thread
From: David Laight @ 2025-08-18 21:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Sun, 17 Aug 2025 14:49:43 +0100
David Laight <david.laight.linux@gmail.com> wrote:

> On Wed, 13 Aug 2025 17:57:00 +0200 (CEST)
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > commit 2865baf54077 ("x86: support user address masking instead of
> > non-speculative conditional") provided an optimization for
> > unsafe_get/put_user(), which optimizes the Spectre-V1 mitigation in an
> > architecture specific way. Currently only x86_64 supports that.
> > 
> > The required code pattern screams for helper functions before it is copied
> > all over the kernel. So far the exposure is limited to futex, x86 and
> > fs/select.
> > 
> > Provide a set of helpers for common single size access patterns:  
> 
> (gmail hasn't decided to accept 1/4 yet - I need to find a better
> mail relay...)
> 
> +/*
> + * Conveniance macros to avoid spreading this pattern all over the place
>     ^ spelling...
> + */
> +#define user_read_masked_begin(src) ({					\
> +	bool __ret = true;						\
> +									\
> +	if (can_do_masked_user_access())				\
> +		src = masked_user_access_begin(src);			\
> +	else if (!user_read_access_begin(src, sizeof(*src)))		\
> +		__ret = false;						\
> +	__ret;								\
> +})

Would something like this work (to avoid the hidden update)?

#define user_read_begin(uaddr, size, error_code) ({	\
	typeof(uaddr) __uaddr;				\
	if (can_do_masked_user_access())		\
		__uaddr = masked_user_access_begin(uaddr);\
	else if (user_read_access_begin(uaddr, size))	\
		__uaddr = uaddr;			\
	else {						\
		error_code;				\
	}						\
	__uaddr;					\
})

With typical use being either:
	uaddr = user_read_begin(uaddr, sizeof (*uaddr), return -EFAULT);
or:
	uaddr = user_read_begin(uaddr, sizeof (*uaddr), goto bad_uaddr);

One problem is I don't think you can easily enforce the assignment.
Ideally you'd want something that made the compiler think that 'uaddr' was unset.
It could be done for in a debug/diagnostic compile by adding 'uaddr = NULL'
at the bottom of the #define and COMPILE_ASSERT(!staticically_true(uaddr == NULL))
inside unsafe_get/put_user().

	David


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 21:21   ` David Laight
@ 2025-08-18 21:36     ` Linus Torvalds
  2025-08-18 22:21       ` Al Viro
                         ` (2 more replies)
  2025-08-19  4:44     ` Thomas Weißschuh
  1 sibling, 3 replies; 23+ messages in thread
From: Linus Torvalds @ 2025-08-18 21:36 UTC (permalink / raw)
  To: David Laight
  Cc: Thomas Gleixner, LKML, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Mon, 18 Aug 2025 at 14:21, David Laight <david.laight.linux@gmail.com> wrote:
>
> Would something like this work (to avoid the hidden update)?

It would certainly work, but I despise code inside macro arguments
even more than I dislike the hidden update.

If we want something like this, we should just make that last argument
be a label, the same way unsafe_{get,put}_user() already works.

That would not only match existing user access exception handling, it
might allow for architecture-specific asm code that uses synchronous
trap instructions (ie the label might turn into an exception entry)

It's basically "manual exception handling", whether it then uses
actual exceptions (like user accesses do) or ends up being some
software implementation with just a "goto label" for the error case.

I realize some people have grown up being told that "goto is bad". Or
have been told that exception handling should be baked into the
language and be asynchronous. Both of those ideas are complete and
utter garbage, and the result of minds that cannot comprehend reality.

Asynchronous exceptions are horrific and tend to cause huge
performance problems (think setjmp()). The Linux kernel exception
model with explicit exception points is not only "that's how you have
to do it in C", it's also technically superior.

And "goto" is fine, as long as you have legible syntax and don't use
it to generate spaghetti code. Being able to write bad code with goto
doesn't make 'goto' bad - you can write bad code with *anything*.

            Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 21:36     ` Linus Torvalds
@ 2025-08-18 22:21       ` Al Viro
  2025-08-18 23:00         ` Linus Torvalds
  2025-08-19  2:39       ` Matthew Wilcox
  2025-08-19 21:33       ` David Laight
  2 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2025-08-18 22:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Laight, Thomas Gleixner, LKML, Mathieu Desnoyers,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, André Almeida,
	x86, Christian Brauner, Jan Kara, linux-fsdevel

On Mon, Aug 18, 2025 at 02:36:31PM -0700, Linus Torvalds wrote:

> And "goto" is fine, as long as you have legible syntax and don't use
> it to generate spaghetti code. Being able to write bad code with goto
> doesn't make 'goto' bad - you can write bad code with *anything*.

Put it another way, one can massage a code into a strictly structured
(no goto, only one exit from each block, etc.) equivalent and every
hard-to-answer question about the original will map to the replacement -
just as hard as it had been.

I suspect that folks with "goto is a Bad Word(tm)" hardon had been told
that goto was always avoidable, but had never bothered to read the proof...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 22:21       ` Al Viro
@ 2025-08-18 23:00         ` Linus Torvalds
  2025-08-19  0:39           ` Al Viro
  0 siblings, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2025-08-18 23:00 UTC (permalink / raw)
  To: Al Viro
  Cc: David Laight, Thomas Gleixner, LKML, Mathieu Desnoyers,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, André Almeida,
	x86, Christian Brauner, Jan Kara, linux-fsdevel

On Mon, 18 Aug 2025 at 15:21, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> I suspect that folks with "goto is a Bad Word(tm)" hardon had been told
> that goto was always avoidable, but had never bothered to read the proof...

I think it's just a combination of compiler people historically
finding purely structured loops easier to analyze (back before modern
things like SSA).

Together with language people who wanted to point out that "modern"
languages had structures loop constructs.

Both issues go back to the 1960s, and both are entirely irrelevant
today - and have been for decades. It's not like you need to actively
teach people to use for-loops instead of 'goto' these days.

Now, I don't advocate 'goto' as a general programming model, but for
exception handling it's superior to any alternative I know of.

Exceptions simply DO NOT NEST, and 'try-catch-finally' is an insane
model for exceptions that has only made things worse both for
compilers and for programmers.

So I do think using labels (without any crazy attempt nesting syntax)
is objectively the superior model.

And the 'finally' mess is much better handled by compilers dealing
with cleanup - again without any pointless artificial nesting
structures.  I think most of our <linux/cleanup.h> models have been
quite successful.

                Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 23:00         ` Linus Torvalds
@ 2025-08-19  0:39           ` Al Viro
  2025-08-20 23:48             ` Al Viro
  0 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2025-08-19  0:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Laight, Thomas Gleixner, LKML, Mathieu Desnoyers,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, André Almeida,
	x86, Christian Brauner, Jan Kara, linux-fsdevel

On Mon, Aug 18, 2025 at 04:00:44PM -0700, Linus Torvalds wrote:

> Now, I don't advocate 'goto' as a general programming model, but for
> exception handling it's superior to any alternative I know of.
> 
> Exceptions simply DO NOT NEST, and 'try-catch-finally' is an insane
> model for exceptions that has only made things worse both for
> compilers and for programmers.
> 
> So I do think using labels (without any crazy attempt nesting syntax)
> is objectively the superior model.
> 
> And the 'finally' mess is much better handled by compilers dealing
> with cleanup - again without any pointless artificial nesting
> structures.  I think most of our <linux/cleanup.h> models have been
> quite successful.

I'm still rather cautious about the uses related to locks - it's
very easy to overextend the area where lock is held (witness the
fs/namespace.c bugs of the "oops, that should've been scoped_guard(),
not guard()" variety - we had several this year) and "grab lock,
except it might fail" stuff appears to be all awful - when macro
is supposed to be used like
	scoped_cond_guard(lock_timer, return -EINVAL, _id)
(hidden in the bowels of another macro, no less)...

I'm still trying to come up with something edible for lock_mount() -
the best approximation I've got so far is

	CLASS(lock_mount, mp)(path);
	if (IS_ERR(mp.mp))
		bugger off
	...

with things like do_add_mount() avoiding the IS_ERR(...) part by
starting with if (IS_ERR(mp)) return PTR_ERR(mp);

With that we get e.g.
	CLASS(lock_mount, mp)(mountpoint);
	error = do_add_mount(real_mount(mnt), mp.mp, mountpoint, mnt_flags);
	if (!error)	// mnt is consumed by successful do_add_mount()
		retain_and_null_ptr(mnt);
	return error;

but it takes some massage to get there...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 21:36     ` Linus Torvalds
  2025-08-18 22:21       ` Al Viro
@ 2025-08-19  2:39       ` Matthew Wilcox
  2025-08-19 21:33       ` David Laight
  2 siblings, 0 replies; 23+ messages in thread
From: Matthew Wilcox @ 2025-08-19  2:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Laight, Thomas Gleixner, LKML, Mathieu Desnoyers,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, André Almeida,
	x86, Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Mon, Aug 18, 2025 at 02:36:31PM -0700, Linus Torvalds wrote:
> I realize some people have grown up being told that "goto is bad". Or
> have been told that exception handling should be baked into the
> language and be asynchronous. Both of those ideas are complete and
> utter garbage, and the result of minds that cannot comprehend reality.
> 
> Asynchronous exceptions are horrific and tend to cause huge
> performance problems (think setjmp()). The Linux kernel exception
> model with explicit exception points is not only "that's how you have
> to do it in C", it's also technically superior.
> 
> And "goto" is fine, as long as you have legible syntax and don't use
> it to generate spaghetti code. Being able to write bad code with goto
> doesn't make 'goto' bad - you can write bad code with *anything*.

Even Dijkstra doesn't argue against using goto for exception handling,

"I remember having read the explicit recommendation to restrict the use
of the go to statement to alarm exits, but I have not been able to trace
it; presumably, it has been made by C.A.R. Hoare."

https://www.cs.utexas.edu/~EWD/transcriptions/EWD02xx/EWD215.html


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 21:21   ` David Laight
  2025-08-18 21:36     ` Linus Torvalds
@ 2025-08-19  4:44     ` Thomas Weißschuh
  1 sibling, 0 replies; 23+ messages in thread
From: Thomas Weißschuh @ 2025-08-19  4:44 UTC (permalink / raw)
  To: David Laight
  Cc: Thomas Gleixner, LKML, Linus Torvalds, Mathieu Desnoyers,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, André Almeida,
	x86, Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On 2025-08-18 22:21:06+0100, David Laight wrote:
> On Sun, 17 Aug 2025 14:49:43 +0100
> David Laight <david.laight.linux@gmail.com> wrote:

(...)

> Would something like this work (to avoid the hidden update)?
> 
> #define user_read_begin(uaddr, size, error_code) ({	\
> 	typeof(uaddr) __uaddr;				\
> 	if (can_do_masked_user_access())		\
> 		__uaddr = masked_user_access_begin(uaddr);\
> 	else if (user_read_access_begin(uaddr, size))	\
> 		__uaddr = uaddr;			\
> 	else {						\
> 		error_code;				\
> 	}						\
> 	__uaddr;					\
> })
> 
> With typical use being either:
> 	uaddr = user_read_begin(uaddr, sizeof (*uaddr), return -EFAULT);
> or:
> 	uaddr = user_read_begin(uaddr, sizeof (*uaddr), goto bad_uaddr);
> 
> One problem is I don't think you can easily enforce the assignment.
> Ideally you'd want something that made the compiler think that 'uaddr' was unset.
> It could be done for in a debug/diagnostic compile by adding 'uaddr = NULL'
> at the bottom of the #define and COMPILE_ASSERT(!staticically_true(uaddr == NULL))
> inside unsafe_get/put_user().

To enforce some assignment, but not to the exact same variable as the argument,
you can wrap user_read_begin() in a function marked as __must_check.


#define __user_read_begin(uaddr, size, error_code) ({	\
	/* See above */
})

static __always_inline void __must_check __user *__user_read_check(void __user *val)
{
	return val;
}

#define user_read_begin(uaddr, size, error_code)	\
	((typeof(uaddr))__user_read_check(__user_read_begin(uaddr, size, error_code)))


Ignoring the return value gives:

error: ignoring return value of ‘__user_read_check’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result]
 1629 |         ((typeof(uaddr))__user_read_check(__user_read_begin(uaddr, size, error_code)))
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: in expansion of macro ‘user_read_begin’
 1635 |         user_read_begin(uaddr, sizeof (*uaddr), return -EFAULT);
      |         ^~~~~~~~~~~~~~~


Thomas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-18 21:36     ` Linus Torvalds
  2025-08-18 22:21       ` Al Viro
  2025-08-19  2:39       ` Matthew Wilcox
@ 2025-08-19 21:33       ` David Laight
  2 siblings, 0 replies; 23+ messages in thread
From: David Laight @ 2025-08-19 21:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, LKML, Mathieu Desnoyers, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, x86,
	Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel

On Mon, 18 Aug 2025 14:36:31 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, 18 Aug 2025 at 14:21, David Laight <david.laight.linux@gmail.com> wrote:
> >
> > Would something like this work (to avoid the hidden update)?  
> 
> It would certainly work, but I despise code inside macro arguments
> even more than I dislike the hidden update.
> 
> If we want something like this, we should just make that last argument
> be a label, the same way unsafe_{get,put}_user() already works.

A 'goto label' is probably a bit more readable that just 'label'.
But there will be similar code in the same function.
But it can't use the same label - one needs the 'user_access_end()'.

I wanted to allow an immediate 'return -EFAULT' as well as a goto.
But not really expect anything more than 'rval = -EFAULT; goto label'.

I do agree than some of the 'condvar wait' macros are a PITA when
a chunk of code is executed repeatedly in a loop.
There are also the iover iter ones, did they get better?

> That would not only match existing user access exception handling, it
> might allow for architecture-specific asm code that uses synchronous
> trap instructions (ie the label might turn into an exception entry)

Unlikely to be a trap in this case, but I guess you might want jump
directly from asm.
OTOH the real aim of this code has to be for all architectures to
have a guard/invalid page that kernel addresses get converted to.
So eventually the conditional jump disappears.

> 
> It's basically "manual exception handling", whether it then uses
> actual exceptions (like user accesses do) or ends up being some
> software implementation with just a "goto label" for the error case.
> 
> I realize some people have grown up being told that "goto is bad". Or
> have been told that exception handling should be baked into the
> language and be asynchronous. Both of those ideas are complete and
> utter garbage, and the result of minds that cannot comprehend reality.

Test: which language has a 'whenever' statement?
IIRC the use is (effectively) 'whenever variable == value goto label'.
(I hope my brain does remember that correctly, the implementation of
that language I used didn't support it.)

> Asynchronous exceptions are horrific and tend to cause huge
> performance problems (think setjmp()).

The original setjmp was trivial, no callee saved registers,
so just saved the program counter and stack pointer.
The original SYSV kernel used setjmp/longjmp to exit the kernel
on error (after writing the errno value to u.u_error).

The real killer is having to follow the stack to execute all the
destructors.

> The Linux kernel exception
> model with explicit exception points is not only "that's how you have
> to do it in C", it's also technically superior.

Indeed.

I've seen C++ code that did 'new buffer', called into some deep code
that would normally save the pointer, but had a try/catch block that
always freed it.
The code had no way of knowing whether the exception happened before
or after the pointer was saved.
And it is pretty impossible to check all the places that might 'throw'.

> 
> And "goto" is fine, as long as you have legible syntax and don't use
> it to generate spaghetti code. Being able to write bad code with goto
> doesn't make 'goto' bad - you can write bad code with *anything*.

I fixed some 'dayjob' code that tried not to use goto, break or return.
The error paths were just all wrong.

	David

> 
>             Linus

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-19  0:39           ` Al Viro
@ 2025-08-20 23:48             ` Al Viro
  2025-08-21  7:45               ` Christian Brauner
  0 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2025-08-20 23:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Laight, Thomas Gleixner, LKML, Mathieu Desnoyers,
	Peter Zijlstra, Darren Hart, Davidlohr Bueso, André Almeida,
	x86, Christian Brauner, Jan Kara, linux-fsdevel

On Tue, Aug 19, 2025 at 01:39:09AM +0100, Al Viro wrote:
> I'm still trying to come up with something edible for lock_mount() -
> the best approximation I've got so far is
> 
> 	CLASS(lock_mount, mp)(path);
> 	if (IS_ERR(mp.mp))
> 		bugger off

... and that does not work, since DEFINE_CLASS() has constructor return
a value that gets copied into the local variable in question.

Which is unusable for situations when a part of what constructor is
doing is insertion of that local variable into a list.

__cleanup() per se is still usable, but... no DEFINE_CLASS for that kind
of data structures ;-/

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-20 23:48             ` Al Viro
@ 2025-08-21  7:45               ` Christian Brauner
  2025-08-21 22:49                 ` Al Viro
  0 siblings, 1 reply; 23+ messages in thread
From: Christian Brauner @ 2025-08-21  7:45 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, David Laight, Thomas Gleixner, LKML,
	Mathieu Desnoyers, Peter Zijlstra, Darren Hart, Davidlohr Bueso,
	André Almeida, x86, Jan Kara, linux-fsdevel

On Thu, Aug 21, 2025 at 12:48:15AM +0100, Al Viro wrote:
> On Tue, Aug 19, 2025 at 01:39:09AM +0100, Al Viro wrote:
> > I'm still trying to come up with something edible for lock_mount() -
> > the best approximation I've got so far is
> > 
> > 	CLASS(lock_mount, mp)(path);
> > 	if (IS_ERR(mp.mp))
> > 		bugger off
> 
> ... and that does not work, since DEFINE_CLASS() has constructor return
> a value that gets copied into the local variable in question.
> 
> Which is unusable for situations when a part of what constructor is
> doing is insertion of that local variable into a list.
> 
> __cleanup() per se is still usable, but... no DEFINE_CLASS for that kind
> of data structures ;-/

Just add the custom infrastructure that we need for this to work out imho.
If it's useful outside of our own realm then we can add it to cleanup.h
and if not we can just add our own header...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 0/4] uaccess: Provide and use helpers for user masked access
  2025-08-21  7:45               ` Christian Brauner
@ 2025-08-21 22:49                 ` Al Viro
  0 siblings, 0 replies; 23+ messages in thread
From: Al Viro @ 2025-08-21 22:49 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, David Laight, Thomas Gleixner, LKML,
	Mathieu Desnoyers, Peter Zijlstra, Darren Hart, Davidlohr Bueso,
	André Almeida, x86, Jan Kara, linux-fsdevel

On Thu, Aug 21, 2025 at 09:45:22AM +0200, Christian Brauner wrote:
> On Thu, Aug 21, 2025 at 12:48:15AM +0100, Al Viro wrote:
> > On Tue, Aug 19, 2025 at 01:39:09AM +0100, Al Viro wrote:
> > > I'm still trying to come up with something edible for lock_mount() -
> > > the best approximation I've got so far is
> > > 
> > > 	CLASS(lock_mount, mp)(path);
> > > 	if (IS_ERR(mp.mp))
> > > 		bugger off
> > 
> > ... and that does not work, since DEFINE_CLASS() has constructor return
> > a value that gets copied into the local variable in question.
> > 
> > Which is unusable for situations when a part of what constructor is
> > doing is insertion of that local variable into a list.
> > 
> > __cleanup() per se is still usable, but... no DEFINE_CLASS for that kind
> > of data structures ;-/
> 
> Just add the custom infrastructure that we need for this to work out imho.

Obviously...  I'm going to put that into a branch on top of -rc3 and keep
the more infrastructural parts in the beginning, so they could be merged
into other branches in vfs/vfs.git without disrupting things on reordering.

> If it's useful outside of our own realm then we can add it to cleanup.h
> and if not we can just add our own header...

lock_mount() et.al. are purely fs/namespace.c, so no header is needed at
all.  FWIW, existing guards in there have problems - I ended up with

DEFINE_LOCK_GUARD_0(namespace_excl, namespace_lock(), namespace_unlock())
DEFINE_LOCK_GUARD_0(namespace_shared, down_read(&namespace_sem),
				      up_read(&namespace_sem))
in fs/namespace.c and
DEFINE_LOCK_GUARD_0(mount_writer, write_seqlock(&mount_lock),
		    write_sequnlock(&mount_lock))
DEFINE_LOCK_GUARD_0(mount_locked_reader, read_seqlock_excl(&mount_lock),
		    read_sequnlock_excl(&mount_lock))
in fs/mount.h; I'm doing conversions to those where they clearly are
good fit and documenting as I go.

mount_lock ones really should not be done in a blanket way - right
now they are wrong in quite a few cases, where writer is used instead
of the locked reader; we'll need to sort that out and I'd rather
keep the open-coded ones for the stuff yet to be considered and/or
tricky.

BTW, the comments I'm using for functions are along the lines of
 * locks: mount_locked_reader || namespace_shared && is_mounted(mnt)
this one - for is_path_reachable().  If you look through the comments
there you'll see things like "vfsmount lock must be held for write" and
the rwlock those are refering to had been gone for more than a decade...

DEFINE_LOCK_GUARD_0 vs. DEFINE_GUARD makes for saner code generation;
having it essenitally check IS_ERR_OR_NULL(&namespace_sem) is already
ridiculous, but when it decides to sacrifice a register for that, complete
with a bunch of spills...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 1/4] uaccess: Provide common helpers for masked user access
  2025-08-13 15:57 ` [patch 1/4] uaccess: Provide common helpers for masked user access Thomas Gleixner
@ 2025-08-26  7:04   ` Christophe Leroy
  0 siblings, 0 replies; 23+ messages in thread
From: Christophe Leroy @ 2025-08-26  7:04 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, Peter Zijlstra, Darren Hart,
	Davidlohr Bueso, André Almeida, x86, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel



Le 13/08/2025 à 17:57, Thomas Gleixner a écrit :
> commit 2865baf54077 ("x86: support user address masking instead of
> non-speculative conditional") provided an optimization for
> unsafe_get/put_user(), which optimizes the Spectre-V1 mitigation in an
> architecture specific way. Currently only x86_64 supports that.
> 
> The required code pattern is:
> 
> 	if (can_do_masked_user_access())
> 		dst = masked_user_access_begin(dst);
> 	else if (!user_write_access_begin(dst, sizeof(*dst)))
> 		return -EFAULT;
> 	unsafe_put_user(val, dst, Efault);
> 	user_read_access_end();

You previously called user_write_access_begin(), so must be a 
user_write_access_end() here not a user_read_access_end().

> 	return 0;
> Efault:
> 	user_read_access_end();

Same.

> 	return -EFAULT;
> 
> The futex code already grew an instance of that and there are other areas,
> which can be optimized, when the calling code actually verified before,
> that the user pointer is both aligned and actually in user space.
> 
> Use the futex example and provide generic helper inlines for that to avoid
> having tons of copies all over the tree.
> 
> This provides get/put_user_masked_uNN() where $NN is the variable size in
> bits, i.e. 8, 16, 32, 64.

Couldn't the $NN be automatically determined through the type of the 
provided user pointer (i.e. the 'from' and 'to' in patch 2) ?

> 
> The second set of helpers is to encapsulate the prologue for larger access
> patterns, e.g. multiple consecutive unsafe_put/get_user() scenarioes:
> 
> 	if (can_do_masked_user_access())
> 		dst = masked_user_access_begin(dst);
> 	else if (!user_write_access_begin(dst, sizeof(*dst)))
> 		return -EFAULT;
> 	unsafe_put_user(a, &dst->a, Efault);
> 	unsafe_put_user(b, &dst->b, Efault);
> 	user_write_access_end();
> 	return 0;
> Efault:
> 	user_write_access_end();
> 	return -EFAULT;
> 
> which allows to shorten this to:
> 
> 	if (!user_write_masked_begin(dst))
> 		return -EFAULT;
> 	unsafe_put_user(a, &dst->a, Efault);
> 	...

That's nice but ... it hides even deeper the fact that 
masked_user_access_begin() opens a read/write access to userspace. On 
x86 it doesn't matter because all userspace accesses are read/write. But 
on architectures like powerpc it becomes a problem if you do a 
read/write open then only call user_read_access_end() as write access 
might remain open.

I have a patch (See [1]) that splits masked_user_access_begin() into 
three versions, one for read-only, one for write-only and one for 
read-write., so that they match user_read_access_end() 
user_write_access_end() and user_access_end() respectively.

[1] 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/7b570e237f7099d564d7b1a270169428ac1f3099.1755854833.git.christophe.leroy@csgroup.eu/


> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   include/linux/uaccess.h |   78 ++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 78 insertions(+)
> 
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -569,6 +569,84 @@ static inline void user_access_restore(u
>   #define user_read_access_end user_access_end
>   #endif
>   
> +/*
> + * Conveniance macros to avoid spreading this pattern all over the place
> + */
> +#define user_read_masked_begin(src) ({					\
> +	bool __ret = true;						\
> +									\
> +	if (can_do_masked_user_access())				\
> +		src = masked_user_access_begin(src);			\

Should call a masked_user_read_access_begin() to perform a read-only 
masked access begin, matching the read-only access begin below

> +	else if (!user_read_access_begin(src, sizeof(*src)))		\
> +		__ret = false;						\
> +	__ret;								\
> +})
> +
> +#define user_write_masked_begin(dst) ({					\
> +	bool __ret = true;						\
> +									\
> +	if (can_do_masked_user_access())				\
> +		dst = masked_user_access_begin(dst);			\

Should call masked_user_write_access_begin() to perform a write-only 
masked access begin, matching the write-only access begin below

> +	else if (!user_write_access_begin(dst, sizeof(*dst)))		\
> +		__ret = false;						\
> +	__ret;								\
> +})

You are missing a user_masked_begin() for read-write operations.

> +
> +/*
> + * get_user_masked_uNN() and put_user_masked_uNN() where NN is the size of
> + * the variable in bits. Supported values are 8, 16, 32 and 64.
> + *
> + * These functions can be used to optimize __get_user() and __put_user()
> + * scenarios, if the architecture supports masked user access. This avoids
> + * the more costly speculation barriers. If the architecture does not
> + * support it, it falls back to user_*_access_begin().
> + *
> + * As with __get/put_user() the user pointer has to be verified by the
> + * caller to be actually in user space.
> + */
> +#define GEN_GET_USER_MASKED(type)					\
> +	static __always_inline						\
> +	int get_user_masked_##type (type *dst, type __user *src)	\
> +	{								\
> +		type val;						\
> +									\
> +		if (!user_read_masked_begin(src))			\
> +			return -EFAULT;					\
> +		unsafe_get_user(val, src, Efault);			\
> +		user_read_access_end();					\
> +		*dst = val;						\
> +		return 0;						\
> +	Efault:								\
> +		user_read_access_end();					\
> +		return -EFAULT;						\
> +	}
> +
> +GEN_GET_USER_MASKED(u8)
> +GEN_GET_USER_MASKED(u16)
> +GEN_GET_USER_MASKED(u32)
> +GEN_GET_USER_MASKED(u64)
> +#undef GEN_GET_USER_MASKED

Do we need four functions ? Can't we just have a get_user_masked() macro 
that relies on the type of src , just like unsafe_get_user() ?

> +
> +#define GEN_PUT_USER_MASKED(type)					\
> +	static __always_inline						\
> +	int put_user_masked_##type (type val, type __user *dst)		\
> +	{								\
> +		if (!user_write_masked_begin(dst))			\
> +			return -EFAULT;					\
> +		unsafe_put_user(val, dst, Efault);			\
> +		user_write_access_end();				\
> +		return 0;						\
> +	Efault:								\
> +		user_write_access_end();				\
> +		return -EFAULT;						\
> +	}
> +
> +GEN_PUT_USER_MASKED(u8)
> +GEN_PUT_USER_MASKED(u16)
> +GEN_PUT_USER_MASKED(u32)
> +GEN_PUT_USER_MASKED(u64)

Same.

> +#undef GEN_PUT_USER_MASKED
> +
>   #ifdef CONFIG_HARDENED_USERCOPY
>   void __noreturn usercopy_abort(const char *name, const char *detail,
>   			       bool to_user, unsigned long offset,
> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 3/4] x86/futex: Use user_*_masked_begin()
  2025-08-13 15:57 ` [patch 3/4] x86/futex: Use user_*_masked_begin() Thomas Gleixner
@ 2025-08-26  7:09   ` Christophe Leroy
  0 siblings, 0 replies; 23+ messages in thread
From: Christophe Leroy @ 2025-08-26  7:09 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Linus Torvalds, Mathieu Desnoyers, x86, Peter Zijlstra,
	Darren Hart, Davidlohr Bueso, André Almeida, Alexander Viro,
	Christian Brauner, Jan Kara, linux-fsdevel



Le 13/08/2025 à 17:57, Thomas Gleixner a écrit :
> Replace the can_do_masked_user_access() conditional with the generic macro.
> 
> No functional change.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: x86@kernel.org
> ---
>   arch/x86/include/asm/futex.h |   12 ++++--------
>   1 file changed, 4 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/include/asm/futex.h
> +++ b/arch/x86/include/asm/futex.h
> @@ -48,9 +48,7 @@ do {								\
>   static __always_inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
>   		u32 __user *uaddr)
>   {
> -	if (can_do_masked_user_access())
> -		uaddr = masked_user_access_begin(uaddr);
> -	else if (!user_access_begin(uaddr, sizeof(u32)))
> +	if (!user_write_masked_begin(uaddr))

You are replacing a user_access_begin() by a macro that calls 
user_write_access_begin(). I means that reads will not be allowed, 
allthough arch_futex_atomic_op_inuser() performs read-then-write, so it 
requires a full read-write user access.

>   		return -EFAULT;
>   
>   	switch (op) {
> @@ -74,7 +72,7 @@ static __always_inline int arch_futex_at
>   		user_access_end();
>   		return -ENOSYS;
>   	}
> -	user_access_end();
> +	user_write_access_end();

Same, can't be changed to write-only, read permission is required as well.

>   	return 0;
>   Efault:
>   	user_access_end();
> @@ -86,9 +84,7 @@ static inline int futex_atomic_cmpxchg_i
>   {
>   	int ret = 0;
>   
> -	if (can_do_masked_user_access())
> -		uaddr = masked_user_access_begin(uaddr);
> -	else if (!user_access_begin(uaddr, sizeof(u32)))
> +	if (!user_write_masked_begin(uaddr))

Same, read access is also needed.

>   		return -EFAULT;
>   	asm volatile("\n"
>   		"1:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"
> @@ -98,7 +94,7 @@ static inline int futex_atomic_cmpxchg_i
>   		: "r" (newval), "1" (oldval)
>   		: "memory"
>   	);
> -	user_access_end();
> +	user_write_access_end();

Same, read access is also needed.

>   	*uval = oldval;
>   	return ret;
>   }
> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-08-26  7:22 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-13 15:57 [patch 0/4] uaccess: Provide and use helpers for user masked access Thomas Gleixner
2025-08-13 15:57 ` [patch 1/4] uaccess: Provide common helpers for masked user access Thomas Gleixner
2025-08-26  7:04   ` Christophe Leroy
2025-08-13 15:57 ` [patch 2/4] futex: Convert to get/put_user_masked_u32() Thomas Gleixner
2025-08-13 15:57 ` [patch 3/4] x86/futex: Use user_*_masked_begin() Thomas Gleixner
2025-08-26  7:09   ` Christophe Leroy
2025-08-13 15:57 ` [patch 4/4] select: Use user_read_masked_begin() Thomas Gleixner
2025-08-17 13:49 ` [patch 0/4] uaccess: Provide and use helpers for user masked access David Laight
2025-08-17 14:00   ` Linus Torvalds
2025-08-17 15:29     ` David Laight
2025-08-17 15:36       ` Linus Torvalds
2025-08-18 11:59         ` David Laight
2025-08-18 21:21   ` David Laight
2025-08-18 21:36     ` Linus Torvalds
2025-08-18 22:21       ` Al Viro
2025-08-18 23:00         ` Linus Torvalds
2025-08-19  0:39           ` Al Viro
2025-08-20 23:48             ` Al Viro
2025-08-21  7:45               ` Christian Brauner
2025-08-21 22:49                 ` Al Viro
2025-08-19  2:39       ` Matthew Wilcox
2025-08-19 21:33       ` David Laight
2025-08-19  4:44     ` Thomas Weißschuh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).