* arm64 uaccess series
@ 2024-07-09 16:01 Linus Torvalds
2024-07-09 16:01 ` [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available Linus Torvalds
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-07-09 16:01 UTC (permalink / raw)
To: Mark Rutland; +Cc: Linux ARM
This is also available at
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git arm64-uaccess
and is three patches, although I expect that I'll only merge the first
two in the 6.11 merge window because that final one is still a bit
special.
I've been running variations of this on my Altra machine for the last
month or more, but admittedly my loads are trivial and uninteresting (ie
mostly kernel builds). So my test coevrage is not very wide.
I like the bit 55 checks in that access_ok() rewrite - and they are
actually simpler than worrying about 64-bit overflow - but they are also
admittedly quite different from what the code does elsewhere, and
there's the whole discussion about how the top byte ignore should really
work.
NOTE: This does *not* contain the "user address masking" part. I've
only written the x86-64 version of that, and while it touches similar
areas, it's pretty orthogonal to this part which is about the regular
low-level accesses.
Linus
arm64: start using 'asm goto' for get_user() when available
arm64: start using 'asm goto' for put_user()
arm64: access_ok() optimization
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/asm-extable.h | 3 +
arch/arm64/include/asm/uaccess.h | 192 ++++++++++++++++++++++++-----------
arch/arm64/kernel/mte.c | 12 +--
4 files changed, 142 insertions(+), 66 deletions(-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available
2024-07-09 16:01 arm64 uaccess series Linus Torvalds
@ 2024-07-09 16:01 ` Linus Torvalds
2024-07-17 16:22 ` GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available Mark Rutland
2024-07-17 16:28 ` RESEND: " Mark Rutland
2024-07-09 16:02 ` [PATCH 2/3] arm64: start using 'asm goto' for put_user() Linus Torvalds
` (2 subsequent siblings)
3 siblings, 2 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-07-09 16:01 UTC (permalink / raw)
To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds
This generates noticeably better code with compilers that support it,
since we don't need to test the error register etc, the exception just
jumps to the error handling directly.
Note that this also marks SW_TTBR0_PAN incompatible with KCSAN support,
since KCSAN wants to save and restore the user access state.
KCSAN and SW_TTBR0_PAN were probably always incompatible, but it became
obvious only when implementing the unsafe user access functions. At
that point the default empty user_access_save/restore() functions
weren't provided by the default fallback functions.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/uaccess.h | 121 +++++++++++++++++++++++++------
arch/arm64/kernel/mte.c | 12 ++-
3 files changed, 104 insertions(+), 30 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5d91259ee7b5..b6e8920364de 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1649,6 +1649,7 @@ config RODATA_FULL_DEFAULT_ENABLED
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
+ depends on !KCSAN
help
Enabling this option prevents the kernel from accessing
user-space memory directly by pointing TTBR0_EL1 to a reserved
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 14be5000c5a0..3e721f73bcaf 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -184,29 +184,40 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr)
* The "__xxx_error" versions set the third argument to -EFAULT if an error
* occurs, and leave it unchanged on success.
*/
-#define __get_mem_asm(load, reg, x, addr, err, type) \
+#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
+#define __get_mem_asm(load, reg, x, addr, label, type) \
+ asm_goto_output( \
+ "1: " load " " reg "0, [%1]\n" \
+ _ASM_EXTABLE_##type##ACCESS_ERR(1b, %l2, %w0) \
+ : "=r" (x) \
+ : "r" (addr) : : label)
+#else
+#define __get_mem_asm(load, reg, x, addr, label, type) do { \
+ int __gma_err = 0; \
asm volatile( \
"1: " load " " reg "1, [%2]\n" \
"2:\n" \
_ASM_EXTABLE_##type##ACCESS_ERR_ZERO(1b, 2b, %w0, %w1) \
- : "+r" (err), "=r" (x) \
- : "r" (addr))
+ : "+r" (__gma_err), "=r" (x) \
+ : "r" (addr)); \
+ if (__gma_err) goto label; } while (0)
+#endif
-#define __raw_get_mem(ldr, x, ptr, err, type) \
+#define __raw_get_mem(ldr, x, ptr, label, type) \
do { \
unsigned long __gu_val; \
switch (sizeof(*(ptr))) { \
case 1: \
- __get_mem_asm(ldr "b", "%w", __gu_val, (ptr), (err), type); \
+ __get_mem_asm(ldr "b", "%w", __gu_val, (ptr), label, type); \
break; \
case 2: \
- __get_mem_asm(ldr "h", "%w", __gu_val, (ptr), (err), type); \
+ __get_mem_asm(ldr "h", "%w", __gu_val, (ptr), label, type); \
break; \
case 4: \
- __get_mem_asm(ldr, "%w", __gu_val, (ptr), (err), type); \
+ __get_mem_asm(ldr, "%w", __gu_val, (ptr), label, type); \
break; \
case 8: \
- __get_mem_asm(ldr, "%x", __gu_val, (ptr), (err), type); \
+ __get_mem_asm(ldr, "%x", __gu_val, (ptr), label, type); \
break; \
default: \
BUILD_BUG(); \
@@ -219,27 +230,34 @@ do { \
* uaccess_ttbr0_disable(). As `x` and `ptr` could contain blocking functions,
* we must evaluate these outside of the critical section.
*/
-#define __raw_get_user(x, ptr, err) \
+#define __raw_get_user(x, ptr, label) \
do { \
__typeof__(*(ptr)) __user *__rgu_ptr = (ptr); \
__typeof__(x) __rgu_val; \
__chk_user_ptr(ptr); \
- \
- uaccess_ttbr0_enable(); \
- __raw_get_mem("ldtr", __rgu_val, __rgu_ptr, err, U); \
- uaccess_ttbr0_disable(); \
- \
- (x) = __rgu_val; \
+ do { \
+ __label__ __rgu_failed; \
+ uaccess_ttbr0_enable(); \
+ __raw_get_mem("ldtr", __rgu_val, __rgu_ptr, __rgu_failed, U); \
+ uaccess_ttbr0_disable(); \
+ (x) = __rgu_val; \
+ break; \
+ __rgu_failed: \
+ uaccess_ttbr0_disable(); \
+ goto label; \
+ } while (0); \
} while (0)
#define __get_user_error(x, ptr, err) \
do { \
+ __label__ __gu_failed; \
__typeof__(*(ptr)) __user *__p = (ptr); \
might_fault(); \
if (access_ok(__p, sizeof(*__p))) { \
__p = uaccess_mask_ptr(__p); \
- __raw_get_user((x), __p, (err)); \
+ __raw_get_user((x), __p, __gu_failed); \
} else { \
+ __gu_failed: \
(x) = (__force __typeof__(x))0; (err) = -EFAULT; \
} \
} while (0)
@@ -262,15 +280,18 @@ do { \
do { \
__typeof__(dst) __gkn_dst = (dst); \
__typeof__(src) __gkn_src = (src); \
- int __gkn_err = 0; \
+ do { \
+ __label__ __gkn_label; \
\
- __mte_enable_tco_async(); \
- __raw_get_mem("ldr", *((type *)(__gkn_dst)), \
- (__force type *)(__gkn_src), __gkn_err, K); \
- __mte_disable_tco_async(); \
- \
- if (unlikely(__gkn_err)) \
+ __mte_enable_tco_async(); \
+ __raw_get_mem("ldr", *((type *)(__gkn_dst)), \
+ (__force type *)(__gkn_src), __gkn_label, K); \
+ __mte_disable_tco_async(); \
+ break; \
+ __gkn_label: \
+ __mte_disable_tco_async(); \
goto err_label; \
+ } while (0); \
} while (0)
#define __put_mem_asm(store, reg, x, addr, err, type) \
@@ -381,6 +402,60 @@ extern unsigned long __must_check __arch_copy_to_user(void __user *to, const voi
__actu_ret; \
})
+static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len)
+{
+ if (unlikely(!access_ok(ptr,len)))
+ return 0;
+ uaccess_ttbr0_enable();
+ return 1;
+}
+#define user_access_begin(a,b) user_access_begin(a,b)
+#define user_access_end() uaccess_ttbr0_disable()
+
+/*
+ * The arm64 inline asms should learn abut asm goto, and we should
+ * teach user_access_begin() about address masking.
+ */
+#define unsafe_put_user(x, ptr, label) do { \
+ int __upu_err = 0; \
+ __raw_put_mem("sttr", x, uaccess_mask_ptr(ptr), __upu_err, U); \
+ if (__upu_err) goto label; \
+} while (0)
+
+#define unsafe_get_user(x, ptr, label) \
+ __raw_get_mem("ldtr", x, uaccess_mask_ptr(ptr), label, U)
+
+/*
+ * KCSAN uses these to save and restore ttbr state.
+ * We do not support KCSAN with ARM64_SW_TTBR0_PAN, so
+ * they are no-ops.
+ */
+static inline unsigned long user_access_save(void) { return 0; }
+static inline void user_access_restore(unsigned long enabled) { }
+
+/*
+ * We want the unsafe accessors to always be inlined and use
+ * the error labels - thus the macro games.
+ */
+#define unsafe_copy_loop(dst, src, len, type, label) \
+ while (len >= sizeof(type)) { \
+ unsafe_put_user(*(type *)(src),(type __user *)(dst),label); \
+ dst += sizeof(type); \
+ src += sizeof(type); \
+ len -= sizeof(type); \
+ }
+
+#define unsafe_copy_to_user(_dst,_src,_len,label) \
+do { \
+ char __user *__ucu_dst = (_dst); \
+ const char *__ucu_src = (_src); \
+ size_t __ucu_len = (_len); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label); \
+} while (0)
+
#define INLINE_COPY_TO_USER
#define INLINE_COPY_FROM_USER
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index dcdcccd40891..6174671be7c1 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -582,12 +582,9 @@ subsys_initcall(register_mte_tcf_preferred_sysctl);
size_t mte_probe_user_range(const char __user *uaddr, size_t size)
{
const char __user *end = uaddr + size;
- int err = 0;
char val;
- __raw_get_user(val, uaddr, err);
- if (err)
- return size;
+ __raw_get_user(val, uaddr, efault);
uaddr = PTR_ALIGN(uaddr, MTE_GRANULE_SIZE);
while (uaddr < end) {
@@ -595,12 +592,13 @@ size_t mte_probe_user_range(const char __user *uaddr, size_t size)
* A read is sufficient for mte, the caller should have probed
* for the pte write permission if required.
*/
- __raw_get_user(val, uaddr, err);
- if (err)
- return end - uaddr;
+ __raw_get_user(val, uaddr, efault);
uaddr += MTE_GRANULE_SIZE;
}
(void)val;
return 0;
+
+efault:
+ return end - uaddr;
}
--
2.45.1.209.gc6f12300df
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] arm64: start using 'asm goto' for put_user()
2024-07-09 16:01 arm64 uaccess series Linus Torvalds
2024-07-09 16:01 ` [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available Linus Torvalds
@ 2024-07-09 16:02 ` Linus Torvalds
2024-07-09 16:02 ` [PATCH 3/3] arm64: access_ok() optimization Linus Torvalds
2024-07-09 16:52 ` arm64 uaccess series Catalin Marinas
3 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-07-09 16:02 UTC (permalink / raw)
To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds
This generates noticeably better code since we don't need to test the
error register etc, the exception just jumps to the error handling
directly.
Unlike get_user(), there's no need to worry about old compilers. All
supported compilers support the regular non-output 'asm goto', as
pointed out by Nathan Chancellor.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/arm64/include/asm/asm-extable.h | 3 ++
arch/arm64/include/asm/uaccess.h | 70 ++++++++++++++--------------
2 files changed, 39 insertions(+), 34 deletions(-)
diff --git a/arch/arm64/include/asm/asm-extable.h b/arch/arm64/include/asm/asm-extable.h
index 980d1dd8e1a3..b8a5861dc7b7 100644
--- a/arch/arm64/include/asm/asm-extable.h
+++ b/arch/arm64/include/asm/asm-extable.h
@@ -112,6 +112,9 @@
#define _ASM_EXTABLE_KACCESS_ERR(insn, fixup, err) \
_ASM_EXTABLE_KACCESS_ERR_ZERO(insn, fixup, err, wzr)
+#define _ASM_EXTABLE_KACCESS(insn, fixup) \
+ _ASM_EXTABLE_KACCESS_ERR_ZERO(insn, fixup, wzr, wzr)
+
#define _ASM_EXTABLE_LOAD_UNALIGNED_ZEROPAD(insn, fixup, data, addr) \
__DEFINE_ASM_GPR_NUMS \
__ASM_EXTABLE_RAW(#insn, #fixup, \
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 3e721f73bcaf..28f665e0975a 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -294,29 +294,28 @@ do { \
} while (0); \
} while (0)
-#define __put_mem_asm(store, reg, x, addr, err, type) \
- asm volatile( \
- "1: " store " " reg "1, [%2]\n" \
+#define __put_mem_asm(store, reg, x, addr, label, type) \
+ asm goto( \
+ "1: " store " " reg "0, [%1]\n" \
"2:\n" \
- _ASM_EXTABLE_##type##ACCESS_ERR(1b, 2b, %w0) \
- : "+r" (err) \
- : "rZ" (x), "r" (addr))
+ _ASM_EXTABLE_##type##ACCESS(1b, %l2) \
+ : : "rZ" (x), "r" (addr) : : label)
-#define __raw_put_mem(str, x, ptr, err, type) \
+#define __raw_put_mem(str, x, ptr, label, type) \
do { \
__typeof__(*(ptr)) __pu_val = (x); \
switch (sizeof(*(ptr))) { \
case 1: \
- __put_mem_asm(str "b", "%w", __pu_val, (ptr), (err), type); \
+ __put_mem_asm(str "b", "%w", __pu_val, (ptr), label, type); \
break; \
case 2: \
- __put_mem_asm(str "h", "%w", __pu_val, (ptr), (err), type); \
+ __put_mem_asm(str "h", "%w", __pu_val, (ptr), label, type); \
break; \
case 4: \
- __put_mem_asm(str, "%w", __pu_val, (ptr), (err), type); \
+ __put_mem_asm(str, "%w", __pu_val, (ptr), label, type); \
break; \
case 8: \
- __put_mem_asm(str, "%x", __pu_val, (ptr), (err), type); \
+ __put_mem_asm(str, "%x", __pu_val, (ptr), label, type); \
break; \
default: \
BUILD_BUG(); \
@@ -328,25 +327,34 @@ do { \
* uaccess_ttbr0_disable(). As `x` and `ptr` could contain blocking functions,
* we must evaluate these outside of the critical section.
*/
-#define __raw_put_user(x, ptr, err) \
+#define __raw_put_user(x, ptr, label) \
do { \
+ __label__ __rpu_failed; \
__typeof__(*(ptr)) __user *__rpu_ptr = (ptr); \
__typeof__(*(ptr)) __rpu_val = (x); \
__chk_user_ptr(__rpu_ptr); \
\
- uaccess_ttbr0_enable(); \
- __raw_put_mem("sttr", __rpu_val, __rpu_ptr, err, U); \
- uaccess_ttbr0_disable(); \
+ do { \
+ uaccess_ttbr0_enable(); \
+ __raw_put_mem("sttr", __rpu_val, __rpu_ptr, __rpu_failed, U); \
+ uaccess_ttbr0_disable(); \
+ break; \
+ __rpu_failed: \
+ uaccess_ttbr0_disable(); \
+ goto label; \
+ } while (0); \
} while (0)
#define __put_user_error(x, ptr, err) \
do { \
+ __label__ __pu_failed; \
__typeof__(*(ptr)) __user *__p = (ptr); \
might_fault(); \
if (access_ok(__p, sizeof(*__p))) { \
__p = uaccess_mask_ptr(__p); \
- __raw_put_user((x), __p, (err)); \
+ __raw_put_user((x), __p, __pu_failed); \
} else { \
+ __pu_failed: \
(err) = -EFAULT; \
} \
} while (0)
@@ -369,15 +377,18 @@ do { \
do { \
__typeof__(dst) __pkn_dst = (dst); \
__typeof__(src) __pkn_src = (src); \
- int __pkn_err = 0; \
\
- __mte_enable_tco_async(); \
- __raw_put_mem("str", *((type *)(__pkn_src)), \
- (__force type *)(__pkn_dst), __pkn_err, K); \
- __mte_disable_tco_async(); \
- \
- if (unlikely(__pkn_err)) \
+ do { \
+ __label__ __pkn_err; \
+ __mte_enable_tco_async(); \
+ __raw_put_mem("str", *((type *)(__pkn_src)), \
+ (__force type *)(__pkn_dst), __pkn_err, K); \
+ __mte_disable_tco_async(); \
+ break; \
+ __pkn_err: \
+ __mte_disable_tco_async(); \
goto err_label; \
+ } while (0); \
} while(0)
extern unsigned long __must_check __arch_copy_from_user(void *to, const void __user *from, unsigned long n);
@@ -411,17 +422,8 @@ static __must_check __always_inline bool user_access_begin(const void __user *pt
}
#define user_access_begin(a,b) user_access_begin(a,b)
#define user_access_end() uaccess_ttbr0_disable()
-
-/*
- * The arm64 inline asms should learn abut asm goto, and we should
- * teach user_access_begin() about address masking.
- */
-#define unsafe_put_user(x, ptr, label) do { \
- int __upu_err = 0; \
- __raw_put_mem("sttr", x, uaccess_mask_ptr(ptr), __upu_err, U); \
- if (__upu_err) goto label; \
-} while (0)
-
+#define unsafe_put_user(x, ptr, label) \
+ __raw_put_mem("sttr", x, uaccess_mask_ptr(ptr), label, U)
#define unsafe_get_user(x, ptr, label) \
__raw_get_mem("ldtr", x, uaccess_mask_ptr(ptr), label, U)
--
2.45.1.209.gc6f12300df
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/3] arm64: access_ok() optimization
2024-07-09 16:01 arm64 uaccess series Linus Torvalds
2024-07-09 16:01 ` [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available Linus Torvalds
2024-07-09 16:02 ` [PATCH 2/3] arm64: start using 'asm goto' for put_user() Linus Torvalds
@ 2024-07-09 16:02 ` Linus Torvalds
2024-07-09 16:52 ` arm64 uaccess series Catalin Marinas
3 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-07-09 16:02 UTC (permalink / raw)
To: Mark Rutland; +Cc: Linux ARM, Linus Torvalds
The TBI setup on arm64 is very strange: HW is set up to always do TBI,
but the kernel enforcement for system calls is purely a software
contract, and user space is supposed to mask off the top bits before the
system call.
Except all the actual brk/mmap/etc() system calls then mask it in kernel
space anyway, and accept any TBI address.
This basically unifies things and makes access_ok() also ignore it.
This is an ABI change, but the current situation is very odd, and this
change avoids the current mess and makes the kernel more permissive, and
as such is unlikely to break anything.
The way forward - for some possible future situation when people want to
use more bits - is probably to introduce a new "I actually want the full
64-bit address space" prctl. But we should make sure that the software
and hardware rules actually match at that point.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/arm64/include/asm/uaccess.h | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 28f665e0975a..1f21190d4db5 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -30,23 +30,20 @@ static inline int __access_ok(const void __user *ptr, unsigned long size);
/*
* Test whether a block of memory is a valid user space address.
- * Returns 1 if the range is valid, 0 otherwise.
*
- * This is equivalent to the following test:
- * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
+ * We only care that the address cannot reach the kernel mapping, and
+ * that an invalid address will fault.
*/
-static inline int access_ok(const void __user *addr, unsigned long size)
+static inline int access_ok(const void __user *p, unsigned long size)
{
- /*
- * Asynchronous I/O running in a kernel thread does not have the
- * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
- * the user address before checking.
- */
- if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
- (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
- addr = untagged_addr(addr);
+ unsigned long addr = (unsigned long)p;
- return likely(__access_ok(addr, size));
+ /* Only bit 55 of the address matters */
+ addr |= addr+size;
+ addr = (addr >> 55) & 1;
+ size >>= 55;
+
+ return !(addr | size);
}
#define access_ok access_ok
--
2.45.1.209.gc6f12300df
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: arm64 uaccess series
2024-07-09 16:01 arm64 uaccess series Linus Torvalds
` (2 preceding siblings ...)
2024-07-09 16:02 ` [PATCH 3/3] arm64: access_ok() optimization Linus Torvalds
@ 2024-07-09 16:52 ` Catalin Marinas
2024-07-09 17:12 ` Linus Torvalds
3 siblings, 1 reply; 11+ messages in thread
From: Catalin Marinas @ 2024-07-09 16:52 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Rutland, Linux ARM
On Tue, Jul 09, 2024 at 09:01:58AM -0700, Linus Torvalds wrote:
> This is also available at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git arm64-uaccess
>
> and is three patches, although I expect that I'll only merge the first
> two in the 6.11 merge window because that final one is still a bit
> special.
>
> I've been running variations of this on my Altra machine for the last
> month or more, but admittedly my loads are trivial and uninteresting (ie
> mostly kernel builds). So my test coevrage is not very wide.
I can temporarily pull this branch and 'runtime-constants' into arm64
'for-kernelci'. It's an unstable branch, it doesn't end up in -next.
It's just pointed at by various CI systems to get some wider testing. I
can even pull all four branches if you think it's useful.
> I like the bit 55 checks in that access_ok() rewrite - and they are
> actually simpler than worrying about 64-bit overflow - but they are also
> admittedly quite different from what the code does elsewhere, and
> there's the whole discussion about how the top byte ignore should really
> work.
We are still debating this. I don't think the ABI change is that bad
but, OTOH, user programs with MTE enabled (which would relax the
access_ok()) haven't been tested much. As a kind of precaution, we could
enforce the current behaviour via the sysctl abi.tagged_addr_disabled
and wire it up via a static key. Currently this sysctl only prevents
setting of the TIF_TAGGED_ADDR flag (and implicitly enforces stricter
checks in access_ok()).
--
Catalin
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: arm64 uaccess series
2024-07-09 16:52 ` arm64 uaccess series Catalin Marinas
@ 2024-07-09 17:12 ` Linus Torvalds
2024-07-09 18:30 ` Catalin Marinas
0 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2024-07-09 17:12 UTC (permalink / raw)
To: Catalin Marinas; +Cc: Mark Rutland, Linux ARM
On Tue, 9 Jul 2024 at 09:52, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Tue, Jul 09, 2024 at 09:01:58AM -0700, Linus Torvalds wrote:
> >
> > I've been running variations of this on my Altra machine for the last
> > month or more, but admittedly my loads are trivial and uninteresting (ie
> > mostly kernel builds). So my test coevrage is not very wide.
>
> I can temporarily pull this branch and 'runtime-constants' into arm64
> 'for-kernelci'. It's an unstable branch, it doesn't end up in -next.
> It's just pointed at by various CI systems to get some wider testing. I
> can even pull all four branches if you think it's useful.
Pulling all four branches is probably just as well. The only one that
doesn't have any arm64 changes at all is the link_path_walk one, but
looking at code generation of link_parh_walk (and strncpy_from_user())
was why the three other branches exist, so they are related in that
way ;)
> We are still debating this. I don't think the ABI change is that bad
> but, OTOH, user programs with MTE enabled (which would relax the
> access_ok()) haven't been tested much. As a kind of precaution, we could
> enforce the current behaviour via the sysctl abi.tagged_addr_disabled
> and wire it up via a static key. Currently this sysctl only prevents
> setting of the TIF_TAGGED_ADDR flag (and implicitly enforces stricter
> checks in access_ok()).
I'd actually be interested to hear if anybody really notices. If the
plan is to make a future sane ABI wrt this, wouldn't it be nice if it
turns out nobody even cares about the odd legacy behavior and we can
just leave it behind?
IOW, why go to extra pain if it turns out that there really are no
reasons to do that?
Linus
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: arm64 uaccess series
2024-07-09 17:12 ` Linus Torvalds
@ 2024-07-09 18:30 ` Catalin Marinas
0 siblings, 0 replies; 11+ messages in thread
From: Catalin Marinas @ 2024-07-09 18:30 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Mark Rutland, Linux ARM
On Tue, Jul 09, 2024 at 10:12:34AM -0700, Linus Torvalds wrote:
> On Tue, 9 Jul 2024 at 09:52, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Jul 09, 2024 at 09:01:58AM -0700, Linus Torvalds wrote:
> > > I've been running variations of this on my Altra machine for the last
> > > month or more, but admittedly my loads are trivial and uninteresting (ie
> > > mostly kernel builds). So my test coevrage is not very wide.
> >
> > I can temporarily pull this branch and 'runtime-constants' into arm64
> > 'for-kernelci'. It's an unstable branch, it doesn't end up in -next.
> > It's just pointed at by various CI systems to get some wider testing. I
> > can even pull all four branches if you think it's useful.
>
> Pulling all four branches is probably just as well. The only one that
> doesn't have any arm64 changes at all is the link_path_walk one, but
> looking at code generation of link_parh_walk (and strncpy_from_user())
> was why the three other branches exist, so they are related in that
> way ;)
Done, pushed the branch. It's usually picked up by CI systems in a day
or two.
> > We are still debating this. I don't think the ABI change is that bad
> > but, OTOH, user programs with MTE enabled (which would relax the
> > access_ok()) haven't been tested much. As a kind of precaution, we could
> > enforce the current behaviour via the sysctl abi.tagged_addr_disabled
> > and wire it up via a static key. Currently this sysctl only prevents
> > setting of the TIF_TAGGED_ADDR flag (and implicitly enforces stricter
> > checks in access_ok()).
>
> I'd actually be interested to hear if anybody really notices. If the
> plan is to make a future sane ABI wrt this, wouldn't it be nice if it
> turns out nobody even cares about the odd legacy behavior and we can
> just leave it behind?
>
> IOW, why go to extra pain if it turns out that there really are no
> reasons to do that?
It depends on how late we find out some weird application relying on the
current behaviour. The sysctl would have been a quick hack for a distro
to restore the old behaviour without patching the kernel. But I agree
that most likely it's just an unnecessary extra pain. We can try our
luck.
--
Catalin
^ permalink raw reply [flat|nested] 11+ messages in thread
* GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available
2024-07-09 16:01 ` [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available Linus Torvalds
@ 2024-07-17 16:22 ` Mark Rutland
2024-07-17 17:54 ` Linus Torvalds
2024-07-17 16:28 ` RESEND: " Mark Rutland
1 sibling, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2024-07-17 16:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux ARM, lkml, Will Deacon, Catalin Marinas, Jakub Jelinek,
Peter Zijlstra, Sean Christopherson, Alex Coplan, Szabolcs Nagy
Hi Linus,
As a heads-up, this is tickling a GCC bug with asm goto outputs in GCC
versions prior to 14.1.0, and the existing asm_goto_output() workaround
doesn't seem to be sufficent. More details on that below, and I've added
LKML and the usual folk to CC, including compiler folk who looked into
this area before. The gist is that in the paths where we jump to a goto
label GCC may erroneously omit assignments to variables which get
assigned an output of the asm in the non-branching case.
The big question is can we actually workaround this, or do we need to
bite the bullet and say that we need GCC 14.1.0 or later?
I spotted that while preparing a fixup for a thinko with
_ASM_EXTABLE_##type##ACCESS_ERR(), where we put the error code (-EFAULT)
into the value register. That *should* be benign, since in the faulting
case we subsequently assign 0, but due to the compiler bug GCC discards
that later assignment...
On Tue, Jul 09, 2024 at 09:01:59AM -0700, Linus Torvalds wrote:
> -#define __get_mem_asm(load, reg, x, addr, err, type) \
> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
> +#define __get_mem_asm(load, reg, x, addr, label, type) \
> + asm_goto_output( \
> + "1: " load " " reg "0, [%1]\n" \
> + _ASM_EXTABLE_##type##ACCESS_ERR(1b, %l2, %w0) \
> + : "=r" (x) \
> + : "r" (addr) : : label)
Ignoring the compiler bug, the extable entry should be:
_ASM_EXTABLE_##type##ACCESS(1b, %l2)
... since we don't have an error variable, and we don't intend to move
-EFAULT into the value destination register (which the extable fixup
handler will do for the 'error' value register).
I'll send a fixup for that, but the more pressing issue is
miscompilation, which seems to be the issue fixed in GCC in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
... which we tried to work around in commits:
4356e9f841f7fbb9 ("work around gcc bugs with 'asm goto' with outputs")
68fb3ca0e408e00d ("update workarounds for gcc "asm goto" issue")
... which we currently try to workaround that issue with the
asm_goto_output() macro:
| #define asm_goto_output(x...) \
| do { asm volatile goto(x); asm (""); } while (0)
... but testing indicates that's insufficient on arm64 and x86_64, and
even with that I see GCC erroneously omitting assignments to variables
in the 'goto' paths, where those are assigned an output in the non-goto
path.
As a reduced test case I have:
| #define asm_goto_output(x...) \
| do { asm volatile goto(x); asm (""); } while (0)
|
| #define __good_or_bad(__val, __key) \
| do { \
| __label__ __failed; \
| unsigned long __tmp; \
| asm_goto_output( \
| " cbnz %[key], %l[__failed]\n" \
| " mov %[val], #0x900d\n" \
| : [val] "=r" (__tmp) \
| : [key] "r" (__key) \
| : \
| : __failed); \
| (__val) = __tmp; \
| break; \
| __failed: \
| (__val) = 0xbad; \
| } while (0)
|
| unsigned long get_val(unsigned long key);
| unsigned long get_val(unsigned long key)
| {
| unsigned long val = 0xbad;
|
| __good_or_bad(val, key);
|
| return val;
| }
... which GCC 13.2.0 (at -O2) compiles to:
| cbnz x0, .Lfailed
| mov x0, #0x900d
| .Lfailed:
| ret
... where the assignment to '__val' in the '__failed' case has been
omitted.
If I add an 'asm("");' immediately after the '__failed' label, before the
assignment, GCC 13.2.0 generates:
| cbnz x0, .Lfailed
| mov x0, #0x900d
| ret
| .Lfailed:
| mov x0, #0xbad
| ret
... where the assignment is retained as we'd hope.
GCC 14.1.0 generates the later sequence regardless of the presence of an asm
after the __failed label.
Can anyone from the GCC side comment as to whether placing the dummy asm("")
block after the goto labels is a sufficient workaround, or whether that's just
avoiding the issue by chance?
Further examples below.
With current mainline, building the following:
| #include <linux/uaccess.h>
| #include <linux/types.h>
|
| noinline unsigned long test_get_user(unsigned long __user *ptr);
| noinline unsigned long test_get_user(unsigned long __user *ptr)
| {
| unsigned long val;
|
| if (!access_ok(ptr, sizeof(*ptr)))
| return 0xdead;
|
| if (get_user(val, ptr))
| val = 0x900d;
|
| return val;
| }
GCC 13.2.0, arm64 defconfig + ARM64_TAGGED_ADDR_ABI=n generates:
| mov x1, #0xffffffffffff8
| cmp x0, x1
| b.hi .Laccess_not_ok
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| .Lextable_fixup:
| ret
| .Laccess_not_ok:
| mov x0, #0xdead
| ret
... entirely omitting the assignment to 'val' in the error path.
GCC 14.1.0, arm64 defconfig + ARM64_TAGGED_ADDR_ABI=n generates:
| mov x1, #0xffffffffffff8
| cmp x0, x1
| b.hi .Laccess_not_ok
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| ret
| .Laccess_not_ok:
| mov x0, #0xdead
| ret
| .Lextable_fixup:
| mov x0, #0x900d
| ret
... with the expected assignment to 'val' in the error path.
On x86 we don't see this for regular get_user() because that calls
out-of-line asm rather than using asm goto with outputs. However
unsafe_get_user() usage does get miscompiled on both arm64 and x86_64.
In the unsafe_* case we generally don't manipulate the value register in
the error path, so we largely don't notice, but this is fragile.
With current mainline, building the following:
| #include <linux/uaccess.h>
| #include <linux/types.h>
|
| noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr);
| noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr)
| {
| unsigned long val;
|
| unsafe_get_user(val, ptr, Efault);
| return val;
|
| Efault:
| val = 0x900d;
| return val;
| }
GCC 13.2.0, arm64 defconfig generates:
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| .Lextable_fixup:
| ret
GCC 13.2.0, x86_64 defconfig + MITIGATION_RETPOLINE=n generates:
| endbr64
| mov (%rdi),%rax
| .Lextable_fixup:
| ret
... omitting the assignment to 'val' in the error path.
GCC 14.1.0, arm64 defconfig generates:
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| ret
| .Lextable_fixup:
| mov x0, #0x900d // #36877
| ret
GCC 14.1.0, x86_64 defconfig + MITIGATION_RETPOLINE=n generates:
| endbr64
| mov (%rdi),%rax
| ret
| .Lextable_fixup:
| mov $0x900d,%eax
| ret
... with the expected assignment to 'val' in the error path.
Mark.
^ permalink raw reply [flat|nested] 11+ messages in thread
* RESEND: GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available
2024-07-09 16:01 ` [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available Linus Torvalds
2024-07-17 16:22 ` GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available Mark Rutland
@ 2024-07-17 16:28 ` Mark Rutland
2024-07-17 18:29 ` Linus Torvalds
1 sibling, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2024-07-17 16:28 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux ARM, linux-kernel, Will Deacon, Catalin Marinas,
Jakub Jelinek, Peter Zijlstra, Sean Christopherson, Alex Coplan,
Szabolcs Nagy
[resending as I messed up the LKML address; sorry!]
Hi Linus,
As a heads-up, this is tickling a GCC bug with asm goto outputs in GCC
versions prior to 14.1.0, and the existing asm_goto_output() workaround
doesn't seem to be sufficent. More details on that below, and I've added
LKML and the usual folk to CC, including compiler folk who looked into
this area before. The gist is that in the paths where we jump to a goto
label GCC may erroneously omit assignments to variables which get
assigned an output of the asm in the non-branching case.
The big question is can we actually workaround this, or do we need to
bite the bullet and say that we need GCC 14.1.0 or later?
I spotted that while preparing a fixup for a thinko with
_ASM_EXTABLE_##type##ACCESS_ERR(), where we put the error code (-EFAULT)
into the value register. That *should* be benign, since in the faulting
case we subsequently assign 0, but due to the compiler bug GCC discards
that later assignment...
On Tue, Jul 09, 2024 at 09:01:59AM -0700, Linus Torvalds wrote:
> -#define __get_mem_asm(load, reg, x, addr, err, type) \
> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
> +#define __get_mem_asm(load, reg, x, addr, label, type) \
> + asm_goto_output( \
> + "1: " load " " reg "0, [%1]\n" \
> + _ASM_EXTABLE_##type##ACCESS_ERR(1b, %l2, %w0) \
> + : "=r" (x) \
> + : "r" (addr) : : label)
Ignoring the compiler bug, the extable entry should be:
_ASM_EXTABLE_##type##ACCESS(1b, %l2)
... since we don't have an error variable, and we don't intend to move
-EFAULT into the value destination register (which the extable fixup
handler will do for the 'error' value register).
I'll send a fixup for that, but the more pressing issue is
miscompilation, which seems to be the issue fixed in GCC in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
... which we tried to work around in commits:
4356e9f841f7fbb9 ("work around gcc bugs with 'asm goto' with outputs")
68fb3ca0e408e00d ("update workarounds for gcc "asm goto" issue")
... which we currently try to workaround that issue with the
asm_goto_output() macro:
| #define asm_goto_output(x...) \
| do { asm volatile goto(x); asm (""); } while (0)
... but testing indicates that's insufficient on arm64 and x86_64, and
even with that I see GCC erroneously omitting assignments to variables
in the 'goto' paths, where those are assigned an output in the non-goto
path.
As a reduced test case I have:
| #define asm_goto_output(x...) \
| do { asm volatile goto(x); asm (""); } while (0)
|
| #define __good_or_bad(__val, __key) \
| do { \
| __label__ __failed; \
| unsigned long __tmp; \
| asm_goto_output( \
| " cbnz %[key], %l[__failed]\n" \
| " mov %[val], #0x900d\n" \
| : [val] "=r" (__tmp) \
| : [key] "r" (__key) \
| : \
| : __failed); \
| (__val) = __tmp; \
| break; \
| __failed: \
| (__val) = 0xbad; \
| } while (0)
|
| unsigned long get_val(unsigned long key);
| unsigned long get_val(unsigned long key)
| {
| unsigned long val = 0xbad;
|
| __good_or_bad(val, key);
|
| return val;
| }
... which GCC 13.2.0 (at -O2) compiles to:
| cbnz x0, .Lfailed
| mov x0, #0x900d
| .Lfailed:
| ret
... where the assignment to '__val' in the '__failed' case has been
omitted.
If I add an 'asm("");' immediately after the '__failed' label, before the
assignment, GCC 13.2.0 generates:
| cbnz x0, .Lfailed
| mov x0, #0x900d
| ret
| .Lfailed:
| mov x0, #0xbad
| ret
... where the assignment is retained as we'd hope.
GCC 14.1.0 generates the later sequence regardless of the presence of an asm
after the __failed label.
Can anyone from the GCC side comment as to whether placing the dummy asm("")
block after the goto labels is a sufficient workaround, or whether that's just
avoiding the issue by chance?
Further examples below.
With current mainline, building the following:
| #include <linux/uaccess.h>
| #include <linux/types.h>
|
| noinline unsigned long test_get_user(unsigned long __user *ptr);
| noinline unsigned long test_get_user(unsigned long __user *ptr)
| {
| unsigned long val;
|
| if (!access_ok(ptr, sizeof(*ptr)))
| return 0xdead;
|
| if (get_user(val, ptr))
| val = 0x900d;
|
| return val;
| }
GCC 13.2.0, arm64 defconfig + ARM64_TAGGED_ADDR_ABI=n generates:
| mov x1, #0xffffffffffff8
| cmp x0, x1
| b.hi .Laccess_not_ok
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| .Lextable_fixup:
| ret
| .Laccess_not_ok:
| mov x0, #0xdead
| ret
... entirely omitting the assignment to 'val' in the error path.
GCC 14.1.0, arm64 defconfig + ARM64_TAGGED_ADDR_ABI=n generates:
| mov x1, #0xffffffffffff8
| cmp x0, x1
| b.hi .Laccess_not_ok
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| ret
| .Laccess_not_ok:
| mov x0, #0xdead
| ret
| .Lextable_fixup:
| mov x0, #0x900d
| ret
... with the expected assignment to 'val' in the error path.
On x86 we don't see this for regular get_user() because that calls
out-of-line asm rather than using asm goto with outputs. However
unsafe_get_user() usage does get miscompiled on both arm64 and x86_64.
In the unsafe_* case we generally don't manipulate the value register in
the error path, so we largely don't notice, but this is fragile.
With current mainline, building the following:
| #include <linux/uaccess.h>
| #include <linux/types.h>
|
| noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr);
| noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr)
| {
| unsigned long val;
|
| unsafe_get_user(val, ptr, Efault);
| return val;
|
| Efault:
| val = 0x900d;
| return val;
| }
GCC 13.2.0, arm64 defconfig generates:
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| .Lextable_fixup:
| ret
GCC 13.2.0, x86_64 defconfig + MITIGATION_RETPOLINE=n generates:
| endbr64
| mov (%rdi),%rax
| .Lextable_fixup:
| ret
... omitting the assignment to 'val' in the error path.
GCC 14.1.0, arm64 defconfig generates:
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| ret
| .Lextable_fixup:
| mov x0, #0x900d // #36877
| ret
GCC 14.1.0, x86_64 defconfig + MITIGATION_RETPOLINE=n generates:
| endbr64
| mov (%rdi),%rax
| ret
| .Lextable_fixup:
| mov $0x900d,%eax
| ret
... with the expected assignment to 'val' in the error path.
Mark.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available
2024-07-17 16:22 ` GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available Mark Rutland
@ 2024-07-17 17:54 ` Linus Torvalds
0 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-07-17 17:54 UTC (permalink / raw)
To: Mark Rutland
Cc: Linux ARM, lkml, Will Deacon, Catalin Marinas, Jakub Jelinek,
Peter Zijlstra, Sean Christopherson, Alex Coplan, Szabolcs Nagy
On Wed, 17 Jul 2024 at 09:23, Mark Rutland <mark.rutland@arm.com> wrote:
>
> As a heads-up, this is tickling a GCC bug with asm goto outputs in GCC
> versions prior to 14.1.0, and the existing asm_goto_output() workaround
> doesn't seem to be sufficent.
Uhhuh.
I've done all my testing with gcc version 13.3.1, but that is one of
the fixed versions.
> I spotted that while preparing a fixup for a thinko with
> _ASM_EXTABLE_##type##ACCESS_ERR(),
Ack. Your fix looks obviously correct to me, but yeah, the deeper
issue is the compiler bug interaction.
> | #define asm_goto_output(x...) \
> | do { asm volatile goto(x); asm (""); } while (0)
>
> ... but testing indicates that's insufficient on arm64 and x86_64, and
> even with that I see GCC erroneously omitting assignments to variables
> in the 'goto' paths, where those are assigned an output in the non-goto
> path.
>
> As a reduced test case I have: [ snip snip ]
>
> ... which GCC 13.2.0 (at -O2) compiles to:
>
> | cbnz x0, .Lfailed
> | mov x0, #0x900d
> | .Lfailed:
> | ret
Yeah, that test-case works fine for me, and I get the expected result:
get_val:
.LFB0:
.cfi_startproc
cbnz x0, .L3
mov x0, #0x900d
ret
.p2align 2,,3
.L3:
.L2:
mov x0, 2989
ret
.cfi_endproc
but as mentioned, I have one of the gcc versions that doesn't need
that GCC_ASM_GOTO_OUTPUT_WORKAROUND:
# Fixed in GCC 14, 13.3, 12.4 and 11.5
so my gcc-13.3.1 doesn't have the issue.
> Can anyone from the GCC side comment as to whether placing the dummy asm("")
> block after the goto labels is a sufficient workaround, or whether that's just
> avoiding the issue by chance?
I have to say that the workaround of putting the empty asm on the
target label rather than next to the "asm goto" itself would seem to
be more sensible than our current workaround, but the problem is that
the target is entirely separate.
The current workaround was very much a "this works in practice in the
(very few) cases we saw" and works syntactically.
And yes, due to how arm64 inlines get_user(), arm64 ends up seeing a
lot more effects of this.
Hmm. I was thinking that we could change the "asm_goto_output()" macro
to take the exception label as the first argument, and create some
kind of trampoline for the exception case with the workaround doing
something like
asm goto(/* for real */ ... : temp_label);
if (0) { temp_label: asm volatile(""); goto real_label; }
but it turns out that we actually have cases of multiple output labels
on x86: see __vmcs_readl() in arch/x86/kvm/vmx/vmx_ops.h.
So doing that kind of wrapping looks non-trivial.
Maybe we just need to give up on the workaround and say that "asm goto
with outputs only works when ASM_GOTO_OUTPUT_WORKAROUND is not set".
The set of broken gcc versions will get progressively smaller as time goes on.
Linus
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RESEND: GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available
2024-07-17 16:28 ` RESEND: " Mark Rutland
@ 2024-07-17 18:29 ` Linus Torvalds
0 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2024-07-17 18:29 UTC (permalink / raw)
To: Mark Rutland
Cc: Linux ARM, linux-kernel, Will Deacon, Catalin Marinas,
Jakub Jelinek, Peter Zijlstra, Sean Christopherson, Alex Coplan,
Szabolcs Nagy
On Wed, 17 Jul 2024 at 09:28, Mark Rutland <mark.rutland@arm.com> wrote:
>
> [resending as I messed up the LKML address; sorry!]
Bah. And my reply was to your original.
I'll just leave the lore link to the reply (that did go to the arm
list successfully) here for anybody that only gets lkml..
https://lore.kernel.org/all/CAHk-=wghzGt7J9XaQgcmLniYrQMtoXGcv+FvdGcyspkb+FxUsw@mail.gmail.com/
but the gist of it probably boils down to the final few sentences:
"Maybe we just need to give up on the workaround and say that "asm goto
with outputs only works when ASM_GOTO_OUTPUT_WORKAROUND is not set".
The set of broken gcc versions will get progressively smaller as time goes on"
which is sad, but not the end of the world.
Unless the gcc people can maybe come up with some variation of our
current workaround that actually works (because the "workaround at
target label" model looks very inconvenient).
Linus
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-07-17 18:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-09 16:01 arm64 uaccess series Linus Torvalds
2024-07-09 16:01 ` [PATCH 1/3] arm64: start using 'asm goto' for get_user() when available Linus Torvalds
2024-07-17 16:22 ` GCC asm goto outputs workaround (Was: "Re: [PATCH 1/3] arm64: start using 'asm goto' for get_user() when") available Mark Rutland
2024-07-17 17:54 ` Linus Torvalds
2024-07-17 16:28 ` RESEND: " Mark Rutland
2024-07-17 18:29 ` Linus Torvalds
2024-07-09 16:02 ` [PATCH 2/3] arm64: start using 'asm goto' for put_user() Linus Torvalds
2024-07-09 16:02 ` [PATCH 3/3] arm64: access_ok() optimization Linus Torvalds
2024-07-09 16:52 ` arm64 uaccess series Catalin Marinas
2024-07-09 17:12 ` Linus Torvalds
2024-07-09 18:30 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).