linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops
@ 2025-07-21  8:36 Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 1/7] arm64: cpufeature: add FEAT_LSUI Yeoreum Yun
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

Since Armv9.6, FEAT_LSUI supplies the load/store instructions for
previleged level to access to access user memory without clearing
PSTATE.PAN bit.

This patchset support FEAT_LUSI and applies in futex atomic operation
where can replace from ldxr/stlxr pair implmentation with clearing
PSTATE.PAN bit to correspondant load/store unprevileged atomic operation
without clearing PSTATE.PAN bit.

Patch Sequences
================

Patch #1 adds cpufeature for FEAT_LUSI

Patch #2 expose FEAT_LUSI to guest

Patch #3 adds Kconfig for FEAT_LUSI

Patch #4 separtes former futex atomic-op implmentation from futex.h
to futex_ll_sc_u.h

Patch #5 implments futex atomic operation using lsui instruction.

Patch #6 introduces lsui.h to apply runtime patch to use former
implmentation when FEAT_LUSI doesn't support.

Patch #7 applies lsui.h into arch_futext_atomic_op().

Patch History
==============
from v3 to v4:
  - rebase to v6.16-rc7
  - modify some patch's title.
  - https://lore.kernel.org/all/20250617183635.1266015-1-yeoreum.yun@arm.com/

from v2 to v3:
  - expose FEAT_LUSI to guest
  - add help section for LUSI Kconfig
  - https://lore.kernel.org/all/20250611151154.46362-1-yeoreum.yun@arm.com/

from v1 to v2:
  - remove empty v9.6 menu entry
  - locate HAS_LUSI in cpucaps in order
  - https://lore.kernel.org/all/20250611104916.10636-1-yeoreum.yun@arm.com/


Yeoreum Yun (7):
  arm64: cpufeature: add FEAT_LSUI
  KVM/arm64: expose FEAT_LSUI to guest
  arm64/Kconfig: add LSUI Kconfig
  arm64/futex: move futex atomic logic with clearing PAN bit
  arm64/futex: add futex atomic operation with FEAT_LSUI
  arm64/asm: introduce lsui.h
  arm64/futex: support futex with FEAT_LSUI

 arch/arm64/Kconfig                     |   9 ++
 arch/arm64/include/asm/futex.h         |  99 ++++++-------------
 arch/arm64/include/asm/futex_ll_sc_u.h | 115 +++++++++++++++++++++
 arch/arm64/include/asm/futex_lsui.h    | 132 +++++++++++++++++++++++++
 arch/arm64/include/asm/lsui.h          |  37 +++++++
 arch/arm64/kernel/cpufeature.c         |   8 ++
 arch/arm64/kvm/sys_regs.c              |   5 +-
 arch/arm64/tools/cpucaps               |   1 +
 8 files changed, 336 insertions(+), 70 deletions(-)
 create mode 100644 arch/arm64/include/asm/futex_ll_sc_u.h
 create mode 100644 arch/arm64/include/asm/futex_lsui.h
 create mode 100644 arch/arm64/include/asm/lsui.h

--
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/7] arm64: cpufeature: add FEAT_LSUI
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 2/7] KVM/arm64: expose FEAT_LSUI to guest Yeoreum Yun
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

Since Armv9.6, FEAT_LSUI supplies load/store instructions
for privileged level to access user memory without clearing PSTATE.PAN bit.

Add LSUI feature so that the unprevilieged load/store instructions
could be used when kernel accesses user memory without clearing PSTATE.PAN bit.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 arch/arm64/kernel/cpufeature.c | 8 ++++++++
 arch/arm64/tools/cpucaps       | 1 +
 2 files changed, 9 insertions(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e151585c6cca..eaf958a0d8bc 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -278,6 +278,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64isar3[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FPRCVT_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_LSUI_SHIFT, 4, ID_AA64ISAR3_EL1_LSUI_NI),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FAMINMAX_SHIFT, 4, 0),
 	ARM64_FTR_END,
 };
@@ -3061,6 +3062,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = has_pmuv3,
 	},
 #endif
+	{
+		.desc = "Unprivileged Load Store Instructions (LSUI)",
+		.capability = ARM64_HAS_LSUI,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_cpuid_feature,
+		ARM64_CPUID_FIELDS(ID_AA64ISAR3_EL1, LSUI, IMP)
+	},
 	{},
 };
 
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 10effd4cff6b..31f2cd655666 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -43,6 +43,7 @@ HAS_HCX
 HAS_LDAPR
 HAS_LPA2
 HAS_LSE_ATOMICS
+HAS_LSUI
 HAS_MOPS
 HAS_NESTED_VIRT
 HAS_PAN
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/7] KVM/arm64: expose FEAT_LSUI to guest
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 1/7] arm64: cpufeature: add FEAT_LSUI Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig Yeoreum Yun
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

expose FEAT_LSUI to guest.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/sys_regs.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c20bd6f21e60..cfdf99e92cda 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1636,7 +1636,8 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
 			val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
 		break;
 	case SYS_ID_AA64ISAR3_EL1:
-		val &= ID_AA64ISAR3_EL1_FPRCVT | ID_AA64ISAR3_EL1_FAMINMAX;
+		val &= ID_AA64ISAR3_EL1_FPRCVT | ID_AA64ISAR3_EL1_FAMINMAX |
+		       ID_AA64ISAR3_EL1_LSUI;
 		break;
 	case SYS_ID_AA64MMFR2_EL1:
 		val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK;
@@ -2921,7 +2922,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 					ID_AA64ISAR2_EL1_APA3 |
 					ID_AA64ISAR2_EL1_GPA3)),
 	ID_WRITABLE(ID_AA64ISAR3_EL1, (ID_AA64ISAR3_EL1_FPRCVT |
-				       ID_AA64ISAR3_EL1_FAMINMAX)),
+				       ID_AA64ISAR3_EL1_FAMINMAX | ID_AA64ISAR3_EL1_LSUI)),
 	ID_UNALLOCATED(6,4),
 	ID_UNALLOCATED(6,5),
 	ID_UNALLOCATED(6,6),
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 1/7] arm64: cpufeature: add FEAT_LSUI Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 2/7] KVM/arm64: expose FEAT_LSUI to guest Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  2025-07-21 10:52   ` Mark Rutland
  2025-07-21  8:36 ` [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit Yeoreum Yun
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

Since Armv9.6, FEAT_LSUI supplies the load/store instructions for
previleged level to access to access user memory without clearing
PSTATE.PAN bit.
It's enough to add CONFIG_AS_HAS_LSUI only because the code for LUSI uses
indiviual `.arch_extension` entries.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 arch/arm64/Kconfig | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 393d71124f5d..c0beb44ed5b8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2238,6 +2238,15 @@ config ARM64_GCS
 
 endmenu # "v9.4 architectural features"
 
+config AS_HAS_LSUI
+	def_bool $(as-instr,.arch_extension lsui)
+	help
+	 Unprivileged Load Store is an extension to introduce unprivileged
+	 variants of load and store instructions so that clearing PSTATE.PAN
+	 is never required in privileged mode.
+	 This feature is available with clang version 20 and later and not yet
+	 supported by gcc.
+
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
                   ` (2 preceding siblings ...)
  2025-07-21  8:36 ` [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  2025-07-21 10:56   ` Mark Rutland
  2025-07-21  8:36 ` [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI Yeoreum Yun
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

Move current futex atomic logics which uses ll/sc method with cleraing
PSTATE.PAN to separate file (futex_ll_sc_u.h) so that
former method will be used only when FEAT_LSUI isn't supported.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 arch/arm64/include/asm/futex_ll_sc_u.h | 115 +++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
 create mode 100644 arch/arm64/include/asm/futex_ll_sc_u.h

diff --git a/arch/arm64/include/asm/futex_ll_sc_u.h b/arch/arm64/include/asm/futex_ll_sc_u.h
new file mode 100644
index 000000000000..6702ba66f1b2
--- /dev/null
+++ b/arch/arm64/include/asm/futex_ll_sc_u.h
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2025 Arm Ltd.
+ */
+#ifndef __ASM_FUTEX_LL_SC_U_H
+#define __ASM_FUTEX_LL_SC_U_H
+
+#include <linux/uaccess.h>
+#include <linux/stringify.h>
+
+#define FUTEX_ATOMIC_OP(op, asm_op)					\
+static __always_inline int						\
+__ll_sc_u_futex_atomic_##op(int oparg, u32 __user *uaddr, int *oval)	\
+{									\
+	unsigned int loops = LL_SC_MAX_LOOPS;				\
+	int ret, val, tmp;						\
+									\
+	uaccess_enable_privileged();					\
+	asm volatile("// __ll_sc_u_futex_atomic_" #op "\n"		\
+	"	prfm	pstl1strm, %2\n"				\
+	"1:	ldxr	%w1, %2\n"					\
+	"	" #asm_op "	%w3, %w1, %w5\n"			\
+	"2:	stlxr	%w0, %w3, %2\n"					\
+	"	cbz	%w0, 3f\n"					\
+	"	sub	%w4, %w4, %w0\n"				\
+	"	cbnz	%w4, 1b\n"					\
+	"	mov	%w0, %w6\n"					\
+	"3:\n"								\
+	"	dmb	ish\n"						\
+	_ASM_EXTABLE_UACCESS_ERR(1b, 3b, %w0)				\
+	_ASM_EXTABLE_UACCESS_ERR(2b, 3b, %w0)				\
+	: "=&r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp),		\
+	  "+r" (loops)							\
+	: "r" (oparg), "Ir" (-EAGAIN)					\
+	: "memory");							\
+	uaccess_disable_privileged();					\
+									\
+	if (!ret)							\
+		*oval = val;						\
+									\
+	return ret;							\
+}
+
+FUTEX_ATOMIC_OP(add, add)
+FUTEX_ATOMIC_OP(or, orr)
+FUTEX_ATOMIC_OP(and, and)
+FUTEX_ATOMIC_OP(eor, eor)
+
+#undef FUTEX_ATOMIC_OP
+
+static __always_inline int
+__ll_sc_u_futex_atomic_set(int oparg, u32 __user *uaddr, int *oval)
+{
+	unsigned int loops = LL_SC_MAX_LOOPS;
+	int ret, val;
+
+	uaccess_enable_privileged();
+	asm volatile("//__ll_sc_u_futex_xchg\n"
+	"	prfm	pstl1strm, %2\n"
+	"1:	ldxr	%w1, %2\n"
+	"2:	stlxr	%w0, %w4, %2\n"
+	"	cbz	%w3, 3f\n"
+	"	sub	%w3, %w3, %w0\n"
+	"	cbnz	%w3, 1b\n"
+	"	mov	%w0, %w5\n"
+	"3:\n"
+	"	dmb	ish\n"
+	_ASM_EXTABLE_UACCESS_ERR(1b, 3b, %w0)
+	_ASM_EXTABLE_UACCESS_ERR(2b, 3b, %w0)
+	: "=&r" (ret), "=&r" (val), "+Q" (*uaddr), "+r" (loops)
+	: "r" (oparg), "Ir" (-EAGAIN)
+	: "memory");
+	uaccess_disable_privileged();
+
+	if (!ret)
+		*oval = val;
+
+	return ret;
+}
+
+static __always_inline int
+__ll_sc_u_futex_cmpxchg(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
+{
+	int ret = 0;
+	unsigned int loops = LL_SC_MAX_LOOPS;
+	u32 val, tmp;
+
+	uaccess_enable_privileged();
+	asm volatile("//__ll_sc_u_futex_cmpxchg\n"
+	"	prfm	pstl1strm, %2\n"
+	"1:	ldxr	%w1, %2\n"
+	"	eor	%w3, %w1, %w5\n"
+	"	cbnz	%w3, 4f\n"
+	"2:	stlxr	%w3, %w6, %2\n"
+	"	cbz	%w3, 3f\n"
+	"	sub	%w4, %w4, %w3\n"
+	"	cbnz	%w4, 1b\n"
+	"	mov	%w0, %w7\n"
+	"3:\n"
+	"	dmb	ish\n"
+	"4:\n"
+	_ASM_EXTABLE_UACCESS_ERR(1b, 4b, %w0)
+	_ASM_EXTABLE_UACCESS_ERR(2b, 4b, %w0)
+	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp), "+r" (loops)
+	: "r" (oldval), "r" (newval), "Ir" (-EAGAIN)
+	: "memory");
+	uaccess_disable_privileged();
+
+	if (!ret)
+		*oval = val;
+
+	return ret;
+}
+
+#endif /* __ASM_FUTEX_LL_SC_U_H */
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
                   ` (3 preceding siblings ...)
  2025-07-21  8:36 ` [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  2025-07-21 11:03   ` Mark Rutland
  2025-07-21  8:36 ` [PATCH v4 6/7] arm64/asm: introduce lsui.h Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 7/7] arm64/futex: support futex with FEAT_LSUI Yeoreum Yun
  6 siblings, 1 reply; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

Current futex atomic operations are implemented with ll/sc instructions and
clearing PSTATE.PAN.

Since Armv9.6, FEAT_LSUI supplies not only load/store instructions but
also atomic operation for user memory access in kernel it doesn't need
to clear PSTATE.PAN bit anymore.

With theses instructions some of futex atomic operations don't need to
be implmented with ldxr/stlxr pair instead can be implmented with
one atomic operation supplied by FEAT_LSUI.

However, some of futex atomic operations still need to use ll/sc way
via ldtxr/stltxr supplied by FEAT_LSUI since there is no correspondant
atomic instruction or doesn't support word size operation
(i.e) eor, cas{mb}t But It's good to work without clearing PSTATE.PAN bit.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 arch/arm64/include/asm/futex_lsui.h | 132 ++++++++++++++++++++++++++++
 1 file changed, 132 insertions(+)
 create mode 100644 arch/arm64/include/asm/futex_lsui.h

diff --git a/arch/arm64/include/asm/futex_lsui.h b/arch/arm64/include/asm/futex_lsui.h
new file mode 100644
index 000000000000..0dc7dca91cdb
--- /dev/null
+++ b/arch/arm64/include/asm/futex_lsui.h
@@ -0,0 +1,132 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2025 Arm Ltd.
+ */
+
+#ifndef __ASM_FUTEX_LSUI_H
+#define __ASM_FUTEX_LSUI_H
+
+#include <linux/uaccess.h>
+#include <linux/stringify.h>
+
+#define FUTEX_ATOMIC_OP(op, asm_op, mb)					\
+static __always_inline int						\
+__lsui_futex_atomic_##op(int oparg, u32 __user *uaddr, int *oval)	\
+{									\
+	int ret = 0;							\
+	int val;							\
+									\
+	mte_enable_tco();						\
+	uaccess_ttbr0_enable();						\
+									\
+	asm volatile("// __lsui_futex_atomic_" #op "\n"			\
+	__LSUI_PREAMBLE							\
+	"1:	" #asm_op #mb "	%w3, %w2, %1\n"				\
+	"2:\n"								\
+	_ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0)				\
+	: "+r" (ret), "+Q" (*uaddr), "=r" (val)				\
+	: "r" (oparg)							\
+	: "memory");							\
+									\
+	mte_disable_tco();						\
+	uaccess_ttbr0_disable();					\
+									\
+	if (!ret)							\
+		*oval = val;						\
+									\
+	return ret;							\
+}
+
+FUTEX_ATOMIC_OP(add, ldtadd, al)
+FUTEX_ATOMIC_OP(or, ldtset, al)
+FUTEX_ATOMIC_OP(andnot, ldtclr, al)
+FUTEX_ATOMIC_OP(set, swpt, al)
+
+#undef FUTEX_ATOMIC_OP
+
+static __always_inline int
+__lsui_futex_atomic_and(int oparg, u32 __user *uaddr, int *oval)
+{
+	return __lsui_futex_atomic_andnot(~oparg, uaddr, oval);
+}
+
+static __always_inline int
+__lsui_futex_atomic_eor(int oparg, u32 __user *uaddr, int *oval)
+{
+	unsigned int loops = LL_SC_MAX_LOOPS;
+	int ret, val, tmp;
+
+	mte_enable_tco();
+	uaccess_ttbr0_enable();
+
+	asm volatile("// __lsui_futex_atomic_eor\n"
+	__LSUI_PREAMBLE
+	"	prfm	pstl1strm, %2\n"
+	"1:	ldtxr	%w1, %2\n"
+	"	eor	%w3, %w1, %w5\n"
+	"2:	stltxr	%w0, %w3, %2\n"
+	"	cbz	%w0, 3f\n"
+	"	sub	%w4, %w4, %w0\n"
+	"	cbnz	%w4, 1b\n"
+	"	mov	%w0, %w6\n"
+	"3:\n"
+	"	dmb	ish\n"
+	_ASM_EXTABLE_UACCESS_ERR(1b, 3b, %w0)
+	_ASM_EXTABLE_UACCESS_ERR(2b, 3b, %w0)
+	: "=&r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp),
+	  "+r" (loops)
+	: "r" (oparg), "Ir" (-EAGAIN)
+	: "memory");
+
+	mte_disable_tco();
+	uaccess_ttbr0_disable();
+
+	if (!ret)
+		*oval = val;
+
+	return ret;
+}
+
+static __always_inline int
+__lsui_futex_cmpxchg(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
+{
+	int ret = 0;
+	unsigned int loops = LL_SC_MAX_LOOPS;
+	u32 val, tmp;
+
+	mte_enable_tco();
+	uaccess_ttbr0_enable();
+
+	/*
+	 * cas{al}t doesn't support word size...
+	 */
+	asm volatile("//__lsui_futex_cmpxchg\n"
+	__LSUI_PREAMBLE
+	"	prfm	pstl1strm, %2\n"
+	"1:	ldtxr	%w1, %2\n"
+	"	eor	%w3, %w1, %w5\n"
+	"	cbnz	%w3, 4f\n"
+	"2:	stltxr	%w3, %w6, %2\n"
+	"	cbz	%w3, 3f\n"
+	"	sub	%w4, %w4, %w3\n"
+	"	cbnz	%w4, 1b\n"
+	"	mov	%w0, %w7\n"
+	"3:\n"
+	"	dmb	ish\n"
+	"4:\n"
+	_ASM_EXTABLE_UACCESS_ERR(1b, 4b, %w0)
+	_ASM_EXTABLE_UACCESS_ERR(2b, 4b, %w0)
+	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp), "+r" (loops)
+	: "r" (oldval), "r" (newval), "Ir" (-EAGAIN)
+	: "memory");
+
+	mte_disable_tco();
+	uaccess_ttbr0_disable();
+
+	if (!ret)
+		*oval = oldval;
+
+	return ret;
+}
+
+#endif /* __ASM_FUTEX_LSUI_H */
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 6/7] arm64/asm: introduce lsui.h
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
                   ` (4 preceding siblings ...)
  2025-07-21  8:36 ` [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  2025-07-21  8:36 ` [PATCH v4 7/7] arm64/futex: support futex with FEAT_LSUI Yeoreum Yun
  6 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

This patch introduces lsui.h header file for applying runtime patch
to use load/store unprevileged instructions when cpu supports
FEAT_LSUI otherwise uses method implemented via ll/sc way with
clearing PSTATE.PAN bit

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 arch/arm64/include/asm/lsui.h | 37 +++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)
 create mode 100644 arch/arm64/include/asm/lsui.h

diff --git a/arch/arm64/include/asm/lsui.h b/arch/arm64/include/asm/lsui.h
new file mode 100644
index 000000000000..39bf232f3eb7
--- /dev/null
+++ b/arch/arm64/include/asm/lsui.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2025 Arm Ltd.
+ */
+#ifndef __ASM_LSUI_H
+#define __ASM_LSUI_H
+
+#define LL_SC_MAX_LOOPS	128 /* What's the largest number you can think of? */
+
+#include <asm/futex_ll_sc_u.h>
+
+#ifdef CONFIG_AS_HAS_LSUI
+
+#define __LSUI_PREAMBLE	".arch_extension lsui\n"
+
+#include <linux/compiler_types.h>
+#include <linux/export.h>
+#include <linux/stringify.h>
+#include <asm/alternative.h>
+#include <asm/alternative-macros.h>
+#include <asm/cpucaps.h>
+
+#include <asm/futex_lsui.h>
+
+#define __lsui_ll_sc_u_body(op, ...)					\
+({									\
+	alternative_has_cap_likely(ARM64_HAS_LSUI) ?		\
+		__lsui_##op(__VA_ARGS__) :				\
+		__ll_sc_u_##op(__VA_ARGS__);				\
+})
+
+#else	/* CONFIG_AS_HAS_LSUI */
+
+#define __lsui_ll_sc_u_body(op, ...)		__ll_sc_u_##op(__VA_ARGS__)
+
+#endif	/* CONFIG_AS_HAS_LSUI */
+#endif	/* __ASM_LSUI_H */
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 7/7] arm64/futex: support futex with FEAT_LSUI
  2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
                   ` (5 preceding siblings ...)
  2025-07-21  8:36 ` [PATCH v4 6/7] arm64/asm: introduce lsui.h Yeoreum Yun
@ 2025-07-21  8:36 ` Yeoreum Yun
  6 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-21  8:36 UTC (permalink / raw)
  To: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz
  Cc: linux-arm-kernel, linux-kernel, Yeoreum Yun

Since Armv9.6, FEAT_LSUI supplies load/store unprevileged instructions
for kernel to access user memory without clearing PSTATE.PAN.

This patch makes futex use futex_atomic operations implemented with these
instruction when cpu supports FEAT_LSUI otherwise they work with
ldxr/stlxr with clearing PSTATE.PAN bit.

Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
---
 arch/arm64/include/asm/futex.h | 99 +++++++++++-----------------------
 1 file changed, 31 insertions(+), 68 deletions(-)

diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
index bc06691d2062..ed4586776655 100644
--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -9,71 +9,60 @@
 #include <linux/uaccess.h>
 
 #include <asm/errno.h>
+#include <asm/lsui.h>
 
-#define FUTEX_MAX_LOOPS	128 /* What's the largest number you can think of? */
-
-#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg)		\
-do {									\
-	unsigned int loops = FUTEX_MAX_LOOPS;				\
-									\
-	uaccess_enable_privileged();					\
-	asm volatile(							\
-"	prfm	pstl1strm, %2\n"					\
-"1:	ldxr	%w1, %2\n"						\
-	insn "\n"							\
-"2:	stlxr	%w0, %w3, %2\n"						\
-"	cbz	%w0, 3f\n"						\
-"	sub	%w4, %w4, %w0\n"					\
-"	cbnz	%w4, 1b\n"						\
-"	mov	%w0, %w6\n"						\
-"3:\n"									\
-"	dmb	ish\n"							\
-	_ASM_EXTABLE_UACCESS_ERR(1b, 3b, %w0)				\
-	_ASM_EXTABLE_UACCESS_ERR(2b, 3b, %w0)				\
-	: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp),	\
-	  "+r" (loops)							\
-	: "r" (oparg), "Ir" (-EAGAIN)					\
-	: "memory");							\
-	uaccess_disable_privileged();					\
-} while (0)
+#define FUTEX_ATOMIC_OP(op)										\
+static __always_inline int										\
+__futex_atomic_##op(int oparg, u32 __user *uaddr, int *oval)	\
+{									\
+	return __lsui_ll_sc_u_body(futex_atomic_##op, oparg, uaddr, oval);					\
+}
+
+FUTEX_ATOMIC_OP(add)
+FUTEX_ATOMIC_OP(or)
+FUTEX_ATOMIC_OP(and)
+FUTEX_ATOMIC_OP(eor)
+FUTEX_ATOMIC_OP(set)
+
+#undef FUTEX_ATOMIC_OP
+
+static __always_inline int
+__futex_cmpxchg(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
+{
+	return __lsui_ll_sc_u_body(futex_cmpxchg, uaddr, oldval, newval, oval);
+}
 
 static inline int
 arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *_uaddr)
 {
-	int oldval = 0, ret, tmp;
-	u32 __user *uaddr = __uaccess_mask_ptr(_uaddr);
+	int ret;
+	u32 __user *uaddr;
 
 	if (!access_ok(_uaddr, sizeof(u32)))
 		return -EFAULT;
 
+	uaddr = __uaccess_mask_ptr(_uaddr);
+
 	switch (op) {
 	case FUTEX_OP_SET:
-		__futex_atomic_op("mov	%w3, %w5",
-				  ret, oldval, uaddr, tmp, oparg);
+		ret = __futex_atomic_set(oparg, uaddr, oval);
 		break;
 	case FUTEX_OP_ADD:
-		__futex_atomic_op("add	%w3, %w1, %w5",
-				  ret, oldval, uaddr, tmp, oparg);
+		ret = __futex_atomic_add(oparg, uaddr, oval);
 		break;
 	case FUTEX_OP_OR:
-		__futex_atomic_op("orr	%w3, %w1, %w5",
-				  ret, oldval, uaddr, tmp, oparg);
+		ret = __futex_atomic_or(oparg, uaddr, oval);
 		break;
 	case FUTEX_OP_ANDN:
-		__futex_atomic_op("and	%w3, %w1, %w5",
-				  ret, oldval, uaddr, tmp, ~oparg);
+		ret = __futex_atomic_and(~oparg, uaddr, oval);
 		break;
 	case FUTEX_OP_XOR:
-		__futex_atomic_op("eor	%w3, %w1, %w5",
-				  ret, oldval, uaddr, tmp, oparg);
+		ret = __futex_atomic_eor(oparg, uaddr, oval);
 		break;
 	default:
 		ret = -ENOSYS;
 	}
 
-	if (!ret)
-		*oval = oldval;
-
 	return ret;
 }
 
@@ -81,40 +70,14 @@ static inline int
 futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *_uaddr,
 			      u32 oldval, u32 newval)
 {
-	int ret = 0;
-	unsigned int loops = FUTEX_MAX_LOOPS;
-	u32 val, tmp;
 	u32 __user *uaddr;
 
 	if (!access_ok(_uaddr, sizeof(u32)))
 		return -EFAULT;
 
 	uaddr = __uaccess_mask_ptr(_uaddr);
-	uaccess_enable_privileged();
-	asm volatile("// futex_atomic_cmpxchg_inatomic\n"
-"	prfm	pstl1strm, %2\n"
-"1:	ldxr	%w1, %2\n"
-"	sub	%w3, %w1, %w5\n"
-"	cbnz	%w3, 4f\n"
-"2:	stlxr	%w3, %w6, %2\n"
-"	cbz	%w3, 3f\n"
-"	sub	%w4, %w4, %w3\n"
-"	cbnz	%w4, 1b\n"
-"	mov	%w0, %w7\n"
-"3:\n"
-"	dmb	ish\n"
-"4:\n"
-	_ASM_EXTABLE_UACCESS_ERR(1b, 4b, %w0)
-	_ASM_EXTABLE_UACCESS_ERR(2b, 4b, %w0)
-	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp), "+r" (loops)
-	: "r" (oldval), "r" (newval), "Ir" (-EAGAIN)
-	: "memory");
-	uaccess_disable_privileged();
-
-	if (!ret)
-		*uval = val;
 
-	return ret;
+	return __futex_cmpxchg(uaddr, oldval, newval, uval);
 }
 
 #endif /* __ASM_FUTEX_H */
-- 
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig
  2025-07-21  8:36 ` [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig Yeoreum Yun
@ 2025-07-21 10:52   ` Mark Rutland
  2025-07-22  8:17     ` Yeoreum Yun
  0 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2025-07-21 10:52 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz, linux-arm-kernel,
	linux-kernel

On Mon, Jul 21, 2025 at 09:36:14AM +0100, Yeoreum Yun wrote:
> Since Armv9.6, FEAT_LSUI supplies the load/store instructions for
> previleged level to access to access user memory without clearing
> PSTATE.PAN bit.
> It's enough to add CONFIG_AS_HAS_LSUI only because the code for LUSI uses

Nit: s/LUSI/LSUI/

> indiviual `.arch_extension` entries.

Nit: s/indiviual/individual/

> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  arch/arm64/Kconfig | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 393d71124f5d..c0beb44ed5b8 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2238,6 +2238,15 @@ config ARM64_GCS
>  
>  endmenu # "v9.4 architectural features"
>  
> +config AS_HAS_LSUI
> +	def_bool $(as-instr,.arch_extension lsui)
> +	help
> +	 Unprivileged Load Store is an extension to introduce unprivileged
> +	 variants of load and store instructions so that clearing PSTATE.PAN
> +	 is never required in privileged mode.
> +	 This feature is available with clang version 20 and later and not yet
> +	 supported by gcc.

I don't think we need to describe the feature in detail for the AS_HAS_*
config symbol; I think all we need to say is:

	Supported by LLVM 20 and later, not yet supported by GNU AS.

Otherwise this looks fine.

Mark.

> +
>  config ARM64_SVE
>  	bool "ARM Scalable Vector Extension support"
>  	default y
> -- 
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit
  2025-07-21  8:36 ` [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit Yeoreum Yun
@ 2025-07-21 10:56   ` Mark Rutland
  2025-07-22  8:21     ` Yeoreum Yun
  0 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2025-07-21 10:56 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz, linux-arm-kernel,
	linux-kernel

On Mon, Jul 21, 2025 at 09:36:15AM +0100, Yeoreum Yun wrote:
> Move current futex atomic logics which uses ll/sc method with cleraing
> PSTATE.PAN to separate file (futex_ll_sc_u.h) so that
> former method will be used only when FEAT_LSUI isn't supported.

This isn't moving logic, this is *duplicating* the existing logic. As of
this patch, this logic in the <asm/futex_ll_sc_u.h> header is unused,
and the existing logic in <asm/futex.h> is still used as-is.

Please refactor the existing logic first. The deletion of the existing
code should happen at the same time as this addition. That way it's
possible to see that the deleted logic corresponds to what is being
added in the header, and it's generally nicer for bisection.

Mark.

> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  arch/arm64/include/asm/futex_ll_sc_u.h | 115 +++++++++++++++++++++++++
>  1 file changed, 115 insertions(+)
>  create mode 100644 arch/arm64/include/asm/futex_ll_sc_u.h
> 
> diff --git a/arch/arm64/include/asm/futex_ll_sc_u.h b/arch/arm64/include/asm/futex_ll_sc_u.h
> new file mode 100644
> index 000000000000..6702ba66f1b2
> --- /dev/null
> +++ b/arch/arm64/include/asm/futex_ll_sc_u.h
> @@ -0,0 +1,115 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2025 Arm Ltd.
> + */
> +#ifndef __ASM_FUTEX_LL_SC_U_H
> +#define __ASM_FUTEX_LL_SC_U_H
> +
> +#include <linux/uaccess.h>
> +#include <linux/stringify.h>
> +
> +#define FUTEX_ATOMIC_OP(op, asm_op)					\
> +static __always_inline int						\
> +__ll_sc_u_futex_atomic_##op(int oparg, u32 __user *uaddr, int *oval)	\
> +{									\
> +	unsigned int loops = LL_SC_MAX_LOOPS;				\
> +	int ret, val, tmp;						\
> +									\
> +	uaccess_enable_privileged();					\
> +	asm volatile("// __ll_sc_u_futex_atomic_" #op "\n"		\
> +	"	prfm	pstl1strm, %2\n"				\
> +	"1:	ldxr	%w1, %2\n"					\
> +	"	" #asm_op "	%w3, %w1, %w5\n"			\
> +	"2:	stlxr	%w0, %w3, %2\n"					\
> +	"	cbz	%w0, 3f\n"					\
> +	"	sub	%w4, %w4, %w0\n"				\
> +	"	cbnz	%w4, 1b\n"					\
> +	"	mov	%w0, %w6\n"					\
> +	"3:\n"								\
> +	"	dmb	ish\n"						\
> +	_ASM_EXTABLE_UACCESS_ERR(1b, 3b, %w0)				\
> +	_ASM_EXTABLE_UACCESS_ERR(2b, 3b, %w0)				\
> +	: "=&r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp),		\
> +	  "+r" (loops)							\
> +	: "r" (oparg), "Ir" (-EAGAIN)					\
> +	: "memory");							\
> +	uaccess_disable_privileged();					\
> +									\
> +	if (!ret)							\
> +		*oval = val;						\
> +									\
> +	return ret;							\
> +}
> +
> +FUTEX_ATOMIC_OP(add, add)
> +FUTEX_ATOMIC_OP(or, orr)
> +FUTEX_ATOMIC_OP(and, and)
> +FUTEX_ATOMIC_OP(eor, eor)
> +
> +#undef FUTEX_ATOMIC_OP
> +
> +static __always_inline int
> +__ll_sc_u_futex_atomic_set(int oparg, u32 __user *uaddr, int *oval)
> +{
> +	unsigned int loops = LL_SC_MAX_LOOPS;
> +	int ret, val;
> +
> +	uaccess_enable_privileged();
> +	asm volatile("//__ll_sc_u_futex_xchg\n"
> +	"	prfm	pstl1strm, %2\n"
> +	"1:	ldxr	%w1, %2\n"
> +	"2:	stlxr	%w0, %w4, %2\n"
> +	"	cbz	%w3, 3f\n"
> +	"	sub	%w3, %w3, %w0\n"
> +	"	cbnz	%w3, 1b\n"
> +	"	mov	%w0, %w5\n"
> +	"3:\n"
> +	"	dmb	ish\n"
> +	_ASM_EXTABLE_UACCESS_ERR(1b, 3b, %w0)
> +	_ASM_EXTABLE_UACCESS_ERR(2b, 3b, %w0)
> +	: "=&r" (ret), "=&r" (val), "+Q" (*uaddr), "+r" (loops)
> +	: "r" (oparg), "Ir" (-EAGAIN)
> +	: "memory");
> +	uaccess_disable_privileged();
> +
> +	if (!ret)
> +		*oval = val;
> +
> +	return ret;
> +}
> +
> +static __always_inline int
> +__ll_sc_u_futex_cmpxchg(u32 __user *uaddr, u32 oldval, u32 newval, u32 *oval)
> +{
> +	int ret = 0;
> +	unsigned int loops = LL_SC_MAX_LOOPS;
> +	u32 val, tmp;
> +
> +	uaccess_enable_privileged();
> +	asm volatile("//__ll_sc_u_futex_cmpxchg\n"
> +	"	prfm	pstl1strm, %2\n"
> +	"1:	ldxr	%w1, %2\n"
> +	"	eor	%w3, %w1, %w5\n"
> +	"	cbnz	%w3, 4f\n"
> +	"2:	stlxr	%w3, %w6, %2\n"
> +	"	cbz	%w3, 3f\n"
> +	"	sub	%w4, %w4, %w3\n"
> +	"	cbnz	%w4, 1b\n"
> +	"	mov	%w0, %w7\n"
> +	"3:\n"
> +	"	dmb	ish\n"
> +	"4:\n"
> +	_ASM_EXTABLE_UACCESS_ERR(1b, 4b, %w0)
> +	_ASM_EXTABLE_UACCESS_ERR(2b, 4b, %w0)
> +	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp), "+r" (loops)
> +	: "r" (oldval), "r" (newval), "Ir" (-EAGAIN)
> +	: "memory");
> +	uaccess_disable_privileged();
> +
> +	if (!ret)
> +		*oval = val;
> +
> +	return ret;
> +}
> +
> +#endif /* __ASM_FUTEX_LL_SC_U_H */
> -- 
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI
  2025-07-21  8:36 ` [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI Yeoreum Yun
@ 2025-07-21 11:03   ` Mark Rutland
  2025-07-22  8:34     ` Yeoreum Yun
  0 siblings, 1 reply; 14+ messages in thread
From: Mark Rutland @ 2025-07-21 11:03 UTC (permalink / raw)
  To: Yeoreum Yun
  Cc: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz, linux-arm-kernel,
	linux-kernel

On Mon, Jul 21, 2025 at 09:36:16AM +0100, Yeoreum Yun wrote:
> Current futex atomic operations are implemented with ll/sc instructions and
> clearing PSTATE.PAN.
> 
> Since Armv9.6, FEAT_LSUI supplies not only load/store instructions but
> also atomic operation for user memory access in kernel it doesn't need
> to clear PSTATE.PAN bit anymore.
> 
> With theses instructions some of futex atomic operations don't need to
> be implmented with ldxr/stlxr pair instead can be implmented with
> one atomic operation supplied by FEAT_LSUI.
> 
> However, some of futex atomic operations still need to use ll/sc way
> via ldtxr/stltxr supplied by FEAT_LSUI since there is no correspondant
> atomic instruction or doesn't support word size operation
> (i.e) eor, cas{mb}t But It's good to work without clearing PSTATE.PAN bit.

That's unfortunate; have we fed back to Arm's architecture folks that we
care about those cases?

> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>  arch/arm64/include/asm/futex_lsui.h | 132 ++++++++++++++++++++++++++++
>  1 file changed, 132 insertions(+)
>  create mode 100644 arch/arm64/include/asm/futex_lsui.h

This logic is introduced unused, and TBH I don't think this needs to be
in a separate header.

I reckon it's be better to keep all of this in <asm/futex.h> and rework
the series to:

(1) Factor out the existing LL/SC logic into separate LL/SC helpers in
    <asm/futex.h>, with an __llsc_ prefix, called by the existing
    functions.

(2) Add the new __lsui_ futex operations to <asm/futex.h>, along with
    code to select between the __llsc_ and __lsui_ versions.

We split the regular atomics different becuase there are *many* generic
atomic operations, but I don't think it's worthwhile to split the futex
logic over several headers.

Maybe it's worth having <asm/lsui.h>, but for now I reckon it's best to
also fold that into <asm/futex.h>, and we can split it out later if we
need it for something else.

Mark.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig
  2025-07-21 10:52   ` Mark Rutland
@ 2025-07-22  8:17     ` Yeoreum Yun
  0 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-22  8:17 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz, linux-arm-kernel,
	linux-kernel

Hi Mark,

> On Mon, Jul 21, 2025 at 09:36:14AM +0100, Yeoreum Yun wrote:
> > Since Armv9.6, FEAT_LSUI supplies the load/store instructions for
> > previleged level to access to access user memory without clearing
> > PSTATE.PAN bit.
> > It's enough to add CONFIG_AS_HAS_LSUI only because the code for LUSI uses
>
> Nit: s/LUSI/LSUI/
>
> > indiviual `.arch_extension` entries.
>
> Nit: s/indiviual/individual/

Sorry. I'll change it...


> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> >  arch/arm64/Kconfig | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 393d71124f5d..c0beb44ed5b8 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -2238,6 +2238,15 @@ config ARM64_GCS
> >
> >  endmenu # "v9.4 architectural features"
> >
> > +config AS_HAS_LSUI
> > +	def_bool $(as-instr,.arch_extension lsui)
> > +	help
> > +	 Unprivileged Load Store is an extension to introduce unprivileged
> > +	 variants of load and store instructions so that clearing PSTATE.PAN
> > +	 is never required in privileged mode.
> > +	 This feature is available with clang version 20 and later and not yet
> > +	 supported by gcc.
>
> I don't think we need to describe the feature in detail for the AS_HAS_*
> config symbol; I think all we need to say is:
>
> 	Supported by LLVM 20 and later, not yet supported by GNU AS.
>

Okay. I'll change it.

> Otherwise this looks fine.

Thanks!

[...]

--
Sincerely,
Yeoreum Yun


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit
  2025-07-21 10:56   ` Mark Rutland
@ 2025-07-22  8:21     ` Yeoreum Yun
  0 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-22  8:21 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz, linux-arm-kernel,
	linux-kernel

Hi Mark,

> > Move current futex atomic logics which uses ll/sc method with cleraing
> > PSTATE.PAN to separate file (futex_ll_sc_u.h) so that
> > former method will be used only when FEAT_LSUI isn't supported.
>
> This isn't moving logic, this is *duplicating* the existing logic. As of
> this patch, this logic in the <asm/futex_ll_sc_u.h> header is unused,
> and the existing logic in <asm/futex.h> is still used as-is.
>
> Please refactor the existing logic first. The deletion of the existing
> code should happen at the same time as this addition. That way it's
> possible to see that the deleted logic corresponds to what is being
> added in the header, and it's generally nicer for bisection.
>
> Mark.

Thanks for this :)
As you suggest in other comments, I'll respin in <asm/futex.h> only.

[...]

--
Sincerely,
Yeoreum Yun


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI
  2025-07-21 11:03   ` Mark Rutland
@ 2025-07-22  8:34     ` Yeoreum Yun
  0 siblings, 0 replies; 14+ messages in thread
From: Yeoreum Yun @ 2025-07-22  8:34 UTC (permalink / raw)
  To: Mark Rutland
  Cc: catalin.marinas, will, broonie, oliver.upton, ardb, frederic,
	james.morse, joey.gouly, scott, maz, linux-arm-kernel,
	linux-kernel

Hi Mark,

> > Current futex atomic operations are implemented with ll/sc instructions and
> > clearing PSTATE.PAN.
> >
> > Since Armv9.6, FEAT_LSUI supplies not only load/store instructions but
> > also atomic operation for user memory access in kernel it doesn't need
> > to clear PSTATE.PAN bit anymore.
> >
> > With theses instructions some of futex atomic operations don't need to
> > be implmented with ldxr/stlxr pair instead can be implmented with
> > one atomic operation supplied by FEAT_LSUI.
> >
> > However, some of futex atomic operations still need to use ll/sc way
> > via ldtxr/stltxr supplied by FEAT_LSUI since there is no correspondant
> > atomic instruction or doesn't support word size operation
> > (i.e) eor, cas{mb}t But It's good to work without clearing PSTATE.PAN bit.
>
> That's unfortunate; have we fed back to Arm's architecture folks that we
> care about those cases?

I haven’t done so yet. If you don’t mind,
could you let me know the appropriate person to give the feedback to?

>
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> >  arch/arm64/include/asm/futex_lsui.h | 132 ++++++++++++++++++++++++++++
> >  1 file changed, 132 insertions(+)
> >  create mode 100644 arch/arm64/include/asm/futex_lsui.h
>
> This logic is introduced unused, and TBH I don't think this needs to be
> in a separate header.
>
> I reckon it's be better to keep all of this in <asm/futex.h> and rework
> the series to:
>
> (1) Factor out the existing LL/SC logic into separate LL/SC helpers in
>     <asm/futex.h>, with an __llsc_ prefix, called by the existing
>     functions.
>
> (2) Add the new __lsui_ futex operations to <asm/futex.h>, along with
>     code to select between the __llsc_ and __lsui_ versions.
>
> We split the regular atomics different becuase there are *many* generic
> atomic operations, but I don't think it's worthwhile to split the futex
> logic over several headers.
>
> Maybe it's worth having <asm/lsui.h>, but for now I reckon it's best to
> also fold that into <asm/futex.h>, and we can split it out later if we
> need it for something else.

Thanks for your suggestion.
I’ll rework it while keeping this implementation.

Thanks!

--
Sincerely,
Yeoreum Yun


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-22  8:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-21  8:36 [PATCH v4 0/7] support FEAT_LSUI and apply it on futex atomic ops Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 1/7] arm64: cpufeature: add FEAT_LSUI Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 2/7] KVM/arm64: expose FEAT_LSUI to guest Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 3/7] arm64/Kconfig: add LSUI Kconfig Yeoreum Yun
2025-07-21 10:52   ` Mark Rutland
2025-07-22  8:17     ` Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 4/7] arm64/futex: move futex atomic logic with clearing PAN bit Yeoreum Yun
2025-07-21 10:56   ` Mark Rutland
2025-07-22  8:21     ` Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 5/7] arm64/futex: add futex atomic operation with FEAT_LSUI Yeoreum Yun
2025-07-21 11:03   ` Mark Rutland
2025-07-22  8:34     ` Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 6/7] arm64/asm: introduce lsui.h Yeoreum Yun
2025-07-21  8:36 ` [PATCH v4 7/7] arm64/futex: support futex with FEAT_LSUI Yeoreum Yun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).