[PATCH v7 0/7] Add support for FEAT_{LS64, LS64

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests
@ 2025-11-07  7:21 Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots Zhou Wang
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V}.
Add support for Armv8.7 FEAT_{LS64, LS64_V}:
- Add identifying and enabling in the cpufeature list
- Expose the support of these features to userspace through HWCAP3 and cpuinfo
- Add related hwcap test
- Handle the trap of unsupported memory (normal/uncacheable) access in a VM

A real scenario for this feature is that the userspace driver can make use of
this to implement direct WQE (workqueue entry) - a mechanism to fill WQE
directly into the hardware.

Picked Marc's 2 patches form [1] for handling the LS64 trap in a VM on emulated
MMIO and the introduce of KVM_EXIT_ARM_LDST64B.

[1] https://lore.kernel.org/linux-arm-kernel/20240815125959.2097734-1-maz@kernel.org/

Tested with updated hwcap test(host in VHE):
[root@localhost tmp]# ./hwcap
[...]
# LS64 present
ok 226 cpuinfo_match_LS64
ok 227 sigill_LS64
ok 228 # SKIP sigbus_LS64
# LS64_V present
ok 229 cpuinfo_match_LS64_V
ok 230 sigill_LS64_V
ok 231 # SKIP sigbus_LS64_V
# 121 skipped test(s) detected. Consider enabling relevant config options to improve coverage.
# Totals: pass:110 fail:0 xfail:0 xpass:0 skip:121 error:0

Tested with updated hwcap test(guest in VHE):
root@localhost:/mnt# ./hwcap
[...]
# LS64 present
ok 226 cpuinfo_match_LS64
ok 227 sigill_LS64
ok 228 # SKIP sigbus_LS64
# LS64_V present
ok 229 cpuinfo_match_LS64_V
ok 230 sigill_LS64_V
ok 231 # SKIP sigbus_LS64_V
# 121 skipped test(s) detected. Consider enabling relevant config options to improve coverage.
# Totals: pass:110 fail:0 xfail:0 xpass:0 skip:121 error:0

Tested nVHE case as well, same log as above.

Change since v6:
- Add exception inject about nested VM in 3/7
- Remove __maybe_unused related codes in 7/7, replace to use asm clobber
- Add my signoff in each patch
- Rebase on v6.18-rc4
Link: https://lore.kernel.org/all/20251024090819.4097819-1-wangzhou1@hisilicon.com/

Change since v5:
- Rebase on v6.18-rc2 and fix the conflicts
- Add more description in elf_hwcaps.rst as suggested by Catalin Marinas
Link: https://lore.kernel.org/all/20250818064806.25417-1-yangyicong@huawei.com/

Change since v4:
- Rebase on v6.17-rc2 and fix the conflicts
Link: https://lore.kernel.org/linux-arm-kernel/20250715081356.12442-1-yangyicong@huawei.com/

Change since v3:
- Inject DABT fault for LS64 fault on unsupported memory but with valid memslot
Link: https://lore.kernel.org/linux-arm-kernel/20250626080906.64230-1-yangyicong@huawei.com/

Change since v2:
- Handle the LS64 fault to userspace and allow userspace to inject LS64 fault
- Reorder the patches to make KVM handling prior to feature support
Link: https://lore.kernel.org/linux-arm-kernel/20250331094320.35226-1-yangyicong@huawei.com/

Change since v1:
- Drop the support for LS64_ACCDATA
- handle the DABT of unsupported memory type after checking the memory attributes
Link: https://lore.kernel.org/linux-arm-kernel/20241202135504.14252-1-yangyicong@huawei.com/

Marc Zyngier (2):
  KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B

Yicong Yang (5):
  KVM: arm64: Handle DABT caused by LS64* instructions on unsupported
    memory
  arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
  arm64: Add support for FEAT_{LS64, LS64_V}
  KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
  kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V}

 Documentation/arch/arm64/booting.rst      | 12 ++++
 Documentation/arch/arm64/elf_hwcaps.rst   | 14 ++++
 Documentation/virt/kvm/api.rst            | 43 ++++++++++--
 arch/arm64/include/asm/el2_setup.h        | 12 +++-
 arch/arm64/include/asm/esr.h              |  8 +++
 arch/arm64/include/asm/hwcap.h            |  2 +
 arch/arm64/include/asm/kvm_emulate.h      |  7 ++
 arch/arm64/include/uapi/asm/hwcap.h       |  2 +
 arch/arm64/kernel/cpufeature.c            | 51 +++++++++++++++
 arch/arm64/kernel/cpuinfo.c               |  2 +
 arch/arm64/kvm/inject_fault.c             | 34 ++++++++++
 arch/arm64/kvm/mmio.c                     | 27 +++++++-
 arch/arm64/kvm/mmu.c                      | 14 +++-
 arch/arm64/tools/cpucaps                  |  2 +
 include/uapi/linux/kvm.h                  |  3 +-
 tools/testing/selftests/arm64/abi/hwcap.c | 79 +++++++++++++++++++++++
 16 files changed, 301 insertions(+), 11 deletions(-)

-- 
2.33.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07 11:48   ` Suzuki K Poulose
  2025-11-07  7:21 ` [PATCH v7 2/7] KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B Zhou Wang
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Marc Zyngier <maz@kernel.org>

The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.

However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.

Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.

A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 arch/arm64/kvm/mmio.c    | 27 ++++++++++++++++++++++++++-
 include/uapi/linux/kvm.h |  3 ++-
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 54f9358c9e0e..2a6261abb647 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -159,6 +159,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 	bool is_write;
 	int len;
 	u8 data_buf[8];
+	u64 esr;
+
+	esr = kvm_vcpu_get_esr(vcpu);
 
 	/*
 	 * No valid syndrome? Ask userspace for help if it has
@@ -168,7 +171,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 	 * though, so directly deliver an exception to the guest.
 	 */
 	if (!kvm_vcpu_dabt_isvalid(vcpu)) {
-		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
+		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), esr,
 				    kvm_vcpu_get_hfar(vcpu), fault_ipa);
 
 		if (vcpu_is_protected(vcpu))
@@ -185,6 +188,28 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 		return -ENOSYS;
 	}
 
+	/*
+	 * When (DFSC == 0b00xxxx || DFSC == 0b10101x) && DFSC != 0b0000xx
+	 * ESR_EL2[12:11] describe the Load/Store Type. This allows us to
+	 * punt the LD64B/ST64B/ST64BV/ST64BV0 instructions to luserspace,
+	 * which will have to provide a full emulation of these 4
+	 * instructions.  No, we don't expect this do be fast.
+	 *
+	 * We rely on traps being set if the corresponding features are not
+	 * enabled, so if we get here, userspace has promised us to handle
+	 * it already.
+	 */
+	switch (kvm_vcpu_trap_get_fault(vcpu)) {
+	case 0b000100 ... 0b001111:
+	case 0b101010 ... 0b101011:
+		if (FIELD_GET(GENMASK(12, 11), esr)) {
+			run->exit_reason = KVM_EXIT_ARM_LDST64B;
+			run->arm_nisv.esr_iss = esr & ~(u64)ESR_ELx_FSC;
+			run->arm_nisv.fault_ipa = fault_ipa;
+			return 0;
+		}
+	}
+
 	/*
 	 * Prepare MMIO operation. First decode the syndrome data we get
 	 * from the CPU. Then try if some in-kernel emulation feels
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 52f6000ab020..d219946b96be 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -179,6 +179,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_LOONGARCH_IOCSR  38
 #define KVM_EXIT_MEMORY_FAULT     39
 #define KVM_EXIT_TDX              40
+#define KVM_EXIT_ARM_LDST64B      41
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -401,7 +402,7 @@ struct kvm_run {
 		} eoi;
 		/* KVM_EXIT_HYPERV */
 		struct kvm_hyperv_exit hyperv;
-		/* KVM_EXIT_ARM_NISV */
+		/* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
 		struct {
 			__u64 esr_iss;
 			__u64 fault_ipa;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 2/7] KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 3/7] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Zhou Wang
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Marc Zyngier <maz@kernel.org>

Add a bit of documentation for KVM_EXIT_ARM_LDST64B so that userspace
knows what to expect.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 Documentation/virt/kvm/api.rst | 43 ++++++++++++++++++++++++++++------
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 57061fa29e6a..b4af2a04b51a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1303,12 +1303,13 @@ userspace, for example because of missing instruction syndrome decode
 information or because there is no device mapped at the accessed IPA, then
 userspace can ask the kernel to inject an external abort using the address
 from the exiting fault on the VCPU. It is a programming error to set
-ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or
-KVM_EXIT_ARM_NISV. This feature is only available if the system supports
-KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in
-how userspace reports accesses for the above cases to guests, across different
-userspace implementations. Nevertheless, userspace can still emulate all Arm
-exceptions by manipulating individual registers using the KVM_SET_ONE_REG API.
+ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO,
+KVM_EXIT_ARM_NISV, or KVM_EXIT_ARM_LDST64B. This feature is only available if
+the system supports KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which
+provides commonality in how userspace reports accesses for the above cases to
+guests, across different userspace implementations. Nevertheless, userspace
+can still emulate all Arm exceptions by manipulating individual registers
+using the KVM_SET_ONE_REG API.
 
 See KVM_GET_VCPU_EVENTS for the data structure.
 
@@ -7050,12 +7051,14 @@ in send_page or recv a buffer to recv_page).
 
 ::
 
-		/* KVM_EXIT_ARM_NISV */
+		/* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
 		struct {
 			__u64 esr_iss;
 			__u64 fault_ipa;
 		} arm_nisv;
 
+- KVM_EXIT_ARM_NISV:
+
 Used on arm64 systems. If a guest accesses memory not in a memslot,
 KVM will typically return to userspace and ask it to do MMIO emulation on its
 behalf. However, for certain classes of instructions, no instruction decode
@@ -7089,6 +7092,32 @@ Note that although KVM_CAP_ARM_NISV_TO_USER will be reported if
 queried outside of a protected VM context, the feature will not be
 exposed if queried on a protected VM file descriptor.
 
+- KVM_EXIT_ARM_LDST64B:
+
+Used on arm64 systems. When a guest using a LD64B, ST64B, ST64BV, ST64BV0,
+outside of a memslot, KVM will return to userspace with KVM_EXIT_ARM_LDST64B,
+exposing the relevant ESR_EL2 information and faulting IPA, similarly to
+KVM_EXIT_ARM_NISV.
+
+Userspace is supposed to fully emulate the instructions, which includes:
+
+	- fetch of the operands for a store, including ACCDATA_EL1 in the case
+	  of a ST64BV0 instruction
+	- deal with the endianness if the guest is big-endian
+	- emulate the access, including the delivery of an exception if the
+	  access didn't succeed
+	- provide a return value in the case of ST64BV/ST64BV0
+	- return the data in the case of a load
+	- increment PC if the instruction was successfully executed
+
+Note that there is no expectation of performance for this emulation, as it
+involves a large number of interaction with the guest state. It is, however,
+expected that the instruction's semantics are preserved, specially the
+single-copy atomicity property of the 64 byte access.
+
+This exit reason must be handled if userspace sets ID_AA64ISAR1_EL1.LS64 to a
+non-zero value, indicating that FEAT_LS64* is enabled.
+
 ::
 
 		/* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 3/7] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 2/7] KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 4/7] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 Zhou Wang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Yicong Yang <yangyicong@hisilicon.com>

If FEAT_LS64WB not supported, FEAT_LS64* instructions only support
to access Device/Uncacheable memory, otherwise a data abort for
unsupported Exclusive or atomic access (0x35, UAoEF) is generated
per spec. It's implementation defined whether the target exception
level is routed and is possible to implemented as route to EL2 on a
VHE VM according to DDI0487L.b Section C3.2.6 Single-copy atomic
64-byte load/store.

If it's implemented as generate the DABT to the final enabled stage
(stage-2), inject the UAoEF back to the guest after checking the
memslot is valid.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 arch/arm64/include/asm/esr.h         |  8 +++++++
 arch/arm64/include/asm/kvm_emulate.h |  1 +
 arch/arm64/kvm/inject_fault.c        | 34 ++++++++++++++++++++++++++++
 arch/arm64/kvm/mmu.c                 | 14 +++++++++++-
 4 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index e1deed824464..63cd17f830da 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -124,6 +124,7 @@
 #define ESR_ELx_FSC_SEA_TTW(n)	(0x14 + (n))
 #define ESR_ELx_FSC_SECC	(0x18)
 #define ESR_ELx_FSC_SECC_TTW(n)	(0x1c + (n))
+#define ESR_ELx_FSC_EXCL_ATOMIC	(0x35)
 #define ESR_ELx_FSC_ADDRSZ	(0x00)
 
 /*
@@ -488,6 +489,13 @@ static inline bool esr_fsc_is_access_flag_fault(unsigned long esr)
 	       (esr == ESR_ELx_FSC_ACCESS_L(0));
 }
 
+static inline bool esr_fsc_is_excl_atomic_fault(unsigned long esr)
+{
+	esr = esr & ESR_ELx_FSC;
+
+	return esr == ESR_ELx_FSC_EXCL_ATOMIC;
+}
+
 static inline bool esr_fsc_is_addr_sz_fault(unsigned long esr)
 {
 	esr &= ESR_ELx_FSC;
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index c9eab316398e..bab967d65715 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -47,6 +47,7 @@ void kvm_skip_instr32(struct kvm_vcpu *vcpu);
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
 int kvm_inject_serror_esr(struct kvm_vcpu *vcpu, u64 esr);
 int kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr);
+int kvm_inject_dabt_excl_atomic(struct kvm_vcpu *vcpu, u64 addr);
 void kvm_inject_size_fault(struct kvm_vcpu *vcpu);
 
 static inline int kvm_inject_sea_dabt(struct kvm_vcpu *vcpu, u64 addr)
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index dfcd66c65517..6cc7ad84d7d8 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -253,6 +253,40 @@ int kvm_inject_sea(struct kvm_vcpu *vcpu, bool iabt, u64 addr)
 	return 1;
 }
 
+static int kvm_inject_nested_excl_atomic(struct kvm_vcpu *vcpu, u64 addr)
+{
+	u64 esr = FIELD_PREP(ESR_ELx_EC_MASK, ESR_ELx_EC_DABT_LOW) |
+		  FIELD_PREP(ESR_ELx_FSC, ESR_ELx_FSC_EXCL_ATOMIC) |
+		  ESR_ELx_IL;
+
+	vcpu_write_sys_reg(vcpu, addr, FAR_EL2);
+	return kvm_inject_nested_sync(vcpu, esr);
+}
+
+/**
+ * kvm_inject_dabt_excl_atomic - inject a data abort for unsupported exclusive
+ *				 or atomic access
+ * @vcpu: The VCPU to receive the data abort
+ * @addr: The address to report in the DFAR
+ *
+ * It is assumed that this code is called from the VCPU thread and that the
+ * VCPU therefore is not currently executing guest code.
+ */
+int kvm_inject_dabt_excl_atomic(struct kvm_vcpu *vcpu, u64 addr)
+{
+	u64 esr;
+
+	if (is_nested_ctxt(vcpu) && (vcpu_read_sys_reg(vcpu, HCR_EL2) & HCR_VM))
+		return kvm_inject_nested_excl_atomic(vcpu, addr);
+
+	__kvm_inject_sea(vcpu, false, addr);
+	esr = vcpu_read_sys_reg(vcpu, exception_esr_elx(vcpu));
+	esr &= ~ESR_ELx_FSC;
+	esr |= ESR_ELx_FSC_EXCL_ATOMIC;
+	vcpu_write_sys_reg(vcpu, esr, exception_esr_elx(vcpu));
+	return 1;
+}
+
 void kvm_inject_size_fault(struct kvm_vcpu *vcpu)
 {
 	unsigned long addr, esr;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7cc964af8d30..06cec9070ea6 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1802,6 +1802,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return ret;
 	}
 
+	/*
+	 * Guest performs atomic/exclusive operations on memory with unsupported
+	 * attributes (e.g. ld64b/st64b on normal memory when no FEAT_LS64WB)
+	 * and trigger the exception here. Since the memslot is valid, inject
+	 * the fault back to the guest.
+	 */
+	if (esr_fsc_is_excl_atomic_fault(kvm_vcpu_get_esr(vcpu))) {
+		kvm_inject_dabt_excl_atomic(vcpu, kvm_vcpu_get_hfar(vcpu));
+		return 1;
+	}
+
 	if (nested)
 		adjust_nested_fault_perms(nested, &prot, &writable);
 
@@ -1971,7 +1982,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 	/* Check the stage-2 fault is trans. fault or write fault */
 	if (!esr_fsc_is_translation_fault(esr) &&
 	    !esr_fsc_is_permission_fault(esr) &&
-	    !esr_fsc_is_access_flag_fault(esr)) {
+	    !esr_fsc_is_access_flag_fault(esr) &&
+	    !esr_fsc_is_excl_atomic_fault(esr)) {
 		kvm_err("Unsupported FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n",
 			kvm_vcpu_trap_get_class(vcpu),
 			(unsigned long)kvm_vcpu_trap_get_fault(vcpu),
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 4/7] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
                   ` (2 preceding siblings ...)
  2025-11-07  7:21 ` [PATCH v7 3/7] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07  7:21 ` [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V} Zhou Wang
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Yicong Yang <yangyicong@hisilicon.com>

Instructions introduced by FEAT_{LS64, LS64_V} is controlled by
HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage
at EL0/1.

This doesn't mean these instructions are always available in
EL0/1 if provided. The hypervisor still have the control at
runtime.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 arch/arm64/include/asm/el2_setup.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index 99a7c0235e6d..9dbd0da8eee8 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -83,9 +83,19 @@
         /* Enable GCS if supported */
 	mrs_s	x1, SYS_ID_AA64PFR1_EL1
 	ubfx	x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
-	cbz	x1, .Lset_hcrx_\@
+	cbz	x1, .Lskip_gcs_hcrx_\@
 	orr	x0, x0, #HCRX_EL2_GCSEn
 
+.Lskip_gcs_hcrx_\@:
+	/* Enable LS64, LS64_V if supported */
+	mrs_s	x1, SYS_ID_AA64ISAR1_EL1
+	ubfx	x1, x1, #ID_AA64ISAR1_EL1_LS64_SHIFT, #4
+	cbz	x1, .Lset_hcrx_\@
+	orr	x0, x0, #HCRX_EL2_EnALS
+	cmp	x1, #ID_AA64ISAR1_EL1_LS64_LS64_V
+	b.lt	.Lset_hcrx_\@
+	orr	x0, x0, #HCRX_EL2_EnASR
+
 .Lset_hcrx_\@:
 	msr_s	SYS_HCRX_EL2, x0
 .Lskip_hcrx_\@:
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
                   ` (3 preceding siblings ...)
  2025-11-07  7:21 ` [PATCH v7 4/7] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07 12:05   ` Suzuki K Poulose
  2025-11-11 11:15   ` Marc Zyngier
  2025-11-07  7:21 ` [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest Zhou Wang
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Yicong Yang <yangyicong@hisilicon.com>

Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V}.
These features are identified by ID_AA64ISAR1_EL1.LS64 and the
use of such instructions in userspace (EL0) can be trapped. In
order to support the use of corresponding instructions in userspace:
- Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
- Add identifying and enabling in the cpufeature list
- Expose these support of these features to userspace through HWCAP3
  and cpuinfo

ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
special memory (device memory) so requires support by the CPU, system
and target memory location (device that support these instructions).
The HWCAP3_{LS64, LS64_V} implies the support of CPU and system (since
no identification method from system, so SoC vendors should advertise
support in the CPU if system also support them).

Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
DABT will be generated, so users (probably userspace driver developer)
should make sure the target memory (device) also have the support.
For st64bv 0xffffffffffffffff will be returned as status result for
unsupported memory so user should check it.

Document the restrictions along with HWCAP3_{LS64, LS64_V}.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 Documentation/arch/arm64/booting.rst    | 12 ++++++
 Documentation/arch/arm64/elf_hwcaps.rst | 14 +++++++
 arch/arm64/include/asm/hwcap.h          |  2 +
 arch/arm64/include/uapi/asm/hwcap.h     |  2 +
 arch/arm64/kernel/cpufeature.c          | 51 +++++++++++++++++++++++++
 arch/arm64/kernel/cpuinfo.c             |  2 +
 arch/arm64/tools/cpucaps                |  2 +
 7 files changed, 85 insertions(+)

diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
index e4f953839f71..2c56d76ecafb 100644
--- a/Documentation/arch/arm64/booting.rst
+++ b/Documentation/arch/arm64/booting.rst
@@ -556,6 +556,18 @@ Before jumping into the kernel, the following conditions must be met:
 
    - MDCR_EL3.TPM (bit 6) must be initialized to 0b0
 
+  For CPUs with support for 64-byte loads and stores without status (FEAT_LS64):
+
+  - If the kernel is entered at EL1 and EL2 is present:
+
+    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.
+
+  For CPUs with support for 64-byte stores with status (FEAT_LS64_V):
+
+  - If the kernel is entered at EL1 and EL2 is present:
+
+    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.
+
 The requirements described above for CPU mode, caches, MMUs, architected
 timers, coherency and system registers apply to all CPUs.  All CPUs must
 enter the kernel in the same exception level.  Where the values documented
diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
index a15df4956849..b86059bc288b 100644
--- a/Documentation/arch/arm64/elf_hwcaps.rst
+++ b/Documentation/arch/arm64/elf_hwcaps.rst
@@ -444,6 +444,20 @@ HWCAP3_MTE_STORE_ONLY
 HWCAP3_LSFE
     Functionality implied by ID_AA64ISAR3_EL1.LSFE == 0b0001
 
+HWCAP3_LS64
+    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001. Note that
+    the function of instruction ld64b/st64b requires support by CPU, system
+    and target (device) memory location and HWCAP3_LS64 implies the support
+    of CPU. User should only use ld64b/st64b on supported target (device)
+    memory location, otherwise fallback to the non-atomic alternatives.
+
+HWCAP3_LS64_V
+    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010. Same to
+    HWCAP3_LS64 that HWCAP3_LS64_V implies CPU's support of instruction
+    st64bv but also requires the support from the system and target (device)
+    memory location. st64bv supports return status result and 0xFFFFFFFFFFFFFFFF
+    will be returned for unsupported memory location.
+
 
 4. Unused AT_HWCAP bits
 -----------------------
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 6d567265467c..3c0804fb3435 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -179,6 +179,8 @@
 #define KERNEL_HWCAP_MTE_FAR		__khwcap3_feature(MTE_FAR)
 #define KERNEL_HWCAP_MTE_STORE_ONLY	__khwcap3_feature(MTE_STORE_ONLY)
 #define KERNEL_HWCAP_LSFE		__khwcap3_feature(LSFE)
+#define KERNEL_HWCAP_LS64		__khwcap3_feature(LS64)
+#define KERNEL_HWCAP_LS64_V		__khwcap3_feature(LS64_V)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 575564ecdb0b..79bc77425b82 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -146,5 +146,7 @@
 #define HWCAP3_MTE_FAR		(1UL << 0)
 #define HWCAP3_MTE_STORE_ONLY		(1UL << 1)
 #define HWCAP3_LSFE		(1UL << 2)
+#define HWCAP3_LS64		(1UL << 3)
+#define HWCAP3_LS64_V		(1UL << 4)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 5ed401ff79e3..dcc5ba620a7e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -239,6 +239,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LS64_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_XS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0),
@@ -2259,6 +2260,38 @@ static void cpu_enable_e0pd(struct arm64_cpu_capabilities const *cap)
 }
 #endif /* CONFIG_ARM64_E0PD */
 
+static bool has_ls64(const struct arm64_cpu_capabilities *entry, int __unused)
+{
+	u64 ls64;
+
+	ls64 = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg),
+					   entry->field_pos, entry->sign);
+
+	if (ls64 == ID_AA64ISAR1_EL1_LS64_NI ||
+	    ls64 > ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA)
+		return false;
+
+	if (entry->capability == ARM64_HAS_LS64 &&
+	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64)
+		return true;
+
+	if (entry->capability == ARM64_HAS_LS64_V &&
+	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64_V)
+		return true;
+
+	return false;
+}
+
+static void cpu_enable_ls64(struct arm64_cpu_capabilities const *cap)
+{
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnALS, SCTLR_EL1_EnALS);
+}
+
+static void cpu_enable_ls64_v(struct arm64_cpu_capabilities const *cap)
+{
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnASR, SCTLR_EL1_EnASR);
+}
+
 #ifdef CONFIG_ARM64_PSEUDO_NMI
 static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 				   int scope)
@@ -3088,6 +3121,22 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.capability = ARM64_HAS_GICV5_LEGACY,
 		.matches = test_has_gicv5_legacy,
 	},
+	{
+		.desc = "LS64",
+		.capability = ARM64_HAS_LS64,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_ls64,
+		.cpu_enable = cpu_enable_ls64,
+		ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64)
+	},
+	{
+		.desc = "LS64_V",
+		.capability = ARM64_HAS_LS64_V,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_ls64,
+		.cpu_enable = cpu_enable_ls64_v,
+		ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64_V)
+	},
 	{},
 };
 
@@ -3207,6 +3256,8 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64ISAR1_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_EBF16),
 	HWCAP_CAP(ID_AA64ISAR1_EL1, DGH, IMP, CAP_HWCAP, KERNEL_HWCAP_DGH),
 	HWCAP_CAP(ID_AA64ISAR1_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_I8MM),
+	HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64, CAP_HWCAP, KERNEL_HWCAP_LS64),
+	HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64_V, CAP_HWCAP, KERNEL_HWCAP_LS64_V),
 	HWCAP_CAP(ID_AA64ISAR2_EL1, LUT, IMP, CAP_HWCAP, KERNEL_HWCAP_LUT),
 	HWCAP_CAP(ID_AA64ISAR3_EL1, FAMINMAX, IMP, CAP_HWCAP, KERNEL_HWCAP_FAMINMAX),
 	HWCAP_CAP(ID_AA64ISAR3_EL1, LSFE, IMP, CAP_HWCAP, KERNEL_HWCAP_LSFE),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index c44e6d94f5de..e8ae0a1885e0 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -81,6 +81,8 @@ static const char *const hwcap_str[] = {
 	[KERNEL_HWCAP_PACA]		= "paca",
 	[KERNEL_HWCAP_PACG]		= "pacg",
 	[KERNEL_HWCAP_GCS]		= "gcs",
+	[KERNEL_HWCAP_LS64]		= "ls64",
+	[KERNEL_HWCAP_LS64_V]		= "ls64_v",
 	[KERNEL_HWCAP_DCPODP]		= "dcpodp",
 	[KERNEL_HWCAP_SVE2]		= "sve2",
 	[KERNEL_HWCAP_SVEAES]		= "sveaes",
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 1b32c1232d28..9aae80ac2280 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -45,6 +45,8 @@ HAS_HCX
 HAS_LDAPR
 HAS_LPA2
 HAS_LSE_ATOMICS
+HAS_LS64
+HAS_LS64_V
 HAS_MOPS
 HAS_NESTED_VIRT
 HAS_BBML2_NOABORT
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
                   ` (4 preceding siblings ...)
  2025-11-07  7:21 ` [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V} Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07 18:53   ` Oliver Upton
  2025-11-07  7:21 ` [PATCH v7 7/7] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V} Zhou Wang
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Yicong Yang <yangyicong@hisilicon.com>

Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 arch/arm64/include/asm/kvm_emulate.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index bab967d65715..29291e25ecfd 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -695,6 +695,12 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
 
 		if (kvm_has_sctlr2(kvm))
 			vcpu->arch.hcrx_el2 |= HCRX_EL2_SCTLR2En;
+
+		if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64))
+			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnALS;
+
+		if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64_V))
+			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnASR;
 	}
 }
 #endif /* __ARM64_KVM_EMULATE_H__ */
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v7 7/7] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V}
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
                   ` (5 preceding siblings ...)
  2025-11-07  7:21 ` [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest Zhou Wang
@ 2025-11-07  7:21 ` Zhou Wang
  2025-11-07  9:21   ` Arnd Bergmann
  2025-11-07  9:23 ` [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Arnd Bergmann
  2025-11-07 18:57 ` Oliver Upton
  8 siblings, 1 reply; 25+ messages in thread
From: Zhou Wang @ 2025-11-07  7:21 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5,
	wangzhou1

From: Yicong Yang <yangyicong@hisilicon.com>

Add tests for FEAT_{LS64, LS64_V}. Issue related instructions
if feature presents, no SIGILL should be received. When such
instructions operate on Device memory or non-cacheable memory,
we may received a SIGBUS during the test (w/o FEAT_LS64WB).
Just ignore it since we only tested whether the instruction
itself can be issued as expected on platforms declaring the
support of such features.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
---
 tools/testing/selftests/arm64/abi/hwcap.c | 79 +++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c
index 3b96d090c5eb..4d41ab10988f 100644
--- a/tools/testing/selftests/arm64/abi/hwcap.c
+++ b/tools/testing/selftests/arm64/abi/hwcap.c
@@ -11,6 +11,8 @@
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <linux/auxvec.h>
+#include <linux/compiler.h>
 #include <sys/auxv.h>
 #include <sys/prctl.h>
 #include <asm/hwcap.h>
@@ -595,6 +597,67 @@ static void lrcpc3_sigill(void)
 	              : "=r" (data0), "=r" (data1) : "r" (src) :);
 }
 
+static void ignore_signal(int sig, siginfo_t *info, void *context)
+{
+	ucontext_t *uc = context;
+
+	uc->uc_mcontext.pc += 4;
+}
+
+static void ls64_sigill(void)
+{
+	struct sigaction ign, old;
+	char src[64] __aligned(64) = { 1 };
+
+	/*
+	 * LS64, LS64_V require target memory to be Device/Non-cacheable (if
+	 * FEAT_LS64WB not supported) and the completer supports these
+	 * instructions, otherwise we'll receive a SIGBUS. Since we are only
+	 * testing the ABI here, so just ignore the SIGBUS and see if we can
+	 * execute the instructions without receiving a SIGILL. Restore the
+	 * handler of SIGBUS after this test.
+	 */
+	ign.sa_sigaction = ignore_signal;
+	ign.sa_flags = SA_SIGINFO | SA_RESTART;
+	sigemptyset(&ign.sa_mask);
+	sigaction(SIGBUS, &ign, &old);
+
+	register void *xn asm ("x8") = src;
+	register u64 xt_1 asm ("x0");
+
+	/* LD64B x0, [x8] */
+	asm volatile(".inst 0xf83fd100" : "=r" (xt_1) : "r" (xn)
+		     : "x1", "x2", "x3", "x4", "x5", "x6", "x7");
+
+	/* ST64B x0, [x8] */
+	asm volatile(".inst 0xf83f9100" : : "r" (xt_1), "r" (xn)
+		     : "x1", "x2", "x3", "x4", "x5", "x6", "x7");
+
+	sigaction(SIGBUS, &old, NULL);
+}
+
+static void ls64_v_sigill(void)
+{
+	struct sigaction ign, old;
+	char dst[64] __aligned(64);
+
+	/* See comment in ls64_sigill() */
+	ign.sa_sigaction = ignore_signal;
+	ign.sa_flags = SA_SIGINFO | SA_RESTART;
+	sigemptyset(&ign.sa_mask);
+	sigaction(SIGBUS, &ign, &old);
+
+	register void *xn asm ("x8") = dst;
+	register u64 xt_1 asm ("x0") = 1;
+	register u64 st   asm ("x9");
+
+	/* ST64BV x9, x0, [x8] */
+	asm volatile(".inst 0xf829b100" : "=r" (st) : "r" (xt_1), "r" (xn)
+		     : "x1", "x2", "x3", "x4", "x5", "x6", "x7");
+
+	sigaction(SIGBUS, &old, NULL);
+}
+
 static const struct hwcap_data {
 	const char *name;
 	unsigned long at_hwcap;
@@ -1134,6 +1197,22 @@ static const struct hwcap_data {
 		.hwcap_bit = HWCAP3_MTE_STORE_ONLY,
 		.cpuinfo = "mtestoreonly",
 	},
+	{
+		.name = "LS64",
+		.at_hwcap = AT_HWCAP3,
+		.hwcap_bit = HWCAP3_LS64,
+		.cpuinfo = "ls64",
+		.sigill_fn = ls64_sigill,
+		.sigill_reliable = true,
+	},
+	{
+		.name = "LS64_V",
+		.at_hwcap = AT_HWCAP3,
+		.hwcap_bit = HWCAP3_LS64_V,
+		.cpuinfo = "ls64_v",
+		.sigill_fn = ls64_v_sigill,
+		.sigill_reliable = true,
+	},
 };
 
 typedef void (*sighandler_fn)(int, siginfo_t *, void *);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 7/7] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V}
  2025-11-07  7:21 ` [PATCH v7 7/7] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V} Zhou Wang
@ 2025-11-07  9:21   ` Arnd Bergmann
  0 siblings, 0 replies; 25+ messages in thread
From: Arnd Bergmann @ 2025-11-07  9:21 UTC (permalink / raw)
  To: Zhou Wang, Catalin Marinas, Will Deacon, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu
  Cc: linux-arm-kernel, kvmarm, Yicong Yang, prime.zeng, xuwei5

On Fri, Nov 7, 2025, at 08:21, Zhou Wang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> Add tests for FEAT_{LS64, LS64_V}. Issue related instructions
> if feature presents, no SIGILL should be received. When such
> instructions operate on Device memory or non-cacheable memory,
> we may received a SIGBUS during the test (w/o FEAT_LS64WB).
> Just ignore it since we only tested whether the instruction
> itself can be issued as expected on platforms declaring the
> support of such features.
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>

Thanks for the changes,

Acked-by: Arnd Bergmann <arnd@arndb.de>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
                   ` (6 preceding siblings ...)
  2025-11-07  7:21 ` [PATCH v7 7/7] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V} Zhou Wang
@ 2025-11-07  9:23 ` Arnd Bergmann
  2025-11-07 18:57 ` Oliver Upton
  8 siblings, 0 replies; 25+ messages in thread
From: Arnd Bergmann @ 2025-11-07  9:23 UTC (permalink / raw)
  To: Zhou Wang, Catalin Marinas, Will Deacon, Marc Zyngier,
	Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu
  Cc: linux-arm-kernel, kvmarm, Yicong Yang, prime.zeng, xuwei5

On Fri, Nov 7, 2025, at 08:21, Zhou Wang wrote:
> Armv8.7 introduces single-copy atomic 64-byte loads and stores
> instructions and its variants named under FEAT_{LS64, LS64_V}.
> Add support for Armv8.7 FEAT_{LS64, LS64_V}:
> - Add identifying and enabling in the cpufeature list
> - Expose the support of these features to userspace through HWCAP3 and cpuinfo
> - Add related hwcap test
> - Handle the trap of unsupported memory (normal/uncacheable) access in a VM
>
> A real scenario for this feature is that the userspace driver can make use of
> this to implement direct WQE (workqueue entry) - a mechanism to fill WQE
> directly into the hardware.
>
> Picked Marc's 2 patches form [1] for handling the LS64 trap in a VM on emulated
> MMIO and the introduce of KVM_EXIT_ARM_LDST64B.

This all looks good to me now, no further comments from my side,
and I hope we can get this into the next kernel.

Thanks a lot for taking care of this!

Acked-by: Arnd Bergmann <arnd@arndb.de>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  2025-11-07  7:21 ` [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots Zhou Wang
@ 2025-11-07 11:48   ` Suzuki K Poulose
  2025-11-07 11:49     ` Suzuki K Poulose
  2025-11-11  2:12     ` Zhou Wang
  0 siblings, 2 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2025-11-07 11:48 UTC (permalink / raw)
  To: Zhou Wang, catalin.marinas, will, maz, oliver.upton, joey.gouly,
	yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5

On 07/11/2025 07:21, Zhou Wang wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> The main use of {LD,ST}64B* is to talk to a device, which is hopefully
> directly assigned to the guest and requires no additional handling.
> 
> However, this does not preclude a VMM from exposing a virtual device
> to the guest, and to allow 64 byte accesses as part of the programming
> interface. A direct consequence of this is that we need to be able
> to forward such access to userspace.
> 
> Given that such a contraption is very unlikely to ever exist, we choose
> to offer a limited service: userspace gets (as part of a new exit reason)
> the ESR, the IPA, and that's it. It is fully expected to handle the full
> semantics of the instructions, deal with ACCDATA, the return values and
> increment PC. Much fun.
> 
> A canonical implementation can also simply inject an abort and be done
> with it. Frankly, don't try to do anything else unless you have time
> to waste.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>

We also need to document this new EXIT reason here :

Documentation/virt/kvm/api.rst


> ---
>   arch/arm64/kvm/mmio.c    | 27 ++++++++++++++++++++++++++-
>   include/uapi/linux/kvm.h |  3 ++-
>   2 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> index 54f9358c9e0e..2a6261abb647 100644
> --- a/arch/arm64/kvm/mmio.c
> +++ b/arch/arm64/kvm/mmio.c
> @@ -159,6 +159,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>   	bool is_write;
>   	int len;
>   	u8 data_buf[8];
> +	u64 esr;
> +
> +	esr = kvm_vcpu_get_esr(vcpu);
>   
>   	/*
>   	 * No valid syndrome? Ask userspace for help if it has
> @@ -168,7 +171,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>   	 * though, so directly deliver an exception to the guest.
>   	 */
>   	if (!kvm_vcpu_dabt_isvalid(vcpu)) {
> -		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
> +		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), esr,
>   				    kvm_vcpu_get_hfar(vcpu), fault_ipa);
>   
>   		if (vcpu_is_protected(vcpu))
> @@ -185,6 +188,28 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>   		return -ENOSYS;
>   	}
>   
> +	/*
> +	 * When (DFSC == 0b00xxxx || DFSC == 0b10101x) && DFSC != 0b0000xx
> +	 * ESR_EL2[12:11] describe the Load/Store Type. This allows us to
> +	 * punt the LD64B/ST64B/ST64BV/ST64BV0 instructions to luserspace,

minor nit: typo: s/luserspace/userspace/                       ^^^

> +	 * which will have to provide a full emulation of these 4
> +	 * instructions.  No, we don't expect this do be fast.
> +	 *
> +	 * We rely on traps being set if the corresponding features are not
> +	 * enabled, so if we get here, userspace has promised us to handle
> +	 * it already.
> +	 */
> +	switch (kvm_vcpu_trap_get_fault(vcpu)) {
> +	case 0b000100 ... 0b001111:
> +	case 0b101010 ... 0b101011:

Matches Arm ARM.

> +		if (FIELD_GET(GENMASK(12, 11), esr)) {
> +			run->exit_reason = KVM_EXIT_ARM_LDST64B;
> +			run->arm_nisv.esr_iss = esr & ~(u64)ESR_ELx_FSC;

Any particular reason why we diverge from the NISV case, where the FSC 
is provided, but not here ? May be this needs to be documented too.

Suzuki


> +			run->arm_nisv.fault_ipa = fault_ipa;
> +			return 0;
> +		}
> +	}
> +
>   	/*
>   	 * Prepare MMIO operation. First decode the syndrome data we get
>   	 * from the CPU. Then try if some in-kernel emulation feels
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 52f6000ab020..d219946b96be 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -179,6 +179,7 @@ struct kvm_xen_exit {
>   #define KVM_EXIT_LOONGARCH_IOCSR  38
>   #define KVM_EXIT_MEMORY_FAULT     39
>   #define KVM_EXIT_TDX              40
> +#define KVM_EXIT_ARM_LDST64B      41
>   
>   /* For KVM_EXIT_INTERNAL_ERROR */
>   /* Emulate instruction failed. */
> @@ -401,7 +402,7 @@ struct kvm_run {
>   		} eoi;
>   		/* KVM_EXIT_HYPERV */
>   		struct kvm_hyperv_exit hyperv;
> -		/* KVM_EXIT_ARM_NISV */
> +		/* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
>   		struct {
>   			__u64 esr_iss;
>   			__u64 fault_ipa;





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  2025-11-07 11:48   ` Suzuki K Poulose
@ 2025-11-07 11:49     ` Suzuki K Poulose
  2025-11-11  2:12     ` Zhou Wang
  1 sibling, 0 replies; 25+ messages in thread
From: Suzuki K Poulose @ 2025-11-07 11:49 UTC (permalink / raw)
  To: Zhou Wang, catalin.marinas, will, maz, oliver.upton, joey.gouly,
	yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5

On 07/11/2025 11:48, Suzuki K Poulose wrote:
> On 07/11/2025 07:21, Zhou Wang wrote:
>> From: Marc Zyngier <maz@kernel.org>
>>
>> The main use of {LD,ST}64B* is to talk to a device, which is hopefully
>> directly assigned to the guest and requires no additional handling.
>>
>> However, this does not preclude a VMM from exposing a virtual device
>> to the guest, and to allow 64 byte accesses as part of the programming
>> interface. A direct consequence of this is that we need to be able
>> to forward such access to userspace.
>>
>> Given that such a contraption is very unlikely to ever exist, we choose
>> to offer a limited service: userspace gets (as part of a new exit reason)
>> the ESR, the IPA, and that's it. It is fully expected to handle the full
>> semantics of the instructions, deal with ACCDATA, the return values and
>> increment PC. Much fun.
>>
>> A canonical implementation can also simply inject an abort and be done
>> with it. Frankly, don't try to do anything else unless you have time
>> to waste.
>>
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
> 
> We also need to document this new EXIT reason here :
> 
> Documentation/virt/kvm/api.rst

Well, spoke too soon. Please ignore this part ;-)

> 
> 
>> ---
>>   arch/arm64/kvm/mmio.c    | 27 ++++++++++++++++++++++++++-
>>   include/uapi/linux/kvm.h |  3 ++-
>>   2 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
>> index 54f9358c9e0e..2a6261abb647 100644
>> --- a/arch/arm64/kvm/mmio.c
>> +++ b/arch/arm64/kvm/mmio.c
>> @@ -159,6 +159,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa)
>>       bool is_write;
>>       int len;
>>       u8 data_buf[8];
>> +    u64 esr;
>> +
>> +    esr = kvm_vcpu_get_esr(vcpu);
>>       /*
>>        * No valid syndrome? Ask userspace for help if it has
>> @@ -168,7 +171,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa)
>>        * though, so directly deliver an exception to the guest.
>>        */
>>       if (!kvm_vcpu_dabt_isvalid(vcpu)) {
>> -        trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
>> +        trace_kvm_mmio_nisv(*vcpu_pc(vcpu), esr,
>>                       kvm_vcpu_get_hfar(vcpu), fault_ipa);
>>           if (vcpu_is_protected(vcpu))
>> @@ -185,6 +188,28 @@ int io_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa)
>>           return -ENOSYS;
>>       }
>> +    /*
>> +     * When (DFSC == 0b00xxxx || DFSC == 0b10101x) && DFSC != 0b0000xx
>> +     * ESR_EL2[12:11] describe the Load/Store Type. This allows us to
>> +     * punt the LD64B/ST64B/ST64BV/ST64BV0 instructions to luserspace,
> 
> minor nit: typo: s/luserspace/userspace/                       ^^^
> 
>> +     * which will have to provide a full emulation of these 4
>> +     * instructions.  No, we don't expect this do be fast.
>> +     *
>> +     * We rely on traps being set if the corresponding features are not
>> +     * enabled, so if we get here, userspace has promised us to handle
>> +     * it already.
>> +     */
>> +    switch (kvm_vcpu_trap_get_fault(vcpu)) {
>> +    case 0b000100 ... 0b001111:
>> +    case 0b101010 ... 0b101011:
> 
> Matches Arm ARM.
> 
>> +        if (FIELD_GET(GENMASK(12, 11), esr)) {
>> +            run->exit_reason = KVM_EXIT_ARM_LDST64B;
>> +            run->arm_nisv.esr_iss = esr & ~(u64)ESR_ELx_FSC;
> 
> Any particular reason why we diverge from the NISV case, where the FSC 
> is provided, but not here ? May be this needs to be documented too.
> 
> Suzuki
> 
> 
>> +            run->arm_nisv.fault_ipa = fault_ipa;
>> +            return 0;
>> +        }
>> +    }
>> +
>>       /*
>>        * Prepare MMIO operation. First decode the syndrome data we get
>>        * from the CPU. Then try if some in-kernel emulation feels
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 52f6000ab020..d219946b96be 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -179,6 +179,7 @@ struct kvm_xen_exit {
>>   #define KVM_EXIT_LOONGARCH_IOCSR  38
>>   #define KVM_EXIT_MEMORY_FAULT     39
>>   #define KVM_EXIT_TDX              40
>> +#define KVM_EXIT_ARM_LDST64B      41
>>   /* For KVM_EXIT_INTERNAL_ERROR */
>>   /* Emulate instruction failed. */
>> @@ -401,7 +402,7 @@ struct kvm_run {
>>           } eoi;
>>           /* KVM_EXIT_HYPERV */
>>           struct kvm_hyperv_exit hyperv;
>> -        /* KVM_EXIT_ARM_NISV */
>> +        /* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
>>           struct {
>>               __u64 esr_iss;
>>               __u64 fault_ipa;
> 
> 
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-07  7:21 ` [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V} Zhou Wang
@ 2025-11-07 12:05   ` Suzuki K Poulose
  2025-11-11  3:40     ` Zhou Wang
  2025-11-11 11:15   ` Marc Zyngier
  1 sibling, 1 reply; 25+ messages in thread
From: Suzuki K Poulose @ 2025-11-07 12:05 UTC (permalink / raw)
  To: Zhou Wang, catalin.marinas, will, maz, oliver.upton, joey.gouly,
	yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5

On 07/11/2025 07:21, Zhou Wang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Armv8.7 introduces single-copy atomic 64-byte loads and stores
> instructions and its variants named under FEAT_{LS64, LS64_V}.
> These features are identified by ID_AA64ISAR1_EL1.LS64 and the
> use of such instructions in userspace (EL0) can be trapped. In
> order to support the use of corresponding instructions in userspace:
> - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
> - Add identifying and enabling in the cpufeature list
> - Expose these support of these features to userspace through HWCAP3
>    and cpuinfo
> 
> ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
> special memory (device memory) so requires support by the CPU, system
> and target memory location (device that support these instructions).
> The HWCAP3_{LS64, LS64_V} implies the support of CPU and system (since
> no identification method from system, so SoC vendors should advertise
> support in the CPU if system also support them).
> 
> Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
> DABT will be generated, so users (probably userspace driver developer)
> should make sure the target memory (device) also have the support.
> For st64bv 0xffffffffffffffff will be returned as status result for
> unsupported memory so user should check it.
> 
> Document the restrictions along with HWCAP3_{LS64, LS64_V}.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
> ---
>   Documentation/arch/arm64/booting.rst    | 12 ++++++
>   Documentation/arch/arm64/elf_hwcaps.rst | 14 +++++++
>   arch/arm64/include/asm/hwcap.h          |  2 +
>   arch/arm64/include/uapi/asm/hwcap.h     |  2 +
>   arch/arm64/kernel/cpufeature.c          | 51 +++++++++++++++++++++++++
>   arch/arm64/kernel/cpuinfo.c             |  2 +
>   arch/arm64/tools/cpucaps                |  2 +
>   7 files changed, 85 insertions(+)
> 
> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> index e4f953839f71..2c56d76ecafb 100644
> --- a/Documentation/arch/arm64/booting.rst
> +++ b/Documentation/arch/arm64/booting.rst
> @@ -556,6 +556,18 @@ Before jumping into the kernel, the following conditions must be met:
>   
>      - MDCR_EL3.TPM (bit 6) must be initialized to 0b0
>   
> +  For CPUs with support for 64-byte loads and stores without status (FEAT_LS64):
> +
> +  - If the kernel is entered at EL1 and EL2 is present:
> +
> +    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.
> +
> +  For CPUs with support for 64-byte stores with status (FEAT_LS64_V):
> +
> +  - If the kernel is entered at EL1 and EL2 is present:
> +
> +    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.
> +
>   The requirements described above for CPU mode, caches, MMUs, architected
>   timers, coherency and system registers apply to all CPUs.  All CPUs must
>   enter the kernel in the same exception level.  Where the values documented
> diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
> index a15df4956849..b86059bc288b 100644
> --- a/Documentation/arch/arm64/elf_hwcaps.rst
> +++ b/Documentation/arch/arm64/elf_hwcaps.rst
> @@ -444,6 +444,20 @@ HWCAP3_MTE_STORE_ONLY
>   HWCAP3_LSFE
>       Functionality implied by ID_AA64ISAR3_EL1.LSFE == 0b0001
>   
> +HWCAP3_LS64
> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001. Note that
> +    the function of instruction ld64b/st64b requires support by CPU, system
> +    and target (device) memory location and HWCAP3_LS64 implies the support
> +    of CPU. User should only use ld64b/st64b on supported target (device)
> +    memory location, otherwise fallback to the non-atomic alternatives.
> +
> +HWCAP3_LS64_V
> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010. Same to
> +    HWCAP3_LS64 that HWCAP3_LS64_V implies CPU's support of instruction
> +    st64bv but also requires the support from the system and target (device)
> +    memory location. st64bv supports return status result and 0xFFFFFFFFFFFFFFFF
> +    will be returned for unsupported memory location.
> +
>   
>   4. Unused AT_HWCAP bits
>   -----------------------
> diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> index 6d567265467c..3c0804fb3435 100644
> --- a/arch/arm64/include/asm/hwcap.h
> +++ b/arch/arm64/include/asm/hwcap.h
> @@ -179,6 +179,8 @@
>   #define KERNEL_HWCAP_MTE_FAR		__khwcap3_feature(MTE_FAR)
>   #define KERNEL_HWCAP_MTE_STORE_ONLY	__khwcap3_feature(MTE_STORE_ONLY)
>   #define KERNEL_HWCAP_LSFE		__khwcap3_feature(LSFE)
> +#define KERNEL_HWCAP_LS64		__khwcap3_feature(LS64)
> +#define KERNEL_HWCAP_LS64_V		__khwcap3_feature(LS64_V)
>   
>   /*
>    * This yields a mask that user programs can use to figure out what
> diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
> index 575564ecdb0b..79bc77425b82 100644
> --- a/arch/arm64/include/uapi/asm/hwcap.h
> +++ b/arch/arm64/include/uapi/asm/hwcap.h
> @@ -146,5 +146,7 @@
>   #define HWCAP3_MTE_FAR		(1UL << 0)
>   #define HWCAP3_MTE_STORE_ONLY		(1UL << 1)
>   #define HWCAP3_LSFE		(1UL << 2)
> +#define HWCAP3_LS64		(1UL << 3)
> +#define HWCAP3_LS64_V		(1UL << 4)
>   
>   #endif /* _UAPI__ASM_HWCAP_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 5ed401ff79e3..dcc5ba620a7e 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -239,6 +239,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
>   };
>   
>   static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
> +	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LS64_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_XS_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0),
>   	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0),
> @@ -2259,6 +2260,38 @@ static void cpu_enable_e0pd(struct arm64_cpu_capabilities const *cap)
>   }
>   #endif /* CONFIG_ARM64_E0PD */
>   
> +static bool has_ls64(const struct arm64_cpu_capabilities *entry, int __unused)
> +{
> +	u64 ls64;
> +
> +	ls64 = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg),
> +					   entry->field_pos, entry->sign);

Why are we always reading from the "local" CPU ? Shouldn't this be based 
on the SCOPE ?

i.e., read from the sanitised feature state for SCOPE_SYSTEM (given that 
is the SCOPE for the capability)

and read from the local CPU for SCOPE_LOCAL (for checks in late CPUs).


> +
> +	if (ls64 == ID_AA64ISAR1_EL1_LS64_NI ||
> +	    ls64 > ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA)

Given this is FTR_LOWER_SAFE, why do we skip anything that is HIGHER 
than a particular value ? You must be able to fall back to the 
has_cpuid_feature() check for both these CAPs.


> +		return false;
> +



---8>---

> +	if (entry->capability == ARM64_HAS_LS64 &&
> +	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64)
> +		return true;
> +
> +	if (entry->capability == ARM64_HAS_LS64_V &&
> +	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64_V)
> +		return true;
> +
> +	return false;

--<8---


minor nit: You could simplify this to:

	return (ls64 >= entry->min_field_value)


Suzuki


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
  2025-11-07  7:21 ` [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest Zhou Wang
@ 2025-11-07 18:53   ` Oliver Upton
  2025-11-11  3:43     ` Zhou Wang
  0 siblings, 1 reply; 25+ messages in thread
From: Oliver Upton @ 2025-11-07 18:53 UTC (permalink / raw)
  To: Zhou Wang
  Cc: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd, linux-arm-kernel, kvmarm,
	yangyccccc, prime.zeng, xuwei5

Hi Zhou,

On Fri, Nov 07, 2025 at 03:21:26PM +0800, Zhou Wang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
> by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>

The ordering of this patch is incorrect as patch #5 has the side-effect
of exposing ID_AA64ISAR1_EL1.LS64 to KVM guests. This one should come
first instead.

Thanks,
Oliver

> ---
>  arch/arm64/include/asm/kvm_emulate.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index bab967d65715..29291e25ecfd 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -695,6 +695,12 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
>  
>  		if (kvm_has_sctlr2(kvm))
>  			vcpu->arch.hcrx_el2 |= HCRX_EL2_SCTLR2En;
> +
> +		if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64))
> +			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnALS;
> +
> +		if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64_V))
> +			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnASR;
>  	}
>  }
>  #endif /* __ARM64_KVM_EMULATE_H__ */
> -- 
> 2.33.0
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests
  2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
                   ` (7 preceding siblings ...)
  2025-11-07  9:23 ` [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Arnd Bergmann
@ 2025-11-07 18:57 ` Oliver Upton
  8 siblings, 0 replies; 25+ messages in thread
From: Oliver Upton @ 2025-11-07 18:57 UTC (permalink / raw)
  To: Zhou Wang
  Cc: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd, linux-arm-kernel, kvmarm,
	yangyccccc, prime.zeng, xuwei5

On Fri, Nov 07, 2025 at 03:21:20PM +0800, Zhou Wang wrote:
> Armv8.7 introduces single-copy atomic 64-byte loads and stores
> instructions and its variants named under FEAT_{LS64, LS64_V}.
> Add support for Armv8.7 FEAT_{LS64, LS64_V}:
> - Add identifying and enabling in the cpufeature list
> - Expose the support of these features to userspace through HWCAP3 and cpuinfo
> - Add related hwcap test
> - Handle the trap of unsupported memory (normal/uncacheable) access in a VM
> 
> A real scenario for this feature is that the userspace driver can make use of
> this to implement direct WQE (workqueue entry) - a mechanism to fill WQE
> directly into the hardware.
> 
> Picked Marc's 2 patches form [1] for handling the LS64 trap in a VM on emulated
> MMIO and the introduce of KVM_EXIT_ARM_LDST64B.

Besides the ordering issue the KVM bits of this look fine to me. If
these patches go through the kvmarm tree then I'd be happy to fix that
up when applying.

Will / Catalin, any preferences on which tree this goes in? If you guys
take it:

Acked-by: Oliver Upton <oupton@kernel.org>

Thanks,
Oliver 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots
  2025-11-07 11:48   ` Suzuki K Poulose
  2025-11-07 11:49     ` Suzuki K Poulose
@ 2025-11-11  2:12     ` Zhou Wang
  1 sibling, 0 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-11  2:12 UTC (permalink / raw)
  To: Suzuki K Poulose, catalin.marinas, will, maz, oliver.upton,
	joey.gouly, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5

On 2025/11/7 19:48, Suzuki K Poulose wrote:
> On 07/11/2025 07:21, Zhou Wang wrote:
>> From: Marc Zyngier <maz@kernel.org>
>>
>> The main use of {LD,ST}64B* is to talk to a device, which is hopefully
>> directly assigned to the guest and requires no additional handling.
>>
>> However, this does not preclude a VMM from exposing a virtual device
>> to the guest, and to allow 64 byte accesses as part of the programming
>> interface. A direct consequence of this is that we need to be able
>> to forward such access to userspace.
>>
>> Given that such a contraption is very unlikely to ever exist, we choose
>> to offer a limited service: userspace gets (as part of a new exit reason)
>> the ESR, the IPA, and that's it. It is fully expected to handle the full
>> semantics of the instructions, deal with ACCDATA, the return values and
>> increment PC. Much fun.
>>
>> A canonical implementation can also simply inject an abort and be done
>> with it. Frankly, don't try to do anything else unless you have time
>> to waste.
>>
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
> 
> We also need to document this new EXIT reason here :
> 
> Documentation/virt/kvm/api.rst
> 
> 
>> ---
>>   arch/arm64/kvm/mmio.c    | 27 ++++++++++++++++++++++++++-
>>   include/uapi/linux/kvm.h |  3 ++-
>>   2 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
>> index 54f9358c9e0e..2a6261abb647 100644
>> --- a/arch/arm64/kvm/mmio.c
>> +++ b/arch/arm64/kvm/mmio.c
>> @@ -159,6 +159,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>       bool is_write;
>>       int len;
>>       u8 data_buf[8];
>> +    u64 esr;
>> +
>> +    esr = kvm_vcpu_get_esr(vcpu);
>>         /*
>>        * No valid syndrome? Ask userspace for help if it has
>> @@ -168,7 +171,7 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>        * though, so directly deliver an exception to the guest.
>>        */
>>       if (!kvm_vcpu_dabt_isvalid(vcpu)) {
>> -        trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
>> +        trace_kvm_mmio_nisv(*vcpu_pc(vcpu), esr,
>>                       kvm_vcpu_get_hfar(vcpu), fault_ipa);
>>             if (vcpu_is_protected(vcpu))
>> @@ -185,6 +188,28 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>           return -ENOSYS;
>>       }
>>   +    /*
>> +     * When (DFSC == 0b00xxxx || DFSC == 0b10101x) && DFSC != 0b0000xx
>> +     * ESR_EL2[12:11] describe the Load/Store Type. This allows us to
>> +     * punt the LD64B/ST64B/ST64BV/ST64BV0 instructions to luserspace,
> 
> minor nit: typo: s/luserspace/userspace/ 

Will fix this in next version.
> 
>> +     * which will have to provide a full emulation of these 4
>> +     * instructions.  No, we don't expect this do be fast.
>> +     *
>> +     * We rely on traps being set if the corresponding features are not
>> +     * enabled, so if we get here, userspace has promised us to handle
>> +     * it already.
>> +     */
>> +    switch (kvm_vcpu_trap_get_fault(vcpu)) {
>> +    case 0b000100 ... 0b001111:
>> +    case 0b101010 ... 0b101011:
> 
> Matches Arm ARM.

Here is mentioned in L.b D24.2.40(page 7526). It does not include 0b0000xx,
so first case in above code is "case 0b000100 ... 0b001111", just skip 0b0000xx.

> 
>> +        if (FIELD_GET(GENMASK(12, 11), esr)) {
>> +            run->exit_reason = KVM_EXIT_ARM_LDST64B;
>> +            run->arm_nisv.esr_iss = esr & ~(u64)ESR_ELx_FSC;
> 
> Any particular reason why we diverge from the NISV case, where the FSC is provided,> but not here ? May be this needs to be documented too.

NISV case and this case is different. NISV indicates whether the syndrome information
in ISS[23:14] is valid, bits[12:11](LST) indicates which LS64 instruction generated
the data abort. Not sure why did we mask FSC, seems that LST already offers related
information.

Best,
Zhou

> 
> Suzuki
> 
> 
>> +            run->arm_nisv.fault_ipa = fault_ipa;
>> +            return 0;
>> +        }
>> +    }
>> +
>>       /*
>>        * Prepare MMIO operation. First decode the syndrome data we get
>>        * from the CPU. Then try if some in-kernel emulation feels
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 52f6000ab020..d219946b96be 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -179,6 +179,7 @@ struct kvm_xen_exit {
>>   #define KVM_EXIT_LOONGARCH_IOCSR  38
>>   #define KVM_EXIT_MEMORY_FAULT     39
>>   #define KVM_EXIT_TDX              40
>> +#define KVM_EXIT_ARM_LDST64B      41
>>     /* For KVM_EXIT_INTERNAL_ERROR */
>>   /* Emulate instruction failed. */
>> @@ -401,7 +402,7 @@ struct kvm_run {
>>           } eoi;
>>           /* KVM_EXIT_HYPERV */
>>           struct kvm_hyperv_exit hyperv;
>> -        /* KVM_EXIT_ARM_NISV */
>> +        /* KVM_EXIT_ARM_NISV / KVM_EXIT_ARM_LDST64B */
>>           struct {
>>               __u64 esr_iss;
>>               __u64 fault_ipa;
> 
> 
> 
> 
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-07 12:05   ` Suzuki K Poulose
@ 2025-11-11  3:40     ` Zhou Wang
  0 siblings, 0 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-11  3:40 UTC (permalink / raw)
  To: Suzuki K Poulose, catalin.marinas, will, maz, oliver.upton,
	joey.gouly, yuzenghui, arnd
  Cc: linux-arm-kernel, kvmarm, yangyccccc, prime.zeng, xuwei5

On 2025/11/7 20:05, Suzuki K Poulose wrote:
> On 07/11/2025 07:21, Zhou Wang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Armv8.7 introduces single-copy atomic 64-byte loads and stores
>> instructions and its variants named under FEAT_{LS64, LS64_V}.
>> These features are identified by ID_AA64ISAR1_EL1.LS64 and the
>> use of such instructions in userspace (EL0) can be trapped. In
>> order to support the use of corresponding instructions in userspace:
>> - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
>> - Add identifying and enabling in the cpufeature list
>> - Expose these support of these features to userspace through HWCAP3
>>    and cpuinfo
>>
>> ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
>> special memory (device memory) so requires support by the CPU, system
>> and target memory location (device that support these instructions).
>> The HWCAP3_{LS64, LS64_V} implies the support of CPU and system (since
>> no identification method from system, so SoC vendors should advertise
>> support in the CPU if system also support them).
>>
>> Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
>> DABT will be generated, so users (probably userspace driver developer)
>> should make sure the target memory (device) also have the support.
>> For st64bv 0xffffffffffffffff will be returned as status result for
>> unsupported memory so user should check it.
>>
>> Document the restrictions along with HWCAP3_{LS64, LS64_V}.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
>> ---
>>   Documentation/arch/arm64/booting.rst    | 12 ++++++
>>   Documentation/arch/arm64/elf_hwcaps.rst | 14 +++++++
>>   arch/arm64/include/asm/hwcap.h          |  2 +
>>   arch/arm64/include/uapi/asm/hwcap.h     |  2 +
>>   arch/arm64/kernel/cpufeature.c          | 51 +++++++++++++++++++++++++
>>   arch/arm64/kernel/cpuinfo.c             |  2 +
>>   arch/arm64/tools/cpucaps                |  2 +
>>   7 files changed, 85 insertions(+)
>>
>> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
>> index e4f953839f71..2c56d76ecafb 100644
>> --- a/Documentation/arch/arm64/booting.rst
>> +++ b/Documentation/arch/arm64/booting.rst
>> @@ -556,6 +556,18 @@ Before jumping into the kernel, the following conditions must be met:
>>        - MDCR_EL3.TPM (bit 6) must be initialized to 0b0
>>   +  For CPUs with support for 64-byte loads and stores without status (FEAT_LS64):
>> +
>> +  - If the kernel is entered at EL1 and EL2 is present:
>> +
>> +    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.
>> +
>> +  For CPUs with support for 64-byte stores with status (FEAT_LS64_V):
>> +
>> +  - If the kernel is entered at EL1 and EL2 is present:
>> +
>> +    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.
>> +
>>   The requirements described above for CPU mode, caches, MMUs, architected
>>   timers, coherency and system registers apply to all CPUs.  All CPUs must
>>   enter the kernel in the same exception level.  Where the values documented
>> diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
>> index a15df4956849..b86059bc288b 100644
>> --- a/Documentation/arch/arm64/elf_hwcaps.rst
>> +++ b/Documentation/arch/arm64/elf_hwcaps.rst
>> @@ -444,6 +444,20 @@ HWCAP3_MTE_STORE_ONLY
>>   HWCAP3_LSFE
>>       Functionality implied by ID_AA64ISAR3_EL1.LSFE == 0b0001
>>   +HWCAP3_LS64
>> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001. Note that
>> +    the function of instruction ld64b/st64b requires support by CPU, system
>> +    and target (device) memory location and HWCAP3_LS64 implies the support
>> +    of CPU. User should only use ld64b/st64b on supported target (device)
>> +    memory location, otherwise fallback to the non-atomic alternatives.
>> +
>> +HWCAP3_LS64_V
>> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010. Same to
>> +    HWCAP3_LS64 that HWCAP3_LS64_V implies CPU's support of instruction
>> +    st64bv but also requires the support from the system and target (device)
>> +    memory location. st64bv supports return status result and 0xFFFFFFFFFFFFFFFF
>> +    will be returned for unsupported memory location.
>> +
>>     4. Unused AT_HWCAP bits
>>   -----------------------
>> diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
>> index 6d567265467c..3c0804fb3435 100644
>> --- a/arch/arm64/include/asm/hwcap.h
>> +++ b/arch/arm64/include/asm/hwcap.h
>> @@ -179,6 +179,8 @@
>>   #define KERNEL_HWCAP_MTE_FAR        __khwcap3_feature(MTE_FAR)
>>   #define KERNEL_HWCAP_MTE_STORE_ONLY    __khwcap3_feature(MTE_STORE_ONLY)
>>   #define KERNEL_HWCAP_LSFE        __khwcap3_feature(LSFE)
>> +#define KERNEL_HWCAP_LS64        __khwcap3_feature(LS64)
>> +#define KERNEL_HWCAP_LS64_V        __khwcap3_feature(LS64_V)
>>     /*
>>    * This yields a mask that user programs can use to figure out what
>> diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
>> index 575564ecdb0b..79bc77425b82 100644
>> --- a/arch/arm64/include/uapi/asm/hwcap.h
>> +++ b/arch/arm64/include/uapi/asm/hwcap.h
>> @@ -146,5 +146,7 @@
>>   #define HWCAP3_MTE_FAR        (1UL << 0)
>>   #define HWCAP3_MTE_STORE_ONLY        (1UL << 1)
>>   #define HWCAP3_LSFE        (1UL << 2)
>> +#define HWCAP3_LS64        (1UL << 3)
>> +#define HWCAP3_LS64_V        (1UL << 4)
>>     #endif /* _UAPI__ASM_HWCAP_H */
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 5ed401ff79e3..dcc5ba620a7e 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -239,6 +239,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
>>   };
>>     static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
>> +    ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LS64_SHIFT, 4, 0),
>>       ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_XS_SHIFT, 4, 0),
>>       ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0),
>>       ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0),
>> @@ -2259,6 +2260,38 @@ static void cpu_enable_e0pd(struct arm64_cpu_capabilities const *cap)
>>   }
>>   #endif /* CONFIG_ARM64_E0PD */
>>   +static bool has_ls64(const struct arm64_cpu_capabilities *entry, int __unused)
>> +{
>> +    u64 ls64;
>> +
>> +    ls64 = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg),
>> +                       entry->field_pos, entry->sign);
> 
> Why are we always reading from the "local" CPU ? Shouldn't this be based on the SCOPE ?
> 
> i.e., read from the sanitised feature state for SCOPE_SYSTEM (given that is the SCOPE for the capability)
> 
> and read from the local CPU for SCOPE_LOCAL (for checks in late CPUs).>

I am not sure if there is more consideration here, but seems that directly using has_cpuid_feature
is enough here, just as you mentioned below.

> 
>> +
>> +    if (ls64 == ID_AA64ISAR1_EL1_LS64_NI ||
>> +        ls64 > ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA)
> 
> Given this is FTR_LOWER_SAFE, why do we skip anything that is HIGHER than a particular value ? You must be able to fall back to the has_cpuid_feature() check for both these CAPs.
> 
> 
>> +        return false;
>> +
> 
> 
> 
> ---8>---
> 
>> +    if (entry->capability == ARM64_HAS_LS64 &&
>> +        ls64 >= ID_AA64ISAR1_EL1_LS64_LS64)
>> +        return true;
>> +
>> +    if (entry->capability == ARM64_HAS_LS64_V &&
>> +        ls64 >= ID_AA64ISAR1_EL1_LS64_LS64_V)
>> +        return true;
>> +
>> +    return false;
> 
> --<8---
> 
> 
> minor nit: You could simplify this to:
> 
>     return (ls64 >= entry->min_field_value)

Same as above.

Best,
Zhou

> 
> 
> Suzuki
> 
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest
  2025-11-07 18:53   ` Oliver Upton
@ 2025-11-11  3:43     ` Zhou Wang
  0 siblings, 0 replies; 25+ messages in thread
From: Zhou Wang @ 2025-11-11  3:43 UTC (permalink / raw)
  To: Oliver Upton
  Cc: catalin.marinas, will, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, arnd, linux-arm-kernel, kvmarm,
	yangyccccc, prime.zeng, xuwei5

On 2025/11/8 2:53, Oliver Upton wrote:
> Hi Zhou,
> 
> On Fri, Nov 07, 2025 at 03:21:26PM +0800, Zhou Wang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
>> by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
> 
> The ordering of this patch is incorrect as patch #5 has the side-effect
> of exposing ID_AA64ISAR1_EL1.LS64 to KVM guests. This one should come
> first instead.
Hi Oliver,

I got this point, will change the order in next version.

Thanks,
Zhou

> 
> Thanks,
> Oliver
> 
>> ---
>>  arch/arm64/include/asm/kvm_emulate.h | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
>> index bab967d65715..29291e25ecfd 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -695,6 +695,12 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
>>  
>>  		if (kvm_has_sctlr2(kvm))
>>  			vcpu->arch.hcrx_el2 |= HCRX_EL2_SCTLR2En;
>> +
>> +		if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64))
>> +			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnALS;
>> +
>> +		if (kvm_has_feat(kvm, ID_AA64ISAR1_EL1, LS64, LS64_V))
>> +			vcpu->arch.hcrx_el2 |= HCRX_EL2_EnASR;
>>  	}
>>  }
>>  #endif /* __ARM64_KVM_EMULATE_H__ */
>> -- 
>> 2.33.0
>>
>>
> 
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-07  7:21 ` [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V} Zhou Wang
  2025-11-07 12:05   ` Suzuki K Poulose
@ 2025-11-11 11:15   ` Marc Zyngier
  2025-11-13 14:40     ` Zhou Wang
  1 sibling, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2025-11-11 11:15 UTC (permalink / raw)
  To: Zhou Wang
  Cc: catalin.marinas, will, oliver.upton, joey.gouly, suzuki.poulose,
	yuzenghui, arnd, linux-arm-kernel, kvmarm, yangyccccc, prime.zeng,
	xuwei5

On Fri, 07 Nov 2025 07:21:25 +0000,
Zhou Wang <wangzhou1@hisilicon.com> wrote:
> 
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Armv8.7 introduces single-copy atomic 64-byte loads and stores
> instructions and its variants named under FEAT_{LS64, LS64_V}.
> These features are identified by ID_AA64ISAR1_EL1.LS64 and the
> use of such instructions in userspace (EL0) can be trapped. In
> order to support the use of corresponding instructions in userspace:
> - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
> - Add identifying and enabling in the cpufeature list
> - Expose these support of these features to userspace through HWCAP3
>   and cpuinfo
> 
> ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
> special memory (device memory) so requires support by the CPU, system
> and target memory location (device that support these instructions).
> The HWCAP3_{LS64, LS64_V} implies the support of CPU and system (since
> no identification method from system, so SoC vendors should advertise
> support in the CPU if system also support them).

But this doesn't mean that the system actually supports this. It is
also trivial for EL0 to spoof a PASID using ST64BV, by populating the
bottom 32bit with whatever it wants (hiding ST64BV0 doesn't prevent
this).

In all honestly, I'm starting to think that we cannot safely expose
this to userspace, at least not without strong guarantees coming from
the system itself.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-11 11:15   ` Marc Zyngier
@ 2025-11-13 14:40     ` Zhou Wang
  2025-11-13 16:26       ` Arnd Bergmann
  0 siblings, 1 reply; 25+ messages in thread
From: Zhou Wang @ 2025-11-13 14:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: catalin.marinas, will, oliver.upton, joey.gouly, suzuki.poulose,
	yuzenghui, arnd, linux-arm-kernel, kvmarm, yangyccccc, prime.zeng,
	xuwei5

On 2025/11/11 19:15, Marc Zyngier wrote:
> On Fri, 07 Nov 2025 07:21:25 +0000,
> Zhou Wang <wangzhou1@hisilicon.com> wrote:
>>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Armv8.7 introduces single-copy atomic 64-byte loads and stores
>> instructions and its variants named under FEAT_{LS64, LS64_V}.
>> These features are identified by ID_AA64ISAR1_EL1.LS64 and the
>> use of such instructions in userspace (EL0) can be trapped. In
>> order to support the use of corresponding instructions in userspace:
>> - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
>> - Add identifying and enabling in the cpufeature list
>> - Expose these support of these features to userspace through HWCAP3
>>   and cpuinfo
>>
>> ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
>> special memory (device memory) so requires support by the CPU, system
>> and target memory location (device that support these instructions).
>> The HWCAP3_{LS64, LS64_V} implies the support of CPU and system (since
>> no identification method from system, so SoC vendors should advertise
>> support in the CPU if system also support them).
> 
> But this doesn't mean that the system actually supports this. It is
> also trivial for EL0 to spoof a PASID using ST64BV, by populating the
> bottom 32bit with whatever it wants (hiding ST64BV0 doesn't prevent
> this).

I am confused here, we enable FEAT_LS64 and FEAT_LS64V in this patch,
so only LD64B/ST64B/ST64BV are involved.

Sending the value of ACCDATA(maybe a PASID) is defined in ST64BV0, which
is not enabled currently.

If a bad system implements ST64BV wrongly, isn't the fault of this bad
system?

Best,
Zhou

> 
> In all honestly, I'm starting to think that we cannot safely expose
> this to userspace, at least not without strong guarantees coming from
> the system itself.
> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-13 14:40     ` Zhou Wang
@ 2025-11-13 16:26       ` Arnd Bergmann
  2025-11-14  9:25         ` Zhou Wang
  0 siblings, 1 reply; 25+ messages in thread
From: Arnd Bergmann @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Zhou Wang, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, linux-arm-kernel, kvmarm,
	Yicong Yang, prime.zeng, xuwei5

On Thu, Nov 13, 2025, at 15:40, Zhou Wang wrote:
> On 2025/11/11 19:15, Marc Zyngier wrote:
>> On Fri, 07 Nov 2025 07:21:25 +0000,
>> 
>> But this doesn't mean that the system actually supports this. It is
>> also trivial for EL0 to spoof a PASID using ST64BV, by populating the
>> bottom 32bit with whatever it wants (hiding ST64BV0 doesn't prevent
>> this).
>
> I am confused here, we enable FEAT_LS64 and FEAT_LS64V in this patch,
> so only LD64B/ST64B/ST64BV are involved.
>
> Sending the value of ACCDATA(maybe a PASID) is defined in ST64BV0, which
> is not enabled currently.
>
> If a bad system implements ST64BV wrongly, isn't the fault of this bad
> system?

As far as I can tell, the design of ST64BV/ST64BV0 is a bit vague
on this, both on the Arm architecture and the PCI side [1], which each
leave the meaning of ACCDATA open to system design.

However, when the intention is to implement a shared hardware workqueue
in the style of drivers/dma/idxd/ [2], the only sensible implementation
is to follow the way that the corresponding Intel instructions work:

- movdir64b/st64b is a nonprivileged instruction to produce a posted
  atomic 64-byte write TLP to a PCIe device, which can only be used
  with a dedicated workqueue that is preconfigured to a fixed PASID

- enqcmds/st64bv is a privileged instruction to produce a non-posted
  atomic 64-byte write Deferrable Memory Write (DMWr) TLP, which can be
  used on a shared workqueue from kernel space, using an arbitrary PASID
  value per transaction, which would normally correspond to the
  physical address space when the kernel initiates DMA.

- enqcmd/st64bv0 is a nonprivileged instruction like enqcmds/st64bv
  using the pasid from accdata that corresponds to 
  current->mm->iommu_mm->pasid value [3] in the kernel for the task
  that initates the transaction in userspace.

A PCIe device can tell the difference between a posted write and a
non-posted DMWr, but it cannot tell the difference between st64bv
and st64bv0, so the kernel must disallow st64bv from userspace if
it can be used on a device that expects the PASID value in the
low bits.

Things would be different if there is a PCIe device that expects
a DMWr transaction but does not use it for a shared hardware
workqueue with the PASID in that field.

Are you using a particular device, or are you trying to enable
the support in general? If you have a specific device you are
working on, does it use the PASID data or not?

Things will clearly get a lot harder if we want to support a
system that can have devices with conflicting interpretations
of the ACCDATA value. Ideally we could always use the
iommu_mm->pasid value here and enable st64bv0 globally like
idxd does. If any devices requires the use of st64bv, that
would have to be mutually exclusive with another driver
handing out access to shared hardware workqueues.

     Arnd

[1] https://members.pcisig.com/wg/PCI-SIG/document/14237
[2] https://www.kernel.org/doc/html/v6.1/x86/sva.html
[3] drivers/iommu/iommu-sva.c

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-13 16:26       ` Arnd Bergmann
@ 2025-11-14  9:25         ` Zhou Wang
  2025-11-14  9:37           ` Arnd Bergmann
  0 siblings, 1 reply; 25+ messages in thread
From: Zhou Wang @ 2025-11-14  9:25 UTC (permalink / raw)
  To: Arnd Bergmann, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, linux-arm-kernel, kvmarm,
	Yicong Yang, prime.zeng, xuwei5

On 2025/11/14 0:26, Arnd Bergmann wrote:
> On Thu, Nov 13, 2025, at 15:40, Zhou Wang wrote:
>> On 2025/11/11 19:15, Marc Zyngier wrote:
>>> On Fri, 07 Nov 2025 07:21:25 +0000,
>>>
>>> But this doesn't mean that the system actually supports this. It is
>>> also trivial for EL0 to spoof a PASID using ST64BV, by populating the
>>> bottom 32bit with whatever it wants (hiding ST64BV0 doesn't prevent
>>> this).
>>
>> I am confused here, we enable FEAT_LS64 and FEAT_LS64V in this patch,
>> so only LD64B/ST64B/ST64BV are involved.
>>
>> Sending the value of ACCDATA(maybe a PASID) is defined in ST64BV0, which
>> is not enabled currently.
>>
>> If a bad system implements ST64BV wrongly, isn't the fault of this bad
>> system?
> 
> As far as I can tell, the design of ST64BV/ST64BV0 is a bit vague
> on this, both on the Arm architecture and the PCI side [1], which each
> leave the meaning of ACCDATA open to system design.
> 
> However, when the intention is to implement a shared hardware workqueue
> in the style of drivers/dma/idxd/ [2], the only sensible implementation
> is to follow the way that the corresponding Intel instructions work:
> 
> - movdir64b/st64b is a nonprivileged instruction to produce a posted
>   atomic 64-byte write TLP to a PCIe device, which can only be used
>   with a dedicated workqueue that is preconfigured to a fixed PASID
> 
> - enqcmds/st64bv is a privileged instruction to produce a non-posted
>   atomic 64-byte write Deferrable Memory Write (DMWr) TLP, which can be
>   used on a shared workqueue from kernel space, using an arbitrary PASID
>   value per transaction, which would normally correspond to the
>   physical address space when the kernel initiates DMA.
> 
> - enqcmd/st64bv0 is a nonprivileged instruction like enqcmds/st64bv
>   using the pasid from accdata that corresponds to 
>   current->mm->iommu_mm->pasid value [3] in the kernel for the task
>   that initates the transaction in userspace.
> 
> A PCIe device can tell the difference between a posted write and a
> non-posted DMWr, but it cannot tell the difference between st64bv
> and st64bv0, so the kernel must disallow st64bv from userspace if
> it can be used on a device that expects the PASID value in the
> low bits.
> 
> Things would be different if there is a PCIe device that expects
> a DMWr transaction but does not use it for a shared hardware
> workqueue with the PASID in that field.
> 
> Are you using a particular device, or are you trying to enable
> the support in general? If you have a specific device you are
> working on, does it use the PASID data or not?

Hi Arnd,

Many thanks for your careful explanation! I got the pointer here.

We have a real device in our SoC, which supports LS64B/ST64B/ST64BV.
For ST64BV, Our device just receives 64B data atomically, not interpret
it with PASID data.

Best,
Zhou

> 
> Things will clearly get a lot harder if we want to support a
> system that can have devices with conflicting interpretations
> of the ACCDATA value. Ideally we could always use the
> iommu_mm->pasid value here and enable st64bv0 globally like
> idxd does. If any devices requires the use of st64bv, that
> would have to be mutually exclusive with another driver
> handing out access to shared hardware workqueues.
> 
>      Arnd
> 
> [1] https://members.pcisig.com/wg/PCI-SIG/document/14237
> [2] https://www.kernel.org/doc/html/v6.1/x86/sva.html
> [3] drivers/iommu/iommu-sva.c
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-14  9:25         ` Zhou Wang
@ 2025-11-14  9:37           ` Arnd Bergmann
  2025-11-18  2:31             ` Zhou Wang
  0 siblings, 1 reply; 25+ messages in thread
From: Arnd Bergmann @ 2025-11-14  9:37 UTC (permalink / raw)
  To: Zhou Wang, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, linux-arm-kernel, kvmarm,
	Yicong Yang, prime.zeng, xuwei5

On Fri, Nov 14, 2025, at 10:25, Zhou Wang wrote:
> On 2025/11/14 0:26, Arnd Bergmann wrote:
>> 
>> Are you using a particular device, or are you trying to enable
>> the support in general? If you have a specific device you are
>> working on, does it use the PASID data or not?
>
> Many thanks for your careful explanation! I got the pointer here.
>
> We have a real device in our SoC, which supports LS64B/ST64B/ST64BV.
> For ST64BV, Our device just receives 64B data atomically, not interpret
> it with PASID data.

Ok, I see. So I assume this is either a kind of dedicated work queue
where the IOMMU PASID is set up in advance for the user MMIO area,
or it is a device that does not do any DMA at all, correct?

In this case, would the device also work correctly with ST64BV0 if
the ACCDATA register is fixed to a value of zero and you can only
use the upper 480 bits? Would it also work if there is an
unpredictable value in ACCDATA that may match the PASID of another
device used by the same process?

         Arnd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-14  9:37           ` Arnd Bergmann
@ 2025-11-18  2:31             ` Zhou Wang
  2025-11-18  7:36               ` Arnd Bergmann
  0 siblings, 1 reply; 25+ messages in thread
From: Zhou Wang @ 2025-11-18  2:31 UTC (permalink / raw)
  To: Arnd Bergmann, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, linux-arm-kernel, kvmarm,
	Yicong Yang, prime.zeng, xuwei5

On 2025/11/14 17:37, Arnd Bergmann wrote:
> On Fri, Nov 14, 2025, at 10:25, Zhou Wang wrote:
>> On 2025/11/14 0:26, Arnd Bergmann wrote:
>>>
>>> Are you using a particular device, or are you trying to enable
>>> the support in general? If you have a specific device you are
>>> working on, does it use the PASID data or not?
>>
>> Many thanks for your careful explanation! I got the pointer here.
>>
>> We have a real device in our SoC, which supports LS64B/ST64B/ST64BV.
>> For ST64BV, Our device just receives 64B data atomically, not interpret
>> it with PASID data.
> 
> Ok, I see. So I assume this is either a kind of dedicated work queue
> where the IOMMU PASID is set up in advance for the user MMIO area,

Yeah, it is something like you mentioned above. MMIO area is binded with
a work queue, a PASID is set up in advance for this work queue.

> or it is a device that does not do any DMA at all, correct?
> 
> In this case, would the device also work correctly with ST64BV0 if
> the ACCDATA register is fixed to a value of zero and you can only
> use the upper 480 bits? Would it also work if there is an
> unpredictable value in ACCDATA that may match the PASID of another
> device used by the same process?

We do not support ST64BV0, so above case will not happen. I think ST64BV0
will trigger a illegal instruction exception in our system.

Best,
Zhou

> 
>          Arnd
> .


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V}
  2025-11-18  2:31             ` Zhou Wang
@ 2025-11-18  7:36               ` Arnd Bergmann
  0 siblings, 0 replies; 25+ messages in thread
From: Arnd Bergmann @ 2025-11-18  7:36 UTC (permalink / raw)
  To: Zhou Wang, Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, linux-arm-kernel, kvmarm,
	Yicong Yang, prime.zeng, xuwei5

On Tue, Nov 18, 2025, at 03:31, Zhou Wang wrote:
> On 2025/11/14 17:37, Arnd Bergmann wrote:
>> On Fri, Nov 14, 2025, at 10:25, Zhou Wang wrote:
>>> On 2025/11/14 0:26, Arnd Bergmann wrote:
>>>>
>>>> Are you using a particular device, or are you trying to enable
>>>> the support in general? If you have a specific device you are
>>>> working on, does it use the PASID data or not?
>>>
>>> Many thanks for your careful explanation! I got the pointer here.
>>>
>>> We have a real device in our SoC, which supports LS64B/ST64B/ST64BV.
>>> For ST64BV, Our device just receives 64B data atomically, not interpret
>>> it with PASID data.
>> 
>> Ok, I see. So I assume this is either a kind of dedicated work queue
>> where the IOMMU PASID is set up in advance for the user MMIO area,
>
> Yeah, it is something like you mentioned above. MMIO area is binded with
> a work queue, a PASID is set up in advance for this work queue.

Ok, thanks for confirming.

>> or it is a device that does not do any DMA at all, correct?
>> 
>> In this case, would the device also work correctly with ST64BV0 if
>> the ACCDATA register is fixed to a value of zero and you can only
>> use the upper 480 bits? Would it also work if there is an
>> unpredictable value in ACCDATA that may match the PASID of another
>> device used by the same process?
>
> We do not support ST64BV0, so above case will not happen. I think ST64BV0
> will trigger a illegal instruction exception in our system.

At least this does make it easier because on your system you would
never run into a situation where you want to support both your
internal device with st64bv and another device with st64bv0.

The easiest setup I can think of that still supports your machine
would look something like:

- have the kernel choose between st64bv and st64bv0 at early boot,
  based on platform configuration, use st64bv0 by default if
  available in hardware and not disabled in EL3, EL2 or kernel
  command line.
- Change pasid handling in iommu_sva_bind_device() so drivers
  have to explicitly request one of the modes before establishing
  a pasid, refuse this on incompatible systems, update the
  three existing callers (idxd, amdxdna, uacce) accordingly.
- on systems with st64bv but no st64bv0, enable st64bv for
  all CPUs at boot time to avoid context switch overhead, but
  forbid mapping shared hardware workqueues into userspace.
- postpone full support for st64bv0 until we have a device that
  actually uses this and can be tested. I think most of it is
  already there in the iommu code, but the ACCDATA setup needs
  to be integrated into the switch_mm()/__switch_to() code
  and possibly a trap handler like on x86.

In the current architecture, both FEAT_LS64_V and
FEAT_LS64_ACCDATA are optional, so you can continue to
produce CPUs that have the former but not the latter, and
the logic above will keep that working. However this breaks
if you ever want to support FEAT_LS64_ACCDATA in a later
CPU but keep the existing driver working with st64bv.

       Arnd


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-11-18  7:37 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-07  7:21 [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Zhou Wang
2025-11-07  7:21 ` [PATCH v7 1/7] KVM: arm64: Add exit to userspace on {LD,ST}64B* outside of memslots Zhou Wang
2025-11-07 11:48   ` Suzuki K Poulose
2025-11-07 11:49     ` Suzuki K Poulose
2025-11-11  2:12     ` Zhou Wang
2025-11-07  7:21 ` [PATCH v7 2/7] KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B Zhou Wang
2025-11-07  7:21 ` [PATCH v7 3/7] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Zhou Wang
2025-11-07  7:21 ` [PATCH v7 4/7] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 Zhou Wang
2025-11-07  7:21 ` [PATCH v7 5/7] arm64: Add support for FEAT_{LS64, LS64_V} Zhou Wang
2025-11-07 12:05   ` Suzuki K Poulose
2025-11-11  3:40     ` Zhou Wang
2025-11-11 11:15   ` Marc Zyngier
2025-11-13 14:40     ` Zhou Wang
2025-11-13 16:26       ` Arnd Bergmann
2025-11-14  9:25         ` Zhou Wang
2025-11-14  9:37           ` Arnd Bergmann
2025-11-18  2:31             ` Zhou Wang
2025-11-18  7:36               ` Arnd Bergmann
2025-11-07  7:21 ` [PATCH v7 6/7] KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest Zhou Wang
2025-11-07 18:53   ` Oliver Upton
2025-11-11  3:43     ` Zhou Wang
2025-11-07  7:21 ` [PATCH v7 7/7] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V} Zhou Wang
2025-11-07  9:21   ` Arnd Bergmann
2025-11-07  9:23 ` [PATCH v7 0/7] Add support for FEAT_{LS64, LS64_V} and related tests Arnd Bergmann
2025-11-07 18:57 ` Oliver Upton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).