linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests
@ 2024-12-02 13:54 Yicong Yang
  2024-12-02 13:55 ` [PATCH 1/5] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA} usage at EL0/1 Yicong Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Yicong Yang @ 2024-12-02 13:54 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc
  Cc: joey.gouly, suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V,
LS64_ACCDATA}. Add support for Armv8.7 FEAT_{LS64, LS64_V, LS64_ACCDATA}:
- Add identifying and enabling in the cpufeature list
- Expose the support of these features to userspace through HWCAP3
  and cpuinfo
- Add related hwcap test
- Handle the trap of unsupported memory (normal/uncacheable) access in a VM

A real scenario for this feature is that the userspace driver can make use of
this to implement direct WQE (workqueue entry) - a mechanism to fill WQE
directly into the hardware.

This patchset also depends on Marc's patchset[1] for enabling related
features in a VM, HCRX trap handling, etc.

[1] https://lore.kernel.org/linux-arm-kernel/20240815125959.2097734-1-maz@kernel.org/

Tested with updated hwcap test:
On host:
root@localhost:/# dmesg | grep "All CPU(s) started"
[    1.600263] CPU: All CPU(s) started at EL2
root@localhost:/# ./hwcap
[snip...]
# LS64 present          
ok 169 cpuinfo_match_LS64
ok 170 sigill_LS64
ok 171 # SKIP sigbus_LS64
# LS64_V present
ok 172 cpuinfo_match_LS64_V
ok 173 sigill_LS64_V
ok 174 # SKIP sigbus_LS64_V
# LS64_ACCDATA present
ok 175 cpuinfo_match_LS64_ACCDATA
ok 176 sigill_LS64_ACCDATA
ok 177 # SKIP sigbus_LS64_ACCDATA
# Totals: pass:92 fail:0 xfail:0 xpass:0 skip:85 error:0

On guest:
root@localhost:/# dmesg | grep "All CPU(s) started"
[    1.375272] CPU: All CPU(s) started at EL1 
root@localhost:/# ./hwcap
[snip...]
# LS64 present
ok 169 cpuinfo_match_LS64
ok 170 sigill_LS64
ok 171 # SKIP sigbus_LS64
# LS64_V present
ok 172 cpuinfo_match_LS64_V
ok 173 sigill_LS64_V
ok 174 # SKIP sigbus_LS64_V
# LS64_ACCDATA present
ok 175 cpuinfo_match_LS64_ACCDATA
ok 176 sigill_LS64_ACCDATA
ok 177 # SKIP sigbus_LS64_ACCDATA
# Totals: pass:92 fail:0 xfail:0 xpass:0 skip:85 error:0

Yicong Yang (5):
  arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA}
    usage at EL0/1
  arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA}
  kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V, LS64_ACCDATA}
  arm64: Add ESR.DFSC definition of unsupported exclusive or atomic
    access
  KVM: arm64: Handle DABT caused by LS64* instructions on unsupported
    memory

 Documentation/arch/arm64/booting.rst      |  28 +++++
 Documentation/arch/arm64/elf_hwcaps.rst   |   9 ++
 arch/arm64/include/asm/el2_setup.h        |  26 ++++-
 arch/arm64/include/asm/esr.h              |   8 ++
 arch/arm64/include/asm/hwcap.h            |   3 +
 arch/arm64/include/uapi/asm/hwcap.h       |   3 +
 arch/arm64/kernel/cpufeature.c            |  70 +++++++++++-
 arch/arm64/kernel/cpuinfo.c               |   3 +
 arch/arm64/kvm/mmu.c                      |  14 +++
 arch/arm64/tools/cpucaps                  |   3 +
 tools/testing/selftests/arm64/abi/hwcap.c | 127 ++++++++++++++++++++++
 11 files changed, 291 insertions(+), 3 deletions(-)

-- 
2.24.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA} usage at EL0/1
  2024-12-02 13:54 [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests Yicong Yang
@ 2024-12-02 13:55 ` Yicong Yang
  2024-12-02 13:55 ` [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} Yicong Yang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Yicong Yang @ 2024-12-02 13:55 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc
  Cc: joey.gouly, suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Instructions introduced by FEAT_{LS64, LS64_V, LS64_ACCDATA} is
controlled by HCRX_EL2.{EnALS, EnASR, EnAS0}. Additionally
access of ACCDATA_EL1 for FEAT_LS64_ACCDATA is also affected by
FGT. Configure all of these to allow usage at EL0/1.

This doesn't mean these instructions are always available in
EL0/1 if provided. The hypervisor still have the control at
runtime.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/include/asm/el2_setup.h | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index 85ef966c08cd..446d3663840b 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -31,9 +31,22 @@
         /* Enable GCS if supported */
 	mrs_s	x1, SYS_ID_AA64PFR1_EL1
 	ubfx	x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
-	cbz	x1, .Lset_hcrx_\@
+	cbz	x1, .Lskip_gcs_hcrx_\@
 	orr	x0, x0, #HCRX_EL2_GCSEn
 
+.Lskip_gcs_hcrx_\@:
+	/* Enable LS64, LS64_V, LS64_ACCDATA if supported */
+	mrs_s	x1, SYS_ID_AA64ISAR1_EL1
+	ubfx	x1, x1, #ID_AA64ISAR1_EL1_LS64_SHIFT, #4
+	cbz	x1, .Lset_hcrx_\@
+	orr	x0, x0, #HCRX_EL2_EnALS
+	cmp	x1, #ID_AA64ISAR1_EL1_LS64_LS64_V
+	b.lt	.Lset_hcrx_\@
+	orr	x0, x0, #HCRX_EL2_EnASR
+	cmp	x1, #ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA
+	b.lt	.Lset_hcrx_\@
+	orr	x0, x0, #HCRX_EL2_EnAS0
+
 .Lset_hcrx_\@:
 	msr_s	SYS_HCRX_EL2, x0
 .Lskip_hcrx_\@:
@@ -211,12 +224,21 @@
 	/* GCS depends on PIE so we don't check it if PIE is absent */
 	mrs_s	x1, SYS_ID_AA64PFR1_EL1
 	ubfx	x1, x1, #ID_AA64PFR1_EL1_GCS_SHIFT, #4
-	cbz	x1, .Lset_fgt_\@
+	cbz	x1, .Lskip_gcs_fgt_\@
 
 	/* Disable traps of access to GCS registers at EL0 and EL1 */
 	orr	x0, x0, #HFGxTR_EL2_nGCS_EL1_MASK
 	orr	x0, x0, #HFGxTR_EL2_nGCS_EL0_MASK
 
+.Lskip_gcs_fgt_\@:
+	mrs_s	x1, SYS_ID_AA64ISAR1_EL1
+	ubfx	x1, x1, #ID_AA64ISAR1_EL1_LS64_SHIFT, #4
+	cmp	x1, #ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA
+	b.ne	.Lset_fgt_\@
+
+	/* Disable the trapping of ACCDATA_EL1 */
+	orr	x0, x0, #HFGxTR_EL2_nACCDATA_EL1
+
 .Lset_fgt_\@:
 	msr_s	SYS_HFGRTR_EL2, x0
 	msr_s	SYS_HFGWTR_EL2, x0
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA}
  2024-12-02 13:54 [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests Yicong Yang
  2024-12-02 13:55 ` [PATCH 1/5] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA} usage at EL0/1 Yicong Yang
@ 2024-12-02 13:55 ` Yicong Yang
  2024-12-03  9:38   ` Marc Zyngier
  2024-12-02 13:55 ` [PATCH 3/5] kselftest/arm64: Add HWCAP test " Yicong Yang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Yicong Yang @ 2024-12-02 13:55 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc
  Cc: joey.gouly, suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V,
LS64_ACCDATA}. These features are identified by ID_AA64ISAR1_EL1.LS64
and the use of such instructions in userspace (EL0) can be trapped.
In order to support the use of corresponding instructions in userspace:
- Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
- Add identifying and enabling in the cpufeature list
- Expose these support of these features to userspace through HWCAP
  and cpuinfo

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 Documentation/arch/arm64/booting.rst    | 28 ++++++++++
 Documentation/arch/arm64/elf_hwcaps.rst |  9 ++++
 arch/arm64/include/asm/hwcap.h          |  3 ++
 arch/arm64/include/uapi/asm/hwcap.h     |  3 ++
 arch/arm64/kernel/cpufeature.c          | 70 ++++++++++++++++++++++++-
 arch/arm64/kernel/cpuinfo.c             |  3 ++
 arch/arm64/tools/cpucaps                |  3 ++
 7 files changed, 118 insertions(+), 1 deletion(-)

diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
index 3278fb4bf219..c35cfe9da501 100644
--- a/Documentation/arch/arm64/booting.rst
+++ b/Documentation/arch/arm64/booting.rst
@@ -449,6 +449,34 @@ Before jumping into the kernel, the following conditions must be met:
 
     - HFGWTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
 
+  For CPUs support for 64-byte loads and stores without status (FEAT_LS64):
+
+  - If the kernel is entered at EL1 and EL2 is present:
+
+    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.
+
+  For CPUs support for 64-byte loads and stores with status (FEAT_LS64_V):
+
+  - If the kernel is entered at EL1 and EL2 is present:
+
+    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.
+
+  For CPUs support for 64-byte EL0 stores with status (FEAT_LS64_ACCDATA):
+
+  - If EL3 is present:
+
+    - SCR_EL3.EnAS0 (bit 36) must be initialised to 0b1.
+
+    - SCR_EL3.ADEn (bit 37) must be initialised to 0b1.
+
+  - If the kernel is entered at EL1 and EL2 is present:
+
+    - HCRX_EL2.EnAS0 (bit 0) must be initialised to 0b1.
+
+    - HFGRTR_EL2.nACCDATA_EL1 (bit 50) must be initialised to 0b1.
+
+    - HFGWTR_EL2.nACCDATA_EL1 (bit 50) must be initialised to 0b1.
+
 The requirements described above for CPU mode, caches, MMUs, architected
 timers, coherency and system registers apply to all CPUs.  All CPUs must
 enter the kernel in the same exception level.  Where the values documented
diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
index 2ff922a406ad..6cb2594f0803 100644
--- a/Documentation/arch/arm64/elf_hwcaps.rst
+++ b/Documentation/arch/arm64/elf_hwcaps.rst
@@ -372,6 +372,15 @@ HWCAP2_SME_SF8DP4
 HWCAP2_POE
     Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
 
+HWCAP3_LS64
+    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001.
+
+HWCAP3_LS64_V
+    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010.
+
+HWCAP3_LS64_ACCDATA
+    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0011.
+
 4. Unused AT_HWCAP bits
 -----------------------
 
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 2b6c61c608e2..ac1d576dfcd4 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -161,6 +161,9 @@
 #define KERNEL_HWCAP_POE		__khwcap2_feature(POE)
 
 #define __khwcap3_feature(x)		(const_ilog2(HWCAP3_ ## x) + 128)
+#define KERNEL_HWCAP_LS64              __khwcap3_feature(LS64)
+#define KERNEL_HWCAP_LS64_V            __khwcap3_feature(LS64_V)
+#define KERNEL_HWCAP_LS64_ACCDATA      __khwcap3_feature(LS64_ACCDATA)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 48d46b768eae..e70f91889fd6 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -128,5 +128,8 @@
 /*
  * HWCAP3 flags - for AT_HWCAP3
  */
+#define HWCAP3_LS64		(1UL << 0)
+#define HWCAP3_LS64_V		(1UL << 1)
+#define HWCAP3_LS64_ACCDATA	(1UL << 2)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 15f175e458c3..9e45c1b54372 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -229,7 +229,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
-	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LS64_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LS64_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_XS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0),
@@ -2246,6 +2246,47 @@ static void cpu_enable_e0pd(struct arm64_cpu_capabilities const *cap)
 }
 #endif /* CONFIG_ARM64_E0PD */
 
+static bool has_ls64(const struct arm64_cpu_capabilities *entry, int __unused)
+{
+	u64 ls64;
+
+	ls64 = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg),
+					   entry->field_pos, entry->sign);
+
+	if (ls64 == ID_AA64ISAR1_EL1_LS64_NI ||
+	    ls64 > ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA)
+		return false;
+
+	if (entry->capability == ARM64_HAS_LS64 &&
+	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64)
+		return true;
+
+	if (entry->capability == ARM64_HAS_LS64_V &&
+	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64_V)
+		return true;
+
+	if (entry->capability == ARM64_HAS_LS64_ACCDATA &&
+	    ls64 >= ID_AA64ISAR1_EL1_LS64_LS64_ACCDATA)
+		return true;
+
+	return false;
+}
+
+static void cpu_enable_ls64(struct arm64_cpu_capabilities const *cap)
+{
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnALS, SCTLR_EL1_EnALS);
+}
+
+static void cpu_enable_ls64_v(struct arm64_cpu_capabilities const *cap)
+{
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnASR, SCTLR_EL1_EnASR);
+}
+
+static void cpu_enable_ls64_accdata(struct arm64_cpu_capabilities const *cap)
+{
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_EnAS0, SCTLR_EL1_EnAS0);
+}
+
 #ifdef CONFIG_ARM64_PSEUDO_NMI
 static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 				   int scope)
@@ -2991,6 +3032,30 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		ARM64_CPUID_FIELDS(ID_AA64PFR1_EL1, GCS, IMP)
 	},
 #endif
+	{
+		.desc = "LS64",
+		.capability = ARM64_HAS_LS64,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_ls64,
+		.cpu_enable = cpu_enable_ls64,
+		ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64)
+	},
+	{
+		.desc = "LS64_V",
+		.capability = ARM64_HAS_LS64_V,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_ls64,
+		.cpu_enable = cpu_enable_ls64_v,
+		ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64_V)
+	},
+	{
+		.desc = "LS64_ACCDATA",
+		.capability = ARM64_HAS_LS64_ACCDATA,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_ls64,
+		.cpu_enable = cpu_enable_ls64_accdata,
+		ARM64_CPUID_FIELDS(ID_AA64ISAR1_EL1, LS64, LS64_ACCDATA)
+	},
 	{},
 };
 
@@ -3088,6 +3153,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64ISAR1_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_EBF16),
 	HWCAP_CAP(ID_AA64ISAR1_EL1, DGH, IMP, CAP_HWCAP, KERNEL_HWCAP_DGH),
 	HWCAP_CAP(ID_AA64ISAR1_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_I8MM),
+	HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64, CAP_HWCAP, KERNEL_HWCAP_LS64),
+	HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64_V, CAP_HWCAP, KERNEL_HWCAP_LS64_V),
+	HWCAP_CAP(ID_AA64ISAR1_EL1, LS64, LS64_ACCDATA, CAP_HWCAP, KERNEL_HWCAP_LS64_ACCDATA),
 	HWCAP_CAP(ID_AA64ISAR2_EL1, LUT, IMP, CAP_HWCAP, KERNEL_HWCAP_LUT),
 	HWCAP_CAP(ID_AA64ISAR3_EL1, FAMINMAX, IMP, CAP_HWCAP, KERNEL_HWCAP_FAMINMAX),
 	HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index d79e88fccdfc..b140852955be 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -81,6 +81,9 @@ static const char *const hwcap_str[] = {
 	[KERNEL_HWCAP_PACA]		= "paca",
 	[KERNEL_HWCAP_PACG]		= "pacg",
 	[KERNEL_HWCAP_GCS]		= "gcs",
+	[KERNEL_HWCAP_LS64]		= "ls64",
+	[KERNEL_HWCAP_LS64_V]		= "ls64_v",
+	[KERNEL_HWCAP_LS64_ACCDATA]	= "ls64_accdata",
 	[KERNEL_HWCAP_DCPODP]		= "dcpodp",
 	[KERNEL_HWCAP_SVE2]		= "sve2",
 	[KERNEL_HWCAP_SVEAES]		= "sveaes",
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index eb17f59e543c..0454d5e64cbe 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -42,6 +42,9 @@ HAS_HCX
 HAS_LDAPR
 HAS_LPA2
 HAS_LSE_ATOMICS
+HAS_LS64
+HAS_LS64_V
+HAS_LS64_ACCDATA
 HAS_MOPS
 HAS_NESTED_VIRT
 HAS_PAN
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] kselftest/arm64: Add HWCAP test for FEAT_{LS64, LS64_V, LS64_ACCDATA}
  2024-12-02 13:54 [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests Yicong Yang
  2024-12-02 13:55 ` [PATCH 1/5] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA} usage at EL0/1 Yicong Yang
  2024-12-02 13:55 ` [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} Yicong Yang
@ 2024-12-02 13:55 ` Yicong Yang
  2024-12-02 13:55 ` [PATCH 4/5] arm64: Add ESR.DFSC definition of unsupported exclusive or atomic access Yicong Yang
  2024-12-02 13:55 ` [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Yicong Yang
  4 siblings, 0 replies; 10+ messages in thread
From: Yicong Yang @ 2024-12-02 13:55 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc
  Cc: joey.gouly, suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Add tests for FEAT_{LS64, LS64_V, LS64_ACCDATA}. Issue related
instructions if feature presents, no SIGILL should be received.
Since such instructions can only operate on Device memory or
non-cacheable memory, we may received a SIGBUS during the test.
Just ignore it since we only tested whether the instruction itself
can be issued as expected on platforms declaring the support of
such features.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 tools/testing/selftests/arm64/abi/hwcap.c | 127 ++++++++++++++++++++++
 1 file changed, 127 insertions(+)

diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c
index 0029ed9c5c9a..6c826d4bb056 100644
--- a/tools/testing/selftests/arm64/abi/hwcap.c
+++ b/tools/testing/selftests/arm64/abi/hwcap.c
@@ -11,6 +11,8 @@
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <linux/auxvec.h>
+#include <linux/compiler.h>
 #include <sys/auxv.h>
 #include <sys/prctl.h>
 #include <asm/hwcap.h>
@@ -452,6 +454,107 @@ static void lrcpc3_sigill(void)
 	              : "=r" (data0), "=r" (data1) : "r" (src) :);
 }
 
+static void ignore_signal(int sig, siginfo_t *info, void *context)
+{
+	ucontext_t *uc = context;
+
+	uc->uc_mcontext.pc += 4;
+}
+
+static void ls64_sigill(void)
+{
+	struct sigaction ign, old;
+	char src[64] __aligned(64) = { 1 };
+
+	/*
+	 * LS64, LS64_V, LS64_ACCDATA require target memory to be
+	 * Device/Non-cacheable and the completer supports these
+	 * instructions, otherwise we'll receive a SIGBUS. Since
+	 * we are only testing the ABI here, so just ignore the
+	 * SIGBUS and see if we can execute the instructions
+	 * without receiving a SIGILL. Restore the handler of
+	 * SIGBUS after this test.
+	 */
+	ign.sa_sigaction = ignore_signal;
+	ign.sa_flags = SA_SIGINFO | SA_RESTART;
+	sigemptyset(&ign.sa_mask);
+	sigaction(SIGBUS, &ign, &old);
+
+	register void *xn asm ("x8") = src;
+	register u64 xt_1 asm ("x0");
+	register u64 __maybe_unused xt_2 asm ("x1");
+	register u64 __maybe_unused xt_3 asm ("x2");
+	register u64 __maybe_unused xt_4 asm ("x3");
+	register u64 __maybe_unused xt_5 asm ("x4");
+	register u64 __maybe_unused xt_6 asm ("x5");
+	register u64 __maybe_unused xt_7 asm ("x6");
+	register u64 __maybe_unused xt_8 asm ("x7");
+
+	/* LD64B x0, [x8] */
+	asm volatile(".inst 0xf83fd100" : "=r" (xt_1) : "r" (xn));
+
+	/* ST64B x0, [x8] */
+	asm volatile(".inst 0xf83f9100" : : "r" (xt_1), "r" (xn));
+
+	sigaction(SIGBUS, &old, NULL);
+}
+
+static void ls64_v_sigill(void)
+{
+	struct sigaction ign, old;
+	char dst[64] __aligned(64);
+
+	/* See comment in ls64_sigill() */
+	ign.sa_sigaction = ignore_signal;
+	ign.sa_flags = SA_SIGINFO | SA_RESTART;
+	sigemptyset(&ign.sa_mask);
+	sigaction(SIGBUS, &ign, &old);
+
+	register void *xn asm ("x8") = dst;
+	register u64 xt_1 asm ("x0") = 1;
+	register u64 __maybe_unused xt_2 asm ("x1") = 2;
+	register u64 __maybe_unused xt_3 asm ("x2") = 3;
+	register u64 __maybe_unused xt_4 asm ("x3") = 4;
+	register u64 __maybe_unused xt_5 asm ("x4") = 5;
+	register u64 __maybe_unused xt_6 asm ("x5") = 6;
+	register u64 __maybe_unused xt_7 asm ("x6") = 7;
+	register u64 __maybe_unused xt_8 asm ("x7") = 8;
+	register u64 st   asm ("x9");
+
+	/* ST64BV x9, x0, [x8] */
+	asm volatile(".inst 0xf829b100" : "=r" (st) : "r" (xt_1), "r" (xn));
+
+	sigaction(SIGBUS, &old, NULL);
+}
+
+static void ls64_accdata_sigill(void)
+{
+	struct sigaction ign, old;
+	char dst[64] __aligned(64);
+
+	/* See comment in ls64_sigill() */
+	ign.sa_sigaction = ignore_signal;
+	ign.sa_flags = SA_SIGINFO | SA_RESTART;
+	sigemptyset(&ign.sa_mask);
+	sigaction(SIGBUS, &ign, &old);
+
+	register void *xn asm ("x8") = dst;
+	register u64 xt_1 asm ("x0") = 1;
+	register u64 __maybe_unused xt_2 asm ("x1") = 2;
+	register u64 __maybe_unused xt_3 asm ("x2") = 3;
+	register u64 __maybe_unused xt_4 asm ("x3") = 4;
+	register u64 __maybe_unused xt_5 asm ("x4") = 5;
+	register u64 __maybe_unused xt_6 asm ("x5") = 6;
+	register u64 __maybe_unused xt_7 asm ("x6") = 7;
+	register u64 __maybe_unused xt_8 asm ("x7") = 8;
+	register u64 st asm ("x9");
+
+	/* ST64BV0 x9, x0, [x8] */
+	asm volatile(".inst 0xf829a100" : "=r" (st) : "r" (xt_1), "r" (xn));
+
+	sigaction(SIGBUS, &old, NULL);
+}
+
 static const struct hwcap_data {
 	const char *name;
 	unsigned long at_hwcap;
@@ -867,6 +970,30 @@ static const struct hwcap_data {
 		.sigill_fn = hbc_sigill,
 		.sigill_reliable = true,
 	},
+	{
+		.name = "LS64",
+		.at_hwcap = AT_HWCAP3,
+		.hwcap_bit = HWCAP3_LS64,
+		.cpuinfo = "ls64",
+		.sigill_fn = ls64_sigill,
+		.sigill_reliable = true,
+	},
+	{
+		.name = "LS64_V",
+		.at_hwcap = AT_HWCAP3,
+		.hwcap_bit = HWCAP3_LS64_V,
+		.cpuinfo = "ls64_v",
+		.sigill_fn = ls64_v_sigill,
+		.sigill_reliable = true,
+	},
+	{
+		.name = "LS64_ACCDATA",
+		.at_hwcap = AT_HWCAP3,
+		.hwcap_bit = HWCAP3_LS64_ACCDATA,
+		.cpuinfo = "ls64_accdata",
+		.sigill_fn = ls64_accdata_sigill,
+		.sigill_reliable = true,
+	},
 };
 
 typedef void (*sighandler_fn)(int, siginfo_t *, void *);
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] arm64: Add ESR.DFSC definition of unsupported exclusive or atomic access
  2024-12-02 13:54 [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests Yicong Yang
                   ` (2 preceding siblings ...)
  2024-12-02 13:55 ` [PATCH 3/5] kselftest/arm64: Add HWCAP test " Yicong Yang
@ 2024-12-02 13:55 ` Yicong Yang
  2024-12-02 13:55 ` [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Yicong Yang
  4 siblings, 0 replies; 10+ messages in thread
From: Yicong Yang @ 2024-12-02 13:55 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc
  Cc: joey.gouly, suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

0x35 indicates IMPLEMENTATION DEFINED fault for Unsupported Exclusive or
Atomic access. Add ESR_ELx_FSC definition and corresponding wrapper.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/include/asm/esr.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 94c9d537a7ad..1639e99f2bd6 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -122,6 +122,7 @@
 #define ESR_ELx_FSC_SEA_TTW(n)	(0x14 + (n))
 #define ESR_ELx_FSC_SECC	(0x18)
 #define ESR_ELx_FSC_SECC_TTW(n)	(0x1c + (n))
+#define ESR_ELx_FSC_EXCL_ATOMIC	(0x35)
 
 /* Status codes for individual page table levels */
 #define ESR_ELx_FSC_ACCESS_L(n)	(ESR_ELx_FSC_ACCESS + (n))
@@ -470,6 +471,13 @@ static inline bool esr_fsc_is_access_flag_fault(unsigned long esr)
 	       (esr == ESR_ELx_FSC_ACCESS_L(0));
 }
 
+static inline bool esr_fsc_is_excl_atomic_fault(unsigned long esr)
+{
+	esr = esr & ESR_ELx_FSC;
+
+	return esr == ESR_ELx_FSC_EXCL_ATOMIC;
+}
+
 /* Indicate whether ESR.EC==0x1A is for an ERETAx instruction */
 static inline bool esr_iss_is_eretax(unsigned long esr)
 {
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
  2024-12-02 13:54 [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests Yicong Yang
                   ` (3 preceding siblings ...)
  2024-12-02 13:55 ` [PATCH 4/5] arm64: Add ESR.DFSC definition of unsupported exclusive or atomic access Yicong Yang
@ 2024-12-02 13:55 ` Yicong Yang
  2024-12-03  9:38   ` Marc Zyngier
  4 siblings, 1 reply; 10+ messages in thread
From: Yicong Yang @ 2024-12-02 13:55 UTC (permalink / raw)
  To: catalin.marinas, will, maz, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc
  Cc: joey.gouly, suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

FEAT_LS64* instructions only support to access Device/Uncacheable
memory, otherwise a data abort for unsupported Exclusive or atomic
access (0x35) is generated per spec. It's implementation defined
whether the target exception level is routed and is possible to
implemented as route to EL2 on a VHE VM. Per DDI0487K.a Section
C3.2.12.2 Single-copy atomic 64-byte load/store:

  The check is performed against the resulting memory type after all
  enabled stages of translation. In this case the fault is reported
  at the final enabled stage of translation.

If it's implemented as generate the DABT to the final enabled stage
(stage-2), inject a DABT to the guest to handle it.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/kvm/mmu.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index c9d46ad57e52..b7e6f0a27537 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1787,6 +1787,20 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 		return 1;
 	}
 
+	/*
+	 * If instructions of FEAT_{LS64, LS64_V, LS64_ACCDATA} operated on
+	 * unsupported memory regions, a DABT for unsupported Exclusive or
+	 * atomic access is generated. It's implementation defined whether
+	 * the exception will be taken to, a stage-1 DABT or the final enabled
+	 * stage of translation (stage-2 in this case as we hit here). Inject
+	 * a DABT to the guest to handle it if it's implemented as a stage-2
+	 * DABT.
+	 */
+	if (esr_fsc_is_excl_atomic_fault(esr)) {
+		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
+		return 1;
+	}
+
 	trace_kvm_guest_fault(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
 			      kvm_vcpu_get_hfar(vcpu), fault_ipa);
 
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
  2024-12-02 13:55 ` [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Yicong Yang
@ 2024-12-03  9:38   ` Marc Zyngier
  2024-12-04  9:08     ` Yicong Yang
  0 siblings, 1 reply; 10+ messages in thread
From: Marc Zyngier @ 2024-12-03  9:38 UTC (permalink / raw)
  To: Yicong Yang
  Cc: catalin.marinas, will, oliver.upton, corbet, linux-arm-kernel,
	kvmarm, linux-kselftest, linux-doc, joey.gouly, suzuki.poulose,
	yuzenghui, shuah, jonathan.cameron, shameerali.kolothum.thodi,
	linuxarm, prime.zeng, xuwei5, yangyicong

On Mon, 02 Dec 2024 13:55:04 +0000,
Yicong Yang <yangyicong@huawei.com> wrote:
> 
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> FEAT_LS64* instructions only support to access Device/Uncacheable
> memory, otherwise a data abort for unsupported Exclusive or atomic

Not quite. FEAT_LS64WB explicitly supports Write-Back mappings.

> access (0x35) is generated per spec. It's implementation defined
> whether the target exception level is routed and is possible to
> implemented as route to EL2 on a VHE VM. Per DDI0487K.a Section
> C3.2.12.2 Single-copy atomic 64-byte load/store:
> 
>   The check is performed against the resulting memory type after all
>   enabled stages of translation. In this case the fault is reported
>   at the final enabled stage of translation.
> 
> If it's implemented as generate the DABT to the final enabled stage
> (stage-2), inject a DABT to the guest to handle it.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  arch/arm64/kvm/mmu.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c9d46ad57e52..b7e6f0a27537 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1787,6 +1787,20 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  		return 1;
>  	}
>  
> +	/*
> +	 * If instructions of FEAT_{LS64, LS64_V, LS64_ACCDATA} operated on
> +	 * unsupported memory regions, a DABT for unsupported Exclusive or
> +	 * atomic access is generated. It's implementation defined whether
> +	 * the exception will be taken to, a stage-1 DABT or the final enabled
> +	 * stage of translation (stage-2 in this case as we hit here). Inject
> +	 * a DABT to the guest to handle it if it's implemented as a stage-2
> +	 * DABT.
> +	 */
> +	if (esr_fsc_is_excl_atomic_fault(esr)) {
> +		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> +		return 1;
> +	}

This doesn't seem quite right. This is injecting an *External* Data
Abort, which is not what the spec says happens, as you are emulating
the *first* acceptable behaviour:

  "The check is performed at each enabled stage of translation, and
   the fault is reported for the first stage of translation that
   provides an inappropriate memory type. In this case, the value of
   the HCR_EL2.DC bit does not cause accesses generated by the
   instructions to generate a stage 1 Data abort,"

So while the exception is reported at a different EL, the fault should
still be an "unsupported Exclusive or atomic access". But that's also
assuming that S2 has a device mapping, and it is EL1 that did
something wrong. Surely you should check the IPA against its memory
type?

Further questions: what happens when a L2 guest triggers such fault?
I don't think you can't arbitrarily route it back to L2 without
looking at why it faulted.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA}
  2024-12-02 13:55 ` [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} Yicong Yang
@ 2024-12-03  9:38   ` Marc Zyngier
  2024-12-04  9:13     ` Yicong Yang
  0 siblings, 1 reply; 10+ messages in thread
From: Marc Zyngier @ 2024-12-03  9:38 UTC (permalink / raw)
  To: Yicong Yang
  Cc: catalin.marinas, will, oliver.upton, corbet, linux-arm-kernel,
	kvmarm, linux-kselftest, linux-doc, joey.gouly, suzuki.poulose,
	yuzenghui, shuah, jonathan.cameron, shameerali.kolothum.thodi,
	linuxarm, prime.zeng, xuwei5, yangyicong

On Mon, 02 Dec 2024 13:55:01 +0000,
Yicong Yang <yangyicong@huawei.com> wrote:
> 
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Armv8.7 introduces single-copy atomic 64-byte loads and stores
> instructions and its variants named under FEAT_{LS64, LS64_V,
> LS64_ACCDATA}. These features are identified by ID_AA64ISAR1_EL1.LS64
> and the use of such instructions in userspace (EL0) can be trapped.
> In order to support the use of corresponding instructions in userspace:
> - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
> - Add identifying and enabling in the cpufeature list
> - Expose these support of these features to userspace through HWCAP
>   and cpuinfo
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  Documentation/arch/arm64/booting.rst    | 28 ++++++++++
>  Documentation/arch/arm64/elf_hwcaps.rst |  9 ++++
>  arch/arm64/include/asm/hwcap.h          |  3 ++
>  arch/arm64/include/uapi/asm/hwcap.h     |  3 ++
>  arch/arm64/kernel/cpufeature.c          | 70 ++++++++++++++++++++++++-
>  arch/arm64/kernel/cpuinfo.c             |  3 ++
>  arch/arm64/tools/cpucaps                |  3 ++
>  7 files changed, 118 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
> index 3278fb4bf219..c35cfe9da501 100644
> --- a/Documentation/arch/arm64/booting.rst
> +++ b/Documentation/arch/arm64/booting.rst
> @@ -449,6 +449,34 @@ Before jumping into the kernel, the following conditions must be met:
>  
>      - HFGWTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
>  
> +  For CPUs support for 64-byte loads and stores without status (FEAT_LS64):
> +
> +  - If the kernel is entered at EL1 and EL2 is present:
> +
> +    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.
> +
> +  For CPUs support for 64-byte loads and stores with status (FEAT_LS64_V):
> +
> +  - If the kernel is entered at EL1 and EL2 is present:
> +
> +    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.
> +
> +  For CPUs support for 64-byte EL0 stores with status (FEAT_LS64_ACCDATA):
> +
> +  - If EL3 is present:
> +
> +    - SCR_EL3.EnAS0 (bit 36) must be initialised to 0b1.
> +
> +    - SCR_EL3.ADEn (bit 37) must be initialised to 0b1.
> +
> +  - If the kernel is entered at EL1 and EL2 is present:
> +
> +    - HCRX_EL2.EnAS0 (bit 0) must be initialised to 0b1.
> +
> +    - HFGRTR_EL2.nACCDATA_EL1 (bit 50) must be initialised to 0b1.
> +
> +    - HFGWTR_EL2.nACCDATA_EL1 (bit 50) must be initialised to 0b1.
> +
>  The requirements described above for CPU mode, caches, MMUs, architected
>  timers, coherency and system registers apply to all CPUs.  All CPUs must
>  enter the kernel in the same exception level.  Where the values documented
> diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
> index 2ff922a406ad..6cb2594f0803 100644
> --- a/Documentation/arch/arm64/elf_hwcaps.rst
> +++ b/Documentation/arch/arm64/elf_hwcaps.rst
> @@ -372,6 +372,15 @@ HWCAP2_SME_SF8DP4
>  HWCAP2_POE
>      Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
>  
> +HWCAP3_LS64
> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001.
> +
> +HWCAP3_LS64_V
> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010.
> +
> +HWCAP3_LS64_ACCDATA
> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0011.
> +

I don't mind the two others, but I seriously question exposing ST64BV0
to userspace. How is ACCDATA_EL1 populated? How is it context-switched?

As it stands, this either does the wrong thing by always having the
low 32bit set to an UNKNOWN value, or actively leaks kernel data.
TBH, I don't see it being usable in practice (the more I read this
part of the architecture, the more broken it looks).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory
  2024-12-03  9:38   ` Marc Zyngier
@ 2024-12-04  9:08     ` Yicong Yang
  0 siblings, 0 replies; 10+ messages in thread
From: Yicong Yang @ 2024-12-04  9:08 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: yangyicong, catalin.marinas, will, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc, joey.gouly,
	suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5

On 2024/12/3 17:38, Marc Zyngier wrote:
> On Mon, 02 Dec 2024 13:55:04 +0000,
> Yicong Yang <yangyicong@huawei.com> wrote:
>>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> FEAT_LS64* instructions only support to access Device/Uncacheable
>> memory, otherwise a data abort for unsupported Exclusive or atomic
> 
> Not quite. FEAT_LS64WB explicitly supports Write-Back mappings.

thanks for the information, I didn't know this since it's not listed on DDI0487K.a,
have seen this feature from [1]. will refine the comments and mention the behaviour
applies when no FEAT_LS64WB.

[1] https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension

> 
>> access (0x35) is generated per spec. It's implementation defined
>> whether the target exception level is routed and is possible to
>> implemented as route to EL2 on a VHE VM. Per DDI0487K.a Section
>> C3.2.12.2 Single-copy atomic 64-byte load/store:
>>
>>   The check is performed against the resulting memory type after all
>>   enabled stages of translation. In this case the fault is reported
>>   at the final enabled stage of translation.
>>
>> If it's implemented as generate the DABT to the final enabled stage
>> (stage-2), inject a DABT to the guest to handle it.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  arch/arm64/kvm/mmu.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index c9d46ad57e52..b7e6f0a27537 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -1787,6 +1787,20 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>  		return 1;
>>  	}
>>  
>> +	/*
>> +	 * If instructions of FEAT_{LS64, LS64_V, LS64_ACCDATA} operated on
>> +	 * unsupported memory regions, a DABT for unsupported Exclusive or
>> +	 * atomic access is generated. It's implementation defined whether
>> +	 * the exception will be taken to, a stage-1 DABT or the final enabled
>> +	 * stage of translation (stage-2 in this case as we hit here). Inject
>> +	 * a DABT to the guest to handle it if it's implemented as a stage-2
>> +	 * DABT.
>> +	 */
>> +	if (esr_fsc_is_excl_atomic_fault(esr)) {
>> +		kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
>> +		return 1;
>> +	}
> 
> This doesn't seem quite right. This is injecting an *External* Data
> Abort, which is not what the spec says happens, as you are emulating
> the *first* acceptable behaviour:
> 
>   "The check is performed at each enabled stage of translation, and
>    the fault is reported for the first stage of translation that
>    provides an inappropriate memory type. In this case, the value of
>    the HCR_EL2.DC bit does not cause accesses generated by the
>    instructions to generate a stage 1 Data abort,"
> 
> So while the exception is reported at a different EL, the fault should
> still be an "unsupported Exclusive or atomic access". 

yes it's trying to emulate the first behaviour to let guest OS handle it
as how host OS handle it - send a SIGBUS to the EL0 task if it's performing
on unsupported memory type. Currently if such cases happens the VM will be
killed without the handling here. Keep the injected FSC as "unsupported
Exclusive or atomic access" should be more proper, will make it in next
version.

> But that's also
> assuming that S2 has a device mapping, and it is EL1 that did
> something wrong. Surely you should check the IPA against its memory
> type?

in my case it happens when both S1 and S2 mapping is normal memory, for example
running the hwcap test (PATCH 3/5) in the VM. But yes the memory type should
be checked here.

> 
> Further questions: what happens when a L2 guest triggers such fault?
> I don't think you can't arbitrarily route it back to L2 without
> looking at why it faulted.
> 

will see what affect to the nested VM case. but as you point out, it's
too early to inject the abort here, at least we should check the memory
type. will address it.

Thanks.






^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA}
  2024-12-03  9:38   ` Marc Zyngier
@ 2024-12-04  9:13     ` Yicong Yang
  0 siblings, 0 replies; 10+ messages in thread
From: Yicong Yang @ 2024-12-04  9:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: yangyicong, catalin.marinas, will, oliver.upton, corbet,
	linux-arm-kernel, kvmarm, linux-kselftest, linux-doc, joey.gouly,
	suzuki.poulose, yuzenghui, shuah, jonathan.cameron,
	shameerali.kolothum.thodi, linuxarm, prime.zeng, xuwei5

On 2024/12/3 17:38, Marc Zyngier wrote:
> On Mon, 02 Dec 2024 13:55:01 +0000,
> Yicong Yang <yangyicong@huawei.com> wrote:
>>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Armv8.7 introduces single-copy atomic 64-byte loads and stores
>> instructions and its variants named under FEAT_{LS64, LS64_V,
>> LS64_ACCDATA}. These features are identified by ID_AA64ISAR1_EL1.LS64
>> and the use of such instructions in userspace (EL0) can be trapped.
>> In order to support the use of corresponding instructions in userspace:
>> - Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
>> - Add identifying and enabling in the cpufeature list
>> - Expose these support of these features to userspace through HWCAP
>>   and cpuinfo
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  Documentation/arch/arm64/booting.rst    | 28 ++++++++++
>>  Documentation/arch/arm64/elf_hwcaps.rst |  9 ++++
>>  arch/arm64/include/asm/hwcap.h          |  3 ++
>>  arch/arm64/include/uapi/asm/hwcap.h     |  3 ++
>>  arch/arm64/kernel/cpufeature.c          | 70 ++++++++++++++++++++++++-
>>  arch/arm64/kernel/cpuinfo.c             |  3 ++
>>  arch/arm64/tools/cpucaps                |  3 ++
>>  7 files changed, 118 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/arch/arm64/booting.rst b/Documentation/arch/arm64/booting.rst
>> index 3278fb4bf219..c35cfe9da501 100644
>> --- a/Documentation/arch/arm64/booting.rst
>> +++ b/Documentation/arch/arm64/booting.rst
>> @@ -449,6 +449,34 @@ Before jumping into the kernel, the following conditions must be met:
>>  
>>      - HFGWTR_EL2.nGCS_EL0 (bit 52) must be initialised to 0b1.
>>  
>> +  For CPUs support for 64-byte loads and stores without status (FEAT_LS64):
>> +
>> +  - If the kernel is entered at EL1 and EL2 is present:
>> +
>> +    - HCRX_EL2.EnALS (bit 1) must be initialised to 0b1.
>> +
>> +  For CPUs support for 64-byte loads and stores with status (FEAT_LS64_V):
>> +
>> +  - If the kernel is entered at EL1 and EL2 is present:
>> +
>> +    - HCRX_EL2.EnASR (bit 2) must be initialised to 0b1.
>> +
>> +  For CPUs support for 64-byte EL0 stores with status (FEAT_LS64_ACCDATA):
>> +
>> +  - If EL3 is present:
>> +
>> +    - SCR_EL3.EnAS0 (bit 36) must be initialised to 0b1.
>> +
>> +    - SCR_EL3.ADEn (bit 37) must be initialised to 0b1.
>> +
>> +  - If the kernel is entered at EL1 and EL2 is present:
>> +
>> +    - HCRX_EL2.EnAS0 (bit 0) must be initialised to 0b1.
>> +
>> +    - HFGRTR_EL2.nACCDATA_EL1 (bit 50) must be initialised to 0b1.
>> +
>> +    - HFGWTR_EL2.nACCDATA_EL1 (bit 50) must be initialised to 0b1.
>> +
>>  The requirements described above for CPU mode, caches, MMUs, architected
>>  timers, coherency and system registers apply to all CPUs.  All CPUs must
>>  enter the kernel in the same exception level.  Where the values documented
>> diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
>> index 2ff922a406ad..6cb2594f0803 100644
>> --- a/Documentation/arch/arm64/elf_hwcaps.rst
>> +++ b/Documentation/arch/arm64/elf_hwcaps.rst
>> @@ -372,6 +372,15 @@ HWCAP2_SME_SF8DP4
>>  HWCAP2_POE
>>      Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
>>  
>> +HWCAP3_LS64
>> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0001.
>> +
>> +HWCAP3_LS64_V
>> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0010.
>> +
>> +HWCAP3_LS64_ACCDATA
>> +    Functionality implied by ID_AA64ISAR1_EL1.LS64 == 0b0011.
>> +
> 
> I don't mind the two others, but I seriously question exposing ST64BV0
> to userspace. How is ACCDATA_EL1 populated? How is it context-switched?
> 
> As it stands, this either does the wrong thing by always having the
> low 32bit set to an UNKNOWN value, or actively leaks kernel data.
> TBH, I don't see it being usable in practice (the more I read this
> part of the architecture, the more broken it looks).
> 

you're right, expose this LS64_ACCDATA alone to userspace won't make it
usable since ACCDATA_EL1 cannot be accessed from EL0. will drop this at
this stage.

Thanks.



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-12-04  9:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-02 13:54 [PATCH 0/5] Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} and related tests Yicong Yang
2024-12-02 13:55 ` [PATCH 1/5] arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V, LS64_ACCDATA} usage at EL0/1 Yicong Yang
2024-12-02 13:55 ` [PATCH 2/5] arm64: Add support for FEAT_{LS64, LS64_V, LS64_ACCDATA} Yicong Yang
2024-12-03  9:38   ` Marc Zyngier
2024-12-04  9:13     ` Yicong Yang
2024-12-02 13:55 ` [PATCH 3/5] kselftest/arm64: Add HWCAP test " Yicong Yang
2024-12-02 13:55 ` [PATCH 4/5] arm64: Add ESR.DFSC definition of unsupported exclusive or atomic access Yicong Yang
2024-12-02 13:55 ` [PATCH 5/5] KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory Yicong Yang
2024-12-03  9:38   ` Marc Zyngier
2024-12-04  9:08     ` Yicong Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).