linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
@ 2023-02-16 18:21 James Morse
  2023-02-16 18:21 ` [RFC PATCH 1/3] firmware: smccc: Add support for erratum discovery API James Morse
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: James Morse @ 2023-02-16 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Lorenzo Pieralisi,
	Sudeep Holla, Marc Zyngier, Oliver Upton, James Morse

Hello!

When stage1 translation is disabled, the SCTRL_E1.I bit controls the
attributes used for instruction fetch, one of the options results in a
non-cacheable access. A whole host of CPUs missed the FWB override
in this case, meaning a KVM guest could fetch stale/junk data instead of
instructions.

The workaround is to disable FWB, and do the required cache maintenance
instead.

The good news is, this isn't a problem for systems using Arm's
interconnect IP. The bad news is: linux can't know this. Arm knows of
at least one platform that is affected by this erratum.


This series adds support for the 'Errata Management Firmware Interface', [0]
and queries that to determine if the CPU is affected or not.

Unfortunately, no-one has firmware that supports this new interface yet,
and the least surprising thing to do is to enable the workaround by default,
meaning FWB is disabled on all these cores, even for unaffected platforms.
Platforms that are not-affected can either take a firmware-update to support
the interface, or if the kernel they run will only run on hardware that is
unaffected, disable the workaround at build time.

The trusted firmware series to implement the interface has not yet been
posted. I'll include a link once it is.

This series is an RFC as I anticipate a wider discussion around how we add
workaround that depend on firmware for detection.

The SDEN documents that describe this are:
Cortex-A78:
https://developer.arm.com/documentation/SDEN1401784/1800/?lang=en
Cortex-A78C:
https://developer.arm.com/documentation/SDEN1707916/1300/?lang=en
https://developer.arm.com/documentation/SDEN2004089/0700/?lang=en
(yes, there are two!)
Cortex-A710:
https://developer.arm.com/documentation/SDEN1775101/1500/?lang=en
Cortex-X1:
https://developer.arm.com/documentation/SDEN1401782/1800/?lang=en
Cortex-X2:
https://developer.arm.com/documentation/SDEN1775100/1500/?lang=en
Cortex-X3:
https://developer.arm.com/documentation/SDEN2055130/1000/?lang=en
Cortex-V1:
https://developer.arm.com/documentation/SDEN1401781/1600/?lang=en
Cortex-V2:
https://developer.arm.com/documentation/SDEN2332927/0500/?lang=en
Cortex-N2:
https://developer.arm.com/documentation/SDEN1982442/1200/?lang=en

Thanks,

James

[0] https://developer.arm.com/documentation/den0100/1-0/?lang=en

James Morse (3):
  firmware: smccc: Add support for erratum discovery API
  arm64: cputype: Add new part numbers for Cortex-X3, and Neoverse-V2
  arm64: errata: Disable FWB on parts with non-ARM interconnects

 Documentation/arm64/silicon-errata.rst | 18 ++++++
 arch/arm64/Kconfig                     | 27 +++++++++
 arch/arm64/include/asm/cputype.h       |  4 ++
 arch/arm64/kernel/cpufeature.c         | 77 ++++++++++++++++++++++++-
 drivers/firmware/smccc/Kconfig         |  8 +++
 drivers/firmware/smccc/Makefile        |  1 +
 drivers/firmware/smccc/em.c            | 78 ++++++++++++++++++++++++++
 include/linux/arm-smccc.h              | 28 +++++++++
 8 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 drivers/firmware/smccc/em.c

-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/3] firmware: smccc: Add support for erratum discovery API
  2023-02-16 18:21 [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
@ 2023-02-16 18:21 ` James Morse
  2023-02-16 18:22 ` [RFC PATCH 2/3] arm64: cputype: Add new part numbers for Cortex-X3, and Neoverse-V2 James Morse
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: James Morse @ 2023-02-16 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Lorenzo Pieralisi,
	Sudeep Holla, Marc Zyngier, Oliver Upton, James Morse

It is not always possible for the OS to determine if a CPU is affected by
a particular erratum. For example, it may depend on an integration choice
the chip designer made, or whether firmware has enabled some particular
feature.

Add support for the SMCCC 'Errata Management Firmware Interface' that lets
the OS query firmware for this information.

Link: https://developer.arm.com/documentation/den0100/1-0/?lang=en
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/cpufeature.c  |  6 +++
 drivers/firmware/smccc/Kconfig  |  8 ++++
 drivers/firmware/smccc/Makefile |  1 +
 drivers/firmware/smccc/em.c     | 78 +++++++++++++++++++++++++++++++++
 include/linux/arm-smccc.h       | 28 ++++++++++++
 5 files changed, 121 insertions(+)
 create mode 100644 drivers/firmware/smccc/em.c

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a77315b338e6..2eb4d38e491a 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -62,6 +62,7 @@
 
 #define pr_fmt(fmt) "CPU features: " fmt
 
+#include <linux/arm_smccc_em.h>
 #include <linux/bsearch.h>
 #include <linux/cpumask.h>
 #include <linux/crash_dump.h>
@@ -1036,6 +1037,11 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
 	 */
 	init_cpu_hwcaps_indirect_list();
 
+	/*
+	 * Early erratum workaround may need to be discovered from firmware.
+	 */
+	arm_smccc_em_init();
+
 	/*
 	 * Detect and enable early CPU capabilities based on the boot CPU,
 	 * after we have initialised the CPU feature infrastructure.
diff --git a/drivers/firmware/smccc/Kconfig b/drivers/firmware/smccc/Kconfig
index 15e7466179a6..a10a150d49bb 100644
--- a/drivers/firmware/smccc/Kconfig
+++ b/drivers/firmware/smccc/Kconfig
@@ -23,3 +23,11 @@ config ARM_SMCCC_SOC_ID
 	help
 	  Include support for the SoC bus on the ARM SMCCC firmware based
 	  platforms providing some sysfs information about the SoC variant.
+
+config ARM_SMCCC_EM
+	bool "Errata discovery by ARM SMCCC"
+	depends on HAVE_ARM_SMCCC_DISCOVERY
+	default y
+	help
+	  Include support for querying firmware via SMCCC to determine whether
+	  the CPU is affected by a specific erratum.
diff --git a/drivers/firmware/smccc/Makefile b/drivers/firmware/smccc/Makefile
index 40d19144a860..39ed128b59b5 100644
--- a/drivers/firmware/smccc/Makefile
+++ b/drivers/firmware/smccc/Makefile
@@ -2,3 +2,4 @@
 #
 obj-$(CONFIG_HAVE_ARM_SMCCC_DISCOVERY)	+= smccc.o kvm_guest.o
 obj-$(CONFIG_ARM_SMCCC_SOC_ID)	+= soc_id.o
+obj-$(CONFIG_ARM_SMCCC_EM)	+= em.o
diff --git a/drivers/firmware/smccc/em.c b/drivers/firmware/smccc/em.c
new file mode 100644
index 000000000000..2c66240d8707
--- /dev/null
+++ b/drivers/firmware/smccc/em.c
@@ -0,0 +1,78 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Arm Errata Management firmware interface.
+ *
+ * This firmware interface advertises support for firmware mitigations for CPU
+ * errata. It can also be used to discover erratum where the 'configurations
+ * affected' depends on the integration.
+ *
+ * Copyright (C) 2022 ARM Limited
+ */
+
+#define pr_fmt(fmt) "arm_smccc_em: " fmt
+
+#include <linux/arm_smccc_em.h>
+#include <linux/arm-smccc.h>
+#include <linux/errno.h>
+#include <linux/printk.h>
+
+#include <asm/alternative.h>
+
+#include <uapi/linux/psci.h>
+
+static u32 supported;
+
+int arm_smccc_em_cpu_features(u32 erratum_id)
+{
+	struct arm_smccc_res res;
+
+	if (!READ_ONCE(supported))
+		return -EOPNOTSUPP;
+
+	arm_smccc_1_1_invoke(ARM_SMCCC_EM_CPU_ERRATUM_FEATURES, erratum_id, 0, &res);
+	switch (res.a0) {
+	case SMCCC_RET_NOT_SUPPORTED:
+		return -EOPNOTSUPP;
+	case SMCCC_EM_RET_INVALID_PARAMTER:
+		return -EINVAL;
+	case SMCCC_EM_RET_UNKNOWN:
+		return -ENOENT;
+	case SMCCC_EM_RET_HIGHER_EL_MITIGATION:
+	case SMCCC_EM_RET_NOT_AFFECTED:
+	case SMCCC_EM_RET_AFFECTED:
+		return res.a0;
+	};
+
+	return -EIO;
+}
+
+int __init arm_smccc_em_init(void)
+{
+	u32 major_ver, minor_ver;
+	struct arm_smccc_res res;
+	enum arm_smccc_conduit conduit = arm_smccc_1_1_get_conduit();
+
+	if (conduit == SMCCC_CONDUIT_NONE)
+		return -EOPNOTSUPP;
+
+	arm_smccc_1_1_invoke(ARM_SMCCC_EM_VERSION, &res);
+	if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
+		return -EOPNOTSUPP;
+
+	major_ver = PSCI_VERSION_MAJOR(res.a0);
+	minor_ver = PSCI_VERSION_MINOR(res.a0);
+	if (major_ver != 1)
+		return -EIO;
+
+	arm_smccc_1_1_invoke(ARM_SMCCC_EM_FEATURES,
+			     ARM_SMCCC_EM_CPU_ERRATUM_FEATURES, &res);
+	if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
+		return -EOPNOTSUPP;
+
+	pr_info("SMCCC Errata Management Interface v%d.%d\n",
+		major_ver, minor_ver);
+
+	WRITE_ONCE(supported, 1);
+
+	return 0;
+}
diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
index 220c8c60e021..cc2e38ce8707 100644
--- a/include/linux/arm-smccc.h
+++ b/include/linux/arm-smccc.h
@@ -182,6 +182,25 @@
 			   ARM_SMCCC_OWNER_STANDARD,		\
 			   0x53)
 
+/* Errata Management calls (defined by ARM DEN0100) */
+#define ARM_SMCCC_EM_VERSION					\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,			\
+			   ARM_SMCCC_SMC_32,			\
+			   ARM_SMCCC_OWNER_STANDARD,		\
+			   0xF0)
+
+#define ARM_SMCCC_EM_FEATURES					\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,			\
+			   ARM_SMCCC_SMC_32,			\
+			   ARM_SMCCC_OWNER_STANDARD,		\
+			   0xF1)
+
+#define ARM_SMCCC_EM_CPU_ERRATUM_FEATURES			\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,			\
+			   ARM_SMCCC_SMC_32,			\
+			   ARM_SMCCC_OWNER_STANDARD,		\
+			   0xF2)
+
 /*
  * Return codes defined in ARM DEN 0070A
  * ARM DEN 0070A is now merged/consolidated into ARM DEN 0028 C
@@ -191,6 +210,15 @@
 #define SMCCC_RET_NOT_REQUIRED			-2
 #define SMCCC_RET_INVALID_PARAMETER		-3
 
+/*
+ * Return codes defined in ARM DEN 0100
+ */
+#define	SMCCC_EM_RET_HIGHER_EL_MITIGATION	3
+#define	SMCCC_EM_RET_NOT_AFFECTED		2
+#define	SMCCC_EM_RET_AFFECTED			1
+#define	SMCCC_EM_RET_INVALID_PARAMTER		-2
+#define	SMCCC_EM_RET_UNKNOWN			-3
+
 #ifndef __ASSEMBLY__
 
 #include <linux/linkage.h>
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/3] arm64: cputype: Add new part numbers for Cortex-X3, and Neoverse-V2
  2023-02-16 18:21 [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
  2023-02-16 18:21 ` [RFC PATCH 1/3] firmware: smccc: Add support for erratum discovery API James Morse
@ 2023-02-16 18:22 ` James Morse
  2023-02-16 18:22 ` [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: James Morse @ 2023-02-16 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Lorenzo Pieralisi,
	Sudeep Holla, Marc Zyngier, Oliver Upton, James Morse

New CPUs have new errata. Add the new partnumbers.

Signed-off-by: James Morse <james.morse@arm.com>
---
Cortex-X3:
https://developer.arm.com/documentation/101593/0102/?lang=en
Neoverse-V2:
https://developer.arm.com/documentation/102375/0002/?lang=en
---
 arch/arm64/include/asm/cputype.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 683ca3af4084..1a2c55e172e8 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -84,6 +84,8 @@
 #define ARM_CPU_PART_CORTEX_X2		0xD48
 #define ARM_CPU_PART_NEOVERSE_N2	0xD49
 #define ARM_CPU_PART_CORTEX_A78C	0xD4B
+#define ARM_CPU_PART_CORTEX_X3		0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2	0xD4F
 
 #define APM_CPU_PART_POTENZA		0x000
 
@@ -149,6 +151,8 @@
 #define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
 #define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
 #define MIDR_CORTEX_A78C	MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X3		MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2	MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
 #define MIDR_THUNDERX	MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
 #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
 #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-16 18:21 [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
  2023-02-16 18:21 ` [RFC PATCH 1/3] firmware: smccc: Add support for erratum discovery API James Morse
  2023-02-16 18:22 ` [RFC PATCH 2/3] arm64: cputype: Add new part numbers for Cortex-X3, and Neoverse-V2 James Morse
@ 2023-02-16 18:22 ` James Morse
  2023-02-16 18:46   ` Marc Zyngier
  2023-02-16 18:52 ` [RFC PATCH 0/3] " Oliver Upton
  2023-02-21 14:38 ` Ard Biesheuvel
  4 siblings, 1 reply; 11+ messages in thread
From: James Morse @ 2023-02-16 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Lorenzo Pieralisi,
	Sudeep Holla, Marc Zyngier, Oliver Upton, James Morse

Force Write Back (FWB) allows the hypervisor to force non-cacheable
accesses made by a guest to be cacheable. This saves the hypervisor
from doing cache maintenance on all pages the guest can access, to
ensure the guest doesn't see stale (and possibly sensitive) data when
making a non-cacheable access.

When stage1 translation is disabled, the SCTRL_E1.I bit controls the
attributes used for instruction fetch, one of the options results in a
non-cacheable access. A whole host of CPUs missed the FWB override
in this case, meaning a KVM guest could fetch stale/junk data instead of
instructions.

The workaround is to always do the cache maintenance. These parts don't
have fine-grained-traps, so it isn't feasible to detect the guest
disabling the MMU. Instead, disable FWB on the host.

While the CPUs are affected, this erratum doesn't occur on parts using
Arm's CMN interconnects. Use the Errata Management API to discover whether
this CPU is affected.

Because guest execution is compromised, the workaround is enabled by
default. If the Errata Management API isn't implemented by firmware, the
workaround will be enabled. If a target platform is not affected, and it
isn't possible to add support for the Errata Management API, the erratum
can be disabled in Kconfig.

Signed-off-by: James Morse <james.morse@arm.com>
---
 Documentation/arm64/silicon-errata.rst | 18 +++++++
 arch/arm64/Kconfig                     | 27 ++++++++++
 arch/arm64/kernel/cpufeature.c         | 71 +++++++++++++++++++++++++-
 3 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index ec5f889d7681..d6ca86ebc7af 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -106,6 +106,10 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A77      | #1508412        | ARM64_ERRATUM_1508412       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A78      | #2712571        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A78C     | #2712575,2712572| ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A510     | #2051678        | ARM64_ERRATUM_2051678       |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A510     | #2077057        | ARM64_ERRATUM_2077057       |
@@ -120,12 +124,20 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A710     | #2224489        | ARM64_ERRATUM_2224489       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A710     | #2701952        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A715     | #2645198        | ARM64_ERRATUM_2645198       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-X1       | #2712571        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-X2       | #2119858        | ARM64_ERRATUM_2119858       |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-X2       | #2224489        | ARM64_ERRATUM_2224489       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-X2       | #2701952        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-X3       | #2701951        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1349291        | N/A                         |
@@ -138,6 +150,12 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N2     | #2253138        | ARM64_ERRATUM_2253138       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Neoverse-N2     | #2728475        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Neoverse-V1     | #2701953        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Neoverse-V2     | #2719103        | ARM64_ERRATUM_2701951       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | MMU-500         | #841119,826419  | N/A                         |
 +----------------+-----------------+-----------------+-----------------------------+
 +----------------+-----------------+-----------------+-----------------------------+
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c5ccca26a408..adc46e82cee6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -986,6 +986,33 @@ config ARM64_ERRATUM_2645198
 
 	  If unsure, say Y.
 
+config ARM64_ERRATUM_2701951
+	bool "ARM CPUs: 2701951: disable FWB on affected parts"
+	select ARM_SMCCC_EM
+	default y
+	help
+	  This option adds the workaround for multiple ARM errata titled
+	  "The core might fetch stale instruction from memory when both Stage 1
+	   Translation and Instruction Cache are Disabled with Stage 2 forced
+	   Write-Back".
+	  This affects Cortex cores: A78, A78C, A710, X1, X2, X3, and Neoverse
+	  cores: V1, V2 and N2.
+
+	  Affected cores fail to apply the FWB override to instruction fetch
+	  when stage1 translation is disabled, and SCTLR_EL1.I is clear. This
+	  results in stale data being fetched and executed. Only CPUs that are
+	  connected to a non-Arm interconnect will exhibit symptoms due to this
+	  errata.
+
+	  Work around this problem in the driver by disabling FWB on affected
+	  parts. The SMCCC Errata Management API is used to query firmware to
+	  learn if the part is affected.
+
+	  If the SMCCC Errata Management API is not implemented on a platform
+	  with an affected core, the workaround will be applied.
+
+	  If unsure, say Y.
+
 config CAVIUM_ERRATUM_22375
 	bool "Cavium erratum 22375, 24313"
 	default y
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 2eb4d38e491a..1d7156e75468 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1574,6 +1574,75 @@ static bool has_cache_dic(const struct arm64_cpu_capabilities *entry,
 	return ctr & BIT(CTR_EL0_DIC_SHIFT);
 }
 
+static bool has_stage2_fwb(const struct arm64_cpu_capabilities *entry,
+			   int scope)
+{
+	bool has_feature = has_cpuid_feature(entry, scope);
+
+	/* List of CPUs which may have broken FWB support. */
+	static const struct midr_range cpus[] = {
+#ifdef CONFIG_ARM64_ERRATUM_2701951
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+		MIDR_RANGE(MIDR_CORTEX_X3, 0, 0, 1, 1),
+		MIDR_RANGE(MIDR_NEOVERSE_V1, 0, 0, 1, 1),
+		MIDR_RANGE(MIDR_NEOVERSE_V2, 0, 0, 0, 1),
+		MIDR_RANGE(MIDR_NEOVERSE_N2, 0, 0, 0, 2),
+#endif
+		{ /* sentinel */ },
+	};
+
+	if (!has_feature)
+		return false;
+
+	if (is_midr_in_range_list(read_cpuid_id(), cpus)) {
+		int i;
+		bool fwb_broken = true;
+
+		/*
+		 * List of erratum numbers for these CPUs.
+		 * It isn't possible to match these to their CPUs, as A78C has
+		 * two erratum numbers. The errata management API will return
+		 * 'UNKNOWN' for an erratum it doesn't recognise.
+		 */
+		static const u32 erratum_nums[] = {
+			2701951,
+			2701952,
+			2701953,
+			2712571,
+			2712572,
+			2712575,
+			2719103,
+			2728475,
+		};
+
+		/*
+		 * The CPU is affected, but what about this configuration?
+		 * Only firmware has the answer. Assume the part is affected,
+		 * and query firmware for the set of erratum numbers. If one
+		 * returns not-affected, the workaround isn't needed.
+		 */
+		for (i = 0; i < ARRAY_SIZE(erratum_nums); i++) {
+			int state = arm_smccc_em_cpu_features(erratum_nums[i]);
+
+			if (state == SMCCC_EM_RET_NOT_AFFECTED) {
+				fwb_broken = false;
+				break;
+			}
+		}
+
+		if (fwb_broken) {
+			pr_info_once("%s disabled due to erratum #2701951\n", entry->desc);
+			return false;
+		}
+	}
+
+	return has_feature;
+}
+
 static bool __maybe_unused
 has_useable_cnp(const struct arm64_cpu_capabilities *entry, int scope)
 {
@@ -2365,7 +2434,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64MMFR2_EL1_FWB_SHIFT,
 		.field_width = 4,
 		.min_field_value = 1,
-		.matches = has_cpuid_feature,
+		.matches = has_stage2_fwb,
 	},
 	{
 		.desc = "ARMv8.4 Translation Table Level",
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-16 18:22 ` [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
@ 2023-02-16 18:46   ` Marc Zyngier
  2023-02-21 17:48     ` James Morse
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2023-02-16 18:46 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Oliver Upton

On Thu, 16 Feb 2023 18:22:01 +0000,
James Morse <james.morse@arm.com> wrote:
> 
> Force Write Back (FWB) allows the hypervisor to force non-cacheable
> accesses made by a guest to be cacheable. This saves the hypervisor
> from doing cache maintenance on all pages the guest can access, to
> ensure the guest doesn't see stale (and possibly sensitive) data when
> making a non-cacheable access.
> 
> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
> attributes used for instruction fetch, one of the options results in a
> non-cacheable access. A whole host of CPUs missed the FWB override
> in this case, meaning a KVM guest could fetch stale/junk data instead of
> instructions.
> 
> The workaround is to always do the cache maintenance. These parts don't
> have fine-grained-traps, so it isn't feasible to detect the guest
> disabling the MMU. Instead, disable FWB on the host.
> 
> While the CPUs are affected, this erratum doesn't occur on parts using
> Arm's CMN interconnects. Use the Errata Management API to discover whether
> this CPU is affected.
> 
> Because guest execution is compromised, the workaround is enabled by
> default. If the Errata Management API isn't implemented by firmware, the
> workaround will be enabled. If a target platform is not affected, and it
> isn't possible to add support for the Errata Management API, the erratum
> can be disabled in Kconfig.

I'm feeling a bit sick...

My main concern is hardly anyone implements this errata management
API, if at all. We should:

- give people an option to disable this from the command-line if they
  know they are on an unaffected system

- have some form of DT property that indicates the HW isn't affected

Thoughts?

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-16 18:21 [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
                   ` (2 preceding siblings ...)
  2023-02-16 18:22 ` [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
@ 2023-02-16 18:52 ` Oliver Upton
  2023-02-21 17:41   ` James Morse
  2023-02-21 14:38 ` Ard Biesheuvel
  4 siblings, 1 reply; 11+ messages in thread
From: Oliver Upton @ 2023-02-16 18:52 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Marc Zyngier

Hi James,

On Thu, Feb 16, 2023 at 06:21:58PM +0000, James Morse wrote:
> Hello!
> 
> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
> attributes used for instruction fetch, one of the options results in a
> non-cacheable access. A whole host of CPUs missed the FWB override
> in this case, meaning a KVM guest could fetch stale/junk data instead of
> instructions.
> 
> The workaround is to disable FWB, and do the required cache maintenance
> instead.
> 
> The good news is, this isn't a problem for systems using Arm's
> interconnect IP. The bad news is: linux can't know this. Arm knows of
> at least one platform that is affected by this erratum.
> 
> 
> This series adds support for the 'Errata Management Firmware Interface', [0]
> and queries that to determine if the CPU is affected or not.
> 
> Unfortunately, no-one has firmware that supports this new interface yet,
> and the least surprising thing to do is to enable the workaround by default,
> meaning FWB is disabled on all these cores, even for unaffected platforms.
> Platforms that are not-affected can either take a firmware-update to support
> the interface, or if the kernel they run will only run on hardware that is
> unaffected, disable the workaround at build time.

Wait, what? Is there a legitimate concern that affected systems are in
the wild today, or is there enough time for affected platforms to go and
implement the necessary firmware interface? Requiring correctly
implemented systems to explicitly opt-out seems like quite a lot more
work (w/ low likelihood) than having the one known platform go about
this the right way.

I'm rather troubled by the idea of enabling this by default on systems
that use these cores unless there really is no opportunity to
course-correct.

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-16 18:21 [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
                   ` (3 preceding siblings ...)
  2023-02-16 18:52 ` [RFC PATCH 0/3] " Oliver Upton
@ 2023-02-21 14:38 ` Ard Biesheuvel
  2023-02-21 17:41   ` James Morse
  4 siblings, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2023-02-21 14:38 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Marc Zyngier, Oliver Upton

On Thu, 16 Feb 2023 at 19:23, James Morse <james.morse@arm.com> wrote:
>
> Hello!
>
> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
> attributes used for instruction fetch, one of the options results in a
> non-cacheable access. A whole host of CPUs missed the FWB override
> in this case, meaning a KVM guest could fetch stale/junk data instead of
> instructions.
>
> The workaround is to disable FWB, and do the required cache maintenance
> instead.
>

So the system should behave as if SCTLR_EL1.I==1 when FWB is enabled,
but it doesn't, right? Couldn't we just force SCTLR_EL1.I to 1 when
FWB is enabled? I.e., trap writes and override the I bit - and if we
want to pretend it is 0 we could trap reads and lie to the guest as
well, but I doubt we'd even need that.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-16 18:52 ` [RFC PATCH 0/3] " Oliver Upton
@ 2023-02-21 17:41   ` James Morse
  2023-02-24 19:00     ` Oliver Upton
  0 siblings, 1 reply; 11+ messages in thread
From: James Morse @ 2023-02-21 17:41 UTC (permalink / raw)
  To: Oliver Upton
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Marc Zyngier

Hi Oliver,

On 16/02/2023 18:52, Oliver Upton wrote:
> On Thu, Feb 16, 2023 at 06:21:58PM +0000, James Morse wrote:
>> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
>> attributes used for instruction fetch, one of the options results in a
>> non-cacheable access. A whole host of CPUs missed the FWB override
>> in this case, meaning a KVM guest could fetch stale/junk data instead of
>> instructions.
>>
>> The workaround is to disable FWB, and do the required cache maintenance
>> instead.
>>
>> The good news is, this isn't a problem for systems using Arm's
>> interconnect IP. The bad news is: linux can't know this. Arm knows of
>> at least one platform that is affected by this erratum.
>>
>>
>> This series adds support for the 'Errata Management Firmware Interface', [0]
>> and queries that to determine if the CPU is affected or not.
>>
>> Unfortunately, no-one has firmware that supports this new interface yet,
>> and the least surprising thing to do is to enable the workaround by default,
>> meaning FWB is disabled on all these cores, even for unaffected platforms.
>> Platforms that are not-affected can either take a firmware-update to support
>> the interface, or if the kernel they run will only run on hardware that is
>> unaffected, disable the workaround at build time.

> Wait, what? Is there a legitimate concern that affected systems are in
> the wild today, or is there enough time for affected platforms to go and
> implement the necessary firmware interface?

The one platform that arm is aware of isn't shipping yet - I assume it will implement the
firmware interface.

But I don't think arm always know what it is people are building ... it certainly doesn't
reach me. This affects a whole host of CPUs, I wouldn't be surprised if there is an
existing part out there that is affected.


> Requiring correctly
> implemented systems to explicitly opt-out seems like quite a lot more
> work (w/ low likelihood) than having the one known platform go about
> this the right way.

Sure, but its safe by default.


> I'm rather troubled by the idea of enabling this by default on systems
> that use these cores unless there really is no opportunity to
> course-correct.

It's the choice between correctness and performance. Probability says unless the CPU is
Neoverse-V2 (which is that one platform), you're not affected. But how much does
correctness matter? I'd hate to have to debug "1 in a 100 times the guest doesn't boot".


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-21 14:38 ` Ard Biesheuvel
@ 2023-02-21 17:41   ` James Morse
  0 siblings, 0 replies; 11+ messages in thread
From: James Morse @ 2023-02-21 17:41 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Marc Zyngier, Oliver Upton

Hi Ard,

On 21/02/2023 14:38, Ard Biesheuvel wrote:
> On Thu, 16 Feb 2023 at 19:23, James Morse <james.morse@arm.com> wrote:
>> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
>> attributes used for instruction fetch, one of the options results in a
>> non-cacheable access. A whole host of CPUs missed the FWB override
>> in this case, meaning a KVM guest could fetch stale/junk data instead of
>> instructions.
>>
>> The workaround is to disable FWB, and do the required cache maintenance
>> instead.

> So the system should behave as if SCTLR_EL1.I==1 when FWB is enabled,
> but it doesn't, right? Couldn't we just force SCTLR_EL1.I to 1 when
> FWB is enabled? I.e., trap writes and override the I bit - and if we
> want to pretend it is 0 we could trap reads and lie to the guest as
> well, but I doubt we'd even need that.

The affected parts don't have fine-grained traps, so we'd need to set HCR_EL2.TVM, which
traps loads of things. We'd only need it while the guest has the MMU disabled, and KVM
already has code that uses this trap to try and spot this ...

... but it only works until the first time you enable SCTRL_EL1.M as the trap is too
costly to leave enabled. If you put the workaround in there, it would work the first time
a guest booted, but a subsequent kexec, or any other reason to turn the MMU off is
exposed. Its an incomplete fix, I'd hate to have to debug it!


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-16 18:46   ` Marc Zyngier
@ 2023-02-21 17:48     ` James Morse
  0 siblings, 0 replies; 11+ messages in thread
From: James Morse @ 2023-02-21 17:48 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Oliver Upton

Hi Marc,

On 16/02/2023 18:46, Marc Zyngier wrote:
> On Thu, 16 Feb 2023 18:22:01 +0000,
> James Morse <james.morse@arm.com> wrote:
>>
>> Force Write Back (FWB) allows the hypervisor to force non-cacheable
>> accesses made by a guest to be cacheable. This saves the hypervisor
>> from doing cache maintenance on all pages the guest can access, to
>> ensure the guest doesn't see stale (and possibly sensitive) data when
>> making a non-cacheable access.
>>
>> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
>> attributes used for instruction fetch, one of the options results in a
>> non-cacheable access. A whole host of CPUs missed the FWB override
>> in this case, meaning a KVM guest could fetch stale/junk data instead of
>> instructions.
>>
>> The workaround is to always do the cache maintenance. These parts don't
>> have fine-grained-traps, so it isn't feasible to detect the guest
>> disabling the MMU. Instead, disable FWB on the host.
>>
>> While the CPUs are affected, this erratum doesn't occur on parts using
>> Arm's CMN interconnects. Use the Errata Management API to discover whether
>> this CPU is affected.
>>
>> Because guest execution is compromised, the workaround is enabled by
>> default. If the Errata Management API isn't implemented by firmware, the
>> workaround will be enabled. If a target platform is not affected, and it
>> isn't possible to add support for the Errata Management API, the erratum
>> can be disabled in Kconfig.

> I'm feeling a bit sick...

> My main concern is hardly anyone implements this errata management
> API, if at all. We should:

If anyone? Today no-one implements it!

We've always had to update one of the firmware or kernel for any errata workaround. I
agree this 'both' option is annoying, but if half the story was missing, you already had a
problem.


> - give people an option to disable this from the command-line if they
>   know they are on an unaffected system

(my least favourite)


> - have some form of DT property that indicates the HW isn't affected

All perfectly valid options. The one part Arm is aware that is affected uses Neoverse-V2,
which is much more likely to appear in ACPI machines. The firmware discovery is preferable
to trying to match the 'OEM id' of some random ACPI to determine if the part is affected -
that whole model falls down if the SoC is OEM'd. (Dell, HP, Lenovo, etc)

I think its fair to say you have to support the firmware discovery API if you use ACPI,
and its optional for DT.


Thanks,

James

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects
  2023-02-21 17:41   ` James Morse
@ 2023-02-24 19:00     ` Oliver Upton
  0 siblings, 0 replies; 11+ messages in thread
From: Oliver Upton @ 2023-02-24 19:00 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Rutland,
	Lorenzo Pieralisi, Sudeep Holla, Marc Zyngier

James,

I realize I didn't send my reply earlier this week and came back to this
when looking for your reply on another thread.

Sorry about that.

On Tue, Feb 21, 2023 at 05:41:35PM +0000, James Morse wrote:

[...]

> > Wait, what? Is there a legitimate concern that affected systems are in
> > the wild today, or is there enough time for affected platforms to go and
> > implement the necessary firmware interface?
> 
> The one platform that arm is aware of isn't shipping yet - I assume it will implement the
> firmware interface.
> 
> But I don't think arm always know what it is people are building ... it certainly doesn't
> reach me. This affects a whole host of CPUs, I wouldn't be surprised if there is an
> existing part out there that is affected.

I was only thinking of the V2 system that is unobtainable at this point.
Actually looking at the laundry list of affected cores it does make more
sense that this problem already exists today.

> > I'm rather troubled by the idea of enabling this by default on systems
> > that use these cores unless there really is no opportunity to
> > course-correct.
> 
> It's the choice between correctness and performance. Probability says unless the CPU is
> Neoverse-V2 (which is that one platform), you're not affected. But how much does
> correctness matter? I'd hate to have to debug "1 in a 100 times the guest doesn't boot".

Oh, not looking to make that tradeoff with my line of questioning :) I
was more curious if there was still an opportunity for affected systems
to explicitly opt-in to the mitigation. Seems that the answer is "no",
sadly.

-- 
Thanks,
Oliver

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-02-24 19:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-16 18:21 [RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
2023-02-16 18:21 ` [RFC PATCH 1/3] firmware: smccc: Add support for erratum discovery API James Morse
2023-02-16 18:22 ` [RFC PATCH 2/3] arm64: cputype: Add new part numbers for Cortex-X3, and Neoverse-V2 James Morse
2023-02-16 18:22 ` [RFC PATCH 3/3] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
2023-02-16 18:46   ` Marc Zyngier
2023-02-21 17:48     ` James Morse
2023-02-16 18:52 ` [RFC PATCH 0/3] " Oliver Upton
2023-02-21 17:41   ` James Morse
2023-02-24 19:00     ` Oliver Upton
2023-02-21 14:38 ` Ard Biesheuvel
2023-02-21 17:41   ` James Morse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).