linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Initial BBML2 support for contpte_convert()
@ 2025-03-19 15:05 Mikołaj Lenczewski
  2025-03-19 15:05 ` [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-19 15:05 UTC (permalink / raw)
  To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, ardb, mark.rutland,
	joey.gouly, maz, james.morse, broonie, oliver.upton, baohua,
	david, ioworker0, jgg, nicolinc, mshavit, jsnitsel, smostafa,
	linux-doc, linux-kernel, linux-arm-kernel, iommu
  Cc: Mikołaj Lenczewski

Hi All,

This patch series adds initial support for eliding break-before-make
requirements on systems that support BBML2 and additionally guarantee
to never raise a conflict abort.

This support elides a TLB invalidation in contpte_convert(). This
leads to a 12% improvement when executing a microbenchmark designed
to force the pathological path where contpte_convert() gets called.
This represents an 80% reduction in the cost of calling
contpte_convert().

This series is based on v6.14-rc5 (7eb172143d55).

Notes
======

Patch 1 implements an allow-list of cpus that support BBML2, but with
the additional constraint of never causing TLB conflict aborts. We
settled on this constraint because we will use the feature for kernel
mappings in the future, for which we cannot handle conflict aborts
safely.

Yang Shi has a series at [1] that aims to use BBML2 to enable splitting
the linear map at runtime. This series partially overlaps with it to add
the cpu feature. We believe this series is fully compatible with Yang's
requirements and could go first.

Due to constraints with the current design of the cpufeature framework
and the fact that our has_bbml2_noabort() check relies on both a MIDR
allowlist and the exposed mmfr2 register value, if an implementation
supports our desired bbml2+noabort semantics but fails to declare
support for base bbml2 via the id_aa64mmfr2.bbm field, the check will
fail.

Not declaring base support for bbml2 when supporting bbml2+noabort
should be considered an erratum [2], and a workaround can be applied in
__cpuinfo_store_cpu() to patch in support for bbml2 for the sanitised
register view used by SCOPE_SYSTEM. However, SCOPE_LOCAL_CPU bypasses
this sanitised view and reads the MSRs directly by design, and so an
additional workaround can be applied in __read_sysreg_by_encoding()
for the MMFR2 case.

[1]:
  https://lore.kernel.org/linux-arm-kernel/20250304222018.615808-1-yang@os.amperecomputing.com/

[2]:
  https://lore.kernel.org/linux-arm-kernel/3bba7adb-392b-4024-984f-b6f0f0f88629@arm.com/

Changelog
=========

v4:
  - rebase onto v6.14-rc5
  - switch from arm64 sw feature override to hw feature override
  - reintroduce has_cpuid_feature() check in addition to MIDR check

v3:
  - https://lore.kernel.org/all/20250313104111.24196-2-miko.lenczewski@arm.com/
  - rebase onto v6.14-rc4
  - add arm64.nobbml2 commandline override
  - squash "delay tlbi" and "elide tlbi" patches

v2:
  - https://lore.kernel.org/all/20250228182403.6269-2-miko.lenczewski@arm.com/
  - fix buggy MIDR check to properly account for all boot+late cpus
  - add smmu bbml2 feature check

v1:
  - https://lore.kernel.org/all/20250219143837.44277-3-miko.lenczewski@arm.com/
  - rebase onto v6.14-rc3
  - remove kvm bugfix patches from series
  - strip out conflict abort handler code
  - switch from blocklist to allowlist of bmml2+noabort implementations
  - remove has_cpuid_feature() in favour of MIDR check

rfc-v1:
  - https://lore.kernel.org/all/20241211154611.40395-1-miko.lenczewski@arm.com/
  - https://lore.kernel.org/all/20241211160218.41404-1-miko.lenczewski@arm.com/

Mikołaj Lenczewski (3):
  arm64: Add BBM Level 2 cpu feature
  iommu/arm: Add BBM Level 2 smmu feature
  arm64/mm: Elide tlbi in contpte_convert() under BBML2

 .../admin-guide/kernel-parameters.txt         |  3 +
 arch/arm64/Kconfig                            | 11 +++
 arch/arm64/include/asm/cpucaps.h              |  2 +
 arch/arm64/include/asm/cpufeature.h           |  5 ++
 arch/arm64/kernel/cpufeature.c                | 68 +++++++++++++++++++
 arch/arm64/kernel/pi/idreg-override.c         |  2 +
 arch/arm64/mm/contpte.c                       |  3 +-
 arch/arm64/tools/cpucaps                      |  1 +
 .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  3 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  3 +
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  4 ++
 11 files changed, 104 insertions(+), 1 deletion(-)

-- 
2.48.1



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-19 15:05 [PATCH v4 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
@ 2025-03-19 15:05 ` Mikołaj Lenczewski
  2025-03-20 13:24   ` Suzuki K Poulose
  2025-03-20 13:37   ` Ard Biesheuvel
  2025-03-19 15:05 ` [PATCH v4 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
  2025-03-19 15:05 ` [PATCH v4 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
  2 siblings, 2 replies; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-19 15:05 UTC (permalink / raw)
  To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, ardb, mark.rutland,
	joey.gouly, maz, james.morse, broonie, oliver.upton, baohua,
	david, ioworker0, jgg, nicolinc, mshavit, jsnitsel, smostafa,
	linux-doc, linux-kernel, linux-arm-kernel, iommu
  Cc: Mikołaj Lenczewski

The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
and this commit adds a dedicated BBML2 cpufeature to test against
support for, as well as a kernel commandline parameter to optionally
disable BBML2 altogether.

This is a system feature as we might have a big.LITTLE architecture
where some cores support BBML2 and some don't, but we want all cores to
be available and BBM to default to level 0 (as opposed to having cores
without BBML2 not coming online).

To support BBML2 in as wide a range of contexts as we can, we want not
only the architectural guarantees that BBML2 makes, but additionally
want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
us having to prove that no recursive faults can be induced in any path
that uses BBML2, allowing its use for arbitrary kernel mappings.
Support detection of such CPUs.

Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
---
 .../admin-guide/kernel-parameters.txt         |  3 +
 arch/arm64/Kconfig                            | 11 +++
 arch/arm64/include/asm/cpucaps.h              |  2 +
 arch/arm64/include/asm/cpufeature.h           |  5 ++
 arch/arm64/kernel/cpufeature.c                | 68 +++++++++++++++++++
 arch/arm64/kernel/pi/idreg-override.c         |  2 +
 arch/arm64/tools/cpucaps                      |  1 +
 7 files changed, 92 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index fb8752b42ec8..3e4cc917a07e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -453,6 +453,9 @@
 	arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
 			32 bit applications.
 
+	arm64.nobbml2	[ARM64] Unconditionally disable Break-Before-Make Level
+			2 support
+
 	arm64.nobti	[ARM64] Unconditionally disable Branch Target
 			Identification support
 
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 940343beb3d4..49deda2b22ae 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
 	  The feature introduces new assembly instructions, and they were
 	  support when binutils >= 2.30.
 
+config ARM64_BBML2_NOABORT
+	bool "Enable support for Break-Before-Make Level 2 detection and usage"
+	default y
+	help
+	  FEAT_BBM provides detection of support levels for break-before-make
+	  sequences. If BBM level 2 is supported, some TLB maintenance requirements
+	  can be relaxed to improve performance. We additonally require the
+	  property that the implementation cannot ever raise TLB Conflict Aborts.
+	  Selecting N causes the kernel to fallback to BBM level 0 behaviour
+	  even if the system supports BBM level 2.
+
 endmenu # "ARMv8.4 architectural features"
 
 menu "ARMv8.5 architectural features"
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 0b5ca6e0eb09..2d6db33d4e45 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
 		return IS_ENABLED(CONFIG_ARM64_PAN);
 	case ARM64_HAS_EPAN:
 		return IS_ENABLED(CONFIG_ARM64_EPAN);
+	case ARM64_HAS_BBML2_NOABORT:
+		return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
 	case ARM64_SVE:
 		return IS_ENABLED(CONFIG_ARM64_SVE);
 	case ARM64_SME:
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index e0e4478f5fb5..108ef3fbbc00 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -866,6 +866,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
 	return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
 }
 
+static inline bool system_supports_bbml2_noabort(void)
+{
+	return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
+}
+
 int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
 bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d561cf3b8ac7..1a4adcda267b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
 	return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
 }
 
+static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
+{
+	/* We want to allow usage of bbml2 in as wide a range of kernel contexts
+	 * as possible. This list is therefore an allow-list of known-good
+	 * implementations that both support bbml2 and additionally, fulfill the
+	 * extra constraint of never generating TLB conflict aborts when using
+	 * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
+	 * kernel contexts difficult to prove safe against recursive aborts).
+	 *
+	 * Note that implementations can only be considered "known-good" if their
+	 * implementors attest to the fact that the implementation never raises
+	 * TLBI conflict aborts for bbml2 mapping granularity changes.
+	 */
+	static const struct midr_range supports_bbml2_noabort_list[] = {
+		MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
+		MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
+		{}
+	};
+
+	return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
+}
+
+static inline unsigned int cpu_read_midr(int cpu)
+{
+	WARN_ON_ONCE(!cpu_online(cpu));
+
+	return per_cpu(cpu_data, cpu).reg_midr;
+}
+
+static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
+{
+	if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
+		return false;
+
+	if (scope & SCOPE_SYSTEM) {
+		int cpu;
+
+		/* We are a boot CPU, and must verify that all enumerated boot
+		 * CPUs have MIDR values within our allowlist. Otherwise, we do
+		 * not allow the BBML2 feature to avoid potential faults when
+		 * the insufficient CPUs access memory regions using BBML2
+		 * semantics.
+		 */
+		for_each_online_cpu(cpu) {
+			if (!cpu_has_bbml2_noabort(cpu_read_midr(cpu)))
+				return false;
+		}
+	} else if (scope & SCOPE_LOCAL_CPU) {
+		/* We are a hot-plugged CPU, so must only check our MIDR.
+		 * If we have the correct MIDR, but the kernel booted on an
+		 * insufficient CPU, we will not use BBML2 (this is safe). If
+		 * we have an incorrect MIDR, but the kernel booted on a
+		 * sufficient CPU, we will not bring up this CPU.
+		 */
+		if (!cpu_has_bbml2_noabort(read_cpuid_id()))
+			return false;
+	}
+
+	return has_cpuid_feature(caps, scope);
+}
+
 #ifdef CONFIG_ARM64_PAN
 static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused)
 {
@@ -2926,6 +2987,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = has_cpuid_feature,
 		ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
 	},
+	{
+		.desc = "BBM Level 2 without conflict abort",
+		.capability = ARM64_HAS_BBML2_NOABORT,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_bbml2_noabort,
+		ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, BBM, 2)
+	},
 	{
 		.desc = "52-bit Virtual Addressing for KVM (LPA2)",
 		.capability = ARM64_HAS_LPA2,
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index c6b185b885f7..803a0c99f7b4 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
 	.override	= &id_aa64mmfr2_override,
 	.fields		= {
 		FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter),
+		FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL),
 		{}
 	},
 };
@@ -246,6 +247,7 @@ static const struct {
 	{ "rodata=off",			"arm64_sw.rodataoff=1" },
 	{ "arm64.nolva",		"id_aa64mmfr2.varange=0" },
 	{ "arm64.no32bit_el0",		"id_aa64pfr0.el0=1" },
+	{ "arm64.nobbml2",		"id_aa64mmfr2.bbm=0" },
 };
 
 static int __init parse_hexdigit(const char *p, u64 *v)
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 1e65f2fb45bd..b03a375e5507 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -14,6 +14,7 @@ HAS_ADDRESS_AUTH_ARCH_QARMA5
 HAS_ADDRESS_AUTH_IMP_DEF
 HAS_AMU_EXTN
 HAS_ARMv8_4_TTL
+HAS_BBML2_NOABORT
 HAS_CACHE_DIC
 HAS_CACHE_IDC
 HAS_CNP
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/3] iommu/arm: Add BBM Level 2 smmu feature
  2025-03-19 15:05 [PATCH v4 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
  2025-03-19 15:05 ` [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
@ 2025-03-19 15:05 ` Mikołaj Lenczewski
  2025-03-19 15:05 ` [PATCH v4 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
  2 siblings, 0 replies; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-19 15:05 UTC (permalink / raw)
  To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, ardb, mark.rutland,
	joey.gouly, maz, james.morse, broonie, oliver.upton, baohua,
	david, ioworker0, jgg, nicolinc, mshavit, jsnitsel, smostafa,
	linux-doc, linux-kernel, linux-arm-kernel, iommu
  Cc: Mikołaj Lenczewski

For supporting BBM Level 2 for userspace mappings, we want to ensure
that the smmu also supports its own version of BBM Level 2. Luckily, the
smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI
0487K.a D8.16.2), so already guarantees that no aborts are raised when
BBM level 2 is claimed.

Add the feature and testing for it under arm_smmu_sva_supported().

Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c     | 3 +++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h     | 4 ++++
 3 files changed, 10 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 9ba596430e7c..6ba182572788 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -222,6 +222,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
 		feat_mask |= ARM_SMMU_FEAT_VAX;
 	}
 
+	if (system_supports_bbml2_noabort())
+		feat_mask |= ARM_SMMU_FEAT_BBML2;
+
 	if ((smmu->features & feat_mask) != feat_mask)
 		return false;
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 358072b4e293..dcee0bdec924 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4406,6 +4406,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
 	if (FIELD_GET(IDR3_RIL, reg))
 		smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
 
+	if (FIELD_GET(IDR3_BBML, reg) == IDR3_BBML2)
+		smmu->features |= ARM_SMMU_FEAT_BBML2;
+
 	/* IDR5 */
 	reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
 
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index bd9d7c85576a..85eaf3ab88c2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -60,6 +60,9 @@ struct arm_smmu_device;
 #define ARM_SMMU_IDR3			0xc
 #define IDR3_FWB			(1 << 8)
 #define IDR3_RIL			(1 << 10)
+#define IDR3_BBML			GENMASK(12, 11)
+#define IDR3_BBML1			(1 << 11)
+#define IDR3_BBML2			(2 << 11)
 
 #define ARM_SMMU_IDR5			0x14
 #define IDR5_STALL_MAX			GENMASK(31, 16)
@@ -754,6 +757,7 @@ struct arm_smmu_device {
 #define ARM_SMMU_FEAT_HA		(1 << 21)
 #define ARM_SMMU_FEAT_HD		(1 << 22)
 #define ARM_SMMU_FEAT_S2FWB		(1 << 23)
+#define ARM_SMMU_FEAT_BBML2		(1 << 24)
 	u32				features;
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2
  2025-03-19 15:05 [PATCH v4 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
  2025-03-19 15:05 ` [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
  2025-03-19 15:05 ` [PATCH v4 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
@ 2025-03-19 15:05 ` Mikołaj Lenczewski
  2 siblings, 0 replies; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-19 15:05 UTC (permalink / raw)
  To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, ardb, mark.rutland,
	joey.gouly, maz, james.morse, broonie, oliver.upton, baohua,
	david, ioworker0, jgg, nicolinc, mshavit, jsnitsel, smostafa,
	linux-doc, linux-kernel, linux-arm-kernel, iommu
  Cc: Mikołaj Lenczewski

When converting a region via contpte_convert() to use mTHP, we have two
different goals. We have to mark each entry as contiguous, and we would
like to smear the dirty and young (access) bits across all entries in
the contiguous block. Currently, we do this by first accumulating the
dirty and young bits in the block, using an atomic
__ptep_get_and_clear() and the relevant pte_{dirty,young}() calls,
performing a tlbi, and finally smearing the correct bits across the
block using __set_ptes().

This approach works fine for BBM level 0, but with support for BBM level
2 we are allowed to reorder the tlbi to after setting the pagetable
entries. We expect the time cost of a tlbi to be much greater than the
cost of clearing and resetting the PTEs. As such, this reordering of the
tlbi outside the window where our PTEs are invalid greatly reduces the
duration the PTE are visibly invalid for other threads. This reduces the
likelyhood of a concurrent page walk finding an invalid PTE, reducing
the likelyhood of a fault in other threads, and improving performance
(more so when there are more threads).

Because we support via allowlist only bbml2 implementations that never
raise conflict aborts and instead invalidate the tlb entries
automatically in hardware, we can avoid the final flush altogether.
Avoiding flushes is a win.

Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/mm/contpte.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
index 55107d27d3f8..77ed03b30b72 100644
--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -68,7 +68,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
 			pte = pte_mkyoung(pte);
 	}
 
-	__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
+	if (!system_supports_bbml2_noabort())
+		__flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
 
 	__set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES);
 }
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-19 15:05 ` [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
@ 2025-03-20 13:24   ` Suzuki K Poulose
  2025-03-20 17:09     ` Mikołaj Lenczewski
  2025-03-20 13:37   ` Ard Biesheuvel
  1 sibling, 1 reply; 11+ messages in thread
From: Suzuki K Poulose @ 2025-03-20 13:24 UTC (permalink / raw)
  To: Mikołaj Lenczewski, ryan.roberts, yang, corbet,
	catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
	ardb, mark.rutland, joey.gouly, maz, james.morse, broonie,
	oliver.upton, baohua, david, ioworker0, jgg, nicolinc, mshavit,
	jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
	iommu

On 19/03/2025 15:05, Mikołaj Lenczewski wrote:
> The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
> and this commit adds a dedicated BBML2 cpufeature to test against
> support for, as well as a kernel commandline parameter to optionally
> disable BBML2 altogether.
> 
> This is a system feature as we might have a big.LITTLE architecture
> where some cores support BBML2 and some don't, but we want all cores to
> be available and BBM to default to level 0 (as opposed to having cores
> without BBML2 not coming online).
> 
> To support BBML2 in as wide a range of contexts as we can, we want not
> only the architectural guarantees that BBML2 makes, but additionally
> want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
> us having to prove that no recursive faults can be induced in any path
> that uses BBML2, allowing its use for arbitrary kernel mappings.
> Support detection of such CPUs.
> 
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
> ---
>   .../admin-guide/kernel-parameters.txt         |  3 +
>   arch/arm64/Kconfig                            | 11 +++
>   arch/arm64/include/asm/cpucaps.h              |  2 +
>   arch/arm64/include/asm/cpufeature.h           |  5 ++
>   arch/arm64/kernel/cpufeature.c                | 68 +++++++++++++++++++
>   arch/arm64/kernel/pi/idreg-override.c         |  2 +
>   arch/arm64/tools/cpucaps                      |  1 +
>   7 files changed, 92 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index fb8752b42ec8..3e4cc917a07e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -453,6 +453,9 @@
>   	arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
>   			32 bit applications.
>   
> +	arm64.nobbml2	[ARM64] Unconditionally disable Break-Before-Make Level
> +			2 support
> +
>   	arm64.nobti	[ARM64] Unconditionally disable Branch Target
>   			Identification support
>   
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 940343beb3d4..49deda2b22ae 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
>   	  The feature introduces new assembly instructions, and they were
>   	  support when binutils >= 2.30.
>   
> +config ARM64_BBML2_NOABORT
> +	bool "Enable support for Break-Before-Make Level 2 detection and usage"
> +	default y
> +	help
> +	  FEAT_BBM provides detection of support levels for break-before-make
> +	  sequences. If BBM level 2 is supported, some TLB maintenance requirements
> +	  can be relaxed to improve performance. We additonally require the
> +	  property that the implementation cannot ever raise TLB Conflict Aborts.
> +	  Selecting N causes the kernel to fallback to BBM level 0 behaviour
> +	  even if the system supports BBM level 2.

minor nit: Should we mention that the feature can be disabled at runtime 
using a kernel parameter ?

> +
>   endmenu # "ARMv8.4 architectural features"
>   
>   menu "ARMv8.5 architectural features"
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 0b5ca6e0eb09..2d6db33d4e45 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
>   		return IS_ENABLED(CONFIG_ARM64_PAN);
>   	case ARM64_HAS_EPAN:
>   		return IS_ENABLED(CONFIG_ARM64_EPAN);
> +	case ARM64_HAS_BBML2_NOABORT:
> +		return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
>   	case ARM64_SVE:
>   		return IS_ENABLED(CONFIG_ARM64_SVE);
>   	case ARM64_SME:
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index e0e4478f5fb5..108ef3fbbc00 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -866,6 +866,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
>   	return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
>   }
>   
> +static inline bool system_supports_bbml2_noabort(void)
> +{
> +	return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
> +}
> +
>   int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
>   bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>   
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index d561cf3b8ac7..1a4adcda267b 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
>   	return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
>   }
>   
> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> +{
> +	/* We want to allow usage of bbml2 in as wide a range of kernel contexts
> +	 * as possible. This list is therefore an allow-list of known-good
> +	 * implementations that both support bbml2 and additionally, fulfill the
> +	 * extra constraint of never generating TLB conflict aborts when using
> +	 * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
> +	 * kernel contexts difficult to prove safe against recursive aborts).
> +	 *
> +	 * Note that implementations can only be considered "known-good" if their
> +	 * implementors attest to the fact that the implementation never raises
> +	 * TLBI conflict aborts for bbml2 mapping granularity changes.
> +	 */
> +	static const struct midr_range supports_bbml2_noabort_list[] = {
> +		MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
> +		MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
> +		{}
> +	};
> +
> +	return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
> +}
> +
> +static inline unsigned int cpu_read_midr(int cpu)
> +{
> +	WARN_ON_ONCE(!cpu_online(cpu));
> +
> +	return per_cpu(cpu_data, cpu).reg_midr;
> +}
> +
> +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
> +{
> +	if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
> +		return false;
> +
> +	if (scope & SCOPE_SYSTEM) {
> +		int cpu;
> +
> +		/* We are a boot CPU, and must verify that all enumerated boot

minor nit: See Documentation/process/coding-style.rst,
Section 8 Commenting.

		/*
		 * <multi-line comment>
		 */

> +		 * CPUs have MIDR values within our allowlist. Otherwise, we do
> +		 * not allow the BBML2 feature to avoid potential faults when
> +		 * the insufficient CPUs access memory regions using BBML2
> +		 * semantics.
> +		 */
> +		for_each_online_cpu(cpu) {
> +			if (!cpu_has_bbml2_noabort(cpu_read_midr(cpu)))
> +				return false;
> +		}
> +	} else if (scope & SCOPE_LOCAL_CPU) {
> +		/* We are a hot-plugged CPU, so must only check our MIDR.

minot nit: same as above

Rest looks good to me.

Suzuki



> +		 * If we have the correct MIDR, but the kernel booted on an
> +		 * insufficient CPU, we will not use BBML2 (this is safe). If
> +		 * we have an incorrect MIDR, but the kernel booted on a
> +		 * sufficient CPU, we will not bring up this CPU.
> +		 */
> +		if (!cpu_has_bbml2_noabort(read_cpuid_id()))
> +			return false;
> +	}
> +
> +	return has_cpuid_feature(caps, scope);
> +}
> +
>   #ifdef CONFIG_ARM64_PAN
>   static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused)
>   {
> @@ -2926,6 +2987,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>   		.matches = has_cpuid_feature,
>   		ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
>   	},
> +	{
> +		.desc = "BBM Level 2 without conflict abort",
> +		.capability = ARM64_HAS_BBML2_NOABORT,
> +		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> +		.matches = has_bbml2_noabort,
> +		ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, BBM, 2)
> +	},
>   	{
>   		.desc = "52-bit Virtual Addressing for KVM (LPA2)",
>   		.capability = ARM64_HAS_LPA2,
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index c6b185b885f7..803a0c99f7b4 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
>   	.override	= &id_aa64mmfr2_override,
>   	.fields		= {
>   		FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter),
> +		FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL),
>   		{}
>   	},
>   };
> @@ -246,6 +247,7 @@ static const struct {
>   	{ "rodata=off",			"arm64_sw.rodataoff=1" },
>   	{ "arm64.nolva",		"id_aa64mmfr2.varange=0" },
>   	{ "arm64.no32bit_el0",		"id_aa64pfr0.el0=1" },
> +	{ "arm64.nobbml2",		"id_aa64mmfr2.bbm=0" },
>   };
>   
>   static int __init parse_hexdigit(const char *p, u64 *v)
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 1e65f2fb45bd..b03a375e5507 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -14,6 +14,7 @@ HAS_ADDRESS_AUTH_ARCH_QARMA5
>   HAS_ADDRESS_AUTH_IMP_DEF
>   HAS_AMU_EXTN
>   HAS_ARMv8_4_TTL
> +HAS_BBML2_NOABORT
>   HAS_CACHE_DIC
>   HAS_CACHE_IDC
>   HAS_CNP



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-19 15:05 ` [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
  2025-03-20 13:24   ` Suzuki K Poulose
@ 2025-03-20 13:37   ` Ard Biesheuvel
  2025-03-20 14:01     ` Ryan Roberts
  2025-03-20 17:06     ` Mikołaj Lenczewski
  1 sibling, 2 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2025-03-20 13:37 UTC (permalink / raw)
  To: Mikołaj Lenczewski
  Cc: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
	maz, james.morse, broonie, oliver.upton, baohua, david, ioworker0,
	jgg, nicolinc, mshavit, jsnitsel, smostafa, linux-doc,
	linux-kernel, linux-arm-kernel, iommu

On Wed, 19 Mar 2025 at 16:06, Mikołaj Lenczewski
<miko.lenczewski@arm.com> wrote:
>
> The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
> and this commit adds a dedicated BBML2 cpufeature to test against
> support for, as well as a kernel commandline parameter to optionally
> disable BBML2 altogether.
>
> This is a system feature as we might have a big.LITTLE architecture
> where some cores support BBML2 and some don't, but we want all cores to
> be available and BBM to default to level 0 (as opposed to having cores
> without BBML2 not coming online).
>
> To support BBML2 in as wide a range of contexts as we can, we want not
> only the architectural guarantees that BBML2 makes, but additionally
> want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
> us having to prove that no recursive faults can be induced in any path
> that uses BBML2, allowing its use for arbitrary kernel mappings.
> Support detection of such CPUs.
>
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
> ---
>  .../admin-guide/kernel-parameters.txt         |  3 +
>  arch/arm64/Kconfig                            | 11 +++
>  arch/arm64/include/asm/cpucaps.h              |  2 +
>  arch/arm64/include/asm/cpufeature.h           |  5 ++
>  arch/arm64/kernel/cpufeature.c                | 68 +++++++++++++++++++
>  arch/arm64/kernel/pi/idreg-override.c         |  2 +
>  arch/arm64/tools/cpucaps                      |  1 +
>  7 files changed, 92 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index fb8752b42ec8..3e4cc917a07e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -453,6 +453,9 @@
>         arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
>                         32 bit applications.
>
> +       arm64.nobbml2   [ARM64] Unconditionally disable Break-Before-Make Level
> +                       2 support
> +
>         arm64.nobti     [ARM64] Unconditionally disable Branch Target
>                         Identification support
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 940343beb3d4..49deda2b22ae 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
>           The feature introduces new assembly instructions, and they were
>           support when binutils >= 2.30.
>
> +config ARM64_BBML2_NOABORT
> +       bool "Enable support for Break-Before-Make Level 2 detection and usage"
> +       default y
> +       help
> +         FEAT_BBM provides detection of support levels for break-before-make
> +         sequences. If BBM level 2 is supported, some TLB maintenance requirements
> +         can be relaxed to improve performance. We additonally require the
> +         property that the implementation cannot ever raise TLB Conflict Aborts.
> +         Selecting N causes the kernel to fallback to BBM level 0 behaviour
> +         even if the system supports BBM level 2.
> +
>  endmenu # "ARMv8.4 architectural features"
>
>  menu "ARMv8.5 architectural features"
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 0b5ca6e0eb09..2d6db33d4e45 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
>                 return IS_ENABLED(CONFIG_ARM64_PAN);
>         case ARM64_HAS_EPAN:
>                 return IS_ENABLED(CONFIG_ARM64_EPAN);
> +       case ARM64_HAS_BBML2_NOABORT:
> +               return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
>         case ARM64_SVE:
>                 return IS_ENABLED(CONFIG_ARM64_SVE);
>         case ARM64_SME:
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index e0e4478f5fb5..108ef3fbbc00 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -866,6 +866,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
>         return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
>  }
>
> +static inline bool system_supports_bbml2_noabort(void)
> +{
> +       return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
> +}
> +
>  int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
>  bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index d561cf3b8ac7..1a4adcda267b 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
>         return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
>  }
>
> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> +{

We generally start these block comments with just /* on the first line

> +       /* We want to allow usage of bbml2 in as wide a range of kernel contexts
> +        * as possible. This list is therefore an allow-list of known-good
> +        * implementations that both support bbml2 and additionally, fulfill the
> +        * extra constraint of never generating TLB conflict aborts when using
> +        * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
> +        * kernel contexts difficult to prove safe against recursive aborts).
> +        *
> +        * Note that implementations can only be considered "known-good" if their
> +        * implementors attest to the fact that the implementation never raises
> +        * TLBI conflict aborts for bbml2 mapping granularity changes.
> +        */
> +       static const struct midr_range supports_bbml2_noabort_list[] = {
> +               MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
> +               MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
> +               {}
> +       };
> +

Why on earth is this needed? Is there nothing in the architecture that
can inform us about this? That seems like a huge oversight to me ...


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-20 13:37   ` Ard Biesheuvel
@ 2025-03-20 14:01     ` Ryan Roberts
  2025-03-20 17:06     ` Mikołaj Lenczewski
  1 sibling, 0 replies; 11+ messages in thread
From: Ryan Roberts @ 2025-03-20 14:01 UTC (permalink / raw)
  To: Ard Biesheuvel, Mikołaj Lenczewski
  Cc: suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
	maz, james.morse, broonie, oliver.upton, baohua, david, ioworker0,
	jgg, nicolinc, mshavit, jsnitsel, smostafa, linux-doc,
	linux-kernel, linux-arm-kernel, iommu

On 20/03/2025 13:37, Ard Biesheuvel wrote:
> On Wed, 19 Mar 2025 at 16:06, Mikołaj Lenczewski
> <miko.lenczewski@arm.com> wrote:
>>
>> The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
>> and this commit adds a dedicated BBML2 cpufeature to test against
>> support for, as well as a kernel commandline parameter to optionally
>> disable BBML2 altogether.
>>
>> This is a system feature as we might have a big.LITTLE architecture
>> where some cores support BBML2 and some don't, but we want all cores to
>> be available and BBM to default to level 0 (as opposed to having cores
>> without BBML2 not coming online).
>>
>> To support BBML2 in as wide a range of contexts as we can, we want not
>> only the architectural guarantees that BBML2 makes, but additionally
>> want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
>> us having to prove that no recursive faults can be induced in any path
>> that uses BBML2, allowing its use for arbitrary kernel mappings.
>> Support detection of such CPUs.
>>
>> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
>> ---
>>  .../admin-guide/kernel-parameters.txt         |  3 +
>>  arch/arm64/Kconfig                            | 11 +++
>>  arch/arm64/include/asm/cpucaps.h              |  2 +
>>  arch/arm64/include/asm/cpufeature.h           |  5 ++
>>  arch/arm64/kernel/cpufeature.c                | 68 +++++++++++++++++++
>>  arch/arm64/kernel/pi/idreg-override.c         |  2 +
>>  arch/arm64/tools/cpucaps                      |  1 +
>>  7 files changed, 92 insertions(+)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index fb8752b42ec8..3e4cc917a07e 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -453,6 +453,9 @@
>>         arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
>>                         32 bit applications.
>>
>> +       arm64.nobbml2   [ARM64] Unconditionally disable Break-Before-Make Level
>> +                       2 support
>> +
>>         arm64.nobti     [ARM64] Unconditionally disable Branch Target
>>                         Identification support
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 940343beb3d4..49deda2b22ae 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
>>           The feature introduces new assembly instructions, and they were
>>           support when binutils >= 2.30.
>>
>> +config ARM64_BBML2_NOABORT
>> +       bool "Enable support for Break-Before-Make Level 2 detection and usage"
>> +       default y
>> +       help
>> +         FEAT_BBM provides detection of support levels for break-before-make
>> +         sequences. If BBM level 2 is supported, some TLB maintenance requirements
>> +         can be relaxed to improve performance. We additonally require the
>> +         property that the implementation cannot ever raise TLB Conflict Aborts.
>> +         Selecting N causes the kernel to fallback to BBM level 0 behaviour
>> +         even if the system supports BBM level 2.
>> +
>>  endmenu # "ARMv8.4 architectural features"
>>
>>  menu "ARMv8.5 architectural features"
>> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
>> index 0b5ca6e0eb09..2d6db33d4e45 100644
>> --- a/arch/arm64/include/asm/cpucaps.h
>> +++ b/arch/arm64/include/asm/cpucaps.h
>> @@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
>>                 return IS_ENABLED(CONFIG_ARM64_PAN);
>>         case ARM64_HAS_EPAN:
>>                 return IS_ENABLED(CONFIG_ARM64_EPAN);
>> +       case ARM64_HAS_BBML2_NOABORT:
>> +               return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
>>         case ARM64_SVE:
>>                 return IS_ENABLED(CONFIG_ARM64_SVE);
>>         case ARM64_SME:
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index e0e4478f5fb5..108ef3fbbc00 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -866,6 +866,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
>>         return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
>>  }
>>
>> +static inline bool system_supports_bbml2_noabort(void)
>> +{
>> +       return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
>> +}
>> +
>>  int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
>>  bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index d561cf3b8ac7..1a4adcda267b 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
>>         return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
>>  }
>>
>> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
>> +{
> 
> We generally start these block comments with just /* on the first line
> 
>> +       /* We want to allow usage of bbml2 in as wide a range of kernel contexts
>> +        * as possible. This list is therefore an allow-list of known-good
>> +        * implementations that both support bbml2 and additionally, fulfill the
>> +        * extra constraint of never generating TLB conflict aborts when using
>> +        * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
>> +        * kernel contexts difficult to prove safe against recursive aborts).
>> +        *
>> +        * Note that implementations can only be considered "known-good" if their
>> +        * implementors attest to the fact that the implementation never raises
>> +        * TLBI conflict aborts for bbml2 mapping granularity changes.
>> +        */
>> +       static const struct midr_range supports_bbml2_noabort_list[] = {
>> +               MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
>> +               MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
>> +               {}
>> +       };
>> +
> 
> Why on earth is this needed? Is there nothing in the architecture that
> can inform us about this? That seems like a huge oversight to me ...

Currently the architecture can only tell us about the BBM support level.
Currently level 2 is the highest and that permits an implementation to raise a
conflict abort instead of handling it in HW.

Since this series only relies on BBML2 for user space memory, we believe we
could contain and handle any conflict abort safely (the first version of this
series handled the abort). But Yang Shi has a series on list that aims to use
BBML2 to enable dynamically splitting the linear map. It becomes much harder to
reason about the safety of any conflict abort in that case.

Will was keen to take this approach were we decide if the HW supports
"BBML2+NOABORT" semanitics based on the MIDR. Plans are in flight to fix the
arch so we can tidy this up long term, but Will didn't want to hold up Ampere.

Thanks,
Ryan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-20 13:37   ` Ard Biesheuvel
  2025-03-20 14:01     ` Ryan Roberts
@ 2025-03-20 17:06     ` Mikołaj Lenczewski
  1 sibling, 0 replies; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-20 17:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
	jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
	maz, james.morse, broonie, oliver.upton, baohua, david, ioworker0,
	jgg, nicolinc, mshavit, jsnitsel, smostafa, linux-doc,
	linux-kernel, linux-arm-kernel, iommu

On Thu, Mar 20, 2025 at 02:37:08PM +0100, Ard Biesheuvel wrote:
> On Wed, 19 Mar 2025 at 16:06, Mikołaj Lenczewski
> <miko.lenczewski@arm.com> wrote:
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index d561cf3b8ac7..1a4adcda267b 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
> >         return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
> >  }
> >
> > +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> > +{
> 
> We generally start these block comments with just /* on the first line

My bad for the oversight. Will fix this, thanks.

-- 
Kind regards,
Mikołaj Lenczewski


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-20 13:24   ` Suzuki K Poulose
@ 2025-03-20 17:09     ` Mikołaj Lenczewski
  2025-03-20 17:11       ` Suzuki K Poulose
  0 siblings, 1 reply; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-20 17:09 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: ryan.roberts, yang, corbet, catalin.marinas, will, jean-philippe,
	robin.murphy, joro, akpm, ardb, mark.rutland, joey.gouly, maz,
	james.morse, broonie, oliver.upton, baohua, david, ioworker0, jgg,
	nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
	linux-arm-kernel, iommu

On Thu, Mar 20, 2025 at 01:24:25PM +0000, Suzuki K Poulose wrote:
> On 19/03/2025 15:05, Mikołaj Lenczewski wrote:
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index fb8752b42ec8..3e4cc917a07e 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -453,6 +453,9 @@
> >   	arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
> >   			32 bit applications.
> > +	arm64.nobbml2	[ARM64] Unconditionally disable Break-Before-Make Level
> > +			2 support
> > +
> >   	arm64.nobti	[ARM64] Unconditionally disable Branch Target
> >   			Identification support
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 940343beb3d4..49deda2b22ae 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
> >   	  The feature introduces new assembly instructions, and they were
> >   	  support when binutils >= 2.30.
> > +config ARM64_BBML2_NOABORT
> > +	bool "Enable support for Break-Before-Make Level 2 detection and usage"
> > +	default y
> > +	help
> > +	  FEAT_BBM provides detection of support levels for break-before-make
> > +	  sequences. If BBM level 2 is supported, some TLB maintenance requirements
> > +	  can be relaxed to improve performance. We additonally require the
> > +	  property that the implementation cannot ever raise TLB Conflict Aborts.
> > +	  Selecting N causes the kernel to fallback to BBM level 0 behaviour
> > +	  even if the system supports BBM level 2.
> 
> minor nit: Should we mention that the feature can be disabled at runtime
> using a kernel parameter ?

Yes, this sounds very reasonable, I should have thought of that. Will
mention the commandline parameter in the kconfig option documentation.

> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index d561cf3b8ac7..1a4adcda267b 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
> >   	return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
> >   }
> > +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> > +{
> > +	/* We want to allow usage of bbml2 in as wide a range of kernel contexts
> > +	 * as possible. This list is therefore an allow-list of known-good
> > +	 * implementations that both support bbml2 and additionally, fulfill the
> > +	 * extra constraint of never generating TLB conflict aborts when using
> > +	 * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
> > +	 * kernel contexts difficult to prove safe against recursive aborts).
> > +	 *
> > +	 * Note that implementations can only be considered "known-good" if their
> > +	 * implementors attest to the fact that the implementation never raises
> > +	 * TLBI conflict aborts for bbml2 mapping granularity changes.
> > +	 */
> > +	static const struct midr_range supports_bbml2_noabort_list[] = {
> > +		MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
> > +		MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
> > +		{}
> > +	};
> > +
> > +	return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
> > +}
> > +
> > +static inline unsigned int cpu_read_midr(int cpu)
> > +{
> > +	WARN_ON_ONCE(!cpu_online(cpu));
> > +
> > +	return per_cpu(cpu_data, cpu).reg_midr;
> > +}
> > +
> > +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
> > +{
> > +	if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
> > +		return false;
> > +
> > +	if (scope & SCOPE_SYSTEM) {
> > +		int cpu;
> > +
> > +		/* We are a boot CPU, and must verify that all enumerated boot
> 
> minor nit: See Documentation/process/coding-style.rst,
> Section 8 Commenting.

My bad, had skimmed the coding style at one point but clearly missed or
forgot the section on comments. Will fix up this and all other instances
of the same issue.

-- 
Kind regards,
Mikołaj Lenczewski


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-20 17:09     ` Mikołaj Lenczewski
@ 2025-03-20 17:11       ` Suzuki K Poulose
  2025-03-20 17:13         ` Mikołaj Lenczewski
  0 siblings, 1 reply; 11+ messages in thread
From: Suzuki K Poulose @ 2025-03-20 17:11 UTC (permalink / raw)
  To: Mikołaj Lenczewski
  Cc: ryan.roberts, yang, corbet, catalin.marinas, will, jean-philippe,
	robin.murphy, joro, akpm, ardb, mark.rutland, joey.gouly, maz,
	james.morse, broonie, oliver.upton, baohua, david, ioworker0, jgg,
	nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
	linux-arm-kernel, iommu

On 20/03/2025 17:09, Mikołaj Lenczewski wrote:
> On Thu, Mar 20, 2025 at 01:24:25PM +0000, Suzuki K Poulose wrote:
>> On 19/03/2025 15:05, Mikołaj Lenczewski wrote:
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index fb8752b42ec8..3e4cc917a07e 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -453,6 +453,9 @@
>>>    	arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
>>>    			32 bit applications.
>>> +	arm64.nobbml2	[ARM64] Unconditionally disable Break-Before-Make Level
>>> +			2 support
>>> +
>>>    	arm64.nobti	[ARM64] Unconditionally disable Branch Target
>>>    			Identification support
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 940343beb3d4..49deda2b22ae 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
>>>    	  The feature introduces new assembly instructions, and they were
>>>    	  support when binutils >= 2.30.
>>> +config ARM64_BBML2_NOABORT
>>> +	bool "Enable support for Break-Before-Make Level 2 detection and usage"
>>> +	default y
>>> +	help
>>> +	  FEAT_BBM provides detection of support levels for break-before-make
>>> +	  sequences. If BBM level 2 is supported, some TLB maintenance requirements
>>> +	  can be relaxed to improve performance. We additonally require the
>>> +	  property that the implementation cannot ever raise TLB Conflict Aborts.
>>> +	  Selecting N causes the kernel to fallback to BBM level 0 behaviour
>>> +	  even if the system supports BBM level 2.
>>
>> minor nit: Should we mention that the feature can be disabled at runtime
>> using a kernel parameter ?
> 
> Yes, this sounds very reasonable, I should have thought of that. Will
> mention the commandline parameter in the kconfig option documentation.

And also may be mention this in the patch description and may be also
add the rationale for providing this tunable.

Cheers
Suzuki


> 
>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>> index d561cf3b8ac7..1a4adcda267b 100644
>>> --- a/arch/arm64/kernel/cpufeature.c
>>> +++ b/arch/arm64/kernel/cpufeature.c
>>> @@ -2176,6 +2176,67 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
>>>    	return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
>>>    }
>>> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
>>> +{
>>> +	/* We want to allow usage of bbml2 in as wide a range of kernel contexts
>>> +	 * as possible. This list is therefore an allow-list of known-good
>>> +	 * implementations that both support bbml2 and additionally, fulfill the
>>> +	 * extra constraint of never generating TLB conflict aborts when using
>>> +	 * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
>>> +	 * kernel contexts difficult to prove safe against recursive aborts).
>>> +	 *
>>> +	 * Note that implementations can only be considered "known-good" if their
>>> +	 * implementors attest to the fact that the implementation never raises
>>> +	 * TLBI conflict aborts for bbml2 mapping granularity changes.
>>> +	 */
>>> +	static const struct midr_range supports_bbml2_noabort_list[] = {
>>> +		MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
>>> +		MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
>>> +		{}
>>> +	};
>>> +
>>> +	return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
>>> +}
>>> +
>>> +static inline unsigned int cpu_read_midr(int cpu)
>>> +{
>>> +	WARN_ON_ONCE(!cpu_online(cpu));
>>> +
>>> +	return per_cpu(cpu_data, cpu).reg_midr;
>>> +}
>>> +
>>> +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
>>> +{
>>> +	if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
>>> +		return false;
>>> +
>>> +	if (scope & SCOPE_SYSTEM) {
>>> +		int cpu;
>>> +
>>> +		/* We are a boot CPU, and must verify that all enumerated boot
>>
>> minor nit: See Documentation/process/coding-style.rst,
>> Section 8 Commenting.
> 
> My bad, had skimmed the coding style at one point but clearly missed or
> forgot the section on comments. Will fix up this and all other instances
> of the same issue.
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature
  2025-03-20 17:11       ` Suzuki K Poulose
@ 2025-03-20 17:13         ` Mikołaj Lenczewski
  0 siblings, 0 replies; 11+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-20 17:13 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: ryan.roberts, yang, corbet, catalin.marinas, will, jean-philippe,
	robin.murphy, joro, akpm, ardb, mark.rutland, joey.gouly, maz,
	james.morse, broonie, oliver.upton, baohua, david, ioworker0, jgg,
	nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
	linux-arm-kernel, iommu

On Thu, Mar 20, 2025 at 05:11:25PM +0000, Suzuki K Poulose wrote:
> On 20/03/2025 17:09, Mikołaj Lenczewski wrote:
> > On Thu, Mar 20, 2025 at 01:24:25PM +0000, Suzuki K Poulose wrote:
> > > On 19/03/2025 15:05, Mikołaj Lenczewski wrote:
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > index fb8752b42ec8..3e4cc917a07e 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -453,6 +453,9 @@
> > > >    	arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
> > > >    			32 bit applications.
> > > > +	arm64.nobbml2	[ARM64] Unconditionally disable Break-Before-Make Level
> > > > +			2 support
> > > > +
> > > >    	arm64.nobti	[ARM64] Unconditionally disable Branch Target
> > > >    			Identification support
> > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > > index 940343beb3d4..49deda2b22ae 100644
> > > > --- a/arch/arm64/Kconfig
> > > > +++ b/arch/arm64/Kconfig
> > > > @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
> > > >    	  The feature introduces new assembly instructions, and they were
> > > >    	  support when binutils >= 2.30.
> > > > +config ARM64_BBML2_NOABORT
> > > > +	bool "Enable support for Break-Before-Make Level 2 detection and usage"
> > > > +	default y
> > > > +	help
> > > > +	  FEAT_BBM provides detection of support levels for break-before-make
> > > > +	  sequences. If BBM level 2 is supported, some TLB maintenance requirements
> > > > +	  can be relaxed to improve performance. We additonally require the
> > > > +	  property that the implementation cannot ever raise TLB Conflict Aborts.
> > > > +	  Selecting N causes the kernel to fallback to BBM level 0 behaviour
> > > > +	  even if the system supports BBM level 2.
> > > 
> > > minor nit: Should we mention that the feature can be disabled at runtime
> > > using a kernel parameter ?
> > 
> > Yes, this sounds very reasonable, I should have thought of that. Will
> > mention the commandline parameter in the kconfig option documentation.
> 
> And also may be mention this in the patch description and may be also
> add the rationale for providing this tunable.
> 
> Cheers
> Suzuki
> 

Will do! :)

-- 
Kind regards,
Mikołaj Lenczewski


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-03-20 17:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-19 15:05 [PATCH v4 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
2025-03-19 15:05 ` [PATCH v4 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
2025-03-20 13:24   ` Suzuki K Poulose
2025-03-20 17:09     ` Mikołaj Lenczewski
2025-03-20 17:11       ` Suzuki K Poulose
2025-03-20 17:13         ` Mikołaj Lenczewski
2025-03-20 13:37   ` Ard Biesheuvel
2025-03-20 14:01     ` Ryan Roberts
2025-03-20 17:06     ` Mikołaj Lenczewski
2025-03-19 15:05 ` [PATCH v4 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
2025-03-19 15:05 ` [PATCH v4 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).