* [PATCH v3 0/3] Initial BBML2 support for contpte_convert()
@ 2025-03-13 10:41 Mikołaj Lenczewski
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
` (3 more replies)
0 siblings, 4 replies; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 10:41 UTC (permalink / raw)
To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
maz, james.morse, broonie, anshuman.khandual, oliver.upton,
ioworker0, baohua, david, jgg, shameerali.kolothum.thodi,
nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
linux-arm-kernel, iommu
Cc: Mikołaj Lenczewski
Hi All,
This patch series adds adding initial support for eliding
break-before-make requirements on systems that support BBML2 and
additionally guarantee to never raise a conflict abort.
This support elides a TLB invalidation in contpte_convert(). This
leads to a 12% improvement when executing a microbenchmark designed
to force the pathological path where contpte_convert() gets called.
This represents an 80% reduction in the cost of calling
contpte_convert().
This series is based on v6.14-rc4 (d082ecbc71e9).
Patch 1 implements an allow-list of cpus that support BBML2, but with
the additional constraint of never causing TLB conflict aborts. We
settled on this constraint because we will use the feature for kernel
mappings in the future, for which we cannot handle conflict aborts
safely.
Yang Shi has a series at [1] that aims to use BBML2 to enable splitting
the linear map at runtime. This series partially overlaps with it to add
the cpu feature. We believe this series is fully compatible with Yang's
requirements and could go first.
[1]:
https://lore.kernel.org/linux-arm-kernel/20250304222018.615808-1-yang@os.amperecomputing.com/
Mikołaj Lenczewski (3):
arm64: Add BBM Level 2 cpu feature
iommu/arm: Add BBM Level 2 smmu feature
arm64/mm: Elide tlbi in contpte_convert() under BBML2
.../admin-guide/kernel-parameters.txt | 3 +
arch/arm64/Kconfig | 11 +++
arch/arm64/include/asm/cpucaps.h | 2 +
arch/arm64/include/asm/cpufeature.h | 6 ++
arch/arm64/kernel/cpufeature.c | 76 +++++++++++++++++++
arch/arm64/kernel/pi/idreg-override.c | 2 +
arch/arm64/mm/contpte.c | 3 +-
arch/arm64/tools/cpucaps | 1 +
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +
11 files changed, 113 insertions(+), 1 deletion(-)
--
2.48.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 10:41 [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
@ 2025-03-13 10:41 ` Mikołaj Lenczewski
2025-03-13 16:13 ` Ryan Roberts
` (2 more replies)
2025-03-13 10:41 ` [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
` (2 subsequent siblings)
3 siblings, 3 replies; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 10:41 UTC (permalink / raw)
To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
maz, james.morse, broonie, anshuman.khandual, oliver.upton,
ioworker0, baohua, david, jgg, shameerali.kolothum.thodi,
nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
linux-arm-kernel, iommu
Cc: Mikołaj Lenczewski
The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
and this commit adds a dedicated BBML2 cpufeature to test against
support for, as well as a kernel commandline parameter to optionally
disable BBML2 altogether.
This is a system feature as we might have a big.LITTLE architecture
where some cores support BBML2 and some don't, but we want all cores to
be available and BBM to default to level 0 (as opposed to having cores
without BBML2 not coming online).
To support BBML2 in as wide a range of contexts as we can, we want not
only the architectural guarantees that BBML2 makes, but additionally
want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
us having to prove that no recursive faults can be induced in any path
that uses BBML2, allowing its use for arbitrary kernel mappings.
Support detection of such CPUs.
Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
---
.../admin-guide/kernel-parameters.txt | 3 +
arch/arm64/Kconfig | 11 +++
arch/arm64/include/asm/cpucaps.h | 2 +
arch/arm64/include/asm/cpufeature.h | 6 ++
arch/arm64/kernel/cpufeature.c | 76 +++++++++++++++++++
arch/arm64/kernel/pi/idreg-override.c | 2 +
arch/arm64/tools/cpucaps | 1 +
7 files changed, 101 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index fb8752b42ec8..3e4cc917a07e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -453,6 +453,9 @@
arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
32 bit applications.
+ arm64.nobbml2 [ARM64] Unconditionally disable Break-Before-Make Level
+ 2 support
+
arm64.nobti [ARM64] Unconditionally disable Branch Target
Identification support
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 940343beb3d4..49deda2b22ae 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
The feature introduces new assembly instructions, and they were
support when binutils >= 2.30.
+config ARM64_BBML2_NOABORT
+ bool "Enable support for Break-Before-Make Level 2 detection and usage"
+ default y
+ help
+ FEAT_BBM provides detection of support levels for break-before-make
+ sequences. If BBM level 2 is supported, some TLB maintenance requirements
+ can be relaxed to improve performance. We additonally require the
+ property that the implementation cannot ever raise TLB Conflict Aborts.
+ Selecting N causes the kernel to fallback to BBM level 0 behaviour
+ even if the system supports BBM level 2.
+
endmenu # "ARMv8.4 architectural features"
menu "ARMv8.5 architectural features"
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 0b5ca6e0eb09..2d6db33d4e45 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
return IS_ENABLED(CONFIG_ARM64_PAN);
case ARM64_HAS_EPAN:
return IS_ENABLED(CONFIG_ARM64_EPAN);
+ case ARM64_HAS_BBML2_NOABORT:
+ return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
case ARM64_SVE:
return IS_ENABLED(CONFIG_ARM64_SVE);
case ARM64_SME:
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index e0e4478f5fb5..7f5b220dacde 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -18,6 +18,7 @@
#define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0
#define ARM64_SW_FEATURE_OVERRIDE_HVHE 4
#define ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF 8
+#define ARM64_SW_FEATURE_OVERRIDE_NOBBML2 12
#ifndef __ASSEMBLY__
@@ -866,6 +867,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
}
+static inline bool system_supports_bbml2_noabort(void)
+{
+ return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
+}
+
int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d561cf3b8ac7..b936e0805161 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2176,6 +2176,76 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
}
+static inline bool bbml2_possible(void)
+{
+ return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
+}
+
+static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
+{
+ /* We want to allow usage of bbml2 in as wide a range of kernel contexts
+ * as possible. This list is therefore an allow-list of known-good
+ * implementations that both support bbml2 and additionally, fulfill the
+ * extra constraint of never generating TLB conflict aborts when using
+ * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
+ * kernel contexts difficult to prove safe against recursive aborts).
+ *
+ * Note that implementations can only be considered "known-good" if their
+ * implementors attest to the fact that the implementation never raises
+ * TLBI conflict aborts for bbml2 mapping granularity changes.
+ */
+ static const struct midr_range supports_bbml2_noabort_list[] = {
+ MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
+ MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
+ {}
+ };
+
+ return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
+}
+
+static inline unsigned int __cpu_read_midr(int cpu)
+{
+ WARN_ON_ONCE(!cpu_online(cpu));
+
+ return per_cpu(cpu_data, cpu).reg_midr;
+}
+
+static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
+{
+ if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
+ return false;
+
+ if (!bbml2_possible())
+ return false;
+
+ if (scope & SCOPE_SYSTEM) {
+ int cpu;
+
+ /* We are a boot CPU, and must verify that all enumerated boot
+ * CPUs have MIDR values within our allowlist. Otherwise, we do
+ * not allow the BBML2 feature to avoid potential faults when
+ * the insufficient CPUs access memory regions using BBML2
+ * semantics.
+ */
+ for_each_online_cpu(cpu) {
+ if (!cpu_has_bbml2_noabort(__cpu_read_midr(cpu)))
+ return false;
+ }
+
+ return true;
+ } else if (scope & SCOPE_LOCAL_CPU) {
+ /* We are a hot-plugged CPU, so only need to check our MIDR.
+ * If we have the correct MIDR, but the kernel booted on an
+ * insufficient CPU, we will not use BBML2 (this is safe). If
+ * we have an incorrect MIDR, but the kernel booted on a
+ * sufficient CPU, we will not bring up this CPU.
+ */
+ return cpu_has_bbml2_noabort(read_cpuid_id());
+ }
+
+ return false;
+}
+
#ifdef CONFIG_ARM64_PAN
static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused)
{
@@ -2926,6 +2996,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
},
+ {
+ .desc = "BBM Level 2 without conflict abort",
+ .capability = ARM64_HAS_BBML2_NOABORT,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = has_bbml2_noabort,
+ },
{
.desc = "52-bit Virtual Addressing for KVM (LPA2)",
.capability = ARM64_HAS_LPA2,
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index c6b185b885f7..9728faa10390 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
+ FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
{}
},
};
@@ -246,6 +247,7 @@ static const struct {
{ "rodata=off", "arm64_sw.rodataoff=1" },
{ "arm64.nolva", "id_aa64mmfr2.varange=0" },
{ "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
+ { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
};
static int __init parse_hexdigit(const char *p, u64 *v)
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 1e65f2fb45bd..b03a375e5507 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -14,6 +14,7 @@ HAS_ADDRESS_AUTH_ARCH_QARMA5
HAS_ADDRESS_AUTH_IMP_DEF
HAS_AMU_EXTN
HAS_ARMv8_4_TTL
+HAS_BBML2_NOABORT
HAS_CACHE_DIC
HAS_CACHE_IDC
HAS_CNP
--
2.48.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature
2025-03-13 10:41 [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
@ 2025-03-13 10:41 ` Mikołaj Lenczewski
2025-03-13 11:39 ` Robin Murphy
2025-03-13 16:18 ` Ryan Roberts
2025-03-13 10:41 ` [PATCH v3 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
2025-03-13 11:22 ` [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Suzuki K Poulose
3 siblings, 2 replies; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 10:41 UTC (permalink / raw)
To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
maz, james.morse, broonie, anshuman.khandual, oliver.upton,
ioworker0, baohua, david, jgg, shameerali.kolothum.thodi,
nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
linux-arm-kernel, iommu
Cc: Mikołaj Lenczewski
For supporting BBM Level 2 for userspace mappings, we want to ensure
that the smmu also supports its own version of BBM Level 2. Luckily, the
smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI
0487K.a D8.16.2), so already guarantees that no aborts are raised when
BBM level 2 is claimed.
Add the feature and testing for it under arm_smmu_sva_supported().
Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
3 files changed, 10 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index 9ba596430e7c..6ba182572788 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -222,6 +222,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
feat_mask |= ARM_SMMU_FEAT_VAX;
}
+ if (system_supports_bbml2_noabort())
+ feat_mask |= ARM_SMMU_FEAT_BBML2;
+
if ((smmu->features & feat_mask) != feat_mask)
return false;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 358072b4e293..dcee0bdec924 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4406,6 +4406,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
if (FIELD_GET(IDR3_RIL, reg))
smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
+ if (FIELD_GET(IDR3_BBML, reg) == IDR3_BBML2)
+ smmu->features |= ARM_SMMU_FEAT_BBML2;
+
/* IDR5 */
reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index bd9d7c85576a..85eaf3ab88c2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -60,6 +60,9 @@ struct arm_smmu_device;
#define ARM_SMMU_IDR3 0xc
#define IDR3_FWB (1 << 8)
#define IDR3_RIL (1 << 10)
+#define IDR3_BBML GENMASK(12, 11)
+#define IDR3_BBML1 (1 << 11)
+#define IDR3_BBML2 (2 << 11)
#define ARM_SMMU_IDR5 0x14
#define IDR5_STALL_MAX GENMASK(31, 16)
@@ -754,6 +757,7 @@ struct arm_smmu_device {
#define ARM_SMMU_FEAT_HA (1 << 21)
#define ARM_SMMU_FEAT_HD (1 << 22)
#define ARM_SMMU_FEAT_S2FWB (1 << 23)
+#define ARM_SMMU_FEAT_BBML2 (1 << 24)
u32 features;
#define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
--
2.48.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v3 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2
2025-03-13 10:41 [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
2025-03-13 10:41 ` [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
@ 2025-03-13 10:41 ` Mikołaj Lenczewski
2025-03-13 16:28 ` Ryan Roberts
2025-03-13 11:22 ` [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Suzuki K Poulose
3 siblings, 1 reply; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 10:41 UTC (permalink / raw)
To: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
maz, james.morse, broonie, anshuman.khandual, oliver.upton,
ioworker0, baohua, david, jgg, shameerali.kolothum.thodi,
nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
linux-arm-kernel, iommu
Cc: Mikołaj Lenczewski
When converting a region via contpte_convert() to use mTHP, we have two
different goals. We have to mark each entry as contiguous, and we would
like to smear the dirty and young (access) bits across all entries in
the contiguous block. Currently, we do this by first accumulating the
dirty and young bits in the block, using an atomic
__ptep_get_and_clear() and the relevant pte_{dirty,young}() calls,
performing a tlbi, and finally smearing the correct bits across the
block using __set_ptes().
This approach works fine for BBM level 0, but with support for BBM level
2 we are allowed to reorder the tlbi to after setting the pagetable
entries. This reordering reduces the likelyhood of a concurrent page
walk finding an invalid (not present) PTE. This reduces the likelyhood
of a fault in other threads, and improves performance marginally
(more so when there are more threads).
If we support bbml2 without conflict aborts however, we can avoid the
final flush altogether and have hardware manage the tlb entries for us.
Avoiding flushes is a win.
Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
---
arch/arm64/mm/contpte.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
index 55107d27d3f8..77ed03b30b72 100644
--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -68,7 +68,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
pte = pte_mkyoung(pte);
}
- __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
+ if (!system_supports_bbml2_noabort())
+ __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
__set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v3 0/3] Initial BBML2 support for contpte_convert()
2025-03-13 10:41 [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
` (2 preceding siblings ...)
2025-03-13 10:41 ` [PATCH v3 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
@ 2025-03-13 11:22 ` Suzuki K Poulose
2025-03-13 11:30 ` Mikołaj Lenczewski
3 siblings, 1 reply; 24+ messages in thread
From: Suzuki K Poulose @ 2025-03-13 11:22 UTC (permalink / raw)
To: Mikołaj Lenczewski, ryan.roberts, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, maz, james.morse, broonie,
anshuman.khandual, oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
Hi Miko,
On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
> Hi All,
>
> This patch series adds adding initial support for eliding
> break-before-make requirements on systems that support BBML2 and
> additionally guarantee to never raise a conflict abort.
>
> This support elides a TLB invalidation in contpte_convert(). This
> leads to a 12% improvement when executing a microbenchmark designed
> to force the pathological path where contpte_convert() gets called.
> This represents an 80% reduction in the cost of calling
> contpte_convert().
>
> This series is based on v6.14-rc4 (d082ecbc71e9).
>
> Patch 1 implements an allow-list of cpus that support BBML2, but with
> the additional constraint of never causing TLB conflict aborts. We
> settled on this constraint because we will use the feature for kernel
> mappings in the future, for which we cannot handle conflict aborts
> safely.
>
> Yang Shi has a series at [1] that aims to use BBML2 to enable splitting
> the linear map at runtime. This series partially overlaps with it to add
> the cpu feature. We believe this series is fully compatible with Yang's
> requirements and could go first.
>
Nothing about the patch functionality, but :
Minor nit: Generally it is a good idea to add "What changed" from the
previous posting. That gives the reviewers an idea of what to look for
in the new version. Something like:
Changes since V2:
{Adding a link to the posting is an added bonus, as we can look up the
discussions easily}
- blah blah
- ..
Cheers
Suzuki
> [1]:
> https://lore.kernel.org/linux-arm-kernel/20250304222018.615808-1-yang@os.amperecomputing.com/
>
> Mikołaj Lenczewski (3):
> arm64: Add BBM Level 2 cpu feature
> iommu/arm: Add BBM Level 2 smmu feature
> arm64/mm: Elide tlbi in contpte_convert() under BBML2
>
> .../admin-guide/kernel-parameters.txt | 3 +
> arch/arm64/Kconfig | 11 +++
> arch/arm64/include/asm/cpucaps.h | 2 +
> arch/arm64/include/asm/cpufeature.h | 6 ++
> arch/arm64/kernel/cpufeature.c | 76 +++++++++++++++++++
> arch/arm64/kernel/pi/idreg-override.c | 2 +
> arch/arm64/mm/contpte.c | 3 +-
> arch/arm64/tools/cpucaps | 1 +
> .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 +
> 11 files changed, 113 insertions(+), 1 deletion(-)
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 0/3] Initial BBML2 support for contpte_convert()
2025-03-13 11:22 ` [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Suzuki K Poulose
@ 2025-03-13 11:30 ` Mikołaj Lenczewski
0 siblings, 0 replies; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 11:30 UTC (permalink / raw)
To: Suzuki K Poulose
Cc: ryan.roberts, yang, corbet, catalin.marinas, will, jean-philippe,
robin.murphy, joro, akpm, mark.rutland, joey.gouly, maz,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On Thu, Mar 13, 2025 at 11:22:07AM +0000, Suzuki K Poulose wrote:
> Hi Miko,
>
> On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
> > Hi All,
> >
> Nothing about the patch functionality, but :
>
> Minor nit: Generally it is a good idea to add "What changed" from the
> previous posting. That gives the reviewers an idea of what to look for
> in the new version. Something like:
>
> Changes since V2:
> {Adding a link to the posting is an added bonus, as we can look up the
> discussions easily}
> - blah blah
> - ..
>
>
> Cheers
> Suzuki
Ah, yes. My apologies. Will make sure to include such a changelog with
links to previous versions in future. I do not know why I did not
include it here, other than it not crossing my mind...
--
Kind regards,
Mikołaj Lenczewski
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature
2025-03-13 10:41 ` [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
@ 2025-03-13 11:39 ` Robin Murphy
2025-03-13 16:18 ` Ryan Roberts
1 sibling, 0 replies; 24+ messages in thread
From: Robin Murphy @ 2025-03-13 11:39 UTC (permalink / raw)
To: Mikołaj Lenczewski, ryan.roberts, suzuki.poulose, yang,
corbet, catalin.marinas, will, jean-philippe, joro, akpm,
mark.rutland, joey.gouly, maz, james.morse, broonie,
anshuman.khandual, oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On 2025-03-13 10:41 am, Mikołaj Lenczewski wrote:
> For supporting BBM Level 2 for userspace mappings, we want to ensure
> that the smmu also supports its own version of BBM Level 2. Luckily, the
> smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI
> 0487K.a D8.16.2), so already guarantees that no aborts are raised when
> BBM level 2 is claimed.
>
> Add the feature and testing for it under arm_smmu_sva_supported().
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
> 3 files changed, 10 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 9ba596430e7c..6ba182572788 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -222,6 +222,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
> feat_mask |= ARM_SMMU_FEAT_VAX;
> }
>
> + if (system_supports_bbml2_noabort())
> + feat_mask |= ARM_SMMU_FEAT_BBML2;
> +
> if ((smmu->features & feat_mask) != feat_mask)
> return false;
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 358072b4e293..dcee0bdec924 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -4406,6 +4406,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
> if (FIELD_GET(IDR3_RIL, reg))
> smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
>
> + if (FIELD_GET(IDR3_BBML, reg) == IDR3_BBML2)
> + smmu->features |= ARM_SMMU_FEAT_BBML2;
> +
> /* IDR5 */
> reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index bd9d7c85576a..85eaf3ab88c2 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -60,6 +60,9 @@ struct arm_smmu_device;
> #define ARM_SMMU_IDR3 0xc
> #define IDR3_FWB (1 << 8)
> #define IDR3_RIL (1 << 10)
> +#define IDR3_BBML GENMASK(12, 11)
> +#define IDR3_BBML1 (1 << 11)
> +#define IDR3_BBML2 (2 << 11)
>
> #define ARM_SMMU_IDR5 0x14
> #define IDR5_STALL_MAX GENMASK(31, 16)
> @@ -754,6 +757,7 @@ struct arm_smmu_device {
> #define ARM_SMMU_FEAT_HA (1 << 21)
> #define ARM_SMMU_FEAT_HD (1 << 22)
> #define ARM_SMMU_FEAT_S2FWB (1 << 23)
> +#define ARM_SMMU_FEAT_BBML2 (1 << 24)
> u32 features;
>
> #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
@ 2025-03-13 16:13 ` Ryan Roberts
2025-03-13 18:08 ` Mikołaj Lenczewski
2025-03-13 17:21 ` Yang Shi
2025-03-13 17:34 ` Marc Zyngier
2 siblings, 1 reply; 24+ messages in thread
From: Ryan Roberts @ 2025-03-13 16:13 UTC (permalink / raw)
To: Mikołaj Lenczewski, suzuki.poulose, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, maz, james.morse, broonie,
anshuman.khandual, oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
> The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
> and this commit adds a dedicated BBML2 cpufeature to test against
> support for, as well as a kernel commandline parameter to optionally
> disable BBML2 altogether.
>
> This is a system feature as we might have a big.LITTLE architecture
> where some cores support BBML2 and some don't, but we want all cores to
> be available and BBM to default to level 0 (as opposed to having cores
> without BBML2 not coming online).
>
> To support BBML2 in as wide a range of contexts as we can, we want not
> only the architectural guarantees that BBML2 makes, but additionally
> want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
> us having to prove that no recursive faults can be induced in any path
> that uses BBML2, allowing its use for arbitrary kernel mappings.
> Support detection of such CPUs.
>
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
I have 2 nits below, but with those resolved:
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> .../admin-guide/kernel-parameters.txt | 3 +
> arch/arm64/Kconfig | 11 +++
> arch/arm64/include/asm/cpucaps.h | 2 +
> arch/arm64/include/asm/cpufeature.h | 6 ++
> arch/arm64/kernel/cpufeature.c | 76 +++++++++++++++++++
> arch/arm64/kernel/pi/idreg-override.c | 2 +
> arch/arm64/tools/cpucaps | 1 +
> 7 files changed, 101 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index fb8752b42ec8..3e4cc917a07e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -453,6 +453,9 @@
> arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
> 32 bit applications.
>
> + arm64.nobbml2 [ARM64] Unconditionally disable Break-Before-Make Level
> + 2 support
> +
> arm64.nobti [ARM64] Unconditionally disable Branch Target
> Identification support
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 940343beb3d4..49deda2b22ae 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
> The feature introduces new assembly instructions, and they were
> support when binutils >= 2.30.
>
> +config ARM64_BBML2_NOABORT
> + bool "Enable support for Break-Before-Make Level 2 detection and usage"
> + default y
> + help
> + FEAT_BBM provides detection of support levels for break-before-make
> + sequences. If BBM level 2 is supported, some TLB maintenance requirements
> + can be relaxed to improve performance. We additonally require the
> + property that the implementation cannot ever raise TLB Conflict Aborts.
> + Selecting N causes the kernel to fallback to BBM level 0 behaviour
> + even if the system supports BBM level 2.
> +
> endmenu # "ARMv8.4 architectural features"
>
> menu "ARMv8.5 architectural features"
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 0b5ca6e0eb09..2d6db33d4e45 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
> return IS_ENABLED(CONFIG_ARM64_PAN);
> case ARM64_HAS_EPAN:
> return IS_ENABLED(CONFIG_ARM64_EPAN);
> + case ARM64_HAS_BBML2_NOABORT:
> + return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
> case ARM64_SVE:
> return IS_ENABLED(CONFIG_ARM64_SVE);
> case ARM64_SME:
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index e0e4478f5fb5..7f5b220dacde 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -18,6 +18,7 @@
> #define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0
> #define ARM64_SW_FEATURE_OVERRIDE_HVHE 4
> #define ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF 8
> +#define ARM64_SW_FEATURE_OVERRIDE_NOBBML2 12
>
> #ifndef __ASSEMBLY__
>
> @@ -866,6 +867,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
> return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
> }
>
> +static inline bool system_supports_bbml2_noabort(void)
> +{
> + return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
> +}
> +
> int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
> bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index d561cf3b8ac7..b936e0805161 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2176,6 +2176,76 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
> return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
> }
>
> +static inline bool bbml2_possible(void)
> +{
> + return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
If you're going to keep this helper, I think it really needs to be:
return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT) &&
!arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
Then you would simplify the caller to remove it's own
IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT) check.
But personally I would remove the helper and just fold the test into
has_bbml2_noabort().
Thanks,
Ryan
> +}
> +
> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> +{
> + /* We want to allow usage of bbml2 in as wide a range of kernel contexts
> + * as possible. This list is therefore an allow-list of known-good
> + * implementations that both support bbml2 and additionally, fulfill the
> + * extra constraint of never generating TLB conflict aborts when using
> + * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
> + * kernel contexts difficult to prove safe against recursive aborts).
> + *
> + * Note that implementations can only be considered "known-good" if their
> + * implementors attest to the fact that the implementation never raises
> + * TLBI conflict aborts for bbml2 mapping granularity changes.
> + */
> + static const struct midr_range supports_bbml2_noabort_list[] = {
> + MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
> + MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
> + {}
> + };
> +
> + return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
> +}
> +
> +static inline unsigned int __cpu_read_midr(int cpu)
nit: why the double underscrore prefix?
> +{
> + WARN_ON_ONCE(!cpu_online(cpu));
> +
> + return per_cpu(cpu_data, cpu).reg_midr;
> +}
> +
> +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
> +{
> + if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
> + return false;
> +
> + if (!bbml2_possible())
> + return false;
> +
> + if (scope & SCOPE_SYSTEM) {
> + int cpu;
> +
> + /* We are a boot CPU, and must verify that all enumerated boot
> + * CPUs have MIDR values within our allowlist. Otherwise, we do
> + * not allow the BBML2 feature to avoid potential faults when
> + * the insufficient CPUs access memory regions using BBML2
> + * semantics.
> + */
> + for_each_online_cpu(cpu) {
> + if (!cpu_has_bbml2_noabort(__cpu_read_midr(cpu)))
> + return false;
> + }
> +
> + return true;
> + } else if (scope & SCOPE_LOCAL_CPU) {
> + /* We are a hot-plugged CPU, so only need to check our MIDR.
> + * If we have the correct MIDR, but the kernel booted on an
> + * insufficient CPU, we will not use BBML2 (this is safe). If
> + * we have an incorrect MIDR, but the kernel booted on a
> + * sufficient CPU, we will not bring up this CPU.
> + */
> + return cpu_has_bbml2_noabort(read_cpuid_id());
> + }
> +
> + return false;
> +}
> +
> #ifdef CONFIG_ARM64_PAN
> static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused)
> {
> @@ -2926,6 +2996,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> .matches = has_cpuid_feature,
> ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
> },
> + {
> + .desc = "BBM Level 2 without conflict abort",
> + .capability = ARM64_HAS_BBML2_NOABORT,
> + .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> + .matches = has_bbml2_noabort,
> + },
> {
> .desc = "52-bit Virtual Addressing for KVM (LPA2)",
> .capability = ARM64_HAS_LPA2,
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index c6b185b885f7..9728faa10390 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> {}
> },
> };
> @@ -246,6 +247,7 @@ static const struct {
> { "rodata=off", "arm64_sw.rodataoff=1" },
> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
> };
>
> static int __init parse_hexdigit(const char *p, u64 *v)
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 1e65f2fb45bd..b03a375e5507 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -14,6 +14,7 @@ HAS_ADDRESS_AUTH_ARCH_QARMA5
> HAS_ADDRESS_AUTH_IMP_DEF
> HAS_AMU_EXTN
> HAS_ARMv8_4_TTL
> +HAS_BBML2_NOABORT
> HAS_CACHE_DIC
> HAS_CACHE_IDC
> HAS_CNP
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature
2025-03-13 10:41 ` [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
2025-03-13 11:39 ` Robin Murphy
@ 2025-03-13 16:18 ` Ryan Roberts
1 sibling, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-03-13 16:18 UTC (permalink / raw)
To: Mikołaj Lenczewski, suzuki.poulose, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, maz, james.morse, broonie,
anshuman.khandual, oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
> For supporting BBM Level 2 for userspace mappings, we want to ensure
> that the smmu also supports its own version of BBM Level 2. Luckily, the
> smmu spec (IHI 0070G 3.21.1.3) is stricter than the aarch64 spec (DDI
> 0487K.a D8.16.2), so already guarantees that no aborts are raised when
> BBM level 2 is claimed.
>
> Add the feature and testing for it under arm_smmu_sva_supported().
>
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 3 +++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 +++
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++++
> 3 files changed, 10 insertions(+)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 9ba596430e7c..6ba182572788 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -222,6 +222,9 @@ bool arm_smmu_sva_supported(struct arm_smmu_device *smmu)
> feat_mask |= ARM_SMMU_FEAT_VAX;
> }
>
> + if (system_supports_bbml2_noabort())
> + feat_mask |= ARM_SMMU_FEAT_BBML2;
> +
> if ((smmu->features & feat_mask) != feat_mask)
> return false;
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 358072b4e293..dcee0bdec924 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -4406,6 +4406,9 @@ static int arm_smmu_device_hw_probe(struct arm_smmu_device *smmu)
> if (FIELD_GET(IDR3_RIL, reg))
> smmu->features |= ARM_SMMU_FEAT_RANGE_INV;
>
> + if (FIELD_GET(IDR3_BBML, reg) == IDR3_BBML2)
> + smmu->features |= ARM_SMMU_FEAT_BBML2;
> +
> /* IDR5 */
> reg = readl_relaxed(smmu->base + ARM_SMMU_IDR5);
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index bd9d7c85576a..85eaf3ab88c2 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -60,6 +60,9 @@ struct arm_smmu_device;
> #define ARM_SMMU_IDR3 0xc
> #define IDR3_FWB (1 << 8)
> #define IDR3_RIL (1 << 10)
> +#define IDR3_BBML GENMASK(12, 11)
> +#define IDR3_BBML1 (1 << 11)
> +#define IDR3_BBML2 (2 << 11)
>
> #define ARM_SMMU_IDR5 0x14
> #define IDR5_STALL_MAX GENMASK(31, 16)
> @@ -754,6 +757,7 @@ struct arm_smmu_device {
> #define ARM_SMMU_FEAT_HA (1 << 21)
> #define ARM_SMMU_FEAT_HD (1 << 22)
> #define ARM_SMMU_FEAT_S2FWB (1 << 23)
> +#define ARM_SMMU_FEAT_BBML2 (1 << 24)
> u32 features;
>
> #define ARM_SMMU_OPT_SKIP_PREFETCH (1 << 0)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2
2025-03-13 10:41 ` [PATCH v3 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
@ 2025-03-13 16:28 ` Ryan Roberts
0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-03-13 16:28 UTC (permalink / raw)
To: Mikołaj Lenczewski, suzuki.poulose, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, maz, james.morse, broonie,
anshuman.khandual, oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
> When converting a region via contpte_convert() to use mTHP, we have two
> different goals. We have to mark each entry as contiguous, and we would
> like to smear the dirty and young (access) bits across all entries in
> the contiguous block. Currently, we do this by first accumulating the
> dirty and young bits in the block, using an atomic
> __ptep_get_and_clear() and the relevant pte_{dirty,young}() calls,
> performing a tlbi, and finally smearing the correct bits across the
> block using __set_ptes().
>
> This approach works fine for BBM level 0, but with support for BBM level
> 2 we are allowed to reorder the tlbi to after setting the pagetable
> entries. This reordering reduces the likelyhood of a concurrent page
> walk finding an invalid (not present) PTE. This reduces the likelyhood
> of a fault in other threads, and improves performance marginally
> (more so when there are more threads).
Just to add some extra clarity for the sake of the review; we expect that the
cost of the tlbi(s) is much greater than the cost of clearing and re-setting the
ptes. So by moving the tlbi(s) outside of the window where the ptes are invalid
(as bbml2 permits us to do) we greatly reduce the duration where the ptes are
invalid and therefore greatly reduce the chances of another thread taking a
fault on the temporarily invalid address...
>
> If we support bbml2 without conflict aborts however, we can avoid the
> final flush altogether and have hardware manage the tlb entries for us.
> Avoiding flushes is a win.
...But because we are only allow-listing bbml2 implementations that don't raise
a conflict abort and instead automatically invalidate the conflicting entries in
HW, we can avoid that tlbi altogether.
>
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> arch/arm64/mm/contpte.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 55107d27d3f8..77ed03b30b72 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
> @@ -68,7 +68,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
> pte = pte_mkyoung(pte);
> }
>
> - __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
> + if (!system_supports_bbml2_noabort())
> + __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
>
> __set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES);
> }
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
2025-03-13 16:13 ` Ryan Roberts
@ 2025-03-13 17:21 ` Yang Shi
2025-03-13 18:13 ` Mikołaj Lenczewski
2025-03-13 17:34 ` Marc Zyngier
2 siblings, 1 reply; 24+ messages in thread
From: Yang Shi @ 2025-03-13 17:21 UTC (permalink / raw)
To: Mikołaj Lenczewski, ryan.roberts, suzuki.poulose, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, maz, james.morse, broonie,
anshuman.khandual, oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On 3/13/25 3:41 AM, Mikołaj Lenczewski wrote:
> The Break-Before-Make cpu feature supports multiple levels (levels 0-2),
> and this commit adds a dedicated BBML2 cpufeature to test against
> support for, as well as a kernel commandline parameter to optionally
> disable BBML2 altogether.
>
> This is a system feature as we might have a big.LITTLE architecture
> where some cores support BBML2 and some don't, but we want all cores to
> be available and BBM to default to level 0 (as opposed to having cores
> without BBML2 not coming online).
>
> To support BBML2 in as wide a range of contexts as we can, we want not
> only the architectural guarantees that BBML2 makes, but additionally
> want BBML2 to not create TLB conflict aborts. Not causing aborts avoids
> us having to prove that no recursive faults can be induced in any path
> that uses BBML2, allowing its use for arbitrary kernel mappings.
> Support detection of such CPUs.
>
> Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com>
> ---
> .../admin-guide/kernel-parameters.txt | 3 +
> arch/arm64/Kconfig | 11 +++
> arch/arm64/include/asm/cpucaps.h | 2 +
> arch/arm64/include/asm/cpufeature.h | 6 ++
> arch/arm64/kernel/cpufeature.c | 76 +++++++++++++++++++
> arch/arm64/kernel/pi/idreg-override.c | 2 +
> arch/arm64/tools/cpucaps | 1 +
> 7 files changed, 101 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index fb8752b42ec8..3e4cc917a07e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -453,6 +453,9 @@
> arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
> 32 bit applications.
>
> + arm64.nobbml2 [ARM64] Unconditionally disable Break-Before-Make Level
> + 2 support
Hi Miko,
A question about the kernel boot parameter. Can this parameter be used
in early boot stage? A quick look at the code shows it should be ok, for
example, cpu_has_bti() is called in map_kernel(). But I'd like to double
check because my patchset needs to check this parameter in map_mem() to
determine whether large block mapping can be used or not.
And a nit below.
> +
> arm64.nobti [ARM64] Unconditionally disable Branch Target
> Identification support
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 940343beb3d4..49deda2b22ae 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2057,6 +2057,17 @@ config ARM64_TLB_RANGE
> The feature introduces new assembly instructions, and they were
> support when binutils >= 2.30.
>
> +config ARM64_BBML2_NOABORT
> + bool "Enable support for Break-Before-Make Level 2 detection and usage"
> + default y
> + help
> + FEAT_BBM provides detection of support levels for break-before-make
> + sequences. If BBM level 2 is supported, some TLB maintenance requirements
> + can be relaxed to improve performance. We additonally require the
> + property that the implementation cannot ever raise TLB Conflict Aborts.
> + Selecting N causes the kernel to fallback to BBM level 0 behaviour
> + even if the system supports BBM level 2.
> +
> endmenu # "ARMv8.4 architectural features"
>
> menu "ARMv8.5 architectural features"
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 0b5ca6e0eb09..2d6db33d4e45 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -23,6 +23,8 @@ cpucap_is_possible(const unsigned int cap)
> return IS_ENABLED(CONFIG_ARM64_PAN);
> case ARM64_HAS_EPAN:
> return IS_ENABLED(CONFIG_ARM64_EPAN);
> + case ARM64_HAS_BBML2_NOABORT:
> + return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT);
> case ARM64_SVE:
> return IS_ENABLED(CONFIG_ARM64_SVE);
> case ARM64_SME:
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index e0e4478f5fb5..7f5b220dacde 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -18,6 +18,7 @@
> #define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0
> #define ARM64_SW_FEATURE_OVERRIDE_HVHE 4
> #define ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF 8
> +#define ARM64_SW_FEATURE_OVERRIDE_NOBBML2 12
>
> #ifndef __ASSEMBLY__
>
> @@ -866,6 +867,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
> return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
> }
>
> +static inline bool system_supports_bbml2_noabort(void)
> +{
> + return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
> +}
> +
> int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
> bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index d561cf3b8ac7..b936e0805161 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2176,6 +2176,76 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
> return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
> }
>
> +static inline bool bbml2_possible(void)
> +{
> + return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
> +}
Can this be moved to cpufeature.h? My patch will use this, anyway I can
do it in my patchset.
Thanks,
Yang
> +
> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> +{
> + /* We want to allow usage of bbml2 in as wide a range of kernel contexts
> + * as possible. This list is therefore an allow-list of known-good
> + * implementations that both support bbml2 and additionally, fulfill the
> + * extra constraint of never generating TLB conflict aborts when using
> + * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
> + * kernel contexts difficult to prove safe against recursive aborts).
> + *
> + * Note that implementations can only be considered "known-good" if their
> + * implementors attest to the fact that the implementation never raises
> + * TLBI conflict aborts for bbml2 mapping granularity changes.
> + */
> + static const struct midr_range supports_bbml2_noabort_list[] = {
> + MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
> + MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
> + {}
> + };
> +
> + return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
> +}
> +
> +static inline unsigned int __cpu_read_midr(int cpu)
> +{
> + WARN_ON_ONCE(!cpu_online(cpu));
> +
> + return per_cpu(cpu_data, cpu).reg_midr;
> +}
> +
> +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
> +{
> + if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
> + return false;
> +
> + if (!bbml2_possible())
> + return false;
> +
> + if (scope & SCOPE_SYSTEM) {
> + int cpu;
> +
> + /* We are a boot CPU, and must verify that all enumerated boot
> + * CPUs have MIDR values within our allowlist. Otherwise, we do
> + * not allow the BBML2 feature to avoid potential faults when
> + * the insufficient CPUs access memory regions using BBML2
> + * semantics.
> + */
> + for_each_online_cpu(cpu) {
> + if (!cpu_has_bbml2_noabort(__cpu_read_midr(cpu)))
> + return false;
> + }
> +
> + return true;
> + } else if (scope & SCOPE_LOCAL_CPU) {
> + /* We are a hot-plugged CPU, so only need to check our MIDR.
> + * If we have the correct MIDR, but the kernel booted on an
> + * insufficient CPU, we will not use BBML2 (this is safe). If
> + * we have an incorrect MIDR, but the kernel booted on a
> + * sufficient CPU, we will not bring up this CPU.
> + */
> + return cpu_has_bbml2_noabort(read_cpuid_id());
> + }
> +
> + return false;
> +}
> +
> #ifdef CONFIG_ARM64_PAN
> static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused)
> {
> @@ -2926,6 +2996,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> .matches = has_cpuid_feature,
> ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
> },
> + {
> + .desc = "BBM Level 2 without conflict abort",
> + .capability = ARM64_HAS_BBML2_NOABORT,
> + .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> + .matches = has_bbml2_noabort,
> + },
> {
> .desc = "52-bit Virtual Addressing for KVM (LPA2)",
> .capability = ARM64_HAS_LPA2,
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index c6b185b885f7..9728faa10390 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> {}
> },
> };
> @@ -246,6 +247,7 @@ static const struct {
> { "rodata=off", "arm64_sw.rodataoff=1" },
> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
> };
>
> static int __init parse_hexdigit(const char *p, u64 *v)
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 1e65f2fb45bd..b03a375e5507 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -14,6 +14,7 @@ HAS_ADDRESS_AUTH_ARCH_QARMA5
> HAS_ADDRESS_AUTH_IMP_DEF
> HAS_AMU_EXTN
> HAS_ARMv8_4_TTL
> +HAS_BBML2_NOABORT
> HAS_CACHE_DIC
> HAS_CACHE_IDC
> HAS_CNP
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
2025-03-13 16:13 ` Ryan Roberts
2025-03-13 17:21 ` Yang Shi
@ 2025-03-13 17:34 ` Marc Zyngier
2025-03-13 18:20 ` Mikołaj Lenczewski
2025-03-13 18:22 ` Ryan Roberts
2 siblings, 2 replies; 24+ messages in thread
From: Marc Zyngier @ 2025-03-13 17:34 UTC (permalink / raw)
To: Mikołaj Lenczewski
Cc: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On Thu, 13 Mar 2025 10:41:10 +0000,
Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
>
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index c6b185b885f7..9728faa10390 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> {}
> },
> };
> @@ -246,6 +247,7 @@ static const struct {
> { "rodata=off", "arm64_sw.rodataoff=1" },
> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
Why is that a SW feature? This looks very much like a HW feature to
me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
with it. Something like:
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index c6b185b885f70..803a0c99f7b46 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
.override = &id_aa64mmfr2_override,
.fields = {
FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter),
+ FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL),
{}
},
};
@@ -246,6 +247,7 @@ static const struct {
{ "rodata=off", "arm64_sw.rodataoff=1" },
{ "arm64.nolva", "id_aa64mmfr2.varange=0" },
{ "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
+ { "arm64.nobbml2", "id_aa64mmfr2.bbm=0" },
};
static int __init parse_hexdigit(const char *p, u64 *v)
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 16:13 ` Ryan Roberts
@ 2025-03-13 18:08 ` Mikołaj Lenczewski
2025-03-14 9:26 ` Ryan Roberts
0 siblings, 1 reply; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 18:08 UTC (permalink / raw)
To: Ryan Roberts
Cc: suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
maz, james.morse, broonie, anshuman.khandual, oliver.upton,
ioworker0, baohua, david, jgg, shameerali.kolothum.thodi,
nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
linux-arm-kernel, iommu
On Thu, Mar 13, 2025 at 04:13:22PM +0000, Ryan Roberts wrote:
> On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index d561cf3b8ac7..b936e0805161 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -2176,6 +2176,76 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
> > return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
> > }
> >
> > +static inline bool bbml2_possible(void)
> > +{
> > + return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
>
> If you're going to keep this helper, I think it really needs to be:
>
> return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT) &&
> !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
>
> Then you would simplify the caller to remove it's own
> IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT) check.
>
> But personally I would remove the helper and just fold the test into
> has_bbml2_noabort().
>
> Thanks,
> Ryan
I was debating folding it into has_bbml2_noabort(), but went ahead and
implemented it separately to match hvhe_possible(), which was another sw
feature helper.
But I agree, folding it will be simpler and read just as easily (if not
easier). Will do so.
> > +}
> > +
> > +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
> > +{
> > + /* We want to allow usage of bbml2 in as wide a range of kernel contexts
> > + * as possible. This list is therefore an allow-list of known-good
> > + * implementations that both support bbml2 and additionally, fulfill the
> > + * extra constraint of never generating TLB conflict aborts when using
> > + * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
> > + * kernel contexts difficult to prove safe against recursive aborts).
> > + *
> > + * Note that implementations can only be considered "known-good" if their
> > + * implementors attest to the fact that the implementation never raises
> > + * TLBI conflict aborts for bbml2 mapping granularity changes.
> > + */
> > + static const struct midr_range supports_bbml2_noabort_list[] = {
> > + MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
> > + MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
> > + {}
> > + };
> > +
> > + return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
> > +}
> > +
> > +static inline unsigned int __cpu_read_midr(int cpu)
>
> nit: why the double underscrore prefix?
Again copying other helpers I saw that seemed to do similar things.
Didn't know if this was the expected style, so did as other helpers did.
Will remove.
Thank you for the review.
--
Kind regards,
Mikołaj Lenczewski
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 17:21 ` Yang Shi
@ 2025-03-13 18:13 ` Mikołaj Lenczewski
2025-03-13 18:17 ` Ryan Roberts
0 siblings, 1 reply; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 18:13 UTC (permalink / raw)
To: Yang Shi, ryan.roberts
Cc: suzuki.poulose, corbet, catalin.marinas, will, jean-philippe,
robin.murphy, joro, akpm, mark.rutland, joey.gouly, maz,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On Thu, Mar 13, 2025 at 10:21:51AM -0700, Yang Shi wrote:
> On 3/13/25 3:41 AM, Mikołaj Lenczewski wrote:
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index fb8752b42ec8..3e4cc917a07e 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -453,6 +453,9 @@
> > arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
> > 32 bit applications.
> >
> > + arm64.nobbml2 [ARM64] Unconditionally disable Break-Before-Make Level
> > + 2 support
>
> Hi Miko,
>
> A question about the kernel boot parameter. Can this parameter be used
> in early boot stage? A quick look at the code shows it should be ok, for
> example, cpu_has_bti() is called in map_kernel(). But I'd like to double
> check because my patchset needs to check this parameter in map_mem() to
> determine whether large block mapping can be used or not.
>
> And a nit below.
I will need to double check exactly when the arm64 software overrides
are finalised, but as long as those values are finalised in / before (?)
the early boot stage then it should be fine? Will reply again once I
check and have an answer.
> > +static inline bool bbml2_possible(void)
> > +{
> > + return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
> > +}
>
> Can this be moved to cpufeature.h? My patch will use this, anyway I can
> do it in my patchset.
>
> Thanks,
> Yang
I can do so. In fact, on second thought, I will probably extend this to
also include the `IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT)` check as well,
and then move it to cpufeature.h, instead of folding said check into
has_bbml2_noabort().
--
Kind regards,
Mikołaj Lenczewski
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 18:13 ` Mikołaj Lenczewski
@ 2025-03-13 18:17 ` Ryan Roberts
0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-03-13 18:17 UTC (permalink / raw)
To: Mikołaj Lenczewski, Yang Shi
Cc: suzuki.poulose, corbet, catalin.marinas, will, jean-philippe,
robin.murphy, joro, akpm, mark.rutland, joey.gouly, maz,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On 13/03/2025 18:13, Mikołaj Lenczewski wrote:
> On Thu, Mar 13, 2025 at 10:21:51AM -0700, Yang Shi wrote:
>> On 3/13/25 3:41 AM, Mikołaj Lenczewski wrote:
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index fb8752b42ec8..3e4cc917a07e 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -453,6 +453,9 @@
>>> arm64.no32bit_el0 [ARM64] Unconditionally disable the execution of
>>> 32 bit applications.
>>>
>>> + arm64.nobbml2 [ARM64] Unconditionally disable Break-Before-Make Level
>>> + 2 support
>>
>> Hi Miko,
>>
>> A question about the kernel boot parameter. Can this parameter be used
>> in early boot stage? A quick look at the code shows it should be ok, for
>> example, cpu_has_bti() is called in map_kernel(). But I'd like to double
>> check because my patchset needs to check this parameter in map_mem() to
>> determine whether large block mapping can be used or not.
>>
>> And a nit below.
>
> I will need to double check exactly when the arm64 software overrides
> are finalised, but as long as those values are finalised in / before (?)
> the early boot stage then it should be fine? Will reply again once I
> check and have an answer.
This will work fine. The override is setup in the PI code before start_kernel().
>
>>> +static inline bool bbml2_possible(void)
>>> +{
>>> + return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
>>> +}
>>
>> Can this be moved to cpufeature.h? My patch will use this, anyway I can
>> do it in my patchset.
I'd prefer to do the moving as part of the series that needs it moved.
Thanks,
Ryan
>>
>> Thanks,
>> Yang
>
> I can do so. In fact, on second thought, I will probably extend this to
> also include the `IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT)` check as well,
> and then move it to cpufeature.h, instead of folding said check into
> has_bbml2_noabort().
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 17:34 ` Marc Zyngier
@ 2025-03-13 18:20 ` Mikołaj Lenczewski
2025-03-13 18:39 ` Marc Zyngier
2025-03-13 18:22 ` Ryan Roberts
1 sibling, 1 reply; 24+ messages in thread
From: Mikołaj Lenczewski @ 2025-03-13 18:20 UTC (permalink / raw)
To: Marc Zyngier
Cc: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On Thu, Mar 13, 2025 at 05:34:46PM +0000, Marc Zyngier wrote:
> On Thu, 13 Mar 2025 10:41:10 +0000,
> Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
> >
> > diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> > index c6b185b885f7..9728faa10390 100644
> > --- a/arch/arm64/kernel/pi/idreg-override.c
> > +++ b/arch/arm64/kernel/pi/idreg-override.c
> > @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> > FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> > FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> > FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> > + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> > {}
> > },
> > };
> > @@ -246,6 +247,7 @@ static const struct {
> > { "rodata=off", "arm64_sw.rodataoff=1" },
> > { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> > { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> > + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
>
> Why is that a SW feature? This looks very much like a HW feature to
> me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
> with it. Something like:
>
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index c6b185b885f70..803a0c99f7b46 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
> .override = &id_aa64mmfr2_override,
> .fields = {
> FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter),
> + FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL),
> {}
> },
> };
> @@ -246,6 +247,7 @@ static const struct {
> { "rodata=off", "arm64_sw.rodataoff=1" },
> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> + { "arm64.nobbml2", "id_aa64mmfr2.bbm=0" },
> };
>
> static int __init parse_hexdigit(const char *p, u64 *v)
>
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
Thanks for the review.
I think part of this confusion is due to me not including a changelog
(definitely something for the next respin!), but the discussion this
change is based on is found here:
https://lore.kernel.org/all/b46dc626-edc9-4d20-99d2-6cd08a01346c@os.amperecomputing.com/
Essentially, this is a SW feature because we do not check the
id_aa64mmfr2.bbm register as part of the has_bbml2_noabort() cpucap
matches filter. This is because certain hardware implementations
do not actually declare bbml2 via the hardware feature register, despite
implementing our bbml2_noabort feature, and certain hypervisor setups
might result in issues so we want to have an override to allow
potentially disabling the feature for generic kernels.
--
Kind regards,
Mikołaj Lenczewski
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 17:34 ` Marc Zyngier
2025-03-13 18:20 ` Mikołaj Lenczewski
@ 2025-03-13 18:22 ` Ryan Roberts
2025-03-13 18:36 ` Marc Zyngier
1 sibling, 1 reply; 24+ messages in thread
From: Ryan Roberts @ 2025-03-13 18:22 UTC (permalink / raw)
To: Marc Zyngier, Mikołaj Lenczewski
Cc: suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On 13/03/2025 17:34, Marc Zyngier wrote:
> On Thu, 13 Mar 2025 10:41:10 +0000,
> Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
>>
>> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
>> index c6b185b885f7..9728faa10390 100644
>> --- a/arch/arm64/kernel/pi/idreg-override.c
>> +++ b/arch/arm64/kernel/pi/idreg-override.c
>> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
>> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
>> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
>> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
>> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
>> {}
>> },
>> };
>> @@ -246,6 +247,7 @@ static const struct {
>> { "rodata=off", "arm64_sw.rodataoff=1" },
>> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
>> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
>> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
>
> Why is that a SW feature? This looks very much like a HW feature to
> me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
> with it. Something like:
I think this implies that we would expect the BBM field to be advertising BBML2
support normally and we would check for that as part of the cpufeature
detection. That's how Miko was doing it in v2, but Yang pointed out that
AmpereOne, which supports BBML2+NOABORT semantics, doesn't actually advertise
BBML2 in its MMFR2. So we don't want to check that field, and instead rely
solely on the MIDR allow-list + a command line override. It was me that
suggested putting that in the SW feature register, and I think that still sounds
like the right solution for this situation?
Thanks,
Ryan
>
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index c6b185b885f70..803a0c99f7b46 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
> .override = &id_aa64mmfr2_override,
> .fields = {
> FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter),
> + FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL),
> {}
> },
> };
> @@ -246,6 +247,7 @@ static const struct {
> { "rodata=off", "arm64_sw.rodataoff=1" },
> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> + { "arm64.nobbml2", "id_aa64mmfr2.bbm=0" },
> };
>
> static int __init parse_hexdigit(const char *p, u64 *v)
>
>
> Thanks,
>
> M.
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 18:22 ` Ryan Roberts
@ 2025-03-13 18:36 ` Marc Zyngier
2025-03-14 9:18 ` Ryan Roberts
0 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2025-03-13 18:36 UTC (permalink / raw)
To: Ryan Roberts
Cc: Mikołaj Lenczewski, suzuki.poulose, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, james.morse, broonie, anshuman.khandual,
oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On Thu, 13 Mar 2025 18:22:00 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 13/03/2025 17:34, Marc Zyngier wrote:
> > On Thu, 13 Mar 2025 10:41:10 +0000,
> > Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
> >>
> >> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> >> index c6b185b885f7..9728faa10390 100644
> >> --- a/arch/arm64/kernel/pi/idreg-override.c
> >> +++ b/arch/arm64/kernel/pi/idreg-override.c
> >> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> >> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> >> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> >> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> >> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> >> {}
> >> },
> >> };
> >> @@ -246,6 +247,7 @@ static const struct {
> >> { "rodata=off", "arm64_sw.rodataoff=1" },
> >> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> >> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> >> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
> >
> > Why is that a SW feature? This looks very much like a HW feature to
> > me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
> > with it. Something like:
>
> I think this implies that we would expect the BBM field to be advertising BBML2
> support normally and we would check for that as part of the cpufeature
> detection. That's how Miko was doing it in v2, but Yang pointed out that
> AmpereOne, which supports BBML2+NOABORT semantics, doesn't actually advertise
> BBML2 in its MMFR2. So we don't want to check that field, and instead rely
> solely on the MIDR allow-list + a command line override. It was me that
> suggested putting that in the SW feature register, and I think that still sounds
> like the right solution for this situation?
I think this is mixing two different things:
- preventing BBM-L2 from being visible to the kernel: this is what my
suggestion is doing by nuking an architectural feature in the
relevant register
- random HW not correctly advertising what they are doing: this is an
erratum workaround
I'd rather we don't conflate the two things, and make them very
explicitly distinct.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 18:20 ` Mikołaj Lenczewski
@ 2025-03-13 18:39 ` Marc Zyngier
0 siblings, 0 replies; 24+ messages in thread
From: Marc Zyngier @ 2025-03-13 18:39 UTC (permalink / raw)
To: Mikołaj Lenczewski
Cc: ryan.roberts, suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On Thu, 13 Mar 2025 18:20:26 +0000,
Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
>
> On Thu, Mar 13, 2025 at 05:34:46PM +0000, Marc Zyngier wrote:
> > On Thu, 13 Mar 2025 10:41:10 +0000,
> > Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
> > >
> > > diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> > > index c6b185b885f7..9728faa10390 100644
> > > --- a/arch/arm64/kernel/pi/idreg-override.c
> > > +++ b/arch/arm64/kernel/pi/idreg-override.c
> > > @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> > > FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> > > FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> > > FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> > > + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> > > {}
> > > },
> > > };
> > > @@ -246,6 +247,7 @@ static const struct {
> > > { "rodata=off", "arm64_sw.rodataoff=1" },
> > > { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> > > { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> > > + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
> >
> > Why is that a SW feature? This looks very much like a HW feature to
> > me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
> > with it. Something like:
> >
> > diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> > index c6b185b885f70..803a0c99f7b46 100644
> > --- a/arch/arm64/kernel/pi/idreg-override.c
> > +++ b/arch/arm64/kernel/pi/idreg-override.c
> > @@ -102,6 +102,7 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
> > .override = &id_aa64mmfr2_override,
> > .fields = {
> > FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter),
> > + FIELD("bbm", ID_AA64MMFR2_EL1_BBM_SHIFT, NULL),
> > {}
> > },
> > };
> > @@ -246,6 +247,7 @@ static const struct {
> > { "rodata=off", "arm64_sw.rodataoff=1" },
> > { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> > { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> > + { "arm64.nobbml2", "id_aa64mmfr2.bbm=0" },
> > };
> >
> > static int __init parse_hexdigit(const char *p, u64 *v)
> >
> >
> > Thanks,
> >
> > M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
>
> Thanks for the review.
>
> I think part of this confusion is due to me not including a changelog
> (definitely something for the next respin!), but the discussion this
> change is based on is found here:
>
> https://lore.kernel.org/all/b46dc626-edc9-4d20-99d2-6cd08a01346c@os.amperecomputing.com/
>
> Essentially, this is a SW feature because we do not check the
> id_aa64mmfr2.bbm register as part of the has_bbml2_noabort() cpucap
> matches filter. This is because certain hardware implementations
> do not actually declare bbml2 via the hardware feature register, despite
> implementing our bbml2_noabort feature, and certain hypervisor setups
> might result in issues so we want to have an override to allow
> potentially disabling the feature for generic kernels.
I replied to Ryan on the same subject: not advertising a feature that
is actually supported is very much an erratum, and we should not
conflate feature control of an architecture feature (which is what the
ID override horror is doing) with implementation-specific workarounds.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 18:36 ` Marc Zyngier
@ 2025-03-14 9:18 ` Ryan Roberts
2025-03-14 10:11 ` Marc Zyngier
2025-03-14 12:33 ` Suzuki K Poulose
0 siblings, 2 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-03-14 9:18 UTC (permalink / raw)
To: Marc Zyngier
Cc: Mikołaj Lenczewski, suzuki.poulose, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, james.morse, broonie, anshuman.khandual,
oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On 13/03/2025 18:36, Marc Zyngier wrote:
> On Thu, 13 Mar 2025 18:22:00 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 13/03/2025 17:34, Marc Zyngier wrote:
>>> On Thu, 13 Mar 2025 10:41:10 +0000,
>>> Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
>>>>
>>>> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
>>>> index c6b185b885f7..9728faa10390 100644
>>>> --- a/arch/arm64/kernel/pi/idreg-override.c
>>>> +++ b/arch/arm64/kernel/pi/idreg-override.c
>>>> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
>>>> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
>>>> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
>>>> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
>>>> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
>>>> {}
>>>> },
>>>> };
>>>> @@ -246,6 +247,7 @@ static const struct {
>>>> { "rodata=off", "arm64_sw.rodataoff=1" },
>>>> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
>>>> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
>>>> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
>>>
>>> Why is that a SW feature? This looks very much like a HW feature to
>>> me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
>>> with it. Something like:
>>
>> I think this implies that we would expect the BBM field to be advertising BBML2
>> support normally and we would check for that as part of the cpufeature
>> detection. That's how Miko was doing it in v2, but Yang pointed out that
>> AmpereOne, which supports BBML2+NOABORT semantics, doesn't actually advertise
>> BBML2 in its MMFR2. So we don't want to check that field, and instead rely
>> solely on the MIDR allow-list + a command line override. It was me that
>> suggested putting that in the SW feature register, and I think that still sounds
>> like the right solution for this situation?
>
> I think this is mixing two different things:
>
> - preventing BBM-L2 from being visible to the kernel: this is what my
> suggestion is doing by nuking an architectural feature in the
> relevant register
>
> - random HW not correctly advertising what they are doing: this is an
> erratum workaround
>
> I'd rather we don't conflate the two things, and make them very
> explicitly distinct.
It all sounds so obvious when you put it like that! :)
I'm guessing there is a layer where the workaround can be applied to the
sanitised feature registers on a per-cpu basis and that won't affect this global
override which will remain as an overlay on top? If so then that sounds perfect
(you can probably tell I find the whole feature management framework rather
inpeneterable). That workaround would be added as part of Yang's series anyway.
So sounds like we are back to testing MMFR2.BBM in the matches function, with
the addition of Maz's proposal above. Sorry for sending you round the houses, Miko.
Thanks,
Ryan
>
> Thanks,
>
> M.
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-13 18:08 ` Mikołaj Lenczewski
@ 2025-03-14 9:26 ` Ryan Roberts
0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-03-14 9:26 UTC (permalink / raw)
To: Mikołaj Lenczewski
Cc: suzuki.poulose, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
maz, james.morse, broonie, anshuman.khandual, oliver.upton,
ioworker0, baohua, david, jgg, shameerali.kolothum.thodi,
nicolinc, mshavit, jsnitsel, smostafa, linux-doc, linux-kernel,
linux-arm-kernel, iommu
On 13/03/2025 18:08, Mikołaj Lenczewski wrote:
> On Thu, Mar 13, 2025 at 04:13:22PM +0000, Ryan Roberts wrote:
>> On 13/03/2025 10:41, Mikołaj Lenczewski wrote:
>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>> index d561cf3b8ac7..b936e0805161 100644
>>> --- a/arch/arm64/kernel/cpufeature.c
>>> +++ b/arch/arm64/kernel/cpufeature.c
>>> @@ -2176,6 +2176,76 @@ static bool hvhe_possible(const struct arm64_cpu_capabilities *entry,
>>> return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE);
>>> }
>>>
>>> +static inline bool bbml2_possible(void)
>>> +{
>>> + return !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
>>
>> If you're going to keep this helper, I think it really needs to be:
>>
>> return IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT) &&
>> !arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOBBML2);
>>
>> Then you would simplify the caller to remove it's own
>> IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT) check.
>>
>> But personally I would remove the helper and just fold the test into
>> has_bbml2_noabort().
>>
>> Thanks,
>> Ryan
>
> I was debating folding it into has_bbml2_noabort(), but went ahead and
> implemented it separately to match hvhe_possible(), which was another sw
> feature helper.
hvhe_possible() is a .matches function, so there is nothing to fold it into.
>
> But I agree, folding it will be simpler and read just as easily (if not
> easier). Will do so.
>
>>> +}
>>> +
>>> +static bool cpu_has_bbml2_noabort(unsigned int cpu_midr)
>>> +{
>>> + /* We want to allow usage of bbml2 in as wide a range of kernel contexts
>>> + * as possible. This list is therefore an allow-list of known-good
>>> + * implementations that both support bbml2 and additionally, fulfill the
>>> + * extra constraint of never generating TLB conflict aborts when using
>>> + * the relaxed bbml2 semantics (such aborts make use of bbml2 in certain
>>> + * kernel contexts difficult to prove safe against recursive aborts).
>>> + *
>>> + * Note that implementations can only be considered "known-good" if their
>>> + * implementors attest to the fact that the implementation never raises
>>> + * TLBI conflict aborts for bbml2 mapping granularity changes.
>>> + */
>>> + static const struct midr_range supports_bbml2_noabort_list[] = {
>>> + MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf),
>>> + MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf),
>>> + {}
>>> + };
>>> +
>>> + return is_midr_in_range_list(cpu_midr, supports_bbml2_noabort_list);
>>> +}
>>> +
>>> +static inline unsigned int __cpu_read_midr(int cpu)
>>
>> nit: why the double underscrore prefix?
>
> Again copying other helpers I saw that seemed to do similar things.
> Didn't know if this was the expected style, so did as other helpers did.
> Will remove.
Often those double underscores are used when you have a public function wrapping
into a private function, like this:
static void __do_a_thing(bool modify_behaviour_in_some_way);
void do_a_thing(void)
{
__do_a_thing(false);
}
I'm sure the coding style offers a better explanation.
Thanks,
Ryan
>
> Thank you for the review.
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-14 9:18 ` Ryan Roberts
@ 2025-03-14 10:11 ` Marc Zyngier
2025-03-14 12:33 ` Suzuki K Poulose
1 sibling, 0 replies; 24+ messages in thread
From: Marc Zyngier @ 2025-03-14 10:11 UTC (permalink / raw)
To: Ryan Roberts
Cc: Mikołaj Lenczewski, suzuki.poulose, yang, corbet,
catalin.marinas, will, jean-philippe, robin.murphy, joro, akpm,
mark.rutland, joey.gouly, james.morse, broonie, anshuman.khandual,
oliver.upton, ioworker0, baohua, david, jgg,
shameerali.kolothum.thodi, nicolinc, mshavit, jsnitsel, smostafa,
linux-doc, linux-kernel, linux-arm-kernel, iommu
On Fri, 14 Mar 2025 09:18:43 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 13/03/2025 18:36, Marc Zyngier wrote:
> > On Thu, 13 Mar 2025 18:22:00 +0000,
> > Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> On 13/03/2025 17:34, Marc Zyngier wrote:
> >>> On Thu, 13 Mar 2025 10:41:10 +0000,
> >>> Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
> >>>>
> >>>> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> >>>> index c6b185b885f7..9728faa10390 100644
> >>>> --- a/arch/arm64/kernel/pi/idreg-override.c
> >>>> +++ b/arch/arm64/kernel/pi/idreg-override.c
> >>>> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
> >>>> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
> >>>> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
> >>>> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
> >>>> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
> >>>> {}
> >>>> },
> >>>> };
> >>>> @@ -246,6 +247,7 @@ static const struct {
> >>>> { "rodata=off", "arm64_sw.rodataoff=1" },
> >>>> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
> >>>> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
> >>>> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
> >>>
> >>> Why is that a SW feature? This looks very much like a HW feature to
> >>> me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
> >>> with it. Something like:
> >>
> >> I think this implies that we would expect the BBM field to be advertising BBML2
> >> support normally and we would check for that as part of the cpufeature
> >> detection. That's how Miko was doing it in v2, but Yang pointed out that
> >> AmpereOne, which supports BBML2+NOABORT semantics, doesn't actually advertise
> >> BBML2 in its MMFR2. So we don't want to check that field, and instead rely
> >> solely on the MIDR allow-list + a command line override. It was me that
> >> suggested putting that in the SW feature register, and I think that still sounds
> >> like the right solution for this situation?
> >
> > I think this is mixing two different things:
> >
> > - preventing BBM-L2 from being visible to the kernel: this is what my
> > suggestion is doing by nuking an architectural feature in the
> > relevant register
> >
> > - random HW not correctly advertising what they are doing: this is an
> > erratum workaround
> >
> > I'd rather we don't conflate the two things, and make them very
> > explicitly distinct.
>
> It all sounds so obvious when you put it like that! :)
>
> I'm guessing there is a layer where the workaround can be applied to the
> sanitised feature registers on a per-cpu basis and that won't affect this global
> override which will remain as an overlay on top? If so then that sounds perfect
> (you can probably tell I find the whole feature management framework rather
> inpeneterable).
You and I, brother... The only person who actually understands what's
in that file is Suzuki.
> That workaround would be added as part of Yang's series anyway.
Yup, that's what I'd expect. Ideally tied to an erratum number so that
we have an actual promise from the vendor that their implementation is
actually BBM-L2 compliant despite the idreg breakage.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-14 9:18 ` Ryan Roberts
2025-03-14 10:11 ` Marc Zyngier
@ 2025-03-14 12:33 ` Suzuki K Poulose
2025-03-14 13:12 ` Ryan Roberts
1 sibling, 1 reply; 24+ messages in thread
From: Suzuki K Poulose @ 2025-03-14 12:33 UTC (permalink / raw)
To: Ryan Roberts, Marc Zyngier
Cc: Mikołaj Lenczewski, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On 14/03/2025 09:18, Ryan Roberts wrote:
> On 13/03/2025 18:36, Marc Zyngier wrote:
>> On Thu, 13 Mar 2025 18:22:00 +0000,
>> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>
>>> On 13/03/2025 17:34, Marc Zyngier wrote:
>>>> On Thu, 13 Mar 2025 10:41:10 +0000,
>>>> Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
>>>>>
>>>>> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
>>>>> index c6b185b885f7..9728faa10390 100644
>>>>> --- a/arch/arm64/kernel/pi/idreg-override.c
>>>>> +++ b/arch/arm64/kernel/pi/idreg-override.c
>>>>> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features __prel64_initconst = {
>>>>> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
>>>>> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
>>>>> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
>>>>> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
>>>>> {}
>>>>> },
>>>>> };
>>>>> @@ -246,6 +247,7 @@ static const struct {
>>>>> { "rodata=off", "arm64_sw.rodataoff=1" },
>>>>> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
>>>>> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
>>>>> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
>>>>
>>>> Why is that a SW feature? This looks very much like a HW feature to
>>>> me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
>>>> with it. Something like:
>>>
>>> I think this implies that we would expect the BBM field to be advertising BBML2
>>> support normally and we would check for that as part of the cpufeature
>>> detection. That's how Miko was doing it in v2, but Yang pointed out that
>>> AmpereOne, which supports BBML2+NOABORT semantics, doesn't actually advertise
>>> BBML2 in its MMFR2. So we don't want to check that field, and instead rely
>>> solely on the MIDR allow-list + a command line override. It was me that
>>> suggested putting that in the SW feature register, and I think that still sounds
>>> like the right solution for this situation?
>>
>> I think this is mixing two different things:
>>
>> - preventing BBM-L2 from being visible to the kernel: this is what my
>> suggestion is doing by nuking an architectural feature in the
>> relevant register
>>
>> - random HW not correctly advertising what they are doing: this is an
>> erratum workaround
>>
>> I'd rather we don't conflate the two things, and make them very
>> explicitly distinct.
>
> It all sounds so obvious when you put it like that! :)
>
> I'm guessing there is a layer where the workaround can be applied to the
> sanitised feature registers on a per-cpu basis and that won't affect this global
> override which will remain as an overlay on top? If so then that sounds perfect
> (you can probably tell I find the whole feature management framework rather
> inpeneterable). That workaround would be added as part of Yang's series anyway.
Unfortunately, there is no easy way to fix this via the normal erratum
workaround "capability". The sanitised feature registers are handled
separately (initialised via init_cpu_features() for boot CPU and
sanitised eachtime via update_cpu_features).
Also we do not "enable" any capability (i.e. calling cpu_enable())
until the very end, after the CPUs are all brought up (except for boot
CPUs).
But it may be possible to "fix up" the BBML2 feature in
cpuinfo_store_*cpu(), without using the "enable" call back.
Something like:
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 285d7d538342..8c23adbe29f8 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -479,7 +479,11 @@ static void __cpuinfo_store_cpu(struct
cpuinfo_arm64 *info)
info->reg_id_aa64mmfr0 = read_cpuid(ID_AA64MMFR0_EL1);
info->reg_id_aa64mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1);
+ /*
+ * The CPU cap is not detected system wide, but we are able to
+ * check if this CPU is affected by the Erratum.
+ */
+ if (this_cpu_has_cap(AMPERE_ONE_ERRATUM_BBML2))
+ // Fixup info->reg_id_aa64_mmfr2 with BBML2.
+
info->reg_id_aa64mmfr3 = read_cpuid(ID_AA64MMFR3_EL1);
info->reg_id_aa64mmfr4 = read_cpuid(ID_AA64MMFR4_EL1);
info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);
info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
Suzuki
>
> So sounds like we are back to testing MMFR2.BBM in the matches function, with
> the addition of Maz's proposal above. Sorry for sending you round the houses, Miko.
>
> Thanks,
> Ryan
>
>>
>> Thanks,
>>
>> M.
>>
>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature
2025-03-14 12:33 ` Suzuki K Poulose
@ 2025-03-14 13:12 ` Ryan Roberts
0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-03-14 13:12 UTC (permalink / raw)
To: Suzuki K Poulose, Marc Zyngier
Cc: Mikołaj Lenczewski, yang, corbet, catalin.marinas, will,
jean-philippe, robin.murphy, joro, akpm, mark.rutland, joey.gouly,
james.morse, broonie, anshuman.khandual, oliver.upton, ioworker0,
baohua, david, jgg, shameerali.kolothum.thodi, nicolinc, mshavit,
jsnitsel, smostafa, linux-doc, linux-kernel, linux-arm-kernel,
iommu
On 14/03/2025 12:33, Suzuki K Poulose wrote:
> On 14/03/2025 09:18, Ryan Roberts wrote:
>> On 13/03/2025 18:36, Marc Zyngier wrote:
>>> On Thu, 13 Mar 2025 18:22:00 +0000,
>>> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>
>>>> On 13/03/2025 17:34, Marc Zyngier wrote:
>>>>> On Thu, 13 Mar 2025 10:41:10 +0000,
>>>>> Mikołaj Lenczewski <miko.lenczewski@arm.com> wrote:
>>>>>>
>>>>>> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/
>>>>>> idreg-override.c
>>>>>> index c6b185b885f7..9728faa10390 100644
>>>>>> --- a/arch/arm64/kernel/pi/idreg-override.c
>>>>>> +++ b/arch/arm64/kernel/pi/idreg-override.c
>>>>>> @@ -209,6 +209,7 @@ static const struct ftr_set_desc sw_features
>>>>>> __prel64_initconst = {
>>>>>> FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL),
>>>>>> FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter),
>>>>>> FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL),
>>>>>> + FIELD("nobbml2", ARM64_SW_FEATURE_OVERRIDE_NOBBML2, NULL),
>>>>>> {}
>>>>>> },
>>>>>> };
>>>>>> @@ -246,6 +247,7 @@ static const struct {
>>>>>> { "rodata=off", "arm64_sw.rodataoff=1" },
>>>>>> { "arm64.nolva", "id_aa64mmfr2.varange=0" },
>>>>>> { "arm64.no32bit_el0", "id_aa64pfr0.el0=1" },
>>>>>> + { "arm64.nobbml2", "arm64_sw.nobbml2=1" },
>>>>>
>>>>> Why is that a SW feature? This looks very much like a HW feature to
>>>>> me, and you should instead mask out ID_AA64MMFR2_EL1.BBM, and be done
>>>>> with it. Something like:
>>>>
>>>> I think this implies that we would expect the BBM field to be advertising BBML2
>>>> support normally and we would check for that as part of the cpufeature
>>>> detection. That's how Miko was doing it in v2, but Yang pointed out that
>>>> AmpereOne, which supports BBML2+NOABORT semantics, doesn't actually advertise
>>>> BBML2 in its MMFR2. So we don't want to check that field, and instead rely
>>>> solely on the MIDR allow-list + a command line override. It was me that
>>>> suggested putting that in the SW feature register, and I think that still
>>>> sounds
>>>> like the right solution for this situation?
>>>
>>> I think this is mixing two different things:
>>>
>>> - preventing BBM-L2 from being visible to the kernel: this is what my
>>> suggestion is doing by nuking an architectural feature in the
>>> relevant register
>>>
>>> - random HW not correctly advertising what they are doing: this is an
>>> erratum workaround
>>>
>>> I'd rather we don't conflate the two things, and make them very
>>> explicitly distinct.
>>
>> It all sounds so obvious when you put it like that! :)
>>
>> I'm guessing there is a layer where the workaround can be applied to the
>> sanitised feature registers on a per-cpu basis and that won't affect this global
>> override which will remain as an overlay on top? If so then that sounds perfect
>> (you can probably tell I find the whole feature management framework rather
>> inpeneterable). That workaround would be added as part of Yang's series anyway.
>
> Unfortunately, there is no easy way to fix this via the normal erratum
> workaround "capability". The sanitised feature registers are handled
> separately (initialised via init_cpu_features() for boot CPU and
> sanitised eachtime via update_cpu_features).
>
> Also we do not "enable" any capability (i.e. calling cpu_enable())
> until the very end, after the CPUs are all brought up (except for boot CPUs).
>
> But it may be possible to "fix up" the BBML2 feature in
> cpuinfo_store_*cpu(), without using the "enable" call back.
>
> Something like:
>
> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> index 285d7d538342..8c23adbe29f8 100644
> --- a/arch/arm64/kernel/cpuinfo.c
> +++ b/arch/arm64/kernel/cpuinfo.c
> @@ -479,7 +479,11 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
> info->reg_id_aa64mmfr0 = read_cpuid(ID_AA64MMFR0_EL1);
> info->reg_id_aa64mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
> info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1);
>
> + /*
> + * The CPU cap is not detected system wide, but we are able to
> + * check if this CPU is affected by the Erratum.
> + */
> + if (this_cpu_has_cap(AMPERE_ONE_ERRATUM_BBML2))
> + // Fixup info->reg_id_aa64_mmfr2 with BBML2.
> +
> info->reg_id_aa64mmfr3 = read_cpuid(ID_AA64MMFR3_EL1);
> info->reg_id_aa64mmfr4 = read_cpuid(ID_AA64MMFR4_EL1);
> info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);
> info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
>
This is the type of thing I was imagining, thanks!
> Suzuki
>
>>
>> So sounds like we are back to testing MMFR2.BBM in the matches function, with
>> the addition of Maz's proposal above. Sorry for sending you round the houses,
>> Miko.
>>
>> Thanks,
>> Ryan
>>
>>>
>>> Thanks,
>>>
>>> M.
>>>
>>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2025-03-14 13:15 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-13 10:41 [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Mikołaj Lenczewski
2025-03-13 10:41 ` [PATCH v3 1/3] arm64: Add BBM Level 2 cpu feature Mikołaj Lenczewski
2025-03-13 16:13 ` Ryan Roberts
2025-03-13 18:08 ` Mikołaj Lenczewski
2025-03-14 9:26 ` Ryan Roberts
2025-03-13 17:21 ` Yang Shi
2025-03-13 18:13 ` Mikołaj Lenczewski
2025-03-13 18:17 ` Ryan Roberts
2025-03-13 17:34 ` Marc Zyngier
2025-03-13 18:20 ` Mikołaj Lenczewski
2025-03-13 18:39 ` Marc Zyngier
2025-03-13 18:22 ` Ryan Roberts
2025-03-13 18:36 ` Marc Zyngier
2025-03-14 9:18 ` Ryan Roberts
2025-03-14 10:11 ` Marc Zyngier
2025-03-14 12:33 ` Suzuki K Poulose
2025-03-14 13:12 ` Ryan Roberts
2025-03-13 10:41 ` [PATCH v3 2/3] iommu/arm: Add BBM Level 2 smmu feature Mikołaj Lenczewski
2025-03-13 11:39 ` Robin Murphy
2025-03-13 16:18 ` Ryan Roberts
2025-03-13 10:41 ` [PATCH v3 3/3] arm64/mm: Elide tlbi in contpte_convert() under BBML2 Mikołaj Lenczewski
2025-03-13 16:28 ` Ryan Roberts
2025-03-13 11:22 ` [PATCH v3 0/3] Initial BBML2 support for contpte_convert() Suzuki K Poulose
2025-03-13 11:30 ` Mikołaj Lenczewski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).