linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT
@ 2024-10-22  9:27 Yicong Yang
  2024-10-22  9:27 ` [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-22  9:27 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

This series adds basic support for FEAT_HAFT introduced in Armv8.9/v9.4
and enable ARCH_HAS_NONLEAF_PMD_YOUNG. The latter will be used in
lru-gen aging. Tested with lru-gen in below steps:
1. Generate a 1GiB workingset by `stress-ng --vm 1`. Then hang the task to
   stop accessing the memory. (AF bit won't be updated)
2. try to age the memory by /sys/kernel/debug/lru_gen

Run above steps with LRU_GEN_NONLEAF_YOUNG(0x4) and not respectively
(switching by /sys/kernel/mm/lru_gen/enabled). LRU_GEN_NONLEAF_YOUNG
will clear and test the PMD AF bit on page walking for aging,
otherwise will clear and test the PTE AF bit for aging. In this case
LRU_GEN_NONLEAF_YOUNG will improve the efficiency of page scanning
since pages won't be accessed and we don't need to scan each PTE.
Observed ~40% time saved for 1GiB memory on our emulated platform
with LRU_GEN_NONLEAF_YOUNG.

For lru-gen aging:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/mm/multigen_lru.rst?h=v6.11-rc1#n94

Change since v2:
- Address comments per Will and Catalin:
  o detect and enable the feature in __cpu_setup()
  o allow online the CPU that doesn't have this feature and mismatch with the boot CPU
  o only advertise the feature if it's enabled system widely
  o set AF bit for kernel page table entries to save later hardware update
  o warn unexpected pmdp_test_and_clear_young()
- Update all the new AA64MMFR1_EL1 fields per Mark
Link: https://lore.kernel.org/linux-arm-kernel/20240814092333.7727-1-yangyicong@huawei.com/

Change since v1:
- Address comments from Marc, improve comments/Kconfig, clean code. Thanks for
  the comments.
Link: https://lore.kernel.org/linux-arm-kernel/20240802093458.32683-1-yangyicong@huawei.com/


Yicong Yang (5):
  arm64/sysreg: Update ID_AA64MMFR1_EL1 register
  arm64: setup: name 'tcr2' register
  arm64: Add support for FEAT_HAFT
  arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
  arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()

 arch/arm64/Kconfig                     | 16 ++++++++++++++++
 arch/arm64/include/asm/cpufeature.h    | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/pgalloc.h       |  9 +++++----
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/include/asm/pgtable.h       | 16 ++++++++++++++--
 arch/arm64/kernel/cpufeature.c         | 23 +++++++++++++++++++++++
 arch/arm64/mm/fixmap.c                 |  9 ++++++---
 arch/arm64/mm/mmu.c                    |  8 ++++----
 arch/arm64/mm/proc.S                   | 20 +++++++++++++++++---
 arch/arm64/tools/cpucaps               |  1 +
 arch/arm64/tools/sysreg                |  4 ++++
 11 files changed, 118 insertions(+), 16 deletions(-)

-- 
2.24.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register
  2024-10-22  9:27 [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
@ 2024-10-22  9:27 ` Yicong Yang
  2024-10-22 17:05   ` Mark Brown
  2024-10-22  9:27 ` [PATCH v3 2/5] arm64: setup: name 'tcr2' register Yicong Yang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-10-22  9:27 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Update ID_AA64MMFR1_EL1 register fields definition per DDI0601 (ID092424)
2024-09. ID_AA64MMFR1_EL1.ETS adds definition for FEAT_ETS2 and
FEAT_ETS3. ID_AA64MMFR1_EL1.HAFDBS adds definition for FEAT_HAFT and
FEAT_HDBSS.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/tools/sysreg | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 8d637ac4b7c6..ae64ba810298 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1648,6 +1648,8 @@ EndEnum
 UnsignedEnum	39:36	ETS
 	0b0000	NI
 	0b0001	IMP
+	0b0010	ETS2
+	0b0011	ETS3
 EndEnum
 UnsignedEnum	35:32	TWED
 	0b0000	NI
@@ -1688,6 +1690,8 @@ UnsignedEnum	3:0	HAFDBS
 	0b0000	NI
 	0b0001	AF
 	0b0010	DBM
+	0b0011	HAFT
+	0b0100	HDBSS
 EndEnum
 EndSysreg
 
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/5] arm64: setup: name 'tcr2' register
  2024-10-22  9:27 [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
  2024-10-22  9:27 ` [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
@ 2024-10-22  9:27 ` Yicong Yang
  2024-10-22 16:54   ` Catalin Marinas
  2024-10-22  9:27 ` [PATCH v3 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-10-22  9:27 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

TCR2_EL1 introduced some additional controls besides TCR_EL1. Currently
only PIE is supported and enabled by writing TCR2_EL1 directly if PIE
detected.

Introduce a named register 'tcr2' just like 'tcr' we've already had.
It'll be initialized to 0 and updated if certain feature detected and
needs to be enabled. Touch the TCR2_EL1 registers at last with the
updated 'tcr2' value if FEAT_TCR2 supported by checking
ID_AA64MMFR3_EL1.TCRX. Then we can extend the support of other features
controlled by TCR2_EL1.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/mm/proc.S | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 8abdc7fed321..ccbae4525891 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -465,10 +465,12 @@ SYM_FUNC_START(__cpu_setup)
 	 */
 	mair	.req	x17
 	tcr	.req	x16
+	tcr2	.req	x15
 	mov_q	mair, MAIR_EL1_SET
 	mov_q	tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | \
 		     TCR_SHARED | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
 		     TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
+	mov	tcr2, xzr
 
 	tcr_clear_errata_bits tcr, x9, x5
 
@@ -525,11 +527,16 @@ alternative_else_nop_endif
 #undef PTE_MAYBE_NG
 #undef PTE_MAYBE_SHARED
 
-	mov	x0, TCR2_EL1x_PIE
-	msr	REG_TCR2_EL1, x0
+	orr	tcr2, tcr2, TCR2_EL1x_PIE
 
 .Lskip_indirection:
 
+	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
+	ubfx	x1, x1, #ID_AA64MMFR3_EL1_TCRX_SHIFT, #4
+	cbz	x1, 1f
+	msr	REG_TCR2_EL1, tcr2
+1:
+
 	/*
 	 * Prepare SCTLR
 	 */
@@ -538,4 +545,5 @@ alternative_else_nop_endif
 
 	.unreq	mair
 	.unreq	tcr
+	.unreq	tcr2
 SYM_FUNC_END(__cpu_setup)
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-22  9:27 [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
  2024-10-22  9:27 ` [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
  2024-10-22  9:27 ` [PATCH v3 2/5] arm64: setup: name 'tcr2' register Yicong Yang
@ 2024-10-22  9:27 ` Yicong Yang
  2024-10-22 18:30   ` Catalin Marinas
  2024-10-22  9:27 ` [PATCH v3 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
  2024-10-22  9:27 ` [PATCH v3 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang
  4 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-10-22  9:27 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Armv8.9/v9.4 introduces the feature Hardware managed Access Flag
for Table descriptors (FEAT_HAFT). The feature is indicated by
ID_AA64MMFR1_EL1.HAFDBS == 0b0011 and can be enabled by
TCR2_EL1.HAFT so it has a dependency on FEAT_TCR2.

Adds the Kconfig for FEAT_HAFT and support detecting and enabling
the feature. The feature is detected and enabled in __cpu_setup()
before MMU on just like HA. A CPU capability is added to notify the
user of the feature and how many CPUs in the system have this feature.

Add definition of P{G,4,U,M}D_TABLE_AF bit and set the AF bit
when creating the kernel page table, which will save the hardware
from having to update them at runtime. This will be ignored if
FEAT_HAFT is not enabled.

The AF bit of table descriptors cannot be managed by the software
per spec, unlike the HA. So users should use this feature only if
it's supported system wide by system_support_haft().

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/Kconfig                     | 15 +++++++++++++++
 arch/arm64/include/asm/cpufeature.h    | 24 ++++++++++++++++++++++++
 arch/arm64/include/asm/pgalloc.h       |  9 +++++----
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/kernel/cpufeature.c         | 23 +++++++++++++++++++++++
 arch/arm64/mm/fixmap.c                 |  9 ++++++---
 arch/arm64/mm/mmu.c                    |  8 ++++----
 arch/arm64/mm/proc.S                   |  8 +++++++-
 arch/arm64/tools/cpucaps               |  1 +
 9 files changed, 89 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3e29b44d2d7b..029d7ad89de5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2176,6 +2176,21 @@ config ARCH_PKEY_BITS
 	int
 	default 3
 
+config ARM64_HAFT
+	bool "Support for Hardware managed Access Flag for Table Descriptor"
+	depends on ARM64_HW_AFDBM
+	default y
+	help
+	  The ARMv8.9/ARMv9.5 introduces the feature Hardware managed Access
+	  Flag for Table descriptors. When enabled an architectural executed
+	  memory access will update the Access Flag in each Table descriptor
+	  which is accessed during the translation table walk and for which
+	  the Access Flag is 0. The Access Flag of the Table descriptor use
+	  the same bit of PTE_AF.
+
+	  The feature will only be enabled if all the CPUs in the system
+	  support this feature. If unsure, say Y.
+
 endmenu # "ARMv8.9 architectural features"
 
 config ARM64_SVE
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 3d261cc123c1..fba2347c0aa6 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -879,6 +879,30 @@ static inline bool cpu_has_hw_af(void)
 						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
 }
 
+/*
+ * Contrary to the page/block access flag, the table access flag
+ * cannot be emulated in software (no access fault will occur).
+ * So users should use this feature only if it's supported system
+ * wide.
+ */
+static inline bool system_support_haft(void)
+{
+	unsigned int hafdbs;
+	u64 mmfr1;
+
+	if (!IS_ENABLED(CONFIG_ARM64_HAFT))
+		return false;
+
+	/*
+	 * Check the sanitised registers to see this feature is supported
+	 * on all the CPUs.
+	 */
+	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	hafdbs = cpuid_feature_extract_unsigned_field(mmfr1,
+						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
+	return hafdbs >= ID_AA64MMFR1_EL1_HAFDBS_HAFT;
+}
+
 static inline bool cpu_has_pan(void)
 {
 	u64 mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 8ff5f2a2579e..bc1051d65125 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
 {
 	pudval_t pudval = PUD_TYPE_TABLE;
 
-	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
+	pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
 	__pud_populate(pudp, __pa(pmdp), pudval);
 }
 #else
@@ -52,7 +52,7 @@ static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
 {
 	p4dval_t p4dval = P4D_TYPE_TABLE;
 
-	p4dval |= (mm == &init_mm) ? P4D_TABLE_UXN : P4D_TABLE_PXN;
+	p4dval |= (mm == &init_mm) ? P4D_TABLE_AF | P4D_TABLE_UXN : P4D_TABLE_PXN;
 	__p4d_populate(p4dp, __pa(pudp), p4dval);
 }
 
@@ -81,7 +81,7 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
 {
 	pgdval_t pgdval = PGD_TYPE_TABLE;
 
-	pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN;
+	pgdval |= (mm == &init_mm) ? PGD_TABLE_AF | PGD_TABLE_UXN : PGD_TABLE_PXN;
 	__pgd_populate(pgdp, __pa(p4dp), pgdval);
 }
 
@@ -127,7 +127,8 @@ static inline void
 pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 {
 	VM_BUG_ON(mm && mm != &init_mm);
-	__pmd_populate(pmdp, __pa(ptep), PMD_TYPE_TABLE | PMD_TABLE_UXN);
+	__pmd_populate(pmdp, __pa(ptep),
+		       PMD_TYPE_TABLE | PMD_TABLE_AF | PMD_TABLE_UXN);
 }
 
 static inline void
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index fd330c1db289..c78a988cca93 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -99,6 +99,7 @@
 #define PGD_TYPE_TABLE		(_AT(pgdval_t, 3) << 0)
 #define PGD_TABLE_BIT		(_AT(pgdval_t, 1) << 1)
 #define PGD_TYPE_MASK		(_AT(pgdval_t, 3) << 0)
+#define PGD_TABLE_AF		(_AT(pgdval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 #define PGD_TABLE_PXN		(_AT(pgdval_t, 1) << 59)
 #define PGD_TABLE_UXN		(_AT(pgdval_t, 1) << 60)
 
@@ -110,6 +111,7 @@
 #define P4D_TYPE_MASK		(_AT(p4dval_t, 3) << 0)
 #define P4D_TYPE_SECT		(_AT(p4dval_t, 1) << 0)
 #define P4D_SECT_RDONLY		(_AT(p4dval_t, 1) << 7)		/* AP[2] */
+#define P4D_TABLE_AF		(_AT(p4dval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 #define P4D_TABLE_PXN		(_AT(p4dval_t, 1) << 59)
 #define P4D_TABLE_UXN		(_AT(p4dval_t, 1) << 60)
 
@@ -121,6 +123,7 @@
 #define PUD_TYPE_MASK		(_AT(pudval_t, 3) << 0)
 #define PUD_TYPE_SECT		(_AT(pudval_t, 1) << 0)
 #define PUD_SECT_RDONLY		(_AT(pudval_t, 1) << 7)		/* AP[2] */
+#define PUD_TABLE_AF		(_AT(pudval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 #define PUD_TABLE_PXN		(_AT(pudval_t, 1) << 59)
 #define PUD_TABLE_UXN		(_AT(pudval_t, 1) << 60)
 
@@ -131,6 +134,7 @@
 #define PMD_TYPE_TABLE		(_AT(pmdval_t, 3) << 0)
 #define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
 #define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
+#define PMD_TABLE_AF		(_AT(pmdval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 
 /*
  * Section
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 718728a85430..6eeaaa80f6fe 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2046,6 +2046,18 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
 
 #endif
 
+#if CONFIG_ARM64_HAFT
+
+static struct cpumask haft_cpus;
+
+static void cpu_enable_haft(struct arm64_cpu_capabilities const *cap)
+{
+	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU))
+		cpumask_set_cpu(smp_processor_id(), &haft_cpus);
+}
+
+#endif /* CONFIG_ARM64_HAFT */
+
 #ifdef CONFIG_ARM64_AMU_EXTN
 
 /*
@@ -2590,6 +2602,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.cpus = &dbm_cpus,
 		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, DBM)
 	},
+#endif
+#ifdef CONFIG_ARM64_HAFT
+	{
+		.desc = "Hardware managed Access Flag for Table Descriptor",
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		.capability = ARM64_HAFT,
+		.matches = has_cpuid_feature,
+		.cpu_enable = cpu_enable_haft,
+		.cpus = &haft_cpus,
+		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, HAFT)
+	},
 #endif
 	{
 		.desc = "CRC32 instructions",
diff --git a/arch/arm64/mm/fixmap.c b/arch/arm64/mm/fixmap.c
index de1e09d986ad..c5c5425791da 100644
--- a/arch/arm64/mm/fixmap.c
+++ b/arch/arm64/mm/fixmap.c
@@ -47,7 +47,8 @@ static void __init early_fixmap_init_pte(pmd_t *pmdp, unsigned long addr)
 
 	if (pmd_none(pmd)) {
 		ptep = bm_pte[BM_PTE_TABLE_IDX(addr)];
-		__pmd_populate(pmdp, __pa_symbol(ptep), PMD_TYPE_TABLE);
+		__pmd_populate(pmdp, __pa_symbol(ptep),
+			       PMD_TYPE_TABLE | PMD_TABLE_AF);
 	}
 }
 
@@ -59,7 +60,8 @@ static void __init early_fixmap_init_pmd(pud_t *pudp, unsigned long addr,
 	pmd_t *pmdp;
 
 	if (pud_none(pud))
-		__pud_populate(pudp, __pa_symbol(bm_pmd), PUD_TYPE_TABLE);
+		__pud_populate(pudp, __pa_symbol(bm_pmd),
+			       PUD_TYPE_TABLE | PUD_TABLE_AF);
 
 	pmdp = pmd_offset_kimg(pudp, addr);
 	do {
@@ -86,7 +88,8 @@ static void __init early_fixmap_init_pud(p4d_t *p4dp, unsigned long addr,
 	}
 
 	if (p4d_none(p4d))
-		__p4d_populate(p4dp, __pa_symbol(bm_pud), P4D_TYPE_TABLE);
+		__p4d_populate(p4dp, __pa_symbol(bm_pud),
+			       P4D_TYPE_TABLE | P4D_TABLE_AF);
 
 	pudp = pud_offset_kimg(p4dp, addr);
 	early_fixmap_init_pmd(pudp, addr, end);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e55b02fbddc8..6441a45eaeda 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -201,7 +201,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
 
 	BUG_ON(pmd_sect(pmd));
 	if (pmd_none(pmd)) {
-		pmdval_t pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN;
+		pmdval_t pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN | PMD_TABLE_AF;
 		phys_addr_t pte_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
@@ -288,7 +288,7 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
 	 */
 	BUG_ON(pud_sect(pud));
 	if (pud_none(pud)) {
-		pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_UXN;
+		pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_UXN | PUD_TABLE_AF;
 		phys_addr_t pmd_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
@@ -333,7 +333,7 @@ static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
 	pud_t *pudp;
 
 	if (p4d_none(p4d)) {
-		p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN;
+		p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN | P4D_TABLE_AF;
 		phys_addr_t pud_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
@@ -391,7 +391,7 @@ static void alloc_init_p4d(pgd_t *pgdp, unsigned long addr, unsigned long end,
 	p4d_t *p4dp;
 
 	if (pgd_none(pgd)) {
-		pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_UXN;
+		pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_UXN | PGD_TABLE_AF;
 		phys_addr_t p4d_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index ccbae4525891..4a58b9b36eb6 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -495,9 +495,15 @@ alternative_else_nop_endif
 	 * via capabilities.
 	 */
 	mrs	x9, ID_AA64MMFR1_EL1
-	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
+	ubfx	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
 	cbz	x9, 1f
 	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
+
+#ifdef CONFIG_ARM64_HAFT
+	cmp	x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
+	b.lt	1f
+	orr	tcr2, tcr2, TCR2_EL1x_HAFT
+#endif /* CONFIG_ARM64_HAFT */
 1:
 #endif	/* CONFIG_ARM64_HW_AFDBM */
 	msr	mair_el1, mair
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index eedb5acc21ed..b35004fa8313 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -56,6 +56,7 @@ HAS_TLB_RANGE
 HAS_VA52
 HAS_VIRT_HOST_EXTN
 HAS_WFXT
+HAFT
 HW_DBM
 KVM_HVHE
 KVM_PROTECTED_MODE
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
  2024-10-22  9:27 [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
                   ` (2 preceding siblings ...)
  2024-10-22  9:27 ` [PATCH v3 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
@ 2024-10-22  9:27 ` Yicong Yang
  2024-10-22  9:27 ` [PATCH v3 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang
  4 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-22  9:27 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

With the support of FEAT_HAFT, the NONLEAF_PMD_YOUNG can be enabled
on arm64 since the hardware is capable of updating the AF flag for
PMD table descriptor. Since the AF bit of the table descriptor
shares the same bit position in block descriptors, we only need
to implement arch_has_hw_nonleaf_pmd_young() and select related
configs. The related pmd_young test/update operations keeps the
same with and already implemented for transparent page support.

Currently ARCH_HAS_NONLEAF_PMD_YOUNG is used to improve the
efficiency of lru-gen aging.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/Kconfig               |  1 +
 arch/arm64/include/asm/pgtable.h | 14 ++++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 029d7ad89de5..8f5fc4f19573 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -38,6 +38,7 @@ config ARM64
 	select ARCH_HAS_MEM_ENCRYPT
 	select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
 	select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+	select ARCH_HAS_NONLEAF_PMD_YOUNG if ARM64_HAFT
 	select ARCH_HAS_PTE_DEVMAP
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_HW_PTE_YOUNG
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index c329ea061dc9..e4712b969aba 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1259,7 +1259,7 @@ static inline int __ptep_clear_flush_young(struct vm_area_struct *vma,
 	return young;
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG)
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 					    unsigned long address,
@@ -1267,7 +1267,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 {
 	return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp);
 }
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */
 
 static inline pte_t __ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long address, pte_t *ptep)
@@ -1502,6 +1502,16 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
  */
 #define arch_has_hw_pte_young		cpu_has_hw_af
 
+#ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG
+
+#define arch_has_hw_nonleaf_pmd_young arch_has_hw_nonleaf_pmd_young
+static inline bool arch_has_hw_nonleaf_pmd_young(void)
+{
+	return system_support_haft();
+}
+
+#endif
+
 /*
  * Experimentally, it's cheap to set the access flag in hardware and we
  * benefit from prefaulting mappings as 'old' to start with.
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()
  2024-10-22  9:27 [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
                   ` (3 preceding siblings ...)
  2024-10-22  9:27 ` [PATCH v3 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
@ 2024-10-22  9:27 ` Yicong Yang
  4 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-22  9:27 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Young bit operation on PMD table entry is only supported if
FEAT_HAFT enabled system widely. Add a warning for notifying
the misbehaviour.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e4712b969aba..bb0600a24016 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1265,6 +1265,8 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 					    unsigned long address,
 					    pmd_t *pmdp)
 {
+	/* Operation applies to PMD table entry only if FEAT_HAFT is enabled */
+	VM_WARN_ON(pmd_table(READ_ONCE(*pmdp)) && !system_support_haft());
 	return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/5] arm64: setup: name 'tcr2' register
  2024-10-22  9:27 ` [PATCH v3 2/5] arm64: setup: name 'tcr2' register Yicong Yang
@ 2024-10-22 16:54   ` Catalin Marinas
  2024-10-23 10:08     ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2024-10-22 16:54 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, maz, mark.rutland, broonie, linux-arm-kernel, oliver.upton,
	ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Tue, Oct 22, 2024 at 05:27:31PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> TCR2_EL1 introduced some additional controls besides TCR_EL1. Currently
> only PIE is supported and enabled by writing TCR2_EL1 directly if PIE
> detected.
> 
> Introduce a named register 'tcr2' just like 'tcr' we've already had.
> It'll be initialized to 0 and updated if certain feature detected and
> needs to be enabled. Touch the TCR2_EL1 registers at last with the
> updated 'tcr2' value if FEAT_TCR2 supported by checking
> ID_AA64MMFR3_EL1.TCRX. Then we can extend the support of other features
> controlled by TCR2_EL1.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  arch/arm64/mm/proc.S | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 8abdc7fed321..ccbae4525891 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -465,10 +465,12 @@ SYM_FUNC_START(__cpu_setup)
>  	 */
>  	mair	.req	x17
>  	tcr	.req	x16
> +	tcr2	.req	x15
>  	mov_q	mair, MAIR_EL1_SET
>  	mov_q	tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | \
>  		     TCR_SHARED | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
>  		     TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
> +	mov	tcr2, xzr
>  
>  	tcr_clear_errata_bits tcr, x9, x5
>  
> @@ -525,11 +527,16 @@ alternative_else_nop_endif
>  #undef PTE_MAYBE_NG
>  #undef PTE_MAYBE_SHARED
>  
> -	mov	x0, TCR2_EL1x_PIE
> -	msr	REG_TCR2_EL1, x0
> +	orr	tcr2, tcr2, TCR2_EL1x_PIE
>  
>  .Lskip_indirection:
>  
> +	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
> +	ubfx	x1, x1, #ID_AA64MMFR3_EL1_TCRX_SHIFT, #4
> +	cbz	x1, 1f
> +	msr	REG_TCR2_EL1, tcr2
> +1:

It makes sense to mimic the TCR_EL1 configuration here with a single MSR
at the end.

I was wondering whether to simply check if the tcr2 reg (x15) is
non-zero under the assumption that bits in it would only be set if the
features are present (and those features imply TCRX). However, we can
set RES0 bits in here even if the feature is not supported in hardware
(more on the next patch).

So I think this patch is ok as is.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register
  2024-10-22  9:27 ` [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
@ 2024-10-22 17:05   ` Mark Brown
  2024-10-23 10:06     ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Mark Brown @ 2024-10-22 17:05 UTC (permalink / raw)
  To: Yicong Yang
  Cc: catalin.marinas, will, maz, mark.rutland, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

[-- Attachment #1: Type: text/plain, Size: 560 bytes --]

On Tue, Oct 22, 2024 at 05:27:30PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Update ID_AA64MMFR1_EL1 register fields definition per DDI0601 (ID092424)
> 2024-09. ID_AA64MMFR1_EL1.ETS adds definition for FEAT_ETS2 and
> FEAT_ETS3. ID_AA64MMFR1_EL1.HAFDBS adds definition for FEAT_HAFT and
> FEAT_HDBSS.

Reviewed-by: Mark Brown <broonie@kernel.org>

This was also in a recent KVM series (which prompted me to send the same
change since the change there was only partially done), it'd be good to
get this merged.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-22  9:27 ` [PATCH v3 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
@ 2024-10-22 18:30   ` Catalin Marinas
  2024-10-23 10:30     ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2024-10-22 18:30 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, maz, mark.rutland, broonie, linux-arm-kernel, oliver.upton,
	ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 3e29b44d2d7b..029d7ad89de5 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2176,6 +2176,21 @@ config ARCH_PKEY_BITS
>  	int
>  	default 3
>  
> +config ARM64_HAFT
> +	bool "Support for Hardware managed Access Flag for Table Descriptor"

Super nit: s/Descriptor/Descriptors/

> +	depends on ARM64_HW_AFDBM
> +	default y
> +	help
> +	  The ARMv8.9/ARMv9.5 introduces the feature Hardware managed Access
> +	  Flag for Table descriptors. When enabled an architectural executed
> +	  memory access will update the Access Flag in each Table descriptor
> +	  which is accessed during the translation table walk and for which
> +	  the Access Flag is 0. The Access Flag of the Table descriptor use
> +	  the same bit of PTE_AF.
> +
> +	  The feature will only be enabled if all the CPUs in the system
> +	  support this feature. If unsure, say Y.
> +
>  endmenu # "ARMv8.9 architectural features"
>  
>  config ARM64_SVE
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 3d261cc123c1..fba2347c0aa6 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -879,6 +879,30 @@ static inline bool cpu_has_hw_af(void)
>  						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>  }
>  
> +/*
> + * Contrary to the page/block access flag, the table access flag
> + * cannot be emulated in software (no access fault will occur).
> + * So users should use this feature only if it's supported system
> + * wide.
> + */
> +static inline bool system_support_haft(void)
> +{
> +	unsigned int hafdbs;
> +	u64 mmfr1;
> +
> +	if (!IS_ENABLED(CONFIG_ARM64_HAFT))
> +		return false;
> +
> +	/*
> +	 * Check the sanitised registers to see this feature is supported
> +	 * on all the CPUs.
> +	 */
> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	hafdbs = cpuid_feature_extract_unsigned_field(mmfr1,
> +						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
> +	return hafdbs >= ID_AA64MMFR1_EL1_HAFDBS_HAFT;
> +}

Can we not have just an entry in the arm64_features array with the type
ARM64_CPUCAP_SYSTEM_FEATURE and avoid the explicit checks here?

> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
> index 8ff5f2a2579e..bc1051d65125 100644
> --- a/arch/arm64/include/asm/pgalloc.h
> +++ b/arch/arm64/include/asm/pgalloc.h
> @@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>  {
>  	pudval_t pudval = PUD_TYPE_TABLE;
>  
> -	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
> +	pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
>  	__pud_populate(pudp, __pa(pmdp), pudval);
>  }

Why not set the table AF for the task entries? I haven't checked the
core code but normally when we map a pte it's mapped as young. While for
table AF we wouldn't get a fault, I would have thought the core code
follows the same logic.

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 718728a85430..6eeaaa80f6fe 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2046,6 +2046,18 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
>  
>  #endif
>  
> +#if CONFIG_ARM64_HAFT
> +
> +static struct cpumask haft_cpus;
> +
> +static void cpu_enable_haft(struct arm64_cpu_capabilities const *cap)
> +{
> +	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU))
> +		cpumask_set_cpu(smp_processor_id(), &haft_cpus);
> +}
> +
> +#endif /* CONFIG_ARM64_HAFT */
> +
>  #ifdef CONFIG_ARM64_AMU_EXTN
>  
>  /*
> @@ -2590,6 +2602,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.cpus = &dbm_cpus,
>  		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, DBM)
>  	},
> +#endif
> +#ifdef CONFIG_ARM64_HAFT
> +	{
> +		.desc = "Hardware managed Access Flag for Table Descriptor",
> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,

I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
similar for HW DBM but there we get a fault and set the pte dirty. You
combined it with a system_support_haft() that checks the sanitised regs
but I'd rather have a static branch check via cpus_have_cap(). Even with
your approach we can have a race with a late CPU hot-plugged that
doesn't have the feature in the middle of some core code walking the
page tables.

With a system feature type, late CPUs not having the feature won't be
brought online (if feature enabled) but in general I don't have much
sympathy for SoC vendors combining CPUs with incompatible features ;).

> +		.capability = ARM64_HAFT,
> +		.matches = has_cpuid_feature,
> +		.cpu_enable = cpu_enable_haft,
> +		.cpus = &haft_cpus,
> +		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, HAFT)
> +	},
[...]
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index ccbae4525891..4a58b9b36eb6 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -495,9 +495,15 @@ alternative_else_nop_endif
>  	 * via capabilities.
>  	 */
>  	mrs	x9, ID_AA64MMFR1_EL1
> -	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
> +	ubfx	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
>  	cbz	x9, 1f
>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> +
> +#ifdef CONFIG_ARM64_HAFT
> +	cmp	x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
> +	b.lt	1f
> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> +#endif /* CONFIG_ARM64_HAFT */

I think we can skip the ID check here and always set the HAFT bit. We do
something similar with MTE (not for TCR_HA though, don't remember why).

-- 
Catalin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register
  2024-10-22 17:05   ` Mark Brown
@ 2024-10-23 10:06     ` Yicong Yang
  0 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-23 10:06 UTC (permalink / raw)
  To: Mark Brown
  Cc: yangyicong, catalin.marinas, will, maz, mark.rutland,
	linux-arm-kernel, oliver.upton, ryan.roberts, linuxarm,
	jonathan.cameron, shameerali.kolothum.thodi, prime.zeng, xuwei5,
	wangkefeng.wang

On 2024/10/23 1:05, Mark Brown wrote:
> On Tue, Oct 22, 2024 at 05:27:30PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Update ID_AA64MMFR1_EL1 register fields definition per DDI0601 (ID092424)
>> 2024-09. ID_AA64MMFR1_EL1.ETS adds definition for FEAT_ETS2 and
>> FEAT_ETS3. ID_AA64MMFR1_EL1.HAFDBS adds definition for FEAT_HAFT and
>> FEAT_HDBSS.
> 
> Reviewed-by: Mark Brown <broonie@kernel.org>
> 
> This was also in a recent KVM series (which prompted me to send the same
> change since the change there was only partially done), it'd be good to
> get this merged.
> 

Thanks. I may missed that series..



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/5] arm64: setup: name 'tcr2' register
  2024-10-22 16:54   ` Catalin Marinas
@ 2024-10-23 10:08     ` Yicong Yang
  0 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-23 10:08 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: yangyicong, will, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On 2024/10/23 0:54, Catalin Marinas wrote:
> On Tue, Oct 22, 2024 at 05:27:31PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> TCR2_EL1 introduced some additional controls besides TCR_EL1. Currently
>> only PIE is supported and enabled by writing TCR2_EL1 directly if PIE
>> detected.
>>
>> Introduce a named register 'tcr2' just like 'tcr' we've already had.
>> It'll be initialized to 0 and updated if certain feature detected and
>> needs to be enabled. Touch the TCR2_EL1 registers at last with the
>> updated 'tcr2' value if FEAT_TCR2 supported by checking
>> ID_AA64MMFR3_EL1.TCRX. Then we can extend the support of other features
>> controlled by TCR2_EL1.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  arch/arm64/mm/proc.S | 12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>> index 8abdc7fed321..ccbae4525891 100644
>> --- a/arch/arm64/mm/proc.S
>> +++ b/arch/arm64/mm/proc.S
>> @@ -465,10 +465,12 @@ SYM_FUNC_START(__cpu_setup)
>>  	 */
>>  	mair	.req	x17
>>  	tcr	.req	x16
>> +	tcr2	.req	x15
>>  	mov_q	mair, MAIR_EL1_SET
>>  	mov_q	tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | \
>>  		     TCR_SHARED | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
>>  		     TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
>> +	mov	tcr2, xzr
>>  
>>  	tcr_clear_errata_bits tcr, x9, x5
>>  
>> @@ -525,11 +527,16 @@ alternative_else_nop_endif
>>  #undef PTE_MAYBE_NG
>>  #undef PTE_MAYBE_SHARED
>>  
>> -	mov	x0, TCR2_EL1x_PIE
>> -	msr	REG_TCR2_EL1, x0
>> +	orr	tcr2, tcr2, TCR2_EL1x_PIE
>>  
>>  .Lskip_indirection:
>>  
>> +	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
>> +	ubfx	x1, x1, #ID_AA64MMFR3_EL1_TCRX_SHIFT, #4
>> +	cbz	x1, 1f
>> +	msr	REG_TCR2_EL1, tcr2
>> +1:
> 
> It makes sense to mimic the TCR_EL1 configuration here with a single MSR
> at the end.
> 
> I was wondering whether to simply check if the tcr2 reg (x15) is
> non-zero under the assumption that bits in it would only be set if the
> features are present (and those features imply TCRX). However, we can
> set RES0 bits in here even if the feature is not supported in hardware
> (more on the next patch).
> 
> So I think this patch is ok as is.
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> .

Thanks.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-22 18:30   ` Catalin Marinas
@ 2024-10-23 10:30     ` Yicong Yang
  2024-10-23 12:36       ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-10-23 10:30 UTC (permalink / raw)
  To: Catalin Marinas, will
  Cc: yangyicong, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On 2024/10/23 2:30, Catalin Marinas wrote:
> On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 3e29b44d2d7b..029d7ad89de5 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -2176,6 +2176,21 @@ config ARCH_PKEY_BITS
>>  	int
>>  	default 3
>>  
>> +config ARM64_HAFT
>> +	bool "Support for Hardware managed Access Flag for Table Descriptor"
> 
> Super nit: s/Descriptor/Descriptors/
> 

sure.

>> +	depends on ARM64_HW_AFDBM
>> +	default y
>> +	help
>> +	  The ARMv8.9/ARMv9.5 introduces the feature Hardware managed Access
>> +	  Flag for Table descriptors. When enabled an architectural executed
>> +	  memory access will update the Access Flag in each Table descriptor
>> +	  which is accessed during the translation table walk and for which
>> +	  the Access Flag is 0. The Access Flag of the Table descriptor use
>> +	  the same bit of PTE_AF.
>> +
>> +	  The feature will only be enabled if all the CPUs in the system
>> +	  support this feature. If unsure, say Y.
>> +
>>  endmenu # "ARMv8.9 architectural features"
>>  
>>  config ARM64_SVE
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index 3d261cc123c1..fba2347c0aa6 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -879,6 +879,30 @@ static inline bool cpu_has_hw_af(void)
>>  						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>>  }
>>  
>> +/*
>> + * Contrary to the page/block access flag, the table access flag
>> + * cannot be emulated in software (no access fault will occur).
>> + * So users should use this feature only if it's supported system
>> + * wide.
>> + */
>> +static inline bool system_support_haft(void)
>> +{
>> +	unsigned int hafdbs;
>> +	u64 mmfr1;
>> +
>> +	if (!IS_ENABLED(CONFIG_ARM64_HAFT))
>> +		return false;
>> +
>> +	/*
>> +	 * Check the sanitised registers to see this feature is supported
>> +	 * on all the CPUs.
>> +	 */
>> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
>> +	hafdbs = cpuid_feature_extract_unsigned_field(mmfr1,
>> +						ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
>> +	return hafdbs >= ID_AA64MMFR1_EL1_HAFDBS_HAFT;
>> +}
> 
> Can we not have just an entry in the arm64_features array with the type
> ARM64_CPUCAP_SYSTEM_FEATURE and avoid the explicit checks here?
> 

reply below...

>> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
>> index 8ff5f2a2579e..bc1051d65125 100644
>> --- a/arch/arm64/include/asm/pgalloc.h
>> +++ b/arch/arm64/include/asm/pgalloc.h
>> @@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>>  {
>>  	pudval_t pudval = PUD_TYPE_TABLE;
>>  
>> -	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
>> +	pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
>>  	__pud_populate(pudp, __pa(pmdp), pudval);
>>  }
> 
> Why not set the table AF for the task entries? I haven't checked the
> core code but normally when we map a pte it's mapped as young. While for
> table AF we wouldn't get a fault, I would have thought the core code
> follows the same logic.
> 

I may need to check. If I understand it correctly, for most case (e.g. a read fault) we should
make pte young if the hardware AF update is not supported. Otherwsie hardware will help to update.

>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 718728a85430..6eeaaa80f6fe 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -2046,6 +2046,18 @@ static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
>>  
>>  #endif
>>  
>> +#if CONFIG_ARM64_HAFT
>> +
>> +static struct cpumask haft_cpus;
>> +
>> +static void cpu_enable_haft(struct arm64_cpu_capabilities const *cap)
>> +{
>> +	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU))
>> +		cpumask_set_cpu(smp_processor_id(), &haft_cpus);
>> +}
>> +
>> +#endif /* CONFIG_ARM64_HAFT */
>> +
>>  #ifdef CONFIG_ARM64_AMU_EXTN
>>  
>>  /*
>> @@ -2590,6 +2602,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>  		.cpus = &dbm_cpus,
>>  		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, DBM)
>>  	},
>> +#endif
>> +#ifdef CONFIG_ARM64_HAFT
>> +	{
>> +		.desc = "Hardware managed Access Flag for Table Descriptor",
>> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> 
> I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
> similar for HW DBM but there we get a fault and set the pte dirty. You
> combined it with a system_support_haft() that checks the sanitised regs
> but I'd rather have a static branch check via cpus_have_cap(). Even with
> your approach we can have a race with a late CPU hot-plugged that
> doesn't have the feature in the middle of some core code walking the
> page tables.
> 
> With a system feature type, late CPUs not having the feature won't be
> brought online (if feature enabled) but in general I don't have much
> sympathy for SoC vendors combining CPUs with incompatible features ;).
> 

ok. If we make it a system feature, we can using cpus_have_cap() then and
drop the system_support_haft() which is checking with the sanitised registers.
It's fine for me.

Will ask to not refuse online a CPU due to mismatch of this feature in [1],
hope we have an agreement :)

[1] https://lore.kernel.org/linux-arm-kernel/20240820161822.GC28750@willie-the-truck/

>> +		.capability = ARM64_HAFT,
>> +		.matches = has_cpuid_feature,
>> +		.cpu_enable = cpu_enable_haft,
>> +		.cpus = &haft_cpus,
>> +		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, HAFT)
>> +	},
> [...]
>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>> index ccbae4525891..4a58b9b36eb6 100644
>> --- a/arch/arm64/mm/proc.S
>> +++ b/arch/arm64/mm/proc.S
>> @@ -495,9 +495,15 @@ alternative_else_nop_endif
>>  	 * via capabilities.
>>  	 */
>>  	mrs	x9, ID_AA64MMFR1_EL1
>> -	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>> +	ubfx	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
>>  	cbz	x9, 1f
>>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
>> +
>> +#ifdef CONFIG_ARM64_HAFT
>> +	cmp	x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
>> +	b.lt	1f
>> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
>> +#endif /* CONFIG_ARM64_HAFT */
> 
> I think we can skip the ID check here and always set the HAFT bit. We do
> something similar with MTE (not for TCR_HA though, don't remember why).
> 

Thanks for the reference to MTE. Will see and have a test. But a check here
may seem more reasonable as we usually detect a feature first then enable it?

Thanks.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-23 10:30     ` Yicong Yang
@ 2024-10-23 12:36       ` Catalin Marinas
  2024-10-24 14:45         ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2024-10-23 12:36 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, yangyicong, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On Wed, Oct 23, 2024 at 06:30:18PM +0800, Yicong Yang wrote:
> On 2024/10/23 2:30, Catalin Marinas wrote:
> > On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
> >> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
> >> index 8ff5f2a2579e..bc1051d65125 100644
> >> --- a/arch/arm64/include/asm/pgalloc.h
> >> +++ b/arch/arm64/include/asm/pgalloc.h
> >> @@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
> >>  {
> >>  	pudval_t pudval = PUD_TYPE_TABLE;
> >>  
> >> -	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
> >> +	pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
> >>  	__pud_populate(pudp, __pa(pmdp), pudval);
> >>  }
> > 
> > Why not set the table AF for the task entries? I haven't checked the
> > core code but normally when we map a pte it's mapped as young. While for
> > table AF we wouldn't get a fault, I would have thought the core code
> > follows the same logic.
> 
> I may need to check. If I understand it correctly, for most case (e.g.
> a read fault) we should make pte young if the hardware AF update is
> not supported. Otherwsie hardware will help to update.

On arm64 at least, _PROT_DEFAULT has PTE_AF set. So all accessible
entries in protection_map[] will have it set. I'm not sure how the core
code clears PTE_AF in the table entries. I'd have thought it goes
together with some pte_mkold().

> >> +#ifdef CONFIG_ARM64_HAFT
> >> +	{
> >> +		.desc = "Hardware managed Access Flag for Table Descriptor",
> >> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> > 
> > I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
> > similar for HW DBM but there we get a fault and set the pte dirty. You
> > combined it with a system_support_haft() that checks the sanitised regs
> > but I'd rather have a static branch check via cpus_have_cap(). Even with
> > your approach we can have a race with a late CPU hot-plugged that
> > doesn't have the feature in the middle of some core code walking the
> > page tables.
> > 
> > With a system feature type, late CPUs not having the feature won't be
> > brought online (if feature enabled) but in general I don't have much
> > sympathy for SoC vendors combining CPUs with incompatible features ;).
> 
> ok. If we make it a system feature, we can using cpus_have_cap() then and
> drop the system_support_haft() which is checking with the sanitised registers.
> It's fine for me.
> 
> Will ask to not refuse online a CPU due to mismatch of this feature in [1],
> hope we have an agreement :)
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20240820161822.GC28750@willie-the-truck/

I initially thought this would work but I don't feel easy about having
should_clear_pmd_young() change its polarity at runtime while user space
is running. If that's not a problem, we can go with your current
approach.

> >> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> >> index ccbae4525891..4a58b9b36eb6 100644
> >> --- a/arch/arm64/mm/proc.S
> >> +++ b/arch/arm64/mm/proc.S
> >> @@ -495,9 +495,15 @@ alternative_else_nop_endif
> >>  	 * via capabilities.
> >>  	 */
> >>  	mrs	x9, ID_AA64MMFR1_EL1
> >> -	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
> >> +	ubfx	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
> >>  	cbz	x9, 1f
> >>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> >> +
> >> +#ifdef CONFIG_ARM64_HAFT
> >> +	cmp	x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
> >> +	b.lt	1f
> >> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> >> +#endif /* CONFIG_ARM64_HAFT */
> > 
> > I think we can skip the ID check here and always set the HAFT bit. We do
> > something similar with MTE (not for TCR_HA though, don't remember why).
> 
> Thanks for the reference to MTE. Will see and have a test. But a check
> here may seem more reasonable as we usually detect a feature first
> then enable it?

The behaviour of these RES0 bits is that we can write them and if the
feature is present, it will be enabled, otherwise it won't have any
effect, so it's not necessary to check the ID bits, the result would be
the same. We do this in other places as well.

Of course, we need to check the presence of TCR2_EL1, otherwise it would
undef. Just a bit less code since we want the feature on anyway.

-- 
Catalin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-23 12:36       ` Catalin Marinas
@ 2024-10-24 14:45         ` Yicong Yang
  2024-10-24 15:23           ` Yicong Yang
  2024-10-28 18:33           ` Catalin Marinas
  0 siblings, 2 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-24 14:45 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: will, yangyicong, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On 2024/10/23 20:36, Catalin Marinas wrote:
> On Wed, Oct 23, 2024 at 06:30:18PM +0800, Yicong Yang wrote:
>> On 2024/10/23 2:30, Catalin Marinas wrote:
>>> On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
>>>> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
>>>> index 8ff5f2a2579e..bc1051d65125 100644
>>>> --- a/arch/arm64/include/asm/pgalloc.h
>>>> +++ b/arch/arm64/include/asm/pgalloc.h
>>>> @@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>>>>  {
>>>>  	pudval_t pudval = PUD_TYPE_TABLE;
>>>>  
>>>> -	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
>>>> +	pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
>>>>  	__pud_populate(pudp, __pa(pmdp), pudval);
>>>>  }
>>>
>>> Why not set the table AF for the task entries? I haven't checked the
>>> core code but normally when we map a pte it's mapped as young. While for
>>> table AF we wouldn't get a fault, I would have thought the core code
>>> follows the same logic.
>>
>> I may need to check. If I understand it correctly, for most case (e.g.
>> a read fault) we should make pte young if the hardware AF update is
>> not supported. Otherwsie hardware will help to update.
> 
> On arm64 at least, _PROT_DEFAULT has PTE_AF set. So all accessible
> entries in protection_map[] will have it set. I'm not sure how the core
> code clears PTE_AF in the table entries. I'd have thought it goes
> together with some pte_mkold().
> 

you're right. Checked that x86 will set the AF bit for the table entry [1][2].
Will set AF for task entries.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/pgalloc.h#n84
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/pgtable_types.h#n221

>>>> +#ifdef CONFIG_ARM64_HAFT
>>>> +	{
>>>> +		.desc = "Hardware managed Access Flag for Table Descriptor",
>>>> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>>>
>>> I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
>>> similar for HW DBM but there we get a fault and set the pte dirty. You
>>> combined it with a system_support_haft() that checks the sanitised regs
>>> but I'd rather have a static branch check via cpus_have_cap(). Even with
>>> your approach we can have a race with a late CPU hot-plugged that
>>> doesn't have the feature in the middle of some core code walking the
>>> page tables.
>>>
>>> With a system feature type, late CPUs not having the feature won't be
>>> brought online (if feature enabled) but in general I don't have much
>>> sympathy for SoC vendors combining CPUs with incompatible features ;).
>>
>> ok. If we make it a system feature, we can using cpus_have_cap() then and
>> drop the system_support_haft() which is checking with the sanitised registers.
>> It's fine for me.
>>
>> Will ask to not refuse online a CPU due to mismatch of this feature in [1],
>> hope we have an agreement :)
>>
>> [1] https://lore.kernel.org/linux-arm-kernel/20240820161822.GC28750@willie-the-truck/
> 
> I initially thought this would work but I don't feel easy about having
> should_clear_pmd_young() change its polarity at runtime while user space
> is running. If that's not a problem, we can go with your current
> approach.

this should be ok as I image. after online a CPU without HAFT the system won't
advertise HAFT support but we don't disable the HAFT update on the supported
CPUs, the ongoing page aging process can still use the updated table AF information
and later process will fallback to use the PTE's AF bit. efficiency maybe reduced
but the function should be correct.

the user setting will be changed after online a CPU without HAFT. the user will
know this unless read /sys/kernel/mm/lru_gen/enabled again (LRU_GEN_NONLEAF_YOUNG
bit 0x4 will be cleared after online a CPU without HAFT since system would
not declare non-leaf PMD young support).

>>>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>>>> index ccbae4525891..4a58b9b36eb6 100644
>>>> --- a/arch/arm64/mm/proc.S
>>>> +++ b/arch/arm64/mm/proc.S
>>>> @@ -495,9 +495,15 @@ alternative_else_nop_endif
>>>>  	 * via capabilities.
>>>>  	 */
>>>>  	mrs	x9, ID_AA64MMFR1_EL1
>>>> -	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>>>> +	ubfx	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
>>>>  	cbz	x9, 1f
>>>>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
>>>> +
>>>> +#ifdef CONFIG_ARM64_HAFT
>>>> +	cmp	x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
>>>> +	b.lt	1f
>>>> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
>>>> +#endif /* CONFIG_ARM64_HAFT */
>>>
>>> I think we can skip the ID check here and always set the HAFT bit. We do
>>> something similar with MTE (not for TCR_HA though, don't remember why).
>>
>> Thanks for the reference to MTE. Will see and have a test. But a check
>> here may seem more reasonable as we usually detect a feature first
>> then enable it?
> 
> The behaviour of these RES0 bits is that we can write them and if the
> feature is present, it will be enabled, otherwise it won't have any
> effect, so it's not necessary to check the ID bits, the result would be
> the same. We do this in other places as well.
> 
> Of course, we need to check the presence of TCR2_EL1, otherwise it would
> undef. Just a bit less code since we want the feature on anyway.

sure. will drop the check and set the HAFT bit unconditionally.

Thanks.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-24 14:45         ` Yicong Yang
@ 2024-10-24 15:23           ` Yicong Yang
  2024-10-28 18:33           ` Catalin Marinas
  1 sibling, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-10-24 15:23 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: yangyicong, will, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On 2024/10/24 22:45, Yicong Yang wrote:
> On 2024/10/23 20:36, Catalin Marinas wrote:
>> On Wed, Oct 23, 2024 at 06:30:18PM +0800, Yicong Yang wrote:
>>> On 2024/10/23 2:30, Catalin Marinas wrote:
>>>> On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
>>>>> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
>>>>> index 8ff5f2a2579e..bc1051d65125 100644
>>>>> --- a/arch/arm64/include/asm/pgalloc.h
>>>>> +++ b/arch/arm64/include/asm/pgalloc.h
>>>>> @@ -30,7 +30,7 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
>>>>>  {
>>>>>  	pudval_t pudval = PUD_TYPE_TABLE;
>>>>>  
>>>>> -	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
>>>>> +	pudval |= (mm == &init_mm) ? PUD_TABLE_AF | PUD_TABLE_UXN : PUD_TABLE_PXN;
>>>>>  	__pud_populate(pudp, __pa(pmdp), pudval);
>>>>>  }
>>>>
>>>> Why not set the table AF for the task entries? I haven't checked the
>>>> core code but normally when we map a pte it's mapped as young. While for
>>>> table AF we wouldn't get a fault, I would have thought the core code
>>>> follows the same logic.
>>>
>>> I may need to check. If I understand it correctly, for most case (e.g.
>>> a read fault) we should make pte young if the hardware AF update is
>>> not supported. Otherwsie hardware will help to update.
>>
>> On arm64 at least, _PROT_DEFAULT has PTE_AF set. So all accessible
>> entries in protection_map[] will have it set. I'm not sure how the core
>> code clears PTE_AF in the table entries. I'd have thought it goes
>> together with some pte_mkold().
>>
> 
> you're right. Checked that x86 will set the AF bit for the table entry [1][2].
> Will set AF for task entries.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/pgalloc.h#n84
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/include/asm/pgtable_types.h#n221
> 
>>>>> +#ifdef CONFIG_ARM64_HAFT
>>>>> +	{
>>>>> +		.desc = "Hardware managed Access Flag for Table Descriptor",
>>>>> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>>>>
>>>> I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
>>>> similar for HW DBM but there we get a fault and set the pte dirty. You
>>>> combined it with a system_support_haft() that checks the sanitised regs
>>>> but I'd rather have a static branch check via cpus_have_cap(). Even with
>>>> your approach we can have a race with a late CPU hot-plugged that
>>>> doesn't have the feature in the middle of some core code walking the
>>>> page tables.
>>>>
>>>> With a system feature type, late CPUs not having the feature won't be
>>>> brought online (if feature enabled) but in general I don't have much
>>>> sympathy for SoC vendors combining CPUs with incompatible features ;).
>>>
>>> ok. If we make it a system feature, we can using cpus_have_cap() then and
>>> drop the system_support_haft() which is checking with the sanitised registers.
>>> It's fine for me.
>>>
>>> Will ask to not refuse online a CPU due to mismatch of this feature in [1],
>>> hope we have an agreement :)
>>>
>>> [1] https://lore.kernel.org/linux-arm-kernel/20240820161822.GC28750@willie-the-truck/
>>
>> I initially thought this would work but I don't feel easy about having
>> should_clear_pmd_young() change its polarity at runtime while user space
>> is running. If that's not a problem, we can go with your current
>> approach.
> 
> this should be ok as I image. after online a CPU without HAFT the system won't
> advertise HAFT support but we don't disable the HAFT update on the supported
> CPUs, the ongoing page aging process can still use the updated table AF information
> and later process will fallback to use the PTE's AF bit. efficiency maybe reduced
> but the function should be correct.
> 

But we may have a chance to hit the VM_WARN_ON() in pmdp_test_and_clear_young() introduced in Patch 5/5:

CPUx                                         CPUy (without HAFT)
if (should_clear_pmd_yound())
                                             online and make system_support_haft() == false;
  pmdp_test_and_clear_young()
    VM_WARN_ON(pmd_table(READ_ONCE(*pmdp)) && !system_support_haft());

we may need to drop the VM_WARN_ON() with current approach or make this feature
ARM64_CPUCAP_SYSTEM_FEATURE.

> the user setting will be changed after online a CPU without HAFT. the user will
> know this unless read /sys/kernel/mm/lru_gen/enabled again (LRU_GEN_NONLEAF_YOUNG
> bit 0x4 will be cleared after online a CPU without HAFT since system would
> not declare non-leaf PMD young support).
> 
>>>>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>>>>> index ccbae4525891..4a58b9b36eb6 100644
>>>>> --- a/arch/arm64/mm/proc.S
>>>>> +++ b/arch/arm64/mm/proc.S
>>>>> @@ -495,9 +495,15 @@ alternative_else_nop_endif
>>>>>  	 * via capabilities.
>>>>>  	 */
>>>>>  	mrs	x9, ID_AA64MMFR1_EL1
>>>>> -	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>>>>> +	ubfx	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_SHIFT, #4
>>>>>  	cbz	x9, 1f
>>>>>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
>>>>> +
>>>>> +#ifdef CONFIG_ARM64_HAFT
>>>>> +	cmp	x9, ID_AA64MMFR1_EL1_HAFDBS_HAFT
>>>>> +	b.lt	1f
>>>>> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
>>>>> +#endif /* CONFIG_ARM64_HAFT */
>>>>
>>>> I think we can skip the ID check here and always set the HAFT bit. We do
>>>> something similar with MTE (not for TCR_HA though, don't remember why).
>>>
>>> Thanks for the reference to MTE. Will see and have a test. But a check
>>> here may seem more reasonable as we usually detect a feature first
>>> then enable it?
>>
>> The behaviour of these RES0 bits is that we can write them and if the
>> feature is present, it will be enabled, otherwise it won't have any
>> effect, so it's not necessary to check the ID bits, the result would be
>> the same. We do this in other places as well.
>>
>> Of course, we need to check the presence of TCR2_EL1, otherwise it would
>> undef. Just a bit less code since we want the feature on anyway.
> 
> sure. will drop the check and set the HAFT bit unconditionally.
> 
> Thanks.
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/5] arm64: Add support for FEAT_HAFT
  2024-10-24 14:45         ` Yicong Yang
  2024-10-24 15:23           ` Yicong Yang
@ 2024-10-28 18:33           ` Catalin Marinas
  1 sibling, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2024-10-28 18:33 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, yangyicong, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On Thu, Oct 24, 2024 at 10:45:51PM +0800, Yicong Yang wrote:
> On 2024/10/23 20:36, Catalin Marinas wrote:
> > On Wed, Oct 23, 2024 at 06:30:18PM +0800, Yicong Yang wrote:
> >> On 2024/10/23 2:30, Catalin Marinas wrote:
> >>> On Tue, Oct 22, 2024 at 05:27:32PM +0800, Yicong Yang wrote:
> >>>> +#ifdef CONFIG_ARM64_HAFT
> >>>> +	{
> >>>> +		.desc = "Hardware managed Access Flag for Table Descriptor",
> >>>> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> >>>
> >>> I'd actually use ARM64_CPUCAP_SYSTEM_FEATURE here. We use something
> >>> similar for HW DBM but there we get a fault and set the pte dirty. You
> >>> combined it with a system_support_haft() that checks the sanitised regs
> >>> but I'd rather have a static branch check via cpus_have_cap(). Even with
> >>> your approach we can have a race with a late CPU hot-plugged that
> >>> doesn't have the feature in the middle of some core code walking the
> >>> page tables.
> >>>
> >>> With a system feature type, late CPUs not having the feature won't be
> >>> brought online (if feature enabled) but in general I don't have much
> >>> sympathy for SoC vendors combining CPUs with incompatible features ;).
> >>
> >> ok. If we make it a system feature, we can using cpus_have_cap() then and
> >> drop the system_support_haft() which is checking with the sanitised registers.
> >> It's fine for me.
> >>
> >> Will ask to not refuse online a CPU due to mismatch of this feature in [1],
> >> hope we have an agreement :)
> >>
> >> [1] https://lore.kernel.org/linux-arm-kernel/20240820161822.GC28750@willie-the-truck/
> > 
> > I initially thought this would work but I don't feel easy about having
> > should_clear_pmd_young() change its polarity at runtime while user space
> > is running. If that's not a problem, we can go with your current
> > approach.
> 
> this should be ok as I image. after online a CPU without HAFT the system won't
> advertise HAFT support but we don't disable the HAFT update on the supported
> CPUs, the ongoing page aging process can still use the updated table AF information
> and later process will fallback to use the PTE's AF bit. efficiency maybe reduced
> but the function should be correct.

It's more of a theoretical case - walk_pmd_range() for example checks
should_clear_pmd_young() followed by !pmd_young(). Between these two
checks, should_clear_pmd_young() becomes false but the pmd may have been
accessed by a CPU without HAFT. We'd miss this. However, such race is
benign I think, only used for page aging so it shouldn't matter.

The other thing with your approach is the cost of checking (load, mask,
compare) vs just a static branch. Given that it's only done for pmds,
it's probably lost in the noise but you could check it to be sure.

-- 
Catalin


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-10-28 18:35 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-22  9:27 [PATCH v3 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
2024-10-22  9:27 ` [PATCH v3 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
2024-10-22 17:05   ` Mark Brown
2024-10-23 10:06     ` Yicong Yang
2024-10-22  9:27 ` [PATCH v3 2/5] arm64: setup: name 'tcr2' register Yicong Yang
2024-10-22 16:54   ` Catalin Marinas
2024-10-23 10:08     ` Yicong Yang
2024-10-22  9:27 ` [PATCH v3 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
2024-10-22 18:30   ` Catalin Marinas
2024-10-23 10:30     ` Yicong Yang
2024-10-23 12:36       ` Catalin Marinas
2024-10-24 14:45         ` Yicong Yang
2024-10-24 15:23           ` Yicong Yang
2024-10-28 18:33           ` Catalin Marinas
2024-10-22  9:27 ` [PATCH v3 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
2024-10-22  9:27 ` [PATCH v3 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).