linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT
@ 2024-11-02 10:42 Yicong Yang
  2024-11-02 10:42 ` [PATCH v4 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Yicong Yang @ 2024-11-02 10:42 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

This series adds basic support for FEAT_HAFT introduced in Armv8.9/v9.4
and enable ARCH_HAS_NONLEAF_PMD_YOUNG. The latter will be used in
lru-gen aging. Tested with lru-gen in below steps:
1. Generate a 1GiB workingset by `stress-ng --vm 1`. Then hang the task to
   stop accessing the memory. (AF bit won't be updated)
2. try to age the memory by /sys/kernel/debug/lru_gen

Run above steps with LRU_GEN_NONLEAF_YOUNG(0x4) and not respectively
(switching by /sys/kernel/mm/lru_gen/enabled). LRU_GEN_NONLEAF_YOUNG
will clear and test the PMD AF bit on page walking for aging,
otherwise will clear and test the PTE AF bit for aging. In this case
LRU_GEN_NONLEAF_YOUNG will improve the efficiency of page scanning
since pages won't be accessed and we don't need to scan each PTE.
Observed ~40% time saved for 1GiB memory on our emulated platform
with LRU_GEN_NONLEAF_YOUNG.

For lru-gen aging:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/mm/multigen_lru.rst?h=v6.11-rc1#n94

Change since v3:
Address the comments per Catalin. Add tags for Patch 1/5 and 2/5.
- Make HAFT a ARM64_CPUCAP_SYSTEM_FEATURE feature then:
  o checking the feature will be more efficient
  o avoid race between onlining a non-HAFT CPU when using the HAFT features
- Set table AF for task entries as well
- Set TCR2.HAFT unconditionally
Link: https://lore.kernel.org/linux-arm-kernel/20241022092734.59984-1-yangyicong@huawei.com/

Change since v2:
- Address comments per Will and Catalin:
  o detect and enable the feature in __cpu_setup()
  o allow online the CPU that doesn't have this feature and mismatch with the boot CPU
  o only advertise the feature if it's enabled system widely
  o set AF bit for kernel page table entries to save later hardware update
  o warn unexpected pmdp_test_and_clear_young()
- Update all the new AA64MMFR1_EL1 fields per Mark
Link: https://lore.kernel.org/linux-arm-kernel/20240814092333.7727-1-yangyicong@huawei.com/

Change since v1:
- Address comments from Marc, improve comments/Kconfig, clean code. Thanks for
  the comments.
Link: https://lore.kernel.org/linux-arm-kernel/20240802093458.32683-1-yangyicong@huawei.com/

Yicong Yang (5):
  arm64/sysreg: Update ID_AA64MMFR1_EL1 register
  arm64: setup: name 'tcr2' register
  arm64: Add support for FEAT_HAFT
  arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
  arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()

 arch/arm64/Kconfig                     | 16 ++++++++++++++++
 arch/arm64/include/asm/cpufeature.h    |  6 ++++++
 arch/arm64/include/asm/pgalloc.h       | 12 +++++++-----
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/include/asm/pgtable.h       | 10 ++++++++--
 arch/arm64/kernel/cpufeature.c         | 15 +++++++++++++++
 arch/arm64/mm/fixmap.c                 |  9 ++++++---
 arch/arm64/mm/mmu.c                    |  8 ++++----
 arch/arm64/mm/proc.S                   | 16 ++++++++++++++--
 arch/arm64/tools/cpucaps               |  1 +
 arch/arm64/tools/sysreg                |  4 ++++
 11 files changed, 85 insertions(+), 16 deletions(-)

-- 
2.24.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register
  2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
@ 2024-11-02 10:42 ` Yicong Yang
  2024-11-02 10:42 ` [PATCH v4 2/5] arm64: setup: name 'tcr2' register Yicong Yang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-11-02 10:42 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Update ID_AA64MMFR1_EL1 register fields definition per DDI0601 (ID092424)
2024-09. ID_AA64MMFR1_EL1.ETS adds definition for FEAT_ETS2 and
FEAT_ETS3. ID_AA64MMFR1_EL1.HAFDBS adds definition for FEAT_HAFT and
FEAT_HDBSS.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/tools/sysreg | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 8d637ac4b7c6..ae64ba810298 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1648,6 +1648,8 @@ EndEnum
 UnsignedEnum	39:36	ETS
 	0b0000	NI
 	0b0001	IMP
+	0b0010	ETS2
+	0b0011	ETS3
 EndEnum
 UnsignedEnum	35:32	TWED
 	0b0000	NI
@@ -1688,6 +1690,8 @@ UnsignedEnum	3:0	HAFDBS
 	0b0000	NI
 	0b0001	AF
 	0b0010	DBM
+	0b0011	HAFT
+	0b0100	HDBSS
 EndEnum
 EndSysreg
 
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/5] arm64: setup: name 'tcr2' register
  2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
  2024-11-02 10:42 ` [PATCH v4 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
@ 2024-11-02 10:42 ` Yicong Yang
  2024-11-02 10:42 ` [PATCH v4 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2024-11-02 10:42 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

TCR2_EL1 introduced some additional controls besides TCR_EL1. Currently
only PIE is supported and enabled by writing TCR2_EL1 directly if PIE
detected.

Introduce a named register 'tcr2' just like 'tcr' we've already had.
It'll be initialized to 0 and updated if certain feature detected and
needs to be enabled. Touch the TCR2_EL1 registers at last with the
updated 'tcr2' value if FEAT_TCR2 supported by checking
ID_AA64MMFR3_EL1.TCRX. Then we can extend the support of other features
controlled by TCR2_EL1.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/mm/proc.S | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 8abdc7fed321..ccbae4525891 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -465,10 +465,12 @@ SYM_FUNC_START(__cpu_setup)
 	 */
 	mair	.req	x17
 	tcr	.req	x16
+	tcr2	.req	x15
 	mov_q	mair, MAIR_EL1_SET
 	mov_q	tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | \
 		     TCR_SHARED | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
 		     TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS
+	mov	tcr2, xzr
 
 	tcr_clear_errata_bits tcr, x9, x5
 
@@ -525,11 +527,16 @@ alternative_else_nop_endif
 #undef PTE_MAYBE_NG
 #undef PTE_MAYBE_SHARED
 
-	mov	x0, TCR2_EL1x_PIE
-	msr	REG_TCR2_EL1, x0
+	orr	tcr2, tcr2, TCR2_EL1x_PIE
 
 .Lskip_indirection:
 
+	mrs_s	x1, SYS_ID_AA64MMFR3_EL1
+	ubfx	x1, x1, #ID_AA64MMFR3_EL1_TCRX_SHIFT, #4
+	cbz	x1, 1f
+	msr	REG_TCR2_EL1, tcr2
+1:
+
 	/*
 	 * Prepare SCTLR
 	 */
@@ -538,4 +545,5 @@ alternative_else_nop_endif
 
 	.unreq	mair
 	.unreq	tcr
+	.unreq	tcr2
 SYM_FUNC_END(__cpu_setup)
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
  2024-11-02 10:42 ` [PATCH v4 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
  2024-11-02 10:42 ` [PATCH v4 2/5] arm64: setup: name 'tcr2' register Yicong Yang
@ 2024-11-02 10:42 ` Yicong Yang
  2024-11-04 17:28   ` Catalin Marinas
  2024-11-02 10:42 ` [PATCH v4 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-11-02 10:42 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Armv8.9/v9.4 introduces the feature Hardware managed Access Flag
for Table descriptors (FEAT_HAFT). The feature is indicated by
ID_AA64MMFR1_EL1.HAFDBS == 0b0011 and can be enabled by
TCR2_EL1.HAFT so it has a dependency on FEAT_TCR2.

Adds the Kconfig for FEAT_HAFT and support detecting and enabling
the feature. The feature is enabled in __cpu_setup() before MMU on
just like HA. A CPU capability is added to notify the user of the
feature.

Add definition of P{G,4,U,M}D_TABLE_AF bit and set the AF bit
when creating the page table, which will save the hardware
from having to update them at runtime. This will be ignored if
FEAT_HAFT is not enabled.

The AF bit of table descriptors cannot be managed by the software
per spec, unlike the HA. So this should be used only if it's supported
system wide by system_supports_haft().

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/Kconfig                     | 15 +++++++++++++++
 arch/arm64/include/asm/cpufeature.h    |  6 ++++++
 arch/arm64/include/asm/pgalloc.h       | 12 +++++++-----
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/kernel/cpufeature.c         | 15 +++++++++++++++
 arch/arm64/mm/fixmap.c                 |  9 ++++++---
 arch/arm64/mm/mmu.c                    |  8 ++++----
 arch/arm64/mm/proc.S                   |  4 ++++
 arch/arm64/tools/cpucaps               |  1 +
 9 files changed, 62 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fd9df6dcc593..7c023f3f55d6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2177,6 +2177,21 @@ config ARCH_PKEY_BITS
 	int
 	default 3
 
+config ARM64_HAFT
+	bool "Support for Hardware managed Access Flag for Table Descriptors"
+	depends on ARM64_HW_AFDBM
+	default y
+	help
+	  The ARMv8.9/ARMv9.5 introduces the feature Hardware managed Access
+	  Flag for Table descriptors. When enabled an architectural executed
+	  memory access will update the Access Flag in each Table descriptor
+	  which is accessed during the translation table walk and for which
+	  the Access Flag is 0. The Access Flag of the Table descriptor use
+	  the same bit of PTE_AF.
+
+	  The feature will only be enabled if all the CPUs in the system
+	  support this feature. If unsure, say Y.
+
 endmenu # "ARMv8.9 architectural features"
 
 config ARM64_SVE
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 3d261cc123c1..ed8c784ca082 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -838,6 +838,12 @@ static inline bool system_supports_poe(void)
 		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
 }
 
+static inline bool system_supports_haft(void)
+{
+	return IS_ENABLED(CONFIG_ARM64_HAFT) &&
+		cpus_have_final_cap(ARM64_HAFT);
+}
+
 int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
 bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
 
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 8ff5f2a2579e..e75422864d1b 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -28,7 +28,7 @@ static inline void __pud_populate(pud_t *pudp, phys_addr_t pmdp, pudval_t prot)
 
 static inline void pud_populate(struct mm_struct *mm, pud_t *pudp, pmd_t *pmdp)
 {
-	pudval_t pudval = PUD_TYPE_TABLE;
+	pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_AF;
 
 	pudval |= (mm == &init_mm) ? PUD_TABLE_UXN : PUD_TABLE_PXN;
 	__pud_populate(pudp, __pa(pmdp), pudval);
@@ -50,7 +50,7 @@ static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
 
 static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
 {
-	p4dval_t p4dval = P4D_TYPE_TABLE;
+	p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_AF;
 
 	p4dval |= (mm == &init_mm) ? P4D_TABLE_UXN : P4D_TABLE_PXN;
 	__p4d_populate(p4dp, __pa(pudp), p4dval);
@@ -79,7 +79,7 @@ static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot)
 
 static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
 {
-	pgdval_t pgdval = PGD_TYPE_TABLE;
+	pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_AF;
 
 	pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN;
 	__pgd_populate(pgdp, __pa(p4dp), pgdval);
@@ -127,14 +127,16 @@ static inline void
 pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
 {
 	VM_BUG_ON(mm && mm != &init_mm);
-	__pmd_populate(pmdp, __pa(ptep), PMD_TYPE_TABLE | PMD_TABLE_UXN);
+	__pmd_populate(pmdp, __pa(ptep),
+		       PMD_TYPE_TABLE | PMD_TABLE_AF | PMD_TABLE_UXN);
 }
 
 static inline void
 pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
 {
 	VM_BUG_ON(mm == &init_mm);
-	__pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN);
+	__pmd_populate(pmdp, page_to_phys(ptep),
+		       PMD_TYPE_TABLE | PMD_TABLE_AF | PMD_TABLE_PXN);
 }
 
 #endif
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index fd330c1db289..c78a988cca93 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -99,6 +99,7 @@
 #define PGD_TYPE_TABLE		(_AT(pgdval_t, 3) << 0)
 #define PGD_TABLE_BIT		(_AT(pgdval_t, 1) << 1)
 #define PGD_TYPE_MASK		(_AT(pgdval_t, 3) << 0)
+#define PGD_TABLE_AF		(_AT(pgdval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 #define PGD_TABLE_PXN		(_AT(pgdval_t, 1) << 59)
 #define PGD_TABLE_UXN		(_AT(pgdval_t, 1) << 60)
 
@@ -110,6 +111,7 @@
 #define P4D_TYPE_MASK		(_AT(p4dval_t, 3) << 0)
 #define P4D_TYPE_SECT		(_AT(p4dval_t, 1) << 0)
 #define P4D_SECT_RDONLY		(_AT(p4dval_t, 1) << 7)		/* AP[2] */
+#define P4D_TABLE_AF		(_AT(p4dval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 #define P4D_TABLE_PXN		(_AT(p4dval_t, 1) << 59)
 #define P4D_TABLE_UXN		(_AT(p4dval_t, 1) << 60)
 
@@ -121,6 +123,7 @@
 #define PUD_TYPE_MASK		(_AT(pudval_t, 3) << 0)
 #define PUD_TYPE_SECT		(_AT(pudval_t, 1) << 0)
 #define PUD_SECT_RDONLY		(_AT(pudval_t, 1) << 7)		/* AP[2] */
+#define PUD_TABLE_AF		(_AT(pudval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 #define PUD_TABLE_PXN		(_AT(pudval_t, 1) << 59)
 #define PUD_TABLE_UXN		(_AT(pudval_t, 1) << 60)
 
@@ -131,6 +134,7 @@
 #define PMD_TYPE_TABLE		(_AT(pmdval_t, 3) << 0)
 #define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
 #define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
+#define PMD_TABLE_AF		(_AT(pmdval_t, 1) << 10)	/* Ignored if no FEAT_HAFT */
 
 /*
  * Section
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 718728a85430..878712fa0d3b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2590,6 +2590,21 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.cpus = &dbm_cpus,
 		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, DBM)
 	},
+#endif
+#ifdef CONFIG_ARM64_HAFT
+	{
+		.desc = "Hardware managed Access Flag for Table Descriptors",
+		/*
+		 * Contrary to the page/block access flag, the table access flag
+		 * cannot be emulated in software (no access fault will occur).
+		 * Therefore this should be used only if it's supported system
+		 * wide.
+		 */
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.capability = ARM64_HAFT,
+		.matches = has_cpuid_feature,
+		ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HAFDBS, HAFT)
+	},
 #endif
 	{
 		.desc = "CRC32 instructions",
diff --git a/arch/arm64/mm/fixmap.c b/arch/arm64/mm/fixmap.c
index de1e09d986ad..c5c5425791da 100644
--- a/arch/arm64/mm/fixmap.c
+++ b/arch/arm64/mm/fixmap.c
@@ -47,7 +47,8 @@ static void __init early_fixmap_init_pte(pmd_t *pmdp, unsigned long addr)
 
 	if (pmd_none(pmd)) {
 		ptep = bm_pte[BM_PTE_TABLE_IDX(addr)];
-		__pmd_populate(pmdp, __pa_symbol(ptep), PMD_TYPE_TABLE);
+		__pmd_populate(pmdp, __pa_symbol(ptep),
+			       PMD_TYPE_TABLE | PMD_TABLE_AF);
 	}
 }
 
@@ -59,7 +60,8 @@ static void __init early_fixmap_init_pmd(pud_t *pudp, unsigned long addr,
 	pmd_t *pmdp;
 
 	if (pud_none(pud))
-		__pud_populate(pudp, __pa_symbol(bm_pmd), PUD_TYPE_TABLE);
+		__pud_populate(pudp, __pa_symbol(bm_pmd),
+			       PUD_TYPE_TABLE | PUD_TABLE_AF);
 
 	pmdp = pmd_offset_kimg(pudp, addr);
 	do {
@@ -86,7 +88,8 @@ static void __init early_fixmap_init_pud(p4d_t *p4dp, unsigned long addr,
 	}
 
 	if (p4d_none(p4d))
-		__p4d_populate(p4dp, __pa_symbol(bm_pud), P4D_TYPE_TABLE);
+		__p4d_populate(p4dp, __pa_symbol(bm_pud),
+			       P4D_TYPE_TABLE | P4D_TABLE_AF);
 
 	pudp = pud_offset_kimg(p4dp, addr);
 	early_fixmap_init_pmd(pudp, addr, end);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e55b02fbddc8..6441a45eaeda 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -201,7 +201,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
 
 	BUG_ON(pmd_sect(pmd));
 	if (pmd_none(pmd)) {
-		pmdval_t pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN;
+		pmdval_t pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN | PMD_TABLE_AF;
 		phys_addr_t pte_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
@@ -288,7 +288,7 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
 	 */
 	BUG_ON(pud_sect(pud));
 	if (pud_none(pud)) {
-		pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_UXN;
+		pudval_t pudval = PUD_TYPE_TABLE | PUD_TABLE_UXN | PUD_TABLE_AF;
 		phys_addr_t pmd_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
@@ -333,7 +333,7 @@ static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end,
 	pud_t *pudp;
 
 	if (p4d_none(p4d)) {
-		p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN;
+		p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN | P4D_TABLE_AF;
 		phys_addr_t pud_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
@@ -391,7 +391,7 @@ static void alloc_init_p4d(pgd_t *pgdp, unsigned long addr, unsigned long end,
 	p4d_t *p4dp;
 
 	if (pgd_none(pgd)) {
-		pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_UXN;
+		pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_UXN | PGD_TABLE_AF;
 		phys_addr_t p4d_phys;
 
 		if (flags & NO_EXEC_MAPPINGS)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index ccbae4525891..0bc88df7cb35 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -498,6 +498,10 @@ alternative_else_nop_endif
 	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
 	cbz	x9, 1f
 	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
+
+#ifdef CONFIG_ARM64_HAFT
+	orr	tcr2, tcr2, TCR2_EL1x_HAFT
+#endif /* CONFIG_ARM64_HAFT */
 1:
 #endif	/* CONFIG_ARM64_HW_AFDBM */
 	msr	mair_el1, mair
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index eedb5acc21ed..b35004fa8313 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -56,6 +56,7 @@ HAS_TLB_RANGE
 HAS_VA52
 HAS_VIRT_HOST_EXTN
 HAS_WFXT
+HAFT
 HW_DBM
 KVM_HVHE
 KVM_PROTECTED_MODE
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
  2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
                   ` (2 preceding siblings ...)
  2024-11-02 10:42 ` [PATCH v4 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
@ 2024-11-02 10:42 ` Yicong Yang
  2024-11-04 17:29   ` Catalin Marinas
  2024-11-02 10:42 ` [PATCH v4 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang
  2024-11-05 13:51 ` [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Catalin Marinas
  5 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-11-02 10:42 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

With the support of FEAT_HAFT, the NONLEAF_PMD_YOUNG can be enabled
on arm64 since the hardware is capable of updating the AF flag for
PMD table descriptor. Since the AF bit of the table descriptor
shares the same bit position in block descriptors, we only need
to implement arch_has_hw_nonleaf_pmd_young() and select related
configs. The related pmd_young test/update operations keeps the
same with and already implemented for transparent page support.

Currently ARCH_HAS_NONLEAF_PMD_YOUNG is used to improve the
efficiency of lru-gen aging.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/Kconfig               | 1 +
 arch/arm64/include/asm/pgtable.h | 8 ++++++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7c023f3f55d6..0b6c66ec35e5 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -38,6 +38,7 @@ config ARM64
 	select ARCH_HAS_MEM_ENCRYPT
 	select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
 	select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+	select ARCH_HAS_NONLEAF_PMD_YOUNG if ARM64_HAFT
 	select ARCH_HAS_PTE_DEVMAP
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_HW_PTE_YOUNG
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index c329ea061dc9..5182feabd943 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1259,7 +1259,7 @@ static inline int __ptep_clear_flush_young(struct vm_area_struct *vma,
 	return young;
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG)
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 					    unsigned long address,
@@ -1267,7 +1267,7 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 {
 	return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp);
 }
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */
 
 static inline pte_t __ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long address, pte_t *ptep)
@@ -1502,6 +1502,10 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
  */
 #define arch_has_hw_pte_young		cpu_has_hw_af
 
+#ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG
+#define arch_has_hw_nonleaf_pmd_young	system_supports_haft
+#endif
+
 /*
  * Experimentally, it's cheap to set the access flag in hardware and we
  * benefit from prefaulting mappings as 'old' to start with.
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()
  2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
                   ` (3 preceding siblings ...)
  2024-11-02 10:42 ` [PATCH v4 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
@ 2024-11-02 10:42 ` Yicong Yang
  2024-11-04 17:29   ` Catalin Marinas
  2024-11-05 13:51 ` [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Catalin Marinas
  5 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-11-02 10:42 UTC (permalink / raw)
  To: catalin.marinas, will, maz, mark.rutland, broonie,
	linux-arm-kernel
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

From: Yicong Yang <yangyicong@hisilicon.com>

Young bit operation on PMD table entry is only supported if
FEAT_HAFT enabled system wide. Add a warning for notifying
the misbehaviour.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5182feabd943..160ca503c99f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1265,6 +1265,8 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 					    unsigned long address,
 					    pmd_t *pmdp)
 {
+	/* Operation applies to PMD table entry only if FEAT_HAFT is enabled */
+	VM_WARN_ON(pmd_table(READ_ONCE(*pmdp)) && !system_supports_haft());
 	return __ptep_test_and_clear_young(vma, address, (pte_t *)pmdp);
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG */
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-02 10:42 ` [PATCH v4 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
@ 2024-11-04 17:28   ` Catalin Marinas
  2024-11-05  2:47     ` Yicong Yang
  2024-11-05  8:35     ` Marc Zyngier
  0 siblings, 2 replies; 16+ messages in thread
From: Catalin Marinas @ 2024-11-04 17:28 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, maz, mark.rutland, broonie, linux-arm-kernel, oliver.upton,
	ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 3d261cc123c1..ed8c784ca082 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -838,6 +838,12 @@ static inline bool system_supports_poe(void)
>  		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
>  }
>  
> +static inline bool system_supports_haft(void)
> +{
> +	return IS_ENABLED(CONFIG_ARM64_HAFT) &&
> +		cpus_have_final_cap(ARM64_HAFT);
> +}

I'm fine with this approach. If we ever get hardware with mismatched
FEAT_HAFT and some secondary CPUs don't come up, we can revisit.

> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index ccbae4525891..0bc88df7cb35 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -498,6 +498,10 @@ alternative_else_nop_endif
>  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>  	cbz	x9, 1f
>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> +
> +#ifdef CONFIG_ARM64_HAFT
> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> +#endif /* CONFIG_ARM64_HAFT */
>  1:
>  #endif	/* CONFIG_ARM64_HW_AFDBM */
>  	msr	mair_el1, mair

If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
already have a dependency in the Kconfig. Anyway, I can fix this up.

I think as an additional patch we can also remove the ID checks for the
tcr bit in tge HW_AFDBM case. But that's unrelated to this series.

-- 
Catalin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()
  2024-11-02 10:42 ` [PATCH v4 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang
@ 2024-11-04 17:29   ` Catalin Marinas
  0 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2024-11-04 17:29 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, maz, mark.rutland, broonie, linux-arm-kernel, oliver.upton,
	ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Sat, Nov 02, 2024 at 06:42:35PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Young bit operation on PMD table entry is only supported if
> FEAT_HAFT enabled system wide. Add a warning for notifying
> the misbehaviour.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
  2024-11-02 10:42 ` [PATCH v4 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
@ 2024-11-04 17:29   ` Catalin Marinas
  0 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2024-11-04 17:29 UTC (permalink / raw)
  To: Yicong Yang
  Cc: will, maz, mark.rutland, broonie, linux-arm-kernel, oliver.upton,
	ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Sat, Nov 02, 2024 at 06:42:34PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> With the support of FEAT_HAFT, the NONLEAF_PMD_YOUNG can be enabled
> on arm64 since the hardware is capable of updating the AF flag for
> PMD table descriptor. Since the AF bit of the table descriptor
> shares the same bit position in block descriptors, we only need
> to implement arch_has_hw_nonleaf_pmd_young() and select related
> configs. The related pmd_young test/update operations keeps the
> same with and already implemented for transparent page support.
> 
> Currently ARCH_HAS_NONLEAF_PMD_YOUNG is used to improve the
> efficiency of lru-gen aging.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-04 17:28   ` Catalin Marinas
@ 2024-11-05  2:47     ` Yicong Yang
  2024-11-05 10:38       ` Yicong Yang
  2024-11-05  8:35     ` Marc Zyngier
  1 sibling, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-11-05  2:47 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: yangyicong, will, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On 2024/11/5 1:28, Catalin Marinas wrote:
> On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index 3d261cc123c1..ed8c784ca082 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -838,6 +838,12 @@ static inline bool system_supports_poe(void)
>>  		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
>>  }
>>  
>> +static inline bool system_supports_haft(void)
>> +{
>> +	return IS_ENABLED(CONFIG_ARM64_HAFT) &&
>> +		cpus_have_final_cap(ARM64_HAFT);
>> +}
> 
> I'm fine with this approach. If we ever get hardware with mismatched
> FEAT_HAFT and some secondary CPUs don't come up, we can revisit.
> 
>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>> index ccbae4525891..0bc88df7cb35 100644
>> --- a/arch/arm64/mm/proc.S
>> +++ b/arch/arm64/mm/proc.S
>> @@ -498,6 +498,10 @@ alternative_else_nop_endif
>>  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>>  	cbz	x9, 1f
>>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
>> +
>> +#ifdef CONFIG_ARM64_HAFT
>> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
>> +#endif /* CONFIG_ARM64_HAFT */
>>  1:
>>  #endif	/* CONFIG_ARM64_HW_AFDBM */
>>  	msr	mair_el1, mair
> 
> If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
> already have a dependency in the Kconfig. Anyway, I can fix this up.

yes it has already depend on the HW_AFDBM. And one asm won't cause much to the
Image size if user want CONFIG_ARM64_HAFT=n. I'll drop the #ifdef here.

> I think as an additional patch we can also remove the ID checks for the
> tcr bit in tge HW_AFDBM case. But that's unrelated to this series.

ok. I'll post a separate patch for dropping this.

Thanks.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-04 17:28   ` Catalin Marinas
  2024-11-05  2:47     ` Yicong Yang
@ 2024-11-05  8:35     ` Marc Zyngier
  2024-11-05  9:58       ` Catalin Marinas
  1 sibling, 1 reply; 16+ messages in thread
From: Marc Zyngier @ 2024-11-05  8:35 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Yicong Yang, will, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Mon, 04 Nov 2024 17:28:48 +0000,
Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > index 3d261cc123c1..ed8c784ca082 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -838,6 +838,12 @@ static inline bool system_supports_poe(void)
> >  		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
> >  }
> >  
> > +static inline bool system_supports_haft(void)
> > +{
> > +	return IS_ENABLED(CONFIG_ARM64_HAFT) &&
> > +		cpus_have_final_cap(ARM64_HAFT);
> > +}
> 
> I'm fine with this approach. If we ever get hardware with mismatched
> FEAT_HAFT and some secondary CPUs don't come up, we can revisit.
> 
> > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> > index ccbae4525891..0bc88df7cb35 100644
> > --- a/arch/arm64/mm/proc.S
> > +++ b/arch/arm64/mm/proc.S
> > @@ -498,6 +498,10 @@ alternative_else_nop_endif
> >  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
> >  	cbz	x9, 1f
> >  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> > +
> > +#ifdef CONFIG_ARM64_HAFT
> > +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> > +#endif /* CONFIG_ARM64_HAFT */
> >  1:
> >  #endif	/* CONFIG_ARM64_HW_AFDBM */
> >  	msr	mair_el1, mair
> 
> If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
> already have a dependency in the Kconfig. Anyway, I can fix this up.
> 
> I think as an additional patch we can also remove the ID checks for the
> tcr bit in tge HW_AFDBM case. But that's unrelated to this series.

I think you want to be careful with this one. I know of at least one
implementation that has a broken FEAT_HAFDBS implementation, that
removes it from the ID registers, but where the control bit in TCR_ELx
still takes effect.

Please see 6df696cd9bc1 ("arm64: errata: Mitigate Ampere1 erratum
AC03_CPU_38 at stage-2") which indicates how we actually rely on the
check for S1 translation.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-05  8:35     ` Marc Zyngier
@ 2024-11-05  9:58       ` Catalin Marinas
  2024-11-05 11:52         ` Marc Zyngier
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2024-11-05  9:58 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Yicong Yang, will, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Tue, Nov 05, 2024 at 08:35:51AM +0000, Marc Zyngier wrote:
> On Mon, 04 Nov 2024 17:28:48 +0000,
> Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
> > > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> > > index ccbae4525891..0bc88df7cb35 100644
> > > --- a/arch/arm64/mm/proc.S
> > > +++ b/arch/arm64/mm/proc.S
> > > @@ -498,6 +498,10 @@ alternative_else_nop_endif
> > >  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
> > >  	cbz	x9, 1f
> > >  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> > > +
> > > +#ifdef CONFIG_ARM64_HAFT
> > > +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> > > +#endif /* CONFIG_ARM64_HAFT */
> > >  1:
> > >  #endif	/* CONFIG_ARM64_HW_AFDBM */
> > >  	msr	mair_el1, mair
> > 
> > If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
> > already have a dependency in the Kconfig. Anyway, I can fix this up.
> > 
> > I think as an additional patch we can also remove the ID checks for the
> > tcr bit in tge HW_AFDBM case. But that's unrelated to this series.
> 
> I think you want to be careful with this one. I know of at least one
> implementation that has a broken FEAT_HAFDBS implementation, that
> removes it from the ID registers, but where the control bit in TCR_ELx
> still takes effect.
> 
> Please see 6df696cd9bc1 ("arm64: errata: Mitigate Ampere1 erratum
> AC03_CPU_38 at stage-2") which indicates how we actually rely on the
> check for S1 translation.

Ah, thanks for this. So the hardware with the erratum above can still
update the pte after it has been marked invalid, hence we can't turn it
on in TCR_EL1 even if the rest of the kernel considers the feature
disabled. So yes, the HAFDBS code needs to stay as is.

Let's hope the hardware people learnt and we won't have similar errata
for FEAT_HAFT.

-- 
Catalin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-05  2:47     ` Yicong Yang
@ 2024-11-05 10:38       ` Yicong Yang
  2024-11-05 10:54         ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2024-11-05 10:38 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: yangyicong, will, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On 2024/11/5 10:47, Yicong Yang wrote:
> On 2024/11/5 1:28, Catalin Marinas wrote:
>> On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
>>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>>> index 3d261cc123c1..ed8c784ca082 100644
>>> --- a/arch/arm64/include/asm/cpufeature.h
>>> +++ b/arch/arm64/include/asm/cpufeature.h
>>> @@ -838,6 +838,12 @@ static inline bool system_supports_poe(void)
>>>  		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
>>>  }
>>>  
>>> +static inline bool system_supports_haft(void)
>>> +{
>>> +	return IS_ENABLED(CONFIG_ARM64_HAFT) &&
>>> +		cpus_have_final_cap(ARM64_HAFT);
>>> +}
>>
>> I'm fine with this approach. If we ever get hardware with mismatched
>> FEAT_HAFT and some secondary CPUs don't come up, we can revisit.
>>
>>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
>>> index ccbae4525891..0bc88df7cb35 100644
>>> --- a/arch/arm64/mm/proc.S
>>> +++ b/arch/arm64/mm/proc.S
>>> @@ -498,6 +498,10 @@ alternative_else_nop_endif
>>>  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
>>>  	cbz	x9, 1f
>>>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
>>> +
>>> +#ifdef CONFIG_ARM64_HAFT
>>> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
>>> +#endif /* CONFIG_ARM64_HAFT */
>>>  1:
>>>  #endif	/* CONFIG_ARM64_HW_AFDBM */
>>>  	msr	mair_el1, mair
>>
>> If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
>> already have a dependency in the Kconfig. Anyway, I can fix this up.
> 
> yes it has already depend on the HW_AFDBM. And one asm won't cause much to the
> Image size if user want CONFIG_ARM64_HAFT=n. I'll drop the #ifdef here.
> 

I rethink it and maybe we still need the #ifdef here considering one case: the hardware
supports FEAT_HAFT while user make CONFIG_ARM64_HAFT=n, in such case the HAFT will be
enabled unexpectedly if no CONFIG_ARM64_HAFT protection here.

Thanks.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-05 10:38       ` Yicong Yang
@ 2024-11-05 10:54         ` Catalin Marinas
  0 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2024-11-05 10:54 UTC (permalink / raw)
  To: Yicong Yang
  Cc: yangyicong, will, maz, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang

On Tue, Nov 05, 2024 at 06:38:51PM +0800, Yicong Yang wrote:
> On 2024/11/5 10:47, Yicong Yang wrote:
> > On 2024/11/5 1:28, Catalin Marinas wrote:
> >> On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
> >>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> >>> index 3d261cc123c1..ed8c784ca082 100644
> >>> --- a/arch/arm64/include/asm/cpufeature.h
> >>> +++ b/arch/arm64/include/asm/cpufeature.h
> >>> @@ -838,6 +838,12 @@ static inline bool system_supports_poe(void)
> >>>  		alternative_has_cap_unlikely(ARM64_HAS_S1POE);
> >>>  }
> >>>  
> >>> +static inline bool system_supports_haft(void)
> >>> +{
> >>> +	return IS_ENABLED(CONFIG_ARM64_HAFT) &&
> >>> +		cpus_have_final_cap(ARM64_HAFT);
> >>> +}
> >>
> >> I'm fine with this approach. If we ever get hardware with mismatched
> >> FEAT_HAFT and some secondary CPUs don't come up, we can revisit.
> >>
> >>> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> >>> index ccbae4525891..0bc88df7cb35 100644
> >>> --- a/arch/arm64/mm/proc.S
> >>> +++ b/arch/arm64/mm/proc.S
> >>> @@ -498,6 +498,10 @@ alternative_else_nop_endif
> >>>  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
> >>>  	cbz	x9, 1f
> >>>  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> >>> +
> >>> +#ifdef CONFIG_ARM64_HAFT
> >>> +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> >>> +#endif /* CONFIG_ARM64_HAFT */
> >>>  1:
> >>>  #endif	/* CONFIG_ARM64_HW_AFDBM */
> >>>  	msr	mair_el1, mair
> >>
> >> If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
> >> already have a dependency in the Kconfig. Anyway, I can fix this up.
> > 
> > yes it has already depend on the HW_AFDBM. And one asm won't cause much to the
> > Image size if user want CONFIG_ARM64_HAFT=n. I'll drop the #ifdef here.
> > 
> 
> I rethink it and maybe we still need the #ifdef here considering one case: the hardware
> supports FEAT_HAFT while user make CONFIG_ARM64_HAFT=n, in such case the HAFT will be
> enabled unexpectedly if no CONFIG_ARM64_HAFT protection here.

Yes, still keeping the #ifdef but outside of HW_AFDBM. I can fix it up
myself when applying the patches.

-- 
Catalin


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/5] arm64: Add support for FEAT_HAFT
  2024-11-05  9:58       ` Catalin Marinas
@ 2024-11-05 11:52         ` Marc Zyngier
  0 siblings, 0 replies; 16+ messages in thread
From: Marc Zyngier @ 2024-11-05 11:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Yicong Yang, will, mark.rutland, broonie, linux-arm-kernel,
	oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Tue, 05 Nov 2024 09:58:26 +0000,
Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Tue, Nov 05, 2024 at 08:35:51AM +0000, Marc Zyngier wrote:
> > On Mon, 04 Nov 2024 17:28:48 +0000,
> > Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Sat, Nov 02, 2024 at 06:42:33PM +0800, Yicong Yang wrote:
> > > > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> > > > index ccbae4525891..0bc88df7cb35 100644
> > > > --- a/arch/arm64/mm/proc.S
> > > > +++ b/arch/arm64/mm/proc.S
> > > > @@ -498,6 +498,10 @@ alternative_else_nop_endif
> > > >  	and	x9, x9, ID_AA64MMFR1_EL1_HAFDBS_MASK
> > > >  	cbz	x9, 1f
> > > >  	orr	tcr, tcr, #TCR_HA		// hardware Access flag update
> > > > +
> > > > +#ifdef CONFIG_ARM64_HAFT
> > > > +	orr	tcr2, tcr2, TCR2_EL1x_HAFT
> > > > +#endif /* CONFIG_ARM64_HAFT */
> > > >  1:
> > > >  #endif	/* CONFIG_ARM64_HW_AFDBM */
> > > >  	msr	mair_el1, mair
> > > 
> > > If you still want #ifdefs, I'd have left it outside the HW_AFDBM. We
> > > already have a dependency in the Kconfig. Anyway, I can fix this up.
> > > 
> > > I think as an additional patch we can also remove the ID checks for the
> > > tcr bit in tge HW_AFDBM case. But that's unrelated to this series.
> > 
> > I think you want to be careful with this one. I know of at least one
> > implementation that has a broken FEAT_HAFDBS implementation, that
> > removes it from the ID registers, but where the control bit in TCR_ELx
> > still takes effect.
> > 
> > Please see 6df696cd9bc1 ("arm64: errata: Mitigate Ampere1 erratum
> > AC03_CPU_38 at stage-2") which indicates how we actually rely on the
> > check for S1 translation.
> 
> Ah, thanks for this. So the hardware with the erratum above can still
> update the pte after it has been marked invalid, hence we can't turn it
> on in TCR_EL1 even if the rest of the kernel considers the feature
> disabled. So yes, the HAFDBS code needs to stay as is.

Indeed. Atomicity is overrated, let's go shopping.

> Let's hope the hardware people learnt and we won't have similar errata
> for FEAT_HAFT.

If I was religious, I'd light a candle. But we've both seen enough HW
to know that they *will* fsck it up. We just don't know how yet.

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT
  2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
                   ` (4 preceding siblings ...)
  2024-11-02 10:42 ` [PATCH v4 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang
@ 2024-11-05 13:51 ` Catalin Marinas
  5 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2024-11-05 13:51 UTC (permalink / raw)
  To: will, maz, mark.rutland, broonie, linux-arm-kernel, Yicong Yang
  Cc: oliver.upton, ryan.roberts, linuxarm, jonathan.cameron,
	shameerali.kolothum.thodi, prime.zeng, xuwei5, wangkefeng.wang,
	yangyicong

On Sat, 02 Nov 2024 18:42:30 +0800, Yicong Yang wrote:
> This series adds basic support for FEAT_HAFT introduced in Armv8.9/v9.4
> and enable ARCH_HAS_NONLEAF_PMD_YOUNG. The latter will be used in
> lru-gen aging. Tested with lru-gen in below steps:
> 1. Generate a 1GiB workingset by `stress-ng --vm 1`. Then hang the task to
>    stop accessing the memory. (AF bit won't be updated)
> 2. try to age the memory by /sys/kernel/debug/lru_gen
> 
> [...]

Applied to arm64 (for-next/haft), thanks!

I added back the ID check as in v3 following Marc pointing out the
Ampere erratum. Who knows, we may get similar bugs for FEAT_HAFT in the
future, so better have it covered.

[1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register
      https://git.kernel.org/arm64/c/aa47dcda2708
[2/5] arm64: setup: name 'tcr2' register
      https://git.kernel.org/arm64/c/926b66e2ebc8
[3/5] arm64: Add support for FEAT_HAFT
      https://git.kernel.org/arm64/c/efe72541355d
[4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG
      https://git.kernel.org/arm64/c/62df5870ebf7
[5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young()
      https://git.kernel.org/arm64/c/b349a5a2b6e2

-- 
Catalin



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-11-05 13:53 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-02 10:42 [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Yicong Yang
2024-11-02 10:42 ` [PATCH v4 1/5] arm64/sysreg: Update ID_AA64MMFR1_EL1 register Yicong Yang
2024-11-02 10:42 ` [PATCH v4 2/5] arm64: setup: name 'tcr2' register Yicong Yang
2024-11-02 10:42 ` [PATCH v4 3/5] arm64: Add support for FEAT_HAFT Yicong Yang
2024-11-04 17:28   ` Catalin Marinas
2024-11-05  2:47     ` Yicong Yang
2024-11-05 10:38       ` Yicong Yang
2024-11-05 10:54         ` Catalin Marinas
2024-11-05  8:35     ` Marc Zyngier
2024-11-05  9:58       ` Catalin Marinas
2024-11-05 11:52         ` Marc Zyngier
2024-11-02 10:42 ` [PATCH v4 4/5] arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG Yicong Yang
2024-11-04 17:29   ` Catalin Marinas
2024-11-02 10:42 ` [PATCH v4 5/5] arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() Yicong Yang
2024-11-04 17:29   ` Catalin Marinas
2024-11-05 13:51 ` [PATCH v4 0/5] Support Armv8.9/v9.4 FEAT_HAFT Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).