* [PATCH v3 01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 02/13] arm64: mm: Introduce a C wrapper for by-range TLB invalidation Ryan Roberts
` (12 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
As part of efforts to reduce our reliance on complex preprocessor macros
for TLB invalidation routines, convert the __tlbi_level macro to a C
function for by-level TLB invalidation.
Each specific tlbi level op is implemented as a C function and the
appropriate function pointer is passed to __tlbi_level(). Since
everything is declared inline and is statically resolvable, the compiler
will convert the indirect function call to a direct inline execution.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 67 +++++++++++++++++++++++++------
1 file changed, 54 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 1416e652612b7..a0e3ebe299864 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -97,19 +97,60 @@ static inline unsigned long get_trans_granule(void)
#define TLBI_TTL_UNKNOWN INT_MAX
-#define __tlbi_level(op, addr, level) do { \
- u64 arg = addr; \
- \
- if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && \
- level >= 0 && level <= 3) { \
- u64 ttl = level & 3; \
- ttl |= get_trans_granule() << 2; \
- arg &= ~TLBI_TTL_MASK; \
- arg |= FIELD_PREP(TLBI_TTL_MASK, ttl); \
- } \
- \
- __tlbi(op, arg); \
-} while(0)
+typedef void (*tlbi_op)(u64 arg);
+
+static __always_inline void vae1is(u64 arg)
+{
+ __tlbi(vae1is, arg);
+}
+
+static __always_inline void vae2is(u64 arg)
+{
+ __tlbi(vae2is, arg);
+}
+
+static __always_inline void vale1(u64 arg)
+{
+ __tlbi(vale1, arg);
+}
+
+static __always_inline void vale1is(u64 arg)
+{
+ __tlbi(vale1is, arg);
+}
+
+static __always_inline void vale2is(u64 arg)
+{
+ __tlbi(vale2is, arg);
+}
+
+static __always_inline void vaale1is(u64 arg)
+{
+ __tlbi(vaale1is, arg);
+}
+
+static __always_inline void ipas2e1(u64 arg)
+{
+ __tlbi(ipas2e1, arg);
+}
+
+static __always_inline void ipas2e1is(u64 arg)
+{
+ __tlbi(ipas2e1is, arg);
+}
+
+static __always_inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
+{
+ u64 arg = addr;
+
+ if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && level <= 3) {
+ u64 ttl = level | (get_trans_granule() << 2);
+
+ FIELD_MODIFY(TLBI_TTL_MASK, &arg, ttl);
+ }
+
+ op(arg);
+}
#define __tlbi_user_level(op, arg, level) do { \
if (arm64_kernel_unmapped_at_el0()) \
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 02/13] arm64: mm: Introduce a C wrapper for by-range TLB invalidation
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 03/13] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Ryan Roberts
` (11 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
As part of efforts to reduce our reliance on complex preprocessor macros
for TLB invalidation routines, introduce a new C wrapper for by-range
TLB invalidation which can be used instead of the __tlbi() macro and can
additionally be called from C code.
Each specific tlbi range op is implemented as a C function and the
appropriate function pointer is passed to __tlbi_range(). Since
everything is declared inline and is statically resolvable, the compiler
will convert the indirect function call to a direct inline execution.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 32 ++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index a0e3ebe299864..b3b86e5f7034e 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -468,6 +468,36 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
* operations can only span an even number of pages. We save this for last to
* ensure 64KB start alignment is maintained for the LPA2 case.
*/
+static __always_inline void rvae1is(u64 arg)
+{
+ __tlbi(rvae1is, arg);
+}
+
+static __always_inline void rvale1(u64 arg)
+{
+ __tlbi(rvale1, arg);
+}
+
+static __always_inline void rvale1is(u64 arg)
+{
+ __tlbi(rvale1is, arg);
+}
+
+static __always_inline void rvaale1is(u64 arg)
+{
+ __tlbi(rvaale1is, arg);
+}
+
+static __always_inline void ripas2e1is(u64 arg)
+{
+ __tlbi(ripas2e1is, arg);
+}
+
+static __always_inline void __tlbi_range(tlbi_op op, u64 arg)
+{
+ op(arg);
+}
+
#define __flush_tlb_range_op(op, start, pages, stride, \
asid, tlb_level, tlbi_user, lpa2) \
do { \
@@ -495,7 +525,7 @@ do { \
if (num >= 0) { \
addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
scale, num, tlb_level); \
- __tlbi(r##op, addr); \
+ __tlbi_range(r##op, addr); \
if (tlbi_user) \
__tlbi_user(r##op, addr); \
__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 03/13] arm64: mm: Implicitly invalidate user ASID based on TLBI operation
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 02/13] arm64: mm: Introduce a C wrapper for by-range TLB invalidation Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 04/13] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Ryan Roberts
` (10 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
When kpti is enabled, separate ASIDs are used for userspace and
kernelspace, requiring ASID-qualified TLB invalidation by virtual
address to invalidate both of them.
Push the logic for invalidating the two ASIDs down into the low-level
tlbi-op-specific functions and remove the burden from the caller to
handle the kpti-specific behaviour.
Co-developed-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 30 +++++++++++++-----------------
1 file changed, 13 insertions(+), 17 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index b3b86e5f7034e..e586d9b71ea2d 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -102,6 +102,7 @@ typedef void (*tlbi_op)(u64 arg);
static __always_inline void vae1is(u64 arg)
{
__tlbi(vae1is, arg);
+ __tlbi_user(vae1is, arg);
}
static __always_inline void vae2is(u64 arg)
@@ -112,11 +113,13 @@ static __always_inline void vae2is(u64 arg)
static __always_inline void vale1(u64 arg)
{
__tlbi(vale1, arg);
+ __tlbi_user(vale1, arg);
}
static __always_inline void vale1is(u64 arg)
{
__tlbi(vale1is, arg);
+ __tlbi_user(vale1is, arg);
}
static __always_inline void vale2is(u64 arg)
@@ -152,11 +155,6 @@ static __always_inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
op(arg);
}
-#define __tlbi_user_level(op, arg, level) do { \
- if (arm64_kernel_unmapped_at_el0()) \
- __tlbi_level(op, (arg | USER_ASID_FLAG), level); \
-} while (0)
-
/*
* This macro creates a properly formatted VA operand for the TLB RANGE. The
* value bit assignments are:
@@ -444,8 +442,6 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
* @stride: Flush granularity
* @asid: The ASID of the task (0 for IPA instructions)
* @tlb_level: Translation Table level hint, if known
- * @tlbi_user: If 'true', call an additional __tlbi_user()
- * (typically for user ASIDs). 'flase' for IPA instructions
* @lpa2: If 'true', the lpa2 scheme is used as set out below
*
* When the CPU does not support TLB range operations, flush the TLB
@@ -471,16 +467,19 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
static __always_inline void rvae1is(u64 arg)
{
__tlbi(rvae1is, arg);
+ __tlbi_user(rvae1is, arg);
}
static __always_inline void rvale1(u64 arg)
{
__tlbi(rvale1, arg);
+ __tlbi_user(rvale1, arg);
}
static __always_inline void rvale1is(u64 arg)
{
__tlbi(rvale1is, arg);
+ __tlbi_user(rvale1is, arg);
}
static __always_inline void rvaale1is(u64 arg)
@@ -499,7 +498,7 @@ static __always_inline void __tlbi_range(tlbi_op op, u64 arg)
}
#define __flush_tlb_range_op(op, start, pages, stride, \
- asid, tlb_level, tlbi_user, lpa2) \
+ asid, tlb_level, lpa2) \
do { \
typeof(start) __flush_start = start; \
typeof(pages) __flush_pages = pages; \
@@ -514,8 +513,6 @@ do { \
(lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
addr = __TLBI_VADDR(__flush_start, asid); \
__tlbi_level(op, addr, tlb_level); \
- if (tlbi_user) \
- __tlbi_user_level(op, addr, tlb_level); \
__flush_start += stride; \
__flush_pages -= stride >> PAGE_SHIFT; \
continue; \
@@ -526,8 +523,6 @@ do { \
addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
scale, num, tlb_level); \
__tlbi_range(r##op, addr); \
- if (tlbi_user) \
- __tlbi_user(r##op, addr); \
__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
} \
@@ -536,7 +531,7 @@ do { \
} while (0)
#define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
- __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());
+ __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
static inline bool __flush_tlb_range_limit_excess(unsigned long start,
unsigned long end, unsigned long pages, unsigned long stride)
@@ -576,10 +571,10 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
if (last_level)
__flush_tlb_range_op(vale1is, start, pages, stride, asid,
- tlb_level, true, lpa2_is_enabled());
+ tlb_level, lpa2_is_enabled());
else
__flush_tlb_range_op(vae1is, start, pages, stride, asid,
- tlb_level, true, lpa2_is_enabled());
+ tlb_level, lpa2_is_enabled());
mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
}
@@ -604,7 +599,7 @@ static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
dsb(nshst);
asid = ASID(vma->vm_mm);
__flush_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid,
- 3, true, lpa2_is_enabled());
+ 3, lpa2_is_enabled());
mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
addr + CONT_PTE_SIZE);
dsb(nsh);
@@ -638,7 +633,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
dsb(ishst);
__flush_tlb_range_op(vaale1is, start, pages, stride, 0,
- TLBI_TTL_UNKNOWN, false, lpa2_is_enabled());
+ TLBI_TTL_UNKNOWN, lpa2_is_enabled());
__tlbi_sync_s1ish();
isb();
}
@@ -689,6 +684,7 @@ static inline bool huge_pmd_needs_flush(pmd_t oldpmd, pmd_t newpmd)
}
#define huge_pmd_needs_flush huge_pmd_needs_flush
+#undef __tlbi_user
#endif
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 04/13] arm64: mm: Push __TLBI_VADDR() into __tlbi_level()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (2 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 03/13] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 05/13] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Ryan Roberts
` (9 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, Linu Cherian
From: Will Deacon <will@kernel.org>
The __TLBI_VADDR() macro takes an ASID and an address and converts them
into a single argument formatted correctly for a TLB invalidation
instruction.
Rather than have callers worry about this (especially in the case where
the ASID is zero), push the macro down into __tlbi_level() via a new
__tlbi_level_asid() helper.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Linu Cherian <linu.cherian@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 14 ++++++++++----
arch/arm64/kernel/sys_compat.c | 2 +-
arch/arm64/kvm/hyp/nvhe/mm.c | 2 +-
arch/arm64/kvm/hyp/nvhe/tlb.c | 2 --
arch/arm64/kvm/hyp/pgtable.c | 4 ++--
arch/arm64/kvm/hyp/vhe/tlb.c | 2 --
6 files changed, 14 insertions(+), 12 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index e586d9b71ea2d..2832305606b72 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -142,9 +142,10 @@ static __always_inline void ipas2e1is(u64 arg)
__tlbi(ipas2e1is, arg);
}
-static __always_inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
+static __always_inline void __tlbi_level_asid(tlbi_op op, u64 addr, u32 level,
+ u16 asid)
{
- u64 arg = addr;
+ u64 arg = __TLBI_VADDR(addr, asid);
if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && level <= 3) {
u64 ttl = level | (get_trans_granule() << 2);
@@ -155,6 +156,11 @@ static __always_inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
op(arg);
}
+static inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
+{
+ __tlbi_level_asid(op, addr, level, 0);
+}
+
/*
* This macro creates a properly formatted VA operand for the TLB RANGE. The
* value bit assignments are:
@@ -511,8 +517,7 @@ do { \
if (!system_supports_tlb_range() || \
__flush_pages == 1 || \
(lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
- addr = __TLBI_VADDR(__flush_start, asid); \
- __tlbi_level(op, addr, tlb_level); \
+ __tlbi_level_asid(op, __flush_start, tlb_level, asid); \
__flush_start += stride; \
__flush_pages -= stride >> PAGE_SHIFT; \
continue; \
@@ -685,6 +690,7 @@ static inline bool huge_pmd_needs_flush(pmd_t oldpmd, pmd_t newpmd)
#define huge_pmd_needs_flush huge_pmd_needs_flush
#undef __tlbi_user
+#undef __TLBI_VADDR
#endif
#endif
diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
index b9d4998c97efa..7e9860143add8 100644
--- a/arch/arm64/kernel/sys_compat.c
+++ b/arch/arm64/kernel/sys_compat.c
@@ -36,7 +36,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
* The workaround requires an inner-shareable tlbi.
* We pick the reserved-ASID to minimise the impact.
*/
- __tlbi(aside1is, __TLBI_VADDR(0, 0));
+ __tlbi(aside1is, 0UL);
__tlbi_sync_s1ish();
}
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index 218976287d3fe..4d8fcc7a3a41e 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -270,7 +270,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot)
* https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/T/#mf10dfbaf1eaef9274c581b81c53758918c1d0f03
*/
dsb(ishst);
- __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), level);
+ __tlbi_level(vale2is, addr, level);
__tlbi_sync_s1ish_hyp();
isb();
}
diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
index 3dc1ce0d27fe6..b29140995d484 100644
--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
+++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
@@ -158,7 +158,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
* Instead, we invalidate Stage-2 for this IPA, and the
* whole of Stage-1. Weep...
*/
- ipa >>= 12;
__tlbi_level(ipas2e1is, ipa, level);
/*
@@ -188,7 +187,6 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
* Instead, we invalidate Stage-2 for this IPA, and the
* whole of Stage-1. Weep...
*/
- ipa >>= 12;
__tlbi_level(ipas2e1, ipa, level);
/*
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 9b480f947da26..30226f2d5564a 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -490,14 +490,14 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
kvm_clear_pte(ctx->ptep);
dsb(ishst);
- __tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), TLBI_TTL_UNKNOWN);
+ __tlbi_level(vae2is, ctx->addr, TLBI_TTL_UNKNOWN);
} else {
if (ctx->end - ctx->addr < granule)
return -EINVAL;
kvm_clear_pte(ctx->ptep);
dsb(ishst);
- __tlbi_level(vale2is, __TLBI_VADDR(ctx->addr, 0), ctx->level);
+ __tlbi_level(vale2is, ctx->addr, ctx->level);
*unmapped += granule;
}
diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c
index 35855dadfb1b3..f7b9dfe3f3a5a 100644
--- a/arch/arm64/kvm/hyp/vhe/tlb.c
+++ b/arch/arm64/kvm/hyp/vhe/tlb.c
@@ -104,7 +104,6 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
* Instead, we invalidate Stage-2 for this IPA, and the
* whole of Stage-1. Weep...
*/
- ipa >>= 12;
__tlbi_level(ipas2e1is, ipa, level);
/*
@@ -136,7 +135,6 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
* Instead, we invalidate Stage-2 for this IPA, and the
* whole of Stage-1. Weep...
*/
- ipa >>= 12;
__tlbi_level(ipas2e1, ipa, level);
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 05/13] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (3 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 04/13] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 06/13] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Ryan Roberts
` (8 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, Linu Cherian
From: Will Deacon <will@kernel.org>
The __TLBI_VADDR_RANGE() macro is only used in one place and isn't
something that's generally useful outside of the low-level range
invalidation gubbins.
Inline __TLBI_VADDR_RANGE() into the __tlbi_range() function so that the
macro can be removed entirely.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Linu Cherian <linu.cherian@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 32 +++++++++++++------------------
1 file changed, 13 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 2832305606b72..0f81547470ea2 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -186,19 +186,6 @@ static inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
#define TLBIR_TTL_MASK GENMASK_ULL(38, 37)
#define TLBIR_BADDR_MASK GENMASK_ULL(36, 0)
-#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl) \
- ({ \
- unsigned long __ta = 0; \
- unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0; \
- __ta |= FIELD_PREP(TLBIR_BADDR_MASK, baddr); \
- __ta |= FIELD_PREP(TLBIR_TTL_MASK, __ttl); \
- __ta |= FIELD_PREP(TLBIR_NUM_MASK, num); \
- __ta |= FIELD_PREP(TLBIR_SCALE_MASK, scale); \
- __ta |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule()); \
- __ta |= FIELD_PREP(TLBIR_ASID_MASK, asid); \
- __ta; \
- })
-
/* These macros are used by the TLBI RANGE feature. */
#define __TLBI_RANGE_PAGES(num, scale) \
((unsigned long)((num) + 1) << (5 * (scale) + 1))
@@ -498,8 +485,19 @@ static __always_inline void ripas2e1is(u64 arg)
__tlbi(ripas2e1is, arg);
}
-static __always_inline void __tlbi_range(tlbi_op op, u64 arg)
+static __always_inline void __tlbi_range(tlbi_op op, u64 addr,
+ u16 asid, int scale, int num,
+ u32 level, bool lpa2)
{
+ u64 arg = 0;
+
+ arg |= FIELD_PREP(TLBIR_BADDR_MASK, addr >> (lpa2 ? 16 : PAGE_SHIFT));
+ arg |= FIELD_PREP(TLBIR_TTL_MASK, level > 3 ? 0 : level);
+ arg |= FIELD_PREP(TLBIR_NUM_MASK, num);
+ arg |= FIELD_PREP(TLBIR_SCALE_MASK, scale);
+ arg |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule());
+ arg |= FIELD_PREP(TLBIR_ASID_MASK, asid);
+
op(arg);
}
@@ -510,8 +508,6 @@ do { \
typeof(pages) __flush_pages = pages; \
int num = 0; \
int scale = 3; \
- int shift = lpa2 ? 16 : PAGE_SHIFT; \
- unsigned long addr; \
\
while (__flush_pages > 0) { \
if (!system_supports_tlb_range() || \
@@ -525,9 +521,7 @@ do { \
\
num = __TLBI_RANGE_NUM(__flush_pages, scale); \
if (num >= 0) { \
- addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
- scale, num, tlb_level); \
- __tlbi_range(r##op, addr); \
+ __tlbi_range(r##op, __flush_start, asid, scale, num, tlb_level, lpa2); \
__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
} \
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 06/13] arm64: mm: Re-implement the __flush_tlb_range_op macro in C
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (4 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 05/13] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 07/13] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Ryan Roberts
` (7 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
The __flush_tlb_range_op() macro is horrible and has been a previous
source of bugs thanks to multiple expansions of its arguments (see
commit f7edb07ad7c6 ("Fix mmu notifiers for range-based invalidates")).
Rewrite the thing in C.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Co-developed-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 84 +++++++++++++++++--------------
1 file changed, 46 insertions(+), 38 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 0f81547470ea2..3c05afdbe3a69 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -429,12 +429,13 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
/*
* __flush_tlb_range_op - Perform TLBI operation upon a range
*
- * @op: TLBI instruction that operates on a range (has 'r' prefix)
+ * @lop: TLBI level operation to perform
+ * @rop: TLBI range operation to perform
* @start: The start address of the range
* @pages: Range as the number of pages from 'start'
* @stride: Flush granularity
* @asid: The ASID of the task (0 for IPA instructions)
- * @tlb_level: Translation Table level hint, if known
+ * @level: Translation Table level hint, if known
* @lpa2: If 'true', the lpa2 scheme is used as set out below
*
* When the CPU does not support TLB range operations, flush the TLB
@@ -501,36 +502,44 @@ static __always_inline void __tlbi_range(tlbi_op op, u64 addr,
op(arg);
}
-#define __flush_tlb_range_op(op, start, pages, stride, \
- asid, tlb_level, lpa2) \
-do { \
- typeof(start) __flush_start = start; \
- typeof(pages) __flush_pages = pages; \
- int num = 0; \
- int scale = 3; \
- \
- while (__flush_pages > 0) { \
- if (!system_supports_tlb_range() || \
- __flush_pages == 1 || \
- (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
- __tlbi_level_asid(op, __flush_start, tlb_level, asid); \
- __flush_start += stride; \
- __flush_pages -= stride >> PAGE_SHIFT; \
- continue; \
- } \
- \
- num = __TLBI_RANGE_NUM(__flush_pages, scale); \
- if (num >= 0) { \
- __tlbi_range(r##op, __flush_start, asid, scale, num, tlb_level, lpa2); \
- __flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
- __flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
- } \
- scale--; \
- } \
-} while (0)
+static __always_inline void __flush_tlb_range_op(tlbi_op lop, tlbi_op rop,
+ u64 start, size_t pages,
+ u64 stride, u16 asid,
+ u32 level, bool lpa2)
+{
+ u64 addr = start, end = start + pages * PAGE_SIZE;
+ int scale = 3;
+
+ while (addr != end) {
+ int num;
+
+ pages = (end - addr) >> PAGE_SHIFT;
+
+ if (!system_supports_tlb_range() || pages == 1)
+ goto invalidate_one;
+
+ if (lpa2 && !IS_ALIGNED(addr, SZ_64K))
+ goto invalidate_one;
+
+ num = __TLBI_RANGE_NUM(pages, scale);
+ if (num >= 0) {
+ __tlbi_range(rop, addr, asid, scale, num, level, lpa2);
+ addr += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT;
+ }
+
+ scale--;
+ continue;
+invalidate_one:
+ __tlbi_level_asid(lop, addr, level, asid);
+ addr += stride;
+ }
+}
+
+#define __flush_s1_tlb_range_op(op, start, pages, stride, asid, tlb_level) \
+ __flush_tlb_range_op(op, r##op, start, pages, stride, asid, tlb_level, lpa2_is_enabled())
#define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
- __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
+ __flush_tlb_range_op(op, r##op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled())
static inline bool __flush_tlb_range_limit_excess(unsigned long start,
unsigned long end, unsigned long pages, unsigned long stride)
@@ -569,11 +578,11 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
asid = ASID(mm);
if (last_level)
- __flush_tlb_range_op(vale1is, start, pages, stride, asid,
- tlb_level, lpa2_is_enabled());
+ __flush_s1_tlb_range_op(vale1is, start, pages, stride,
+ asid, tlb_level);
else
- __flush_tlb_range_op(vae1is, start, pages, stride, asid,
- tlb_level, lpa2_is_enabled());
+ __flush_s1_tlb_range_op(vae1is, start, pages, stride,
+ asid, tlb_level);
mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
}
@@ -597,8 +606,7 @@ static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
dsb(nshst);
asid = ASID(vma->vm_mm);
- __flush_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid,
- 3, lpa2_is_enabled());
+ __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
addr + CONT_PTE_SIZE);
dsb(nsh);
@@ -631,8 +639,8 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
}
dsb(ishst);
- __flush_tlb_range_op(vaale1is, start, pages, stride, 0,
- TLBI_TTL_UNKNOWN, lpa2_is_enabled());
+ __flush_s1_tlb_range_op(vaale1is, start, pages, stride, 0,
+ TLBI_TTL_UNKNOWN);
__tlbi_sync_s1ish();
isb();
}
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 07/13] arm64: mm: Simplify __TLBI_RANGE_NUM() macro
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (5 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 06/13] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 08/13] arm64: mm: Simplify __flush_tlb_range_limit_excess() Ryan Roberts
` (6 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
From: Will Deacon <will@kernel.org>
Since commit e2768b798a19 ("arm64/mm: Modify range-based tlbi to
decrement scale"), we don't need to clamp the 'pages' argument to fit
the range for the specified 'scale' as we know that the upper bits will
have been processed in a prior iteration.
Drop the clamping and simplify the __TLBI_RANGE_NUM() macro.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 3c05afdbe3a69..fb7e541cfdfd9 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -199,11 +199,7 @@ static inline void __tlbi_level(tlbi_op op, u64 addr, u32 level)
* range.
*/
#define __TLBI_RANGE_NUM(pages, scale) \
- ({ \
- int __pages = min((pages), \
- __TLBI_RANGE_PAGES(31, (scale))); \
- (__pages >> (5 * (scale) + 1)) - 1; \
- })
+ (((pages) >> (5 * (scale) + 1)) - 1)
#define __repeat_tlbi_sync(op, arg...) \
do { \
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 08/13] arm64: mm: Simplify __flush_tlb_range_limit_excess()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (6 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 07/13] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 09/13] arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() Ryan Roberts
` (5 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
From: Will Deacon <will@kernel.org>
__flush_tlb_range_limit_excess() is unnecessarily complicated:
- It takes a 'start', 'end' and 'pages' argument, whereas it only
needs 'pages' (which the caller has computed from the other two
arguments!).
- It erroneously compares 'pages' with MAX_TLBI_RANGE_PAGES when
the system doesn't support range-based invalidation but the range to
be invalidated would result in fewer than MAX_DVM_OPS invalidations.
Simplify the function so that it no longer takes the 'start' and 'end'
arguments and only considers the MAX_TLBI_RANGE_PAGES threshold on
systems that implement range-based invalidation.
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index fb7e541cfdfd9..fd86647db24bb 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -537,21 +537,19 @@ static __always_inline void __flush_tlb_range_op(tlbi_op lop, tlbi_op rop,
#define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
__flush_tlb_range_op(op, r##op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled())
-static inline bool __flush_tlb_range_limit_excess(unsigned long start,
- unsigned long end, unsigned long pages, unsigned long stride)
+static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
+ unsigned long stride)
{
/*
- * When the system does not support TLB range based flush
- * operation, (MAX_DVM_OPS - 1) pages can be handled. But
- * with TLB range based operation, MAX_TLBI_RANGE_PAGES
- * pages can be handled.
+ * Assume that the worst case number of DVM ops required to flush a
+ * given range on a system that supports tlb-range is 20 (4 scales, 1
+ * final page, 15 for alignment on LPA2 systems), which is much smaller
+ * than MAX_DVM_OPS.
*/
- if ((!system_supports_tlb_range() &&
- (end - start) >= (MAX_DVM_OPS * stride)) ||
- pages > MAX_TLBI_RANGE_PAGES)
- return true;
+ if (system_supports_tlb_range())
+ return pages > MAX_TLBI_RANGE_PAGES;
- return false;
+ return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
}
static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
@@ -565,7 +563,7 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
end = round_up(end, stride);
pages = (end - start) >> PAGE_SHIFT;
- if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
+ if (__flush_tlb_range_limit_excess(pages, stride)) {
flush_tlb_mm(mm);
return;
}
@@ -629,7 +627,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
end = round_up(end, stride);
pages = (end - start) >> PAGE_SHIFT;
- if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
+ if (__flush_tlb_range_limit_excess(pages, stride)) {
flush_tlb_all();
return;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 09/13] arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (7 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 08/13] arm64: mm: Simplify __flush_tlb_range_limit_excess() Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 10/13] arm64: mm: Refactor __flush_tlb_range() to take flags Ryan Roberts
` (4 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, Linu Cherian
Now that we have __tlbi_level_asid(), let's refactor the
*flush_tlb_page*() variants to use it rather than open coding.
The emitted tlbi(s) is/are intended to be exactly the same as before; no
TTL hint is provided. Although the spec for flush_tlb_page() allows for
setting the TTL hint to 3, it turns out that
flush_tlb_fix_spurious_fault_pmd() depends on
local_flush_tlb_page_nonotify() to invalidate the level 2 entry. This
will be fixed separately.
Reviewed-by: Linu Cherian <linu.cherian@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index fd86647db24bb..0a49a25a4fdc8 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -346,12 +346,8 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
static inline void __local_flush_tlb_page_nonotify_nosync(struct mm_struct *mm,
unsigned long uaddr)
{
- unsigned long addr;
-
dsb(nshst);
- addr = __TLBI_VADDR(uaddr, ASID(mm));
- __tlbi(vale1, addr);
- __tlbi_user(vale1, addr);
+ __tlbi_level_asid(vale1, uaddr, TLBI_TTL_UNKNOWN, ASID(mm));
}
static inline void local_flush_tlb_page_nonotify(struct vm_area_struct *vma,
@@ -373,12 +369,8 @@ static inline void local_flush_tlb_page(struct vm_area_struct *vma,
static inline void __flush_tlb_page_nosync(struct mm_struct *mm,
unsigned long uaddr)
{
- unsigned long addr;
-
dsb(ishst);
- addr = __TLBI_VADDR(uaddr, ASID(mm));
- __tlbi(vale1is, addr);
- __tlbi_user(vale1is, addr);
+ __tlbi_level_asid(vale1is, uaddr, TLBI_TTL_UNKNOWN, ASID(mm));
mmu_notifier_arch_invalidate_secondary_tlbs(mm, uaddr & PAGE_MASK,
(uaddr & PAGE_MASK) + PAGE_SIZE);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 10/13] arm64: mm: Refactor __flush_tlb_range() to take flags
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (8 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 09/13] arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid() Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-02 13:55 ` [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range() Ryan Roberts
` (3 subsequent siblings)
13 siblings, 0 replies; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, Linu Cherian
We have function variants with "_nosync", "_local", "_nonotify" as well
as the "last_level" parameter. Let's generalize and simplify by using a
flags parameter to encode all these variants.
As a first step, convert the "last_level" boolean parameter to a flags
parameter and create the first flag, TLBF_NOWALKCACHE. When present,
walk cache entries are not evicted, which is the same as the old
last_level=true.
Reviewed-by: Linu Cherian <linu.cherian@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/hugetlb.h | 12 ++++++------
arch/arm64/include/asm/pgtable.h | 4 ++--
arch/arm64/include/asm/tlb.h | 6 +++---
arch/arm64/include/asm/tlbflush.h | 32 +++++++++++++++++++------------
arch/arm64/mm/contpte.c | 5 +++--
arch/arm64/mm/hugetlbpage.c | 4 ++--
arch/arm64/mm/mmu.c | 2 +-
7 files changed, 37 insertions(+), 28 deletions(-)
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index e6f8ff3cc6306..d038ff14d16ca 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -71,23 +71,23 @@ static inline void __flush_hugetlb_tlb_range(struct vm_area_struct *vma,
unsigned long start,
unsigned long end,
unsigned long stride,
- bool last_level)
+ tlbf_t flags)
{
switch (stride) {
#ifndef __PAGETABLE_PMD_FOLDED
case PUD_SIZE:
- __flush_tlb_range(vma, start, end, PUD_SIZE, last_level, 1);
+ __flush_tlb_range(vma, start, end, PUD_SIZE, 1, flags);
break;
#endif
case CONT_PMD_SIZE:
case PMD_SIZE:
- __flush_tlb_range(vma, start, end, PMD_SIZE, last_level, 2);
+ __flush_tlb_range(vma, start, end, PMD_SIZE, 2, flags);
break;
case CONT_PTE_SIZE:
- __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, 3);
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, 3, flags);
break;
default:
- __flush_tlb_range(vma, start, end, PAGE_SIZE, last_level, TLBI_TTL_UNKNOWN);
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN, flags);
}
}
@@ -98,7 +98,7 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
{
unsigned long stride = huge_page_size(hstate_vma(vma));
- __flush_hugetlb_tlb_range(vma, start, end, stride, false);
+ __flush_hugetlb_tlb_range(vma, start, end, stride, TLBF_NONE);
}
#endif /* __ASM_HUGETLB_H */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b3e58735c49bd..88bb9275ac898 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -89,9 +89,9 @@ static inline void arch_leave_lazy_mmu_mode(void)
/* Set stride and tlb_level in flush_*_tlb_range */
#define flush_pmd_tlb_range(vma, addr, end) \
- __flush_tlb_range(vma, addr, end, PMD_SIZE, false, 2)
+ __flush_tlb_range(vma, addr, end, PMD_SIZE, 2, TLBF_NONE)
#define flush_pud_tlb_range(vma, addr, end) \
- __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
+ __flush_tlb_range(vma, addr, end, PUD_SIZE, 1, TLBF_NONE)
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
/*
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 8d762607285cc..10869d7731b83 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -53,7 +53,7 @@ static inline int tlb_get_level(struct mmu_gather *tlb)
static inline void tlb_flush(struct mmu_gather *tlb)
{
struct vm_area_struct vma = TLB_FLUSH_VMA(tlb->mm, 0);
- bool last_level = !tlb->freed_tables;
+ tlbf_t flags = tlb->freed_tables ? TLBF_NONE : TLBF_NOWALKCACHE;
unsigned long stride = tlb_get_unmap_size(tlb);
int tlb_level = tlb_get_level(tlb);
@@ -63,13 +63,13 @@ static inline void tlb_flush(struct mmu_gather *tlb)
* reallocate our ASID without invalidating the entire TLB.
*/
if (tlb->fullmm) {
- if (!last_level)
+ if (tlb->freed_tables)
flush_tlb_mm(tlb->mm);
return;
}
__flush_tlb_range(&vma, tlb->start, tlb->end, stride,
- last_level, tlb_level);
+ tlb_level, flags);
}
static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 0a49a25a4fdc8..d134824ea5daa 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -286,16 +286,16 @@ static inline void __tlbi_sync_s1ish_hyp(void)
* CPUs, ensuring that any walk-cache entries associated with the
* translation are also invalidated.
*
- * __flush_tlb_range(vma, start, end, stride, last_level, tlb_level)
+ * __flush_tlb_range(vma, start, end, stride, tlb_level, flags)
* Invalidate the virtual-address range '[start, end)' on all
* CPUs for the user address space corresponding to 'vma->mm'.
* The invalidation operations are issued at a granularity
- * determined by 'stride' and only affect any walk-cache entries
- * if 'last_level' is equal to false. tlb_level is the level at
+ * determined by 'stride'. tlb_level is the level at
* which the invalidation must take place. If the level is wrong,
* no invalidation may take place. In the case where the level
* cannot be easily determined, the value TLBI_TTL_UNKNOWN will
- * perform a non-hinted invalidation.
+ * perform a non-hinted invalidation. flags may be TLBF_NONE (0) or
+ * TLBF_NOWALKCACHE (elide eviction of walk cache entries).
*
* local_flush_tlb_page(vma, addr)
* Local variant of flush_tlb_page(). Stale TLB entries may
@@ -544,10 +544,18 @@ static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
}
+typedef unsigned __bitwise tlbf_t;
+
+/* No special behaviour. */
+#define TLBF_NONE ((__force tlbf_t)0)
+
+/* Invalidate tlb entries only, leaving the page table walk cache intact. */
+#define TLBF_NOWALKCACHE ((__force tlbf_t)BIT(0))
+
static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
unsigned long start, unsigned long end,
- unsigned long stride, bool last_level,
- int tlb_level)
+ unsigned long stride, int tlb_level,
+ tlbf_t flags)
{
unsigned long asid, pages;
@@ -563,7 +571,7 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
dsb(ishst);
asid = ASID(mm);
- if (last_level)
+ if (flags & TLBF_NOWALKCACHE)
__flush_s1_tlb_range_op(vale1is, start, pages, stride,
asid, tlb_level);
else
@@ -575,11 +583,11 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
static inline void __flush_tlb_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
- unsigned long stride, bool last_level,
- int tlb_level)
+ unsigned long stride, int tlb_level,
+ tlbf_t flags)
{
__flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
- last_level, tlb_level);
+ tlb_level, flags);
__tlbi_sync_s1ish();
}
@@ -607,7 +615,7 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
* Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough
* information here.
*/
- __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN, TLBF_NONE);
}
static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
@@ -648,7 +656,7 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
struct mm_struct *mm, unsigned long start, unsigned long end)
{
- __flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3);
+ __flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, 3, TLBF_NOWALKCACHE);
}
static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
index b929a455103f8..681f22fac52a1 100644
--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -225,7 +225,8 @@ static void contpte_convert(struct mm_struct *mm, unsigned long addr,
*/
if (!system_supports_bbml2_noabort())
- __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, true, 3);
+ __flush_tlb_range(&vma, start_addr, addr, PAGE_SIZE, 3,
+ TLBF_NOWALKCACHE);
__set_ptes(mm, start_addr, start_ptep, pte, CONT_PTES);
}
@@ -552,7 +553,7 @@ int contpte_clear_flush_young_ptes(struct vm_area_struct *vma,
* eliding the trailing DSB applies here.
*/
__flush_tlb_range_nosync(vma->vm_mm, addr, end,
- PAGE_SIZE, true, 3);
+ PAGE_SIZE, 3, TLBF_NOWALKCACHE);
}
return young;
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index a42c05cf56408..0b7ccd0cbb9ec 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -181,7 +181,7 @@ static pte_t get_clear_contig_flush(struct mm_struct *mm,
struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
unsigned long end = addr + (pgsize * ncontig);
- __flush_hugetlb_tlb_range(&vma, addr, end, pgsize, true);
+ __flush_hugetlb_tlb_range(&vma, addr, end, pgsize, TLBF_NOWALKCACHE);
return orig_pte;
}
@@ -209,7 +209,7 @@ static void clear_flush(struct mm_struct *mm,
if (mm == &init_mm)
flush_tlb_kernel_range(saddr, addr);
else
- __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, true);
+ __flush_hugetlb_tlb_range(&vma, saddr, addr, pgsize, TLBF_NOWALKCACHE);
}
void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a6a00accf4f93..054df431846fd 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -2149,7 +2149,7 @@ pte_t modify_prot_start_ptes(struct vm_area_struct *vma, unsigned long addr,
*/
if (pte_accessible(vma->vm_mm, pte) && pte_user_exec(pte))
__flush_tlb_range(vma, addr, nr * PAGE_SIZE,
- PAGE_SIZE, true, 3);
+ PAGE_SIZE, 3, TLBF_NOWALKCACHE);
}
return pte;
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (9 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 10/13] arm64: mm: Refactor __flush_tlb_range() to take flags Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-03 9:57 ` Jonathan Cameron
2026-03-02 13:55 ` [PATCH v3 12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() Ryan Roberts
` (2 subsequent siblings)
13 siblings, 1 reply; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel
Refactor function variants with "_nosync", "_local" and "_nonotify" into
a single __always_inline implementation that takes flags and rely on
constant folding to select the parts that are actually needed at any
given callsite, based on the provided flags.
Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
to provide the strongest semantics (i.e. evict from walk cache,
broadcast, synchronise and notify). Each flag reduces the strength in
some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
complement the existing TLBF_NOWALKCACHE.
There are no users that require TLBF_NOBROADCAST without
TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
introduce dead code for vae1 invalidations.
The result is a clearer, simpler, more powerful API.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 95 ++++++++++++++++++-------------
arch/arm64/mm/contpte.c | 9 ++-
2 files changed, 62 insertions(+), 42 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index d134824ea5daa..5509927e45b93 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -295,7 +295,10 @@ static inline void __tlbi_sync_s1ish_hyp(void)
* no invalidation may take place. In the case where the level
* cannot be easily determined, the value TLBI_TTL_UNKNOWN will
* perform a non-hinted invalidation. flags may be TLBF_NONE (0) or
- * TLBF_NOWALKCACHE (elide eviction of walk cache entries).
+ * any combination of TLBF_NOWALKCACHE (elide eviction of walk
+ * cache entries), TLBF_NONOTIFY (don't call mmu notifiers),
+ * TLBF_NOSYNC (don't issue trailing dsb) and TLBF_NOBROADCAST
+ * (only perform the invalidation for the local cpu).
*
* local_flush_tlb_page(vma, addr)
* Local variant of flush_tlb_page(). Stale TLB entries may
@@ -305,12 +308,6 @@ static inline void __tlbi_sync_s1ish_hyp(void)
* Same as local_flush_tlb_page() except MMU notifier will not be
* called.
*
- * local_flush_tlb_contpte(vma, addr)
- * Invalidate the virtual-address range
- * '[addr, addr+CONT_PTE_SIZE)' mapped with contpte on local CPU
- * for the user address space corresponding to 'vma->mm'. Stale
- * TLB entries may remain in remote CPUs.
- *
* Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented
* on top of these routines, since that is our interface to the mmu_gather
* API as used by munmap() and friends.
@@ -552,15 +549,23 @@ typedef unsigned __bitwise tlbf_t;
/* Invalidate tlb entries only, leaving the page table walk cache intact. */
#define TLBF_NOWALKCACHE ((__force tlbf_t)BIT(0))
-static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
- unsigned long start, unsigned long end,
- unsigned long stride, int tlb_level,
- tlbf_t flags)
+/* Skip the trailing dsb after issuing tlbi. */
+#define TLBF_NOSYNC ((__force tlbf_t)BIT(1))
+
+/* Suppress tlb notifier callbacks for this flush operation. */
+#define TLBF_NONOTIFY ((__force tlbf_t)BIT(2))
+
+/* Perform the tlbi locally without broadcasting to other CPUs. */
+#define TLBF_NOBROADCAST ((__force tlbf_t)BIT(3))
+
+static __always_inline void __do_flush_tlb_range(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end,
+ unsigned long stride, int tlb_level,
+ tlbf_t flags)
{
+ struct mm_struct *mm = vma->vm_mm;
unsigned long asid, pages;
- start = round_down(start, stride);
- end = round_up(end, stride);
pages = (end - start) >> PAGE_SHIFT;
if (__flush_tlb_range_limit_excess(pages, stride)) {
@@ -568,17 +573,41 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
return;
}
- dsb(ishst);
+ if (!(flags & TLBF_NOBROADCAST))
+ dsb(ishst);
+ else
+ dsb(nshst);
+
asid = ASID(mm);
- if (flags & TLBF_NOWALKCACHE)
- __flush_s1_tlb_range_op(vale1is, start, pages, stride,
- asid, tlb_level);
- else
+ switch (flags & (TLBF_NOWALKCACHE | TLBF_NOBROADCAST)) {
+ case TLBF_NONE:
__flush_s1_tlb_range_op(vae1is, start, pages, stride,
- asid, tlb_level);
+ asid, tlb_level);
+ break;
+ case TLBF_NOWALKCACHE:
+ __flush_s1_tlb_range_op(vale1is, start, pages, stride,
+ asid, tlb_level);
+ break;
+ case TLBF_NOBROADCAST:
+ /* Combination unused */
+ BUG();
+ break;
+ case TLBF_NOWALKCACHE | TLBF_NOBROADCAST:
+ __flush_s1_tlb_range_op(vale1, start, pages, stride,
+ asid, tlb_level);
+ break;
+ }
+
+ if (!(flags & TLBF_NONOTIFY))
+ mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
- mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
+ if (!(flags & TLBF_NOSYNC)) {
+ if (!(flags & TLBF_NOBROADCAST))
+ __tlbi_sync_s1ish();
+ else
+ dsb(nsh);
+ }
}
static inline void __flush_tlb_range(struct vm_area_struct *vma,
@@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
unsigned long stride, int tlb_level,
tlbf_t flags)
{
- __flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
- tlb_level, flags);
- __tlbi_sync_s1ish();
-}
-
-static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
- unsigned long addr)
-{
- unsigned long asid;
-
- addr = round_down(addr, CONT_PTE_SIZE);
-
- dsb(nshst);
- asid = ASID(vma->vm_mm);
- __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
- mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
- addr + CONT_PTE_SIZE);
- dsb(nsh);
+ start = round_down(start, stride);
+ end = round_up(end, stride);
+ __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
}
static inline void flush_tlb_range(struct vm_area_struct *vma,
@@ -656,7 +670,10 @@ static inline void __flush_tlb_kernel_pgtable(unsigned long kaddr)
static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
struct mm_struct *mm, unsigned long start, unsigned long end)
{
- __flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, 3, TLBF_NOWALKCACHE);
+ struct vm_area_struct vma = { .vm_mm = mm, .vm_flags = 0 };
+
+ __flush_tlb_range(&vma, start, end, PAGE_SIZE, 3,
+ TLBF_NOWALKCACHE | TLBF_NOSYNC);
}
static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
index 681f22fac52a1..3f1a3e86353de 100644
--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -552,8 +552,8 @@ int contpte_clear_flush_young_ptes(struct vm_area_struct *vma,
* See comment in __ptep_clear_flush_young(); same rationale for
* eliding the trailing DSB applies here.
*/
- __flush_tlb_range_nosync(vma->vm_mm, addr, end,
- PAGE_SIZE, 3, TLBF_NOWALKCACHE);
+ __flush_tlb_range(vma, addr, end, PAGE_SIZE, 3,
+ TLBF_NOWALKCACHE | TLBF_NOSYNC);
}
return young;
@@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
__ptep_set_access_flags(vma, addr, ptep, entry, 0);
if (dirty)
- local_flush_tlb_contpte(vma, start_addr);
+ __flush_tlb_range(vma, start_addr,
+ start_addr + CONT_PTE_SIZE,
+ PAGE_SIZE, 3,
+ TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
} else {
__contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
__ptep_set_access_flags(vma, addr, ptep, entry, dirty);
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()
2026-03-02 13:55 ` [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range() Ryan Roberts
@ 2026-03-03 9:57 ` Jonathan Cameron
2026-03-03 13:54 ` Ryan Roberts
0 siblings, 1 reply; 22+ messages in thread
From: Jonathan Cameron @ 2026-03-03 9:57 UTC (permalink / raw)
To: Ryan Roberts
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, linux-arm-kernel, linux-kernel
On Mon, 2 Mar 2026 13:55:58 +0000
Ryan Roberts <ryan.roberts@arm.com> wrote:
> Refactor function variants with "_nosync", "_local" and "_nonotify" into
> a single __always_inline implementation that takes flags and rely on
> constant folding to select the parts that are actually needed at any
> given callsite, based on the provided flags.
>
> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
> to provide the strongest semantics (i.e. evict from walk cache,
> broadcast, synchronise and notify). Each flag reduces the strength in
> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
> complement the existing TLBF_NOWALKCACHE.
>
> There are no users that require TLBF_NOBROADCAST without
> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
> introduce dead code for vae1 invalidations.
>
> The result is a clearer, simpler, more powerful API.
Hi Ryan,
There is one subtle change to rounding that should be called out at least.
Might even be worth pulling it to a precursor patch where you can add an
explanation of why original code was rounding to a larger value than was
ever needed.
Jonathan
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> static inline void __flush_tlb_range(struct vm_area_struct *vma,
> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
> unsigned long stride, int tlb_level,
> tlbf_t flags)
> {
> - __flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
> - tlb_level, flags);
> - __tlbi_sync_s1ish();
> -}
> -
> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
> - unsigned long addr)
> -{
> - unsigned long asid;
> -
> - addr = round_down(addr, CONT_PTE_SIZE);
See below.
> -
> - dsb(nshst);
> - asid = ASID(vma->vm_mm);
> - __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
> - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
> - addr + CONT_PTE_SIZE);
> - dsb(nsh);
> + start = round_down(start, stride);
See below.
> + end = round_up(end, stride);
> + __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
> }
>
> static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> index 681f22fac52a1..3f1a3e86353de 100644
> --- a/arch/arm64/mm/contpte.c
> +++ b/arch/arm64/mm/contpte.c
...
> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
> __ptep_set_access_flags(vma, addr, ptep, entry, 0);
>
> if (dirty)
> - local_flush_tlb_contpte(vma, start_addr);
> + __flush_tlb_range(vma, start_addr,
> + start_addr + CONT_PTE_SIZE,
> + PAGE_SIZE, 3,
This results in a different stride to round down.
local_flush_tlb_contpte() did
addr = round_down(addr, CONT_PTE_SIZE);
With this call we have
start = round_down(start, stride); where stride is PAGE_SIZE.
I'm too lazy to figure out if that matters.
> + TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
> } else {
> __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
> __ptep_set_access_flags(vma, addr, ptep, entry, dirty);
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()
2026-03-03 9:57 ` Jonathan Cameron
@ 2026-03-03 13:54 ` Ryan Roberts
2026-03-03 17:34 ` Jonathan Cameron
0 siblings, 1 reply; 22+ messages in thread
From: Ryan Roberts @ 2026-03-03 13:54 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, linux-arm-kernel, linux-kernel
On 03/03/2026 09:57, Jonathan Cameron wrote:
> On Mon, 2 Mar 2026 13:55:58 +0000
> Ryan Roberts <ryan.roberts@arm.com> wrote:
>
>> Refactor function variants with "_nosync", "_local" and "_nonotify" into
>> a single __always_inline implementation that takes flags and rely on
>> constant folding to select the parts that are actually needed at any
>> given callsite, based on the provided flags.
>>
>> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
>> to provide the strongest semantics (i.e. evict from walk cache,
>> broadcast, synchronise and notify). Each flag reduces the strength in
>> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
>> complement the existing TLBF_NOWALKCACHE.
>>
>> There are no users that require TLBF_NOBROADCAST without
>> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
>> introduce dead code for vae1 invalidations.
>>
>> The result is a clearer, simpler, more powerful API.
> Hi Ryan,
>
> There is one subtle change to rounding that should be called out at least.
Thanks for the review. I'm confident that there isn't actually a change to the
rounding here, but the responsibility has moved to the caller. See below...
>
> Might even be worth pulling it to a precursor patch where you can add an
> explanation of why original code was rounding to a larger value than was
> ever needed.
>
> Jonathan
>
>
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>
>
>> static inline void __flush_tlb_range(struct vm_area_struct *vma,
>> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
>> unsigned long stride, int tlb_level,
>> tlbf_t flags)
>> {
>> - __flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
>> - tlb_level, flags);
>> - __tlbi_sync_s1ish();
>> -}
>> -
>> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
>> - unsigned long addr)
>> -{
>> - unsigned long asid;
>> -
>> - addr = round_down(addr, CONT_PTE_SIZE);
> See below.
>> -
>> - dsb(nshst);
>> - asid = ASID(vma->vm_mm);
>> - __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
>> - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
>> - addr + CONT_PTE_SIZE);
>> - dsb(nsh);
>> + start = round_down(start, stride);
> See below.
>> + end = round_up(end, stride);
>> + __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
>> }
>
>>
>> static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
>> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
>> index 681f22fac52a1..3f1a3e86353de 100644
>> --- a/arch/arm64/mm/contpte.c
>> +++ b/arch/arm64/mm/contpte.c
> ...
>
>> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
>> __ptep_set_access_flags(vma, addr, ptep, entry, 0);
>>
>> if (dirty)
>> - local_flush_tlb_contpte(vma, start_addr);
>> + __flush_tlb_range(vma, start_addr,
>> + start_addr + CONT_PTE_SIZE,
>> + PAGE_SIZE, 3,
>
> This results in a different stride to round down.
> local_flush_tlb_contpte() did
> addr = round_down(addr, CONT_PTE_SIZE);
>
> With this call we have
> start = round_down(start, stride); where stride is PAGE_SIZE.
>
> I'm too lazy to figure out if that matters.
contpte_ptep_set_access_flags() is operating on a contpte block of ptes, and as
such, start_addr has already been rounded down to the start of the block, which
is always bigger than (and perfectly divisible by) PAGE_SIZE.
Previously, local_flush_tlb_contpte() allowed passing any VA in within the
contpte block and the function would automatically round it down to the start of
the block and invalidate the full block.
After the change, we are explicitly passing the already aligned block;
start_addr is already guaranteed to be at the start of the block and "start_addr
+ CONT_PTE_SIZE" is the end.
So in both cases, the rounding down that is done by local_flush_tlb_contpte() /
__flush_tlb_range() doesn't actually change the value.
Thanks,
Ryan
>
>
>> + TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
>> } else {
>> __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
>> __ptep_set_access_flags(vma, addr, ptep, entry, dirty);
>
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()
2026-03-03 13:54 ` Ryan Roberts
@ 2026-03-03 17:34 ` Jonathan Cameron
0 siblings, 0 replies; 22+ messages in thread
From: Jonathan Cameron @ 2026-03-03 17:34 UTC (permalink / raw)
To: Ryan Roberts
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, linux-arm-kernel, linux-kernel
On Tue, 3 Mar 2026 13:54:33 +0000
Ryan Roberts <ryan.roberts@arm.com> wrote:
> On 03/03/2026 09:57, Jonathan Cameron wrote:
> > On Mon, 2 Mar 2026 13:55:58 +0000
> > Ryan Roberts <ryan.roberts@arm.com> wrote:
> >
> >> Refactor function variants with "_nosync", "_local" and "_nonotify" into
> >> a single __always_inline implementation that takes flags and rely on
> >> constant folding to select the parts that are actually needed at any
> >> given callsite, based on the provided flags.
> >>
> >> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
> >> to provide the strongest semantics (i.e. evict from walk cache,
> >> broadcast, synchronise and notify). Each flag reduces the strength in
> >> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
> >> complement the existing TLBF_NOWALKCACHE.
> >>
> >> There are no users that require TLBF_NOBROADCAST without
> >> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
> >> introduce dead code for vae1 invalidations.
> >>
> >> The result is a clearer, simpler, more powerful API.
> > Hi Ryan,
> >
> > There is one subtle change to rounding that should be called out at least.
>
> Thanks for the review. I'm confident that there isn't actually a change to the
> rounding here, but the responsibility has moved to the caller. See below...
>
> >
> > Might even be worth pulling it to a precursor patch where you can add an
> > explanation of why original code was rounding to a larger value than was
> > ever needed.
> >
> > Jonathan
> >
> >
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> >
> >
> >> static inline void __flush_tlb_range(struct vm_area_struct *vma,
> >> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
> >> unsigned long stride, int tlb_level,
> >> tlbf_t flags)
> >> {
> >> - __flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
> >> - tlb_level, flags);
> >> - __tlbi_sync_s1ish();
> >> -}
> >> -
> >> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
> >> - unsigned long addr)
> >> -{
> >> - unsigned long asid;
> >> -
> >> - addr = round_down(addr, CONT_PTE_SIZE);
> > See below.
> >> -
> >> - dsb(nshst);
> >> - asid = ASID(vma->vm_mm);
> >> - __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
> >> - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
> >> - addr + CONT_PTE_SIZE);
> >> - dsb(nsh);
> >> + start = round_down(start, stride);
> > See below.
> >> + end = round_up(end, stride);
> >> + __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
> >> }
> >
> >>
> >> static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
> >> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> >> index 681f22fac52a1..3f1a3e86353de 100644
> >> --- a/arch/arm64/mm/contpte.c
> >> +++ b/arch/arm64/mm/contpte.c
> > ...
> >
> >> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
> >> __ptep_set_access_flags(vma, addr, ptep, entry, 0);
> >>
> >> if (dirty)
> >> - local_flush_tlb_contpte(vma, start_addr);
> >> + __flush_tlb_range(vma, start_addr,
> >> + start_addr + CONT_PTE_SIZE,
> >> + PAGE_SIZE, 3,
> >
> > This results in a different stride to round down.
> > local_flush_tlb_contpte() did
> > addr = round_down(addr, CONT_PTE_SIZE);
> >
> > With this call we have
> > start = round_down(start, stride); where stride is PAGE_SIZE.
> >
> > I'm too lazy to figure out if that matters.
>
> contpte_ptep_set_access_flags() is operating on a contpte block of ptes, and as
> such, start_addr has already been rounded down to the start of the block, which
> is always bigger than (and perfectly divisible by) PAGE_SIZE.
>
> Previously, local_flush_tlb_contpte() allowed passing any VA in within the
> contpte block and the function would automatically round it down to the start of
> the block and invalidate the full block.
>
> After the change, we are explicitly passing the already aligned block;
> start_addr is already guaranteed to be at the start of the block and "start_addr
> + CONT_PTE_SIZE" is the end.
>
> So in both cases, the rounding down that is done by local_flush_tlb_contpte() /
> __flush_tlb_range() doesn't actually change the value.
Ah ok, so key is that the round down in local_flush_tlb_contpte() never
did anything in practice because the only caller is
contpte_ptep_set_access_flags() and that does the align down a couple of
lines before the call. I should have spent a few seconds looking! :(
Maybe if you are respinning just throw in a one line comment on this in the commit
description.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>
> Thanks,
> Ryan
>
>
> >
> >
> >> + TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
> >> } else {
> >> __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
> >> __ptep_set_access_flags(vma, addr, ptep, entry, dirty);
> >
>
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (10 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range() Ryan Roberts
@ 2026-03-02 13:55 ` Ryan Roberts
2026-03-03 9:59 ` Jonathan Cameron
2026-03-02 13:56 ` [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page() Ryan Roberts
2026-03-13 19:43 ` [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Catalin Marinas
13 siblings, 1 reply; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:55 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, Linu Cherian
Flushing a page from the tlb is just a special case of flushing a range.
So let's rework flush_tlb_page() so that it simply wraps
__do_flush_tlb_range(). While at it, let's also update the API to take
the same flags that we use when flushing a range. This allows us to
delete all the ugly "_nosync", "_local" and "_nonotify" variants.
Thanks to constant folding, all of the complex looping and tlbi-by-range
options get eliminated so that the generated code for flush_tlb_page()
looks very similar to the previous version.
Reviewed-by: Linu Cherian <linu.cherian@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/pgtable.h | 6 +--
arch/arm64/include/asm/tlbflush.h | 81 ++++++++++---------------------
arch/arm64/mm/fault.c | 2 +-
3 files changed, 29 insertions(+), 60 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 88bb9275ac898..7039931df4622 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -101,10 +101,10 @@ static inline void arch_leave_lazy_mmu_mode(void)
* entries exist.
*/
#define flush_tlb_fix_spurious_fault(vma, address, ptep) \
- local_flush_tlb_page_nonotify(vma, address)
+ __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
- local_flush_tlb_page_nonotify(vma, address)
+ __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
/*
* ZERO_PAGE is a global shared page that is always zero: used
@@ -1320,7 +1320,7 @@ static inline int __ptep_clear_flush_young(struct vm_area_struct *vma,
* context-switch, which provides a DSB to complete the TLB
* invalidation.
*/
- flush_tlb_page_nosync(vma, address);
+ __flush_tlb_page(vma, address, TLBF_NOSYNC);
}
return young;
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 5509927e45b93..5096ec7ab8650 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -269,10 +269,7 @@ static inline void __tlbi_sync_s1ish_hyp(void)
* unmapping pages from vmalloc/io space.
*
* flush_tlb_page(vma, addr)
- * Invalidate a single user mapping for address 'addr' in the
- * address space corresponding to 'vma->mm'. Note that this
- * operation only invalidates a single, last-level page-table
- * entry and therefore does not affect any walk-caches.
+ * Equivalent to __flush_tlb_page(..., flags=TLBF_NONE)
*
*
* Next, we have some undocumented invalidation routines that you probably
@@ -300,13 +297,14 @@ static inline void __tlbi_sync_s1ish_hyp(void)
* TLBF_NOSYNC (don't issue trailing dsb) and TLBF_NOBROADCAST
* (only perform the invalidation for the local cpu).
*
- * local_flush_tlb_page(vma, addr)
- * Local variant of flush_tlb_page(). Stale TLB entries may
- * remain in remote CPUs.
- *
- * local_flush_tlb_page_nonotify(vma, addr)
- * Same as local_flush_tlb_page() except MMU notifier will not be
- * called.
+ * __flush_tlb_page(vma, addr, flags)
+ * Invalidate a single user mapping for address 'addr' in the
+ * address space corresponding to 'vma->mm'. Note that this
+ * operation only invalidates a single, last-level page-table entry
+ * and therefore does not affect any walk-caches. flags may contain
+ * any combination of TLBF_NONOTIFY (don't call mmu notifiers),
+ * TLBF_NOSYNC (don't issue trailing dsb) and TLBF_NOBROADCAST
+ * (only perform the invalidation for the local cpu).
*
* Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented
* on top of these routines, since that is our interface to the mmu_gather
@@ -340,51 +338,6 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
}
-static inline void __local_flush_tlb_page_nonotify_nosync(struct mm_struct *mm,
- unsigned long uaddr)
-{
- dsb(nshst);
- __tlbi_level_asid(vale1, uaddr, TLBI_TTL_UNKNOWN, ASID(mm));
-}
-
-static inline void local_flush_tlb_page_nonotify(struct vm_area_struct *vma,
- unsigned long uaddr)
-{
- __local_flush_tlb_page_nonotify_nosync(vma->vm_mm, uaddr);
- dsb(nsh);
-}
-
-static inline void local_flush_tlb_page(struct vm_area_struct *vma,
- unsigned long uaddr)
-{
- __local_flush_tlb_page_nonotify_nosync(vma->vm_mm, uaddr);
- mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, uaddr & PAGE_MASK,
- (uaddr & PAGE_MASK) + PAGE_SIZE);
- dsb(nsh);
-}
-
-static inline void __flush_tlb_page_nosync(struct mm_struct *mm,
- unsigned long uaddr)
-{
- dsb(ishst);
- __tlbi_level_asid(vale1is, uaddr, TLBI_TTL_UNKNOWN, ASID(mm));
- mmu_notifier_arch_invalidate_secondary_tlbs(mm, uaddr & PAGE_MASK,
- (uaddr & PAGE_MASK) + PAGE_SIZE);
-}
-
-static inline void flush_tlb_page_nosync(struct vm_area_struct *vma,
- unsigned long uaddr)
-{
- return __flush_tlb_page_nosync(vma->vm_mm, uaddr);
-}
-
-static inline void flush_tlb_page(struct vm_area_struct *vma,
- unsigned long uaddr)
-{
- flush_tlb_page_nosync(vma, uaddr);
- __tlbi_sync_s1ish();
-}
-
static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
{
return true;
@@ -632,6 +585,22 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
__flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN, TLBF_NONE);
}
+static inline void __flush_tlb_page(struct vm_area_struct *vma,
+ unsigned long uaddr, tlbf_t flags)
+{
+ unsigned long start = round_down(uaddr, PAGE_SIZE);
+ unsigned long end = start + PAGE_SIZE;
+
+ __do_flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN,
+ TLBF_NOWALKCACHE | flags);
+}
+
+static inline void flush_tlb_page(struct vm_area_struct *vma,
+ unsigned long uaddr)
+{
+ __flush_tlb_page(vma, uaddr, TLBF_NONE);
+}
+
static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
{
const unsigned long stride = PAGE_SIZE;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index be9dab2c7d6a8..f91aa686f1428 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -239,7 +239,7 @@ int __ptep_set_access_flags(struct vm_area_struct *vma,
* flush_tlb_fix_spurious_fault().
*/
if (dirty)
- local_flush_tlb_page(vma, address);
+ __flush_tlb_page(vma, address, TLBF_NOBROADCAST);
return 1;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v3 12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range()
2026-03-02 13:55 ` [PATCH v3 12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() Ryan Roberts
@ 2026-03-03 9:59 ` Jonathan Cameron
0 siblings, 0 replies; 22+ messages in thread
From: Jonathan Cameron @ 2026-03-03 9:59 UTC (permalink / raw)
To: Ryan Roberts
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, linux-arm-kernel, linux-kernel
On Mon, 2 Mar 2026 13:55:59 +0000
Ryan Roberts <ryan.roberts@arm.com> wrote:
> Flushing a page from the tlb is just a special case of flushing a range.
> So let's rework flush_tlb_page() so that it simply wraps
> __do_flush_tlb_range(). While at it, let's also update the API to take
> the same flags that we use when flushing a range. This allows us to
> delete all the ugly "_nosync", "_local" and "_nonotify" variants.
>
> Thanks to constant folding, all of the complex looping and tlbi-by-range
> options get eliminated so that the generated code for flush_tlb_page()
> looks very similar to the previous version.
>
> Reviewed-by: Linu Cherian <linu.cherian@arm.com>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page()
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (11 preceding siblings ...)
2026-03-02 13:55 ` [PATCH v3 12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range() Ryan Roberts
@ 2026-03-02 13:56 ` Ryan Roberts
2026-03-02 14:42 ` Mark Rutland
2026-03-13 19:43 ` [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Catalin Marinas
13 siblings, 1 reply; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 13:56 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
Linus Torvalds, Oliver Upton, Marc Zyngier, Dev Jain,
Linu Cherian, Jonathan Cameron
Cc: Ryan Roberts, linux-arm-kernel, linux-kernel, Linu Cherian
Previously tlb invalidations issued by __flush_tlb_page() did not
contain a level hint. But the function is clearly only ever targeting
level 3 tlb entries and its documentation agrees:
| this operation only invalidates a single, last-level page-table
| entry and therefore does not affect any walk-caches
However, it turns out that the function was actually being used to
invalidate a level 2 mapping via flush_tlb_fix_spurious_fault_pmd(). The
bug was benign because the level hint was not set so the HW would still
invalidate the PMD mapping, and also because the TLBF_NONOTIFY flag was
set, the bounds of the mapping were never used for anything else.
Now that we have the new and improved range-invalidation API, it is
trival to fix flush_tlb_fix_spurious_fault_pmd() to explicitly flush the
whole range (locally, without notification and last level only). So
let's do that, and then update __flush_tlb_page() to hint level 3.
Reviewed-by: Linu Cherian <linu.cherian@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/pgtable.h | 5 +++--
arch/arm64/include/asm/tlbflush.h | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 7039931df4622..b1a96a8f2b17e 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -103,8 +103,9 @@ static inline void arch_leave_lazy_mmu_mode(void)
#define flush_tlb_fix_spurious_fault(vma, address, ptep) \
__flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
-#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
- __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
+#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
+ __flush_tlb_range(vma, address, address + PMD_SIZE, PMD_SIZE, 2, \
+ TLBF_NOBROADCAST | TLBF_NONOTIFY | TLBF_NOWALKCACHE)
/*
* ZERO_PAGE is a global shared page that is always zero: used
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 5096ec7ab8650..958fe97b744e5 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -591,7 +591,7 @@ static inline void __flush_tlb_page(struct vm_area_struct *vma,
unsigned long start = round_down(uaddr, PAGE_SIZE);
unsigned long end = start + PAGE_SIZE;
- __do_flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN,
+ __do_flush_tlb_range(vma, start, end, PAGE_SIZE, 3,
TLBF_NOWALKCACHE | flags);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread* Re: [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page()
2026-03-02 13:56 ` [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page() Ryan Roberts
@ 2026-03-02 14:42 ` Mark Rutland
2026-03-02 17:39 ` Ryan Roberts
0 siblings, 1 reply; 22+ messages in thread
From: Mark Rutland @ 2026-03-02 14:42 UTC (permalink / raw)
To: Ryan Roberts
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Linus Torvalds,
Oliver Upton, Marc Zyngier, Dev Jain, Linu Cherian,
Jonathan Cameron, linux-arm-kernel, linux-kernel
Hi Ryan,
On Mon, Mar 02, 2026 at 01:56:00PM +0000, Ryan Roberts wrote:
> Previously tlb invalidations issued by __flush_tlb_page() did not
> contain a level hint. But the function is clearly only ever targeting
> level 3 tlb entries and its documentation agrees:
>
> | this operation only invalidates a single, last-level page-table
> | entry and therefore does not affect any walk-caches
FWIW, I'd have read "last-level" as synonymous with "leaf" (i.e. a Page
or Block entry, which is the last level of walk) rather than level 3
specifically. The architecture uses the term to match the former (e.g.
in the description of TLBI VALE1IS).
If we're tightening up __flush_tlb_page(), I think it'd be worth either
updating the comment to explicitly note that this only applies to level
3 entries, OR update the comment+name to say it applies to leaf entries,
and have it take a level parameter.
> However, it turns out that the function was actually being used to
> invalidate a level 2 mapping via flush_tlb_fix_spurious_fault_pmd().
> The bug was benign because the level hint was not set so the HW would
> still invalidate the PMD mapping, and also because the TLBF_NONOTIFY
> flag was set, the bounds of the mapping were never used for anything
> else.
I suspect (as above) that the current usage was intentional, legitimate
usage, just poorly documented.
> Now that we have the new and improved range-invalidation API, it is
> trival to fix flush_tlb_fix_spurious_fault_pmd() to explicitly flush the
> whole range (locally, without notification and last level only). So
> let's do that, and then update __flush_tlb_page() to hint level 3.
Do we never use __flush_tlb_page() to manipulate a level 1 block
mapping? I'd have expected we did the same lazy invalidation for
permission relazation there, but if that's not the case, then this seems
fine in principle.
> Reviewed-by: Linu Cherian <linu.cherian@arm.com>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> arch/arm64/include/asm/pgtable.h | 5 +++--
> arch/arm64/include/asm/tlbflush.h | 2 +-
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 7039931df4622..b1a96a8f2b17e 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -103,8 +103,9 @@ static inline void arch_leave_lazy_mmu_mode(void)
> #define flush_tlb_fix_spurious_fault(vma, address, ptep) \
> __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
>
> -#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
> - __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
> +#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
> + __flush_tlb_range(vma, address, address + PMD_SIZE, PMD_SIZE, 2, \
> + TLBF_NOBROADCAST | TLBF_NONOTIFY | TLBF_NOWALKCACHE)
Is there a reason to keep __flush_tlb_page(), rather than defining
flush_tlb_fix_spurious_fault() in terms of __flush_tlb_range() with all
the level 3 constants?
Mark.
>
> /*
> * ZERO_PAGE is a global shared page that is always zero: used
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 5096ec7ab8650..958fe97b744e5 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -591,7 +591,7 @@ static inline void __flush_tlb_page(struct vm_area_struct *vma,
> unsigned long start = round_down(uaddr, PAGE_SIZE);
> unsigned long end = start + PAGE_SIZE;
>
> - __do_flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN,
> + __do_flush_tlb_range(vma, start, end, PAGE_SIZE, 3,
> TLBF_NOWALKCACHE | flags);
> }
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page()
2026-03-02 14:42 ` Mark Rutland
@ 2026-03-02 17:39 ` Ryan Roberts
2026-03-02 17:56 ` Mark Rutland
0 siblings, 1 reply; 22+ messages in thread
From: Ryan Roberts @ 2026-03-02 17:39 UTC (permalink / raw)
To: Mark Rutland
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Linus Torvalds,
Oliver Upton, Marc Zyngier, Dev Jain, Linu Cherian,
Jonathan Cameron, linux-arm-kernel, linux-kernel
On 02/03/2026 14:42, Mark Rutland wrote:
> Hi Ryan,
>
> On Mon, Mar 02, 2026 at 01:56:00PM +0000, Ryan Roberts wrote:
>> Previously tlb invalidations issued by __flush_tlb_page() did not
>> contain a level hint. But the function is clearly only ever targeting
>> level 3 tlb entries and its documentation agrees:
>>
>> | this operation only invalidates a single, last-level page-table
>> | entry and therefore does not affect any walk-caches
>
> FWIW, I'd have read "last-level" as synonymous with "leaf" (i.e. a Page
> or Block entry, which is the last level of walk) rather than level 3
> specifically. The architecture uses the term to match the former (e.g.
> in the description of TLBI VALE1IS).
Hmm yeah, now that I'm re-reading, I agree that quoted documentation doesn't say
anything about it being level 3 specific.
But actually that was arm64-specific documentation for flush_tlb_page(), which
is a core-mm function. The generic docs at Documentation/core-api/cachetlb.rst
make it clear that it's intended only for PTE invalidations, I think:
| 4) ``void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)``
|
| This time we need to remove the PAGE_SIZE sized translation
| from the TLB. ...
>
> If we're tightening up __flush_tlb_page(), I think it'd be worth either
> updating the comment to explicitly note that this only applies to level
> 3 entries, OR update the comment+name to say it applies to leaf entries,
> and have it take a level parameter.
I'll fix the arm64-specific docs to align with the generic docs and replace
"last-level" with "level 3" if that works for you.
>
>> However, it turns out that the function was actually being used to
>> invalidate a level 2 mapping via flush_tlb_fix_spurious_fault_pmd().
>> The bug was benign because the level hint was not set so the HW would
>> still invalidate the PMD mapping, and also because the TLBF_NONOTIFY
>> flag was set, the bounds of the mapping were never used for anything
>> else.
>
> I suspect (as above) that the current usage was intentional, legitimate
> usage, just poorly documented.
Before this series flush_tlb_fix_spurious_fault_pmd() was implemented using
local_flush_tlb_page_nonotify() which never even gives an option to set the TTL
hint, so I agree.
But I don't think flush_tlb_fix_spurious_fault_pmd() should be calling any tlb
flush api that has "page" in the name since that implies PTE, not PMD.
I think what I have done is an improvement; but I'm happy to soften/correct this
description in the next version.
>
>> Now that we have the new and improved range-invalidation API, it is
>> trival to fix flush_tlb_fix_spurious_fault_pmd() to explicitly flush the
>> whole range (locally, without notification and last level only). So
>> let's do that, and then update __flush_tlb_page() to hint level 3.
>
> Do we never use __flush_tlb_page() to manipulate a level 1 block
> mapping? I'd have expected we did the same lazy invalidation for
> permission relazation there, but if that's not the case, then this seems
> fine in principle.
No, there is no flush_tlb_fix_spurious_fault_pud().
(flush_tlb_fix_spurious_fault_pmd() was only added last cycle).
>
>> Reviewed-by: Linu Cherian <linu.cherian@arm.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>> arch/arm64/include/asm/pgtable.h | 5 +++--
>> arch/arm64/include/asm/tlbflush.h | 2 +-
>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 7039931df4622..b1a96a8f2b17e 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -103,8 +103,9 @@ static inline void arch_leave_lazy_mmu_mode(void)
>> #define flush_tlb_fix_spurious_fault(vma, address, ptep) \
>> __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
>>
>> -#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
>> - __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
>> +#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
>> + __flush_tlb_range(vma, address, address + PMD_SIZE, PMD_SIZE, 2, \
>> + TLBF_NOBROADCAST | TLBF_NONOTIFY | TLBF_NOWALKCACHE)
>
> Is there a reason to keep __flush_tlb_page(), rather than defining
> flush_tlb_fix_spurious_fault() in terms of __flush_tlb_range() with all
> the level 3 constants?
__flush_tlb_page() is called by __ptep_clear_flush_young() and
__ptep_set_access_flags() (as well as by flush_tlb_page()). I could replace them
all, but __flush_tlb_page() is a bit less verbose... I have no strong preference.
Thanks,
Ryan
>
> Mark.
>
>>
>> /*
>> * ZERO_PAGE is a global shared page that is always zero: used
>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
>> index 5096ec7ab8650..958fe97b744e5 100644
>> --- a/arch/arm64/include/asm/tlbflush.h
>> +++ b/arch/arm64/include/asm/tlbflush.h
>> @@ -591,7 +591,7 @@ static inline void __flush_tlb_page(struct vm_area_struct *vma,
>> unsigned long start = round_down(uaddr, PAGE_SIZE);
>> unsigned long end = start + PAGE_SIZE;
>>
>> - __do_flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN,
>> + __do_flush_tlb_range(vma, start, end, PAGE_SIZE, 3,
>> TLBF_NOWALKCACHE | flags);
>> }
>>
>> --
>> 2.43.0
>>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page()
2026-03-02 17:39 ` Ryan Roberts
@ 2026-03-02 17:56 ` Mark Rutland
0 siblings, 0 replies; 22+ messages in thread
From: Mark Rutland @ 2026-03-02 17:56 UTC (permalink / raw)
To: Ryan Roberts
Cc: Will Deacon, Ard Biesheuvel, Catalin Marinas, Linus Torvalds,
Oliver Upton, Marc Zyngier, Dev Jain, Linu Cherian,
Jonathan Cameron, linux-arm-kernel, linux-kernel
On Mon, Mar 02, 2026 at 05:39:51PM +0000, Ryan Roberts wrote:
> On 02/03/2026 14:42, Mark Rutland wrote:
> > On Mon, Mar 02, 2026 at 01:56:00PM +0000, Ryan Roberts wrote:
> >> Previously tlb invalidations issued by __flush_tlb_page() did not
> >> contain a level hint. But the function is clearly only ever targeting
> >> level 3 tlb entries and its documentation agrees:
> >>
> >> | this operation only invalidates a single, last-level page-table
> >> | entry and therefore does not affect any walk-caches
> >
> > FWIW, I'd have read "last-level" as synonymous with "leaf" (i.e. a Page
> > or Block entry, which is the last level of walk) rather than level 3
> > specifically. The architecture uses the term to match the former (e.g.
> > in the description of TLBI VALE1IS).
>
> Hmm yeah, now that I'm re-reading, I agree that quoted documentation doesn't say
> anything about it being level 3 specific.
>
> But actually that was arm64-specific documentation for flush_tlb_page(), which
> is a core-mm function. The generic docs at Documentation/core-api/cachetlb.rst
> make it clear that it's intended only for PTE invalidations, I think:
>
> | 4) ``void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)``
> |
> | This time we need to remove the PAGE_SIZE sized translation
> | from the TLB. ...
Ah! I agree that's stronger and clearer.
> > If we're tightening up __flush_tlb_page(), I think it'd be worth either
> > updating the comment to explicitly note that this only applies to level
> > 3 entries, OR update the comment+name to say it applies to leaf entries,
> > and have it take a level parameter.
>
> I'll fix the arm64-specific docs to align with the generic docs and replace
> "last-level" with "level 3" if that works for you.
Yep, saying "level 3" definitely works for me -- I just want this to be
explicit either way to minimize risk of confusion.
[...]
> >> However, it turns out that the function was actually being used to
> >> invalidate a level 2 mapping via flush_tlb_fix_spurious_fault_pmd().
> >> The bug was benign because the level hint was not set so the HW would
> >> still invalidate the PMD mapping, and also because the TLBF_NONOTIFY
> >> flag was set, the bounds of the mapping were never used for anything
> >> else.
> >
> > I suspect (as above) that the current usage was intentional, legitimate
> > usage, just poorly documented.
>
> Before this series flush_tlb_fix_spurious_fault_pmd() was implemented using
> local_flush_tlb_page_nonotify() which never even gives an option to set the TTL
> hint, so I agree.
>
> But I don't think flush_tlb_fix_spurious_fault_pmd() should be calling any tlb
> flush api that has "page" in the name since that implies PTE, not PMD.
>
> I think what I have done is an improvement; but I'm happy to soften/correct this
> description in the next version.
I think if you include the quote from the core API documentation above,
the rest is is good as is, and doesn't need to be softened.
> >> Now that we have the new and improved range-invalidation API, it is
> >> trival to fix flush_tlb_fix_spurious_fault_pmd() to explicitly flush the
> >> whole range (locally, without notification and last level only). So
> >> let's do that, and then update __flush_tlb_page() to hint level 3.
> >
> > Do we never use __flush_tlb_page() to manipulate a level 1 block
> > mapping? I'd have expected we did the same lazy invalidation for
> > permission relazation there, but if that's not the case, then this seems
> > fine in principle.
>
> No, there is no flush_tlb_fix_spurious_fault_pud().
> (flush_tlb_fix_spurious_fault_pmd() was only added last cycle).
Thanks for confirming; I just wasn't sure whether that didn't exist or
whether it had been possibly missed.
[...]
> >> -#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
> >> - __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
> >> +#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
> >> + __flush_tlb_range(vma, address, address + PMD_SIZE, PMD_SIZE, 2, \
> >> + TLBF_NOBROADCAST | TLBF_NONOTIFY | TLBF_NOWALKCACHE)
> >
> > Is there a reason to keep __flush_tlb_page(), rather than defining
> > flush_tlb_fix_spurious_fault() in terms of __flush_tlb_range() with all
> > the level 3 constants?
>
> __flush_tlb_page() is called by __ptep_clear_flush_young() and
> __ptep_set_access_flags() (as well as by flush_tlb_page()). I could replace them
> all, but __flush_tlb_page() is a bit less verbose... I have no strong preference.
That's reason enough to keep it; thanks for confirming, and sorry for
the noise!
Mark.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation
2026-03-02 13:55 [PATCH v3 00/13] arm64: Refactor TLB invalidation API and implementation Ryan Roberts
` (12 preceding siblings ...)
2026-03-02 13:56 ` [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page() Ryan Roberts
@ 2026-03-13 19:43 ` Catalin Marinas
13 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2026-03-13 19:43 UTC (permalink / raw)
To: Will Deacon, Ard Biesheuvel, Mark Rutland, Linus Torvalds,
Marc Zyngier, Dev Jain, Linu Cherian, Jonathan Cameron,
Oliver Upton, Ryan Roberts
Cc: linux-arm-kernel, linux-kernel
On Mon, 02 Mar 2026 13:55:47 +0000, Ryan Roberts wrote:
> This series refactors the TLB invalidation API to make it more general and
> flexible, and refactors the implementation, aiming to make it more robust,
> easier to understand and easier to add new features in future.
>
> It is heavily based on the series posted by Will back in July at [1]; I've
> attempted to maintain correct authorship and tags - apologies if I got any of
> the etiquette wrong.
>
> [...]
Applied to arm64 (for-next/tlbflush), thanks!
Slight tweak to patch 13 around the last-level vs level 3 comment (both
source comment and the commit log).
[01/13] arm64: mm: Re-implement the __tlbi_level macro as a C function
https://git.kernel.org/arm64/c/5b3fb8a6b429
[02/13] arm64: mm: Introduce a C wrapper for by-range TLB invalidation
https://git.kernel.org/arm64/c/d2bf3226952c
[03/13] arm64: mm: Implicitly invalidate user ASID based on TLBI operation
https://git.kernel.org/arm64/c/edc55b7abb25
[04/13] arm64: mm: Push __TLBI_VADDR() into __tlbi_level()
https://git.kernel.org/arm64/c/a3710035604f
[05/13] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
https://git.kernel.org/arm64/c/d4b048ca145f
[06/13] arm64: mm: Re-implement the __flush_tlb_range_op macro in C
https://git.kernel.org/arm64/c/5e63b73f3deb
[07/13] arm64: mm: Simplify __TLBI_RANGE_NUM() macro
https://git.kernel.org/arm64/c/057bbd8e0610
[08/13] arm64: mm: Simplify __flush_tlb_range_limit_excess()
https://git.kernel.org/arm64/c/c753d667d959
[09/13] arm64: mm: Refactor flush_tlb_page() to use __tlbi_level_asid()
https://git.kernel.org/arm64/c/64212d689306
[10/13] arm64: mm: Refactor __flush_tlb_range() to take flags
https://git.kernel.org/arm64/c/11f6dd8dd283
[11/13] arm64: mm: More flags for __flush_tlb_range()
https://git.kernel.org/arm64/c/0477fc56960d
[12/13] arm64: mm: Wrap flush_tlb_page() around __do_flush_tlb_range()
https://git.kernel.org/arm64/c/15397e3c3850
[13/13] arm64: mm: Provide level hint for flush_tlb_page()
https://git.kernel.org/arm64/c/752a0d1d483e
--
Catalin
^ permalink raw reply [flat|nested] 22+ messages in thread