linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions
@ 2025-07-11 16:17 Will Deacon
  2025-07-11 16:17 ` [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers Will Deacon
                   ` (9 more replies)
  0 siblings, 10 replies; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

Hi all,

I cooked this series following a complaint from Linus back in March
about our range-based TLB invalidation macro after we fixed an
over-invalidation bug thanks to incorrect handling of its arguments:

  https://lore.kernel.org/all/CAHk-=wgiX0q0WCL+SFwVCYtG7JR3=2Rshse-5J3AO2Y4AgT7Jw@mail.gmail.com/

Once I started trying to rework the range macro into a C function, I
spotted a few other opportunities for cleanup and so I've ended up with
this series.

Testing and feedback welcome.

Cheers,

Will

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>

--->8

Will Deacon (10):
  arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers
  arm64: mm: Introduce a C wrapper for by-range TLB invalidation helpers
  arm64: mm: Implicitly invalidate user ASID based on TLBI operation
  arm64: mm: Remove unused 'tlbi_user' argument from
    __flush_tlb_range_op()
  arm64: mm: Re-implement the __tlbi_level macro in C
  arm64: mm: Simplify __TLBI_RANGE_NUM() macro
  arm64: mm: Push __TLBI_VADDR() into __tlbi_level()
  arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
  arm64: mm: Simplify __flush_tlb_range_limit_excess()
  arm64: mm: Re-implement the __flush_tlb_range_op macro in C

 arch/arm64/include/asm/tlbflush.h | 230 ++++++++++++++++++------------
 arch/arm64/kernel/sys_compat.c    |   2 +-
 arch/arm64/kvm/hyp/nvhe/mm.c      |   2 +-
 arch/arm64/kvm/hyp/pgtable.c      |   4 +-
 4 files changed, 140 insertions(+), 98 deletions(-)

-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  8:38   ` Ryan Roberts
  2025-07-11 16:17 ` [PATCH 02/10] arm64: mm: Introduce a C wrapper for by-range " Will Deacon
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

In preparation for reducing our reliance on complex preprocessor macros
for TLB invalidation routines, introduce a new C wrapper for by-level
TLB invalidation helpers which can be used instead of the __tlbi() macro
and can additionally be called from C code.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 33 ++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index aa9efee17277..1c7548ec6cb7 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -88,6 +88,16 @@ static inline unsigned long get_trans_granule(void)
 	}
 }
 
+enum tlbi_op {
+	vae1is,
+	vae2is,
+	vale1is,
+	vale2is,
+	vaale1is,
+	ipas2e1,
+	ipas2e1is,
+};
+
 /*
  * Level-based TLBI operations.
  *
@@ -105,6 +115,27 @@ static inline unsigned long get_trans_granule(void)
 
 #define TLBI_TTL_UNKNOWN	INT_MAX
 
+#define __GEN_TLBI_OP_CASE(op)						\
+	case op:							\
+		__tlbi(op, arg);					\
+		break
+
+static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
+{
+	switch (op) {
+	__GEN_TLBI_OP_CASE(vae1is);
+	__GEN_TLBI_OP_CASE(vae2is);
+	__GEN_TLBI_OP_CASE(vale1is);
+	__GEN_TLBI_OP_CASE(vale2is);
+	__GEN_TLBI_OP_CASE(vaale1is);
+	__GEN_TLBI_OP_CASE(ipas2e1);
+	__GEN_TLBI_OP_CASE(ipas2e1is);
+	default:
+		BUILD_BUG();
+	}
+}
+#undef __GEN_TLBI_OP_CASE
+
 #define __tlbi_level(op, addr, level) do {				\
 	u64 arg = addr;							\
 									\
@@ -116,7 +147,7 @@ static inline unsigned long get_trans_granule(void)
 		arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);			\
 	}								\
 									\
-	__tlbi(op, arg);						\
+	__tlbi_level_op(op, arg);					\
 } while(0)
 
 #define __tlbi_user_level(op, arg, level) do {				\
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/10] arm64: mm: Introduce a C wrapper for by-range TLB invalidation helpers
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
  2025-07-11 16:17 ` [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  8:26   ` Ryan Roberts
  2025-07-11 16:17 ` [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Will Deacon
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

In preparation for reducing our reliance on complex preprocessor macros
for TLB invalidation routines, introduce a new C wrapper for by-range
TLB invalidation helpers which can be used instead of the __tlbi() macro
and can additionally be called from C code.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 1c7548ec6cb7..4408aeebf4d5 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -418,6 +418,24 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
  *    operations can only span an even number of pages. We save this for last to
  *    ensure 64KB start alignment is maintained for the LPA2 case.
  */
+#define __GEN_TLBI_OP_CASE(op)						\
+	case op:							\
+		__tlbi(r ## op, arg);					\
+		break
+
+static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
+{
+	switch (op) {
+	__GEN_TLBI_OP_CASE(vae1is);
+	__GEN_TLBI_OP_CASE(vale1is);
+	__GEN_TLBI_OP_CASE(vaale1is);
+	__GEN_TLBI_OP_CASE(ipas2e1is);
+	default:
+		BUILD_BUG();
+	}
+}
+#undef __GEN_TLBI_OP_CASE
+
 #define __flush_tlb_range_op(op, start, pages, stride,			\
 				asid, tlb_level, tlbi_user, lpa2)	\
 do {									\
@@ -445,7 +463,7 @@ do {									\
 		if (num >= 0) {						\
 			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
 						scale, num, tlb_level);	\
-			__tlbi(r##op, addr);				\
+			__tlbi_range(op, addr);				\
 			if (tlbi_user)					\
 				__tlbi_user(r##op, addr);		\
 			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
  2025-07-11 16:17 ` [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers Will Deacon
  2025-07-11 16:17 ` [PATCH 02/10] arm64: mm: Introduce a C wrapper for by-range " Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  8:44   ` Ryan Roberts
  2025-07-11 16:17 ` [PATCH 04/10] arm64: mm: Remove unused 'tlbi_user' argument from __flush_tlb_range_op() Will Deacon
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

When kpti is enabled, separate ASIDs are used for userspace and
kernelspace, requiring ASID-qualified TLB invalidation by virtual
address to invalidate both of them.

Push the logic for invalidating the two ASIDs down into the low-level
__tlbi_level_op() function based on the TLBI operation and remove the
burden from the caller to handle the kpti-specific behaviour.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 45 ++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 4408aeebf4d5..08e509f37b28 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -115,17 +115,25 @@ enum tlbi_op {
 
 #define TLBI_TTL_UNKNOWN	INT_MAX
 
-#define __GEN_TLBI_OP_CASE(op)						\
+#define ___GEN_TLBI_OP_CASE(op)						\
 	case op:							\
-		__tlbi(op, arg);					\
+		__tlbi(op, arg)
+
+#define __GEN_TLBI_OP_ASID_CASE(op)					\
+	___GEN_TLBI_OP_CASE(op);					\
+		__tlbi_user(op, arg);					\
+		break
+
+#define __GEN_TLBI_OP_CASE(op)						\
+	___GEN_TLBI_OP_CASE(op);					\
 		break
 
 static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
 {
 	switch (op) {
-	__GEN_TLBI_OP_CASE(vae1is);
+	__GEN_TLBI_OP_ASID_CASE(vae1is);
 	__GEN_TLBI_OP_CASE(vae2is);
-	__GEN_TLBI_OP_CASE(vale1is);
+	__GEN_TLBI_OP_ASID_CASE(vale1is);
 	__GEN_TLBI_OP_CASE(vale2is);
 	__GEN_TLBI_OP_CASE(vaale1is);
 	__GEN_TLBI_OP_CASE(ipas2e1);
@@ -134,7 +142,8 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
 		BUILD_BUG();
 	}
 }
-#undef __GEN_TLBI_OP_CASE
+#undef __GEN_TLBI_OP_ASID_CASE
+#undef ___GEN_TLBI_OP_CASE
 
 #define __tlbi_level(op, addr, level) do {				\
 	u64 arg = addr;							\
@@ -150,11 +159,6 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
 	__tlbi_level_op(op, arg);					\
 } while(0)
 
-#define __tlbi_user_level(op, arg, level) do {				\
-	if (arm64_kernel_unmapped_at_el0())				\
-		__tlbi_level(op, (arg | USER_ASID_FLAG), level);	\
-} while (0)
-
 /*
  * This macro creates a properly formatted VA operand for the TLB RANGE. The
  * value bit assignments are:
@@ -418,22 +422,28 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
  *    operations can only span an even number of pages. We save this for last to
  *    ensure 64KB start alignment is maintained for the LPA2 case.
  */
-#define __GEN_TLBI_OP_CASE(op)						\
+#define ___GEN_TLBI_OP_CASE(op)						\
 	case op:							\
-		__tlbi(r ## op, arg);					\
+		__tlbi(r ## op, arg)
+
+#define __GEN_TLBI_OP_ASID_CASE(op)					\
+	___GEN_TLBI_OP_CASE(op);					\
+		__tlbi_user(r ## op, arg);				\
 		break
 
 static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
 {
 	switch (op) {
-	__GEN_TLBI_OP_CASE(vae1is);
-	__GEN_TLBI_OP_CASE(vale1is);
+	__GEN_TLBI_OP_ASID_CASE(vae1is);
+	__GEN_TLBI_OP_ASID_CASE(vale1is);
 	__GEN_TLBI_OP_CASE(vaale1is);
 	__GEN_TLBI_OP_CASE(ipas2e1is);
 	default:
 		BUILD_BUG();
 	}
 }
+#undef __GEN_TLBI_OP_ASID_CASE
+#undef ___GEN_TLBI_OP_CASE
 #undef __GEN_TLBI_OP_CASE
 
 #define __flush_tlb_range_op(op, start, pages, stride,			\
@@ -452,8 +462,6 @@ do {									\
 		    (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) {	\
 			addr = __TLBI_VADDR(__flush_start, asid);	\
 			__tlbi_level(op, addr, tlb_level);		\
-			if (tlbi_user)					\
-				__tlbi_user_level(op, addr, tlb_level);	\
 			__flush_start += stride;			\
 			__flush_pages -= stride >> PAGE_SHIFT;		\
 			continue;					\
@@ -464,8 +472,6 @@ do {									\
 			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
 						scale, num, tlb_level);	\
 			__tlbi_range(op, addr);				\
-			if (tlbi_user)					\
-				__tlbi_user(r##op, addr);		\
 			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
 			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
 		}							\
@@ -584,6 +590,7 @@ static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *b
 {
 	__flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3);
 }
-#endif
 
+#undef __tlbi_user
+#endif
 #endif
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/10] arm64: mm: Remove unused 'tlbi_user' argument from __flush_tlb_range_op()
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (2 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-11 16:17 ` [PATCH 05/10] arm64: mm: Re-implement the __tlbi_level macro in C Will Deacon
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

The 'tlbi_user' argument to __flush_tlb_range_op() is unused. Drop it.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 08e509f37b28..728b00f3e1f4 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -398,8 +398,6 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
  * @stride:	Flush granularity
  * @asid:	The ASID of the task (0 for IPA instructions)
  * @tlb_level:	Translation Table level hint, if known
- * @tlbi_user:	If 'true', call an additional __tlbi_user()
- *              (typically for user ASIDs). 'flase' for IPA instructions
  * @lpa2:	If 'true', the lpa2 scheme is used as set out below
  *
  * When the CPU does not support TLB range operations, flush the TLB
@@ -447,7 +445,7 @@ static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
 #undef __GEN_TLBI_OP_CASE
 
 #define __flush_tlb_range_op(op, start, pages, stride,			\
-				asid, tlb_level, tlbi_user, lpa2)	\
+				asid, tlb_level, lpa2)			\
 do {									\
 	typeof(start) __flush_start = start;				\
 	typeof(pages) __flush_pages = pages;				\
@@ -480,7 +478,7 @@ do {									\
 } while (0)
 
 #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
-	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());
+	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
 
 static inline bool __flush_tlb_range_limit_excess(unsigned long start,
 		unsigned long end, unsigned long pages, unsigned long stride)
@@ -520,10 +518,10 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
 
 	if (last_level)
 		__flush_tlb_range_op(vale1is, start, pages, stride, asid,
-				     tlb_level, true, lpa2_is_enabled());
+				     tlb_level, lpa2_is_enabled());
 	else
 		__flush_tlb_range_op(vae1is, start, pages, stride, asid,
-				     tlb_level, true, lpa2_is_enabled());
+				     tlb_level, lpa2_is_enabled());
 
 	mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
 }
@@ -566,7 +564,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
 
 	dsb(ishst);
 	__flush_tlb_range_op(vaale1is, start, pages, stride, 0,
-			     TLBI_TTL_UNKNOWN, false, lpa2_is_enabled());
+			     TLBI_TTL_UNKNOWN, lpa2_is_enabled());
 	dsb(ish);
 	isb();
 }
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/10] arm64: mm: Re-implement the __tlbi_level macro in C
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (3 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 04/10] arm64: mm: Remove unused 'tlbi_user' argument from __flush_tlb_range_op() Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  9:02   ` Ryan Roberts
  2025-07-11 16:17 ` [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Will Deacon
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

__tlbi_level() is just a simple macro around __tlbi_level_op(), so merge
the two into a single C function. Drop the redundant comparison of
'u32 level' against 0 and tidy up the code a little while we're at it.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 728b00f3e1f4..ddd77e92b268 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -128,8 +128,17 @@ enum tlbi_op {
 	___GEN_TLBI_OP_CASE(op);					\
 		break
 
-static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
+static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 level)
 {
+	u64 arg = addr;
+
+	if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && level <= 3) {
+		u64 ttl = level | (get_trans_granule() << 2);
+
+		arg &= ~TLBI_TTL_MASK;
+		arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);
+	}
+
 	switch (op) {
 	__GEN_TLBI_OP_ASID_CASE(vae1is);
 	__GEN_TLBI_OP_CASE(vae2is);
@@ -145,20 +154,6 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
 #undef __GEN_TLBI_OP_ASID_CASE
 #undef ___GEN_TLBI_OP_CASE
 
-#define __tlbi_level(op, addr, level) do {				\
-	u64 arg = addr;							\
-									\
-	if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) &&	\
-	    level >= 0 && level <= 3) {					\
-		u64 ttl = level & 3;					\
-		ttl |= get_trans_granule() << 2;			\
-		arg &= ~TLBI_TTL_MASK;					\
-		arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);			\
-	}								\
-									\
-	__tlbi_level_op(op, arg);					\
-} while(0)
-
 /*
  * This macro creates a properly formatted VA operand for the TLB RANGE. The
  * value bit assignments are:
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (4 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 05/10] arm64: mm: Re-implement the __tlbi_level macro in C Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  9:06   ` Ryan Roberts
  2025-07-15  5:13   ` Dev Jain
  2025-07-11 16:17 ` [PATCH 07/10] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Will Deacon
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

Since commit e2768b798a19 ("arm64/mm: Modify range-based tlbi to
decrement scale"), we don't need to clamp the 'pages' argument to fit
the range for the specified 'scale' as we know that the upper bits will
have been processed in a prior iteration.

Drop the clamping and simplify the __TLBI_RANGE_NUM() macro.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index ddd77e92b268..a8d21e52ef3a 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -205,11 +205,7 @@ static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 le
  * range.
  */
 #define __TLBI_RANGE_NUM(pages, scale)					\
-	({								\
-		int __pages = min((pages),				\
-				  __TLBI_RANGE_PAGES(31, (scale)));	\
-		(__pages >> (5 * (scale) + 1)) - 1;			\
-	})
+	(((pages) >> (5 * (scale) + 1)) - 1)
 
 /*
  *	TLB Invalidation
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/10] arm64: mm: Push __TLBI_VADDR() into __tlbi_level()
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (5 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-11 16:17 ` [PATCH 08/10] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Will Deacon
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

The __TLBI_VADDR() macro takes an ASID and an address and converts them
into a single argument formatted correctly for a TLB invalidation
instruction.

Rather than have callers worry about this (especially in the case where
the ASID is zero), push the macro down into __tlbi_level() via a new
__tlbi_level_asid() helper.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 14 ++++++++++----
 arch/arm64/kernel/sys_compat.c    |  2 +-
 arch/arm64/kvm/hyp/nvhe/mm.c      |  2 +-
 arch/arm64/kvm/hyp/pgtable.c      |  4 ++--
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index a8d21e52ef3a..434b9fdb340a 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -128,9 +128,10 @@ enum tlbi_op {
 	___GEN_TLBI_OP_CASE(op);					\
 		break
 
-static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 level)
+static __always_inline void __tlbi_level_asid(const enum tlbi_op op, u64 addr,
+					      u32 level, u16 asid)
 {
-	u64 arg = addr;
+	u64 arg = __TLBI_VADDR(addr, asid);
 
 	if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && level <= 3) {
 		u64 ttl = level | (get_trans_granule() << 2);
@@ -154,6 +155,11 @@ static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 le
 #undef __GEN_TLBI_OP_ASID_CASE
 #undef ___GEN_TLBI_OP_CASE
 
+static inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 level)
+{
+	__tlbi_level_asid(op, addr, level, 0);
+}
+
 /*
  * This macro creates a properly formatted VA operand for the TLB RANGE. The
  * value bit assignments are:
@@ -449,8 +455,7 @@ do {									\
 		if (!system_supports_tlb_range() ||			\
 		    __flush_pages == 1 ||				\
 		    (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) {	\
-			addr = __TLBI_VADDR(__flush_start, asid);	\
-			__tlbi_level(op, addr, tlb_level);		\
+			__tlbi_level_asid(op, __flush_start, tlb_level, asid);	\
 			__flush_start += stride;			\
 			__flush_pages -= stride >> PAGE_SHIFT;		\
 			continue;					\
@@ -580,6 +585,7 @@ static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *b
 	__flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3);
 }
 
+#undef __TLBI_VADDR
 #undef __tlbi_user
 #endif
 #endif
diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
index 4a609e9b65de..ad4857df4830 100644
--- a/arch/arm64/kernel/sys_compat.c
+++ b/arch/arm64/kernel/sys_compat.c
@@ -36,7 +36,7 @@ __do_compat_cache_op(unsigned long start, unsigned long end)
 			 * The workaround requires an inner-shareable tlbi.
 			 * We pick the reserved-ASID to minimise the impact.
 			 */
-			__tlbi(aside1is, __TLBI_VADDR(0, 0));
+			__tlbi(aside1is, 0UL);
 			dsb(ish);
 		}
 
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index ae8391baebc3..581385b21826 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -270,7 +270,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot)
 	 * https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/T/#mf10dfbaf1eaef9274c581b81c53758918c1d0f03
 	 */
 	dsb(ishst);
-	__tlbi_level(vale2is, __TLBI_VADDR(addr, 0), level);
+	__tlbi_level(vale2is, addr, level);
 	dsb(ish);
 	isb();
 }
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index c351b4abd5db..540691987e13 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -472,14 +472,14 @@ static int hyp_unmap_walker(const struct kvm_pgtable_visit_ctx *ctx,
 
 		kvm_clear_pte(ctx->ptep);
 		dsb(ishst);
-		__tlbi_level(vae2is, __TLBI_VADDR(ctx->addr, 0), TLBI_TTL_UNKNOWN);
+		__tlbi_level(vae2is, ctx->addr, TLBI_TTL_UNKNOWN);
 	} else {
 		if (ctx->end - ctx->addr < granule)
 			return -EINVAL;
 
 		kvm_clear_pte(ctx->ptep);
 		dsb(ishst);
-		__tlbi_level(vale2is, __TLBI_VADDR(ctx->addr, 0), ctx->level);
+		__tlbi_level(vale2is, ctx->addr, ctx->level);
 		*unmapped += granule;
 	}
 
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/10] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (6 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 07/10] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  9:17   ` Ryan Roberts
  2025-07-11 16:17 ` [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess() Will Deacon
  2025-07-11 16:17 ` [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Will Deacon
  9 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

The __TLBI_VADDR_RANGE() macro is only used in one place and isn't
something that's generally useful outside of the low-level range
invalidation gubbins.

Inline __TLBI_VADDR_RANGE() into the __tlbi_range() function so that the
macro can be removed entirely.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 32 +++++++++++++------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 434b9fdb340a..8618a85d5cd3 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -185,19 +185,6 @@ static inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 level)
 #define TLBIR_TTL_MASK		GENMASK_ULL(38, 37)
 #define TLBIR_BADDR_MASK	GENMASK_ULL(36,  0)
 
-#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl)		\
-	({								\
-		unsigned long __ta = 0;					\
-		unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0;	\
-		__ta |= FIELD_PREP(TLBIR_BADDR_MASK, baddr);		\
-		__ta |= FIELD_PREP(TLBIR_TTL_MASK, __ttl);		\
-		__ta |= FIELD_PREP(TLBIR_NUM_MASK, num);		\
-		__ta |= FIELD_PREP(TLBIR_SCALE_MASK, scale);		\
-		__ta |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule());	\
-		__ta |= FIELD_PREP(TLBIR_ASID_MASK, asid);		\
-		__ta;							\
-	})
-
 /* These macros are used by the TLBI RANGE feature. */
 #define __TLBI_RANGE_PAGES(num, scale)	\
 	((unsigned long)((num) + 1) << (5 * (scale) + 1))
@@ -426,8 +413,19 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
 		__tlbi_user(r ## op, arg);				\
 		break
 
-static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
+static __always_inline void __tlbi_range(const enum tlbi_op op, u64 addr,
+					 u16 asid, int scale, int num,
+					 u32 level, bool lpa2)
 {
+	u64 arg = 0;
+
+	arg |= FIELD_PREP(TLBIR_BADDR_MASK, addr >> (lpa2 ? 16 : PAGE_SHIFT));
+	arg |= FIELD_PREP(TLBIR_TTL_MASK, level > 3 ? 0 : level);
+	arg |= FIELD_PREP(TLBIR_NUM_MASK, num);
+	arg |= FIELD_PREP(TLBIR_SCALE_MASK, scale);
+	arg |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule());
+	arg |= FIELD_PREP(TLBIR_ASID_MASK, asid);
+
 	switch (op) {
 	__GEN_TLBI_OP_ASID_CASE(vae1is);
 	__GEN_TLBI_OP_ASID_CASE(vale1is);
@@ -448,8 +446,6 @@ do {									\
 	typeof(pages) __flush_pages = pages;				\
 	int num = 0;							\
 	int scale = 3;							\
-	int shift = lpa2 ? 16 : PAGE_SHIFT;				\
-	unsigned long addr;						\
 									\
 	while (__flush_pages > 0) {					\
 		if (!system_supports_tlb_range() ||			\
@@ -463,9 +459,7 @@ do {									\
 									\
 		num = __TLBI_RANGE_NUM(__flush_pages, scale);		\
 		if (num >= 0) {						\
-			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
-						scale, num, tlb_level);	\
-			__tlbi_range(op, addr);				\
+			__tlbi_range(op, __flush_start, asid, scale, num, tlb_level, lpa2); \
 			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
 			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
 		}							\
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess()
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (7 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 08/10] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-14  9:30   ` Ryan Roberts
  2025-07-11 16:17 ` [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Will Deacon
  9 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

__flush_tlb_range_limit_excess() is unnecessarily complicated:

  - It takes a 'start', 'end' and 'pages' argument, whereas it only
    needs 'pages' (which the caller has computed from the other two
    arguments!).

  - It erroneously compares 'pages' with MAX_TLBI_RANGE_PAGES when
    the system doesn't support range-based invalidation but the range to
    be invalidated would result in fewer than MAX_DVM_OPS invalidations.

Simplify the function so that it no longer takes the 'start' and 'end'
arguments and only considers the MAX_TLBI_RANGE_PAGES threshold on
systems that implement range-based invalidation.

Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 8618a85d5cd3..2541863721af 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -470,21 +470,13 @@ do {									\
 #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
 	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
 
-static inline bool __flush_tlb_range_limit_excess(unsigned long start,
-		unsigned long end, unsigned long pages, unsigned long stride)
+static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
+						  unsigned long stride)
 {
-	/*
-	 * When the system does not support TLB range based flush
-	 * operation, (MAX_DVM_OPS - 1) pages can be handled. But
-	 * with TLB range based operation, MAX_TLBI_RANGE_PAGES
-	 * pages can be handled.
-	 */
-	if ((!system_supports_tlb_range() &&
-	     (end - start) >= (MAX_DVM_OPS * stride)) ||
-	    pages > MAX_TLBI_RANGE_PAGES)
+	if (system_supports_tlb_range() && pages > MAX_TLBI_RANGE_PAGES)
 		return true;
 
-	return false;
+	return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
 }
 
 static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
@@ -498,7 +490,7 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
 	end = round_up(end, stride);
 	pages = (end - start) >> PAGE_SHIFT;
 
-	if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
+	if (__flush_tlb_range_limit_excess(pages, stride)) {
 		flush_tlb_mm(mm);
 		return;
 	}
@@ -547,7 +539,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
 	end = round_up(end, stride);
 	pages = (end - start) >> PAGE_SHIFT;
 
-	if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
+	if (__flush_tlb_range_limit_excess(pages, stride)) {
 		flush_tlb_all();
 		return;
 	}
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C
  2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
                   ` (8 preceding siblings ...)
  2025-07-11 16:17 ` [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess() Will Deacon
@ 2025-07-11 16:17 ` Will Deacon
  2025-07-11 18:16   ` Linus Torvalds
  2025-07-14  9:44   ` Ryan Roberts
  9 siblings, 2 replies; 24+ messages in thread
From: Will Deacon @ 2025-07-11 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Will Deacon, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Linus Torvalds, Oliver Upton,
	Marc Zyngier

The __flush_tlb_range_op() macro is horrible and has been a previous
source of bugs thanks to multiple expansions of its arguments (see
commit f7edb07ad7c6 ("Fix mmu notifiers for range-based invalidates")).

Rewrite the thing in C.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/tlbflush.h | 63 +++++++++++++++++--------------
 1 file changed, 34 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 2541863721af..ee69efdc12ab 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -376,12 +376,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
 /*
  * __flush_tlb_range_op - Perform TLBI operation upon a range
  *
- * @op:	TLBI instruction that operates on a range (has 'r' prefix)
+ * @op:		TLBI instruction that operates on a range
  * @start:	The start address of the range
  * @pages:	Range as the number of pages from 'start'
  * @stride:	Flush granularity
  * @asid:	The ASID of the task (0 for IPA instructions)
- * @tlb_level:	Translation Table level hint, if known
+ * @level:	Translation Table level hint, if known
  * @lpa2:	If 'true', the lpa2 scheme is used as set out below
  *
  * When the CPU does not support TLB range operations, flush the TLB
@@ -439,33 +439,38 @@ static __always_inline void __tlbi_range(const enum tlbi_op op, u64 addr,
 #undef ___GEN_TLBI_OP_CASE
 #undef __GEN_TLBI_OP_CASE
 
-#define __flush_tlb_range_op(op, start, pages, stride,			\
-				asid, tlb_level, lpa2)			\
-do {									\
-	typeof(start) __flush_start = start;				\
-	typeof(pages) __flush_pages = pages;				\
-	int num = 0;							\
-	int scale = 3;							\
-									\
-	while (__flush_pages > 0) {					\
-		if (!system_supports_tlb_range() ||			\
-		    __flush_pages == 1 ||				\
-		    (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) {	\
-			__tlbi_level_asid(op, __flush_start, tlb_level, asid);	\
-			__flush_start += stride;			\
-			__flush_pages -= stride >> PAGE_SHIFT;		\
-			continue;					\
-		}							\
-									\
-		num = __TLBI_RANGE_NUM(__flush_pages, scale);		\
-		if (num >= 0) {						\
-			__tlbi_range(op, __flush_start, asid, scale, num, tlb_level, lpa2); \
-			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
-			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
-		}							\
-		scale--;						\
-	}								\
-} while (0)
+static __always_inline void __flush_tlb_range_op(const enum tlbi_op op,
+						 u64 start, size_t pages,
+						 u64 stride, u16 asid,
+						 u32 level, bool lpa2)
+{
+	u64 addr = start, end = start + pages * PAGE_SIZE;
+	int scale = 3;
+
+	while (addr != end) {
+		int num;
+
+		pages = (end - addr) >> PAGE_SHIFT;
+
+		if (!system_supports_tlb_range() || pages == 1)
+			goto invalidate_one;
+
+		if (lpa2 && !IS_ALIGNED(addr, SZ_64K))
+			goto invalidate_one;
+
+		num = __TLBI_RANGE_NUM(pages, scale);
+		if (num >= 0) {
+			__tlbi_range(op, addr, asid, scale, num, level, lpa2);
+			addr += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT;
+		}
+
+		scale--;
+		continue;
+invalidate_one:
+		__tlbi_level_asid(op, addr, level, asid);
+		addr += stride;
+	}
+}
 
 #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
 	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
-- 
2.50.0.727.gbf7dc18ff4-goog



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C
  2025-07-11 16:17 ` [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Will Deacon
@ 2025-07-11 18:16   ` Linus Torvalds
  2025-07-13 13:35     ` Will Deacon
  2025-07-14  9:44   ` Ryan Roberts
  1 sibling, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2025-07-11 18:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, linux-kernel, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Oliver Upton, Marc Zyngier

On Fri, 11 Jul 2025 at 09:18, Will Deacon <will@kernel.org> wrote:
>
> The __flush_tlb_range_op() macro is horrible and has been a previous
> source of bugs thanks to multiple expansions of its arguments (see
> commit f7edb07ad7c6 ("Fix mmu notifiers for range-based invalidates")).
>
> Rewrite the thing in C.

So I do think this is better than the old case, but I think you could
go one step further...

> +static __always_inline void __flush_tlb_range_op(const enum tlbi_op op,
> +                                                u64 start, size_t pages,
> +                                                u64 stride, u16 asid,
> +                                                u32 level, bool lpa2)

If you replaced that "enum tlbi_op op" with two function pointers
instead (for "the invalidate range" and "invalidate one" cases
respectively), I think you could make this code much more obvious.

And exactly like how you depend on that 'op' value being
constant-folded because all the different levels are inline functions,
the same thing ends up happening with function pointers where inlining
will result in a constant function pointer becoming just a static call
(and in turn inlined as well).

And then the actual *callers* would use the "look up op" thing, but I
suspect that in many cases those could then be in turn also simplified
to not use that op-number at all, but just end up using the op-name.

I didn't try to actually create that series - and I think you'd want
to do it in multiple steps just to make each individual step small and
obvious - but I think it would end up looking nicer.

            Linus


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C
  2025-07-11 18:16   ` Linus Torvalds
@ 2025-07-13 13:35     ` Will Deacon
  0 siblings, 0 replies; 24+ messages in thread
From: Will Deacon @ 2025-07-13 13:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-arm-kernel, linux-kernel, Ard Biesheuvel, Catalin Marinas,
	Ryan Roberts, Mark Rutland, Oliver Upton, Marc Zyngier

On Fri, Jul 11, 2025 at 11:16:32AM -0700, Linus Torvalds wrote:
> On Fri, 11 Jul 2025 at 09:18, Will Deacon <will@kernel.org> wrote:
> >
> > The __flush_tlb_range_op() macro is horrible and has been a previous
> > source of bugs thanks to multiple expansions of its arguments (see
> > commit f7edb07ad7c6 ("Fix mmu notifiers for range-based invalidates")).
> >
> > Rewrite the thing in C.
> 
> So I do think this is better than the old case, but I think you could
> go one step further...
> 
> > +static __always_inline void __flush_tlb_range_op(const enum tlbi_op op,
> > +                                                u64 start, size_t pages,
> > +                                                u64 stride, u16 asid,
> > +                                                u32 level, bool lpa2)
> 
> If you replaced that "enum tlbi_op op" with two function pointers
> instead (for "the invalidate range" and "invalidate one" cases
> respectively), I think you could make this code much more obvious.
> 
> And exactly like how you depend on that 'op' value being
> constant-folded because all the different levels are inline functions,
> the same thing ends up happening with function pointers where inlining
> will result in a constant function pointer becoming just a static call
> (and in turn inlined as well).

So I don't _strictly_ rely on the constant-folding and replacing that
BUILD_BUG_ON() with a BUG_ON() would still give functionally correct
code if inlining didn't occur. I just much preferred catching a wonky
TLBI op at compile-time, which is why I ended up with this but I hadn't
considered that this would allow us to inline indirect function calls.

> And then the actual *callers* would use the "look up op" thing, but I
> suspect that in many cases those could then be in turn also simplified
> to not use that op-number at all, but just end up using the op-name.

Right, I think we'd drop the enum entirely if we went down this route.

> I didn't try to actually create that series - and I think you'd want
> to do it in multiple steps just to make each individual step small and
> obvious - but I think it would end up looking nicer.

I'll have a play...

Will


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/10] arm64: mm: Introduce a C wrapper for by-range TLB invalidation helpers
  2025-07-11 16:17 ` [PATCH 02/10] arm64: mm: Introduce a C wrapper for by-range " Will Deacon
@ 2025-07-14  8:26   ` Ryan Roberts
  0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  8:26 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> In preparation for reducing our reliance on complex preprocessor macros
> for TLB invalidation routines, introduce a new C wrapper for by-range
> TLB invalidation helpers which can be used instead of the __tlbi() macro
> and can additionally be called from C code.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/tlbflush.h | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 1c7548ec6cb7..4408aeebf4d5 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -418,6 +418,24 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>   *    operations can only span an even number of pages. We save this for last to
>   *    ensure 64KB start alignment is maintained for the LPA2 case.
>   */
> +#define __GEN_TLBI_OP_CASE(op)						\
> +	case op:							\
> +		__tlbi(r ## op, arg);					\
> +		break
> +
> +static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)

nit: you called the level version __tlbi_level_op(). Why not call this
__tlbi_range_op() for consistency?

> +{
> +	switch (op) {
> +	__GEN_TLBI_OP_CASE(vae1is);
> +	__GEN_TLBI_OP_CASE(vale1is);
> +	__GEN_TLBI_OP_CASE(vaale1is);
> +	__GEN_TLBI_OP_CASE(ipas2e1is);
> +	default:
> +		BUILD_BUG();
> +	}
> +}
> +#undef __GEN_TLBI_OP_CASE
> +
>  #define __flush_tlb_range_op(op, start, pages, stride,			\
>  				asid, tlb_level, tlbi_user, lpa2)	\
>  do {									\
> @@ -445,7 +463,7 @@ do {									\
>  		if (num >= 0) {						\
>  			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
>  						scale, num, tlb_level);	\
> -			__tlbi(r##op, addr);				\
> +			__tlbi_range(op, addr);				\
>  			if (tlbi_user)					\
>  				__tlbi_user(r##op, addr);		\
>  			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers
  2025-07-11 16:17 ` [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers Will Deacon
@ 2025-07-14  8:38   ` Ryan Roberts
  0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  8:38 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> In preparation for reducing our reliance on complex preprocessor macros
> for TLB invalidation routines, introduce a new C wrapper for by-level
> TLB invalidation helpers which can be used instead of the __tlbi() macro
> and can additionally be called from C code.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/tlbflush.h | 33 ++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index aa9efee17277..1c7548ec6cb7 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -88,6 +88,16 @@ static inline unsigned long get_trans_granule(void)
>  	}
>  }
>  
> +enum tlbi_op {
> +	vae1is,
> +	vae2is,
> +	vale1is,
> +	vale2is,
> +	vaale1is,
> +	ipas2e1,
> +	ipas2e1is,
> +};
> +
>  /*
>   * Level-based TLBI operations.
>   *
> @@ -105,6 +115,27 @@ static inline unsigned long get_trans_granule(void)
>  
>  #define TLBI_TTL_UNKNOWN	INT_MAX
>  
> +#define __GEN_TLBI_OP_CASE(op)						\

nit: my personal preference would be to explicitly pass arg into the macro
instead of implicitly picking it from the parent context.


> +	case op:							\
> +		__tlbi(op, arg);					\
> +		break
> +
> +static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
> +{
> +	switch (op) {
> +	__GEN_TLBI_OP_CASE(vae1is);
> +	__GEN_TLBI_OP_CASE(vae2is);
> +	__GEN_TLBI_OP_CASE(vale1is);
> +	__GEN_TLBI_OP_CASE(vale2is);
> +	__GEN_TLBI_OP_CASE(vaale1is);
> +	__GEN_TLBI_OP_CASE(ipas2e1);
> +	__GEN_TLBI_OP_CASE(ipas2e1is);
> +	default:
> +		BUILD_BUG();
> +	}
> +}
> +#undef __GEN_TLBI_OP_CASE
> +
>  #define __tlbi_level(op, addr, level) do {				\
>  	u64 arg = addr;							\
>  									\
> @@ -116,7 +147,7 @@ static inline unsigned long get_trans_granule(void)
>  		arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);			\
>  	}								\
>  									\
> -	__tlbi(op, arg);						\
> +	__tlbi_level_op(op, arg);					\
>  } while(0)
>  
>  #define __tlbi_user_level(op, arg, level) do {				\



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation
  2025-07-11 16:17 ` [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Will Deacon
@ 2025-07-14  8:44   ` Ryan Roberts
  2025-07-14  9:46     ` Ryan Roberts
  0 siblings, 1 reply; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  8:44 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> When kpti is enabled, separate ASIDs are used for userspace and
> kernelspace, requiring ASID-qualified TLB invalidation by virtual
> address to invalidate both of them.
> 
> Push the logic for invalidating the two ASIDs down into the low-level
> __tlbi_level_op() function based on the TLBI operation and remove the
> burden from the caller to handle the kpti-specific behaviour.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/tlbflush.h | 45 ++++++++++++++++++-------------
>  1 file changed, 26 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 4408aeebf4d5..08e509f37b28 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -115,17 +115,25 @@ enum tlbi_op {
>  
>  #define TLBI_TTL_UNKNOWN	INT_MAX
>  
> -#define __GEN_TLBI_OP_CASE(op)						\
> +#define ___GEN_TLBI_OP_CASE(op)						\
>  	case op:							\
> -		__tlbi(op, arg);					\
> +		__tlbi(op, arg)
> +
> +#define __GEN_TLBI_OP_ASID_CASE(op)					\
> +	___GEN_TLBI_OP_CASE(op);					\
> +		__tlbi_user(op, arg);					\
> +		break
> +
> +#define __GEN_TLBI_OP_CASE(op)						\
> +	___GEN_TLBI_OP_CASE(op);					\
>  		break
>  
>  static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>  {
>  	switch (op) {
> -	__GEN_TLBI_OP_CASE(vae1is);
> +	__GEN_TLBI_OP_ASID_CASE(vae1is);
>  	__GEN_TLBI_OP_CASE(vae2is);
> -	__GEN_TLBI_OP_CASE(vale1is);
> +	__GEN_TLBI_OP_ASID_CASE(vale1is);
>  	__GEN_TLBI_OP_CASE(vale2is);
>  	__GEN_TLBI_OP_CASE(vaale1is);
>  	__GEN_TLBI_OP_CASE(ipas2e1);
> @@ -134,7 +142,8 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>  		BUILD_BUG();
>  	}
>  }
> -#undef __GEN_TLBI_OP_CASE
> +#undef __GEN_TLBI_OP_ASID_CASE
> +#undef ___GEN_TLBI_OP_CASE
>  
>  #define __tlbi_level(op, addr, level) do {				\
>  	u64 arg = addr;							\
> @@ -150,11 +159,6 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>  	__tlbi_level_op(op, arg);					\
>  } while(0)
>  
> -#define __tlbi_user_level(op, arg, level) do {				\
> -	if (arm64_kernel_unmapped_at_el0())				\
> -		__tlbi_level(op, (arg | USER_ASID_FLAG), level);	\
> -} while (0)
> -
>  /*
>   * This macro creates a properly formatted VA operand for the TLB RANGE. The
>   * value bit assignments are:
> @@ -418,22 +422,28 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>   *    operations can only span an even number of pages. We save this for last to
>   *    ensure 64KB start alignment is maintained for the LPA2 case.
>   */
> -#define __GEN_TLBI_OP_CASE(op)						\
> +#define ___GEN_TLBI_OP_CASE(op)						\
>  	case op:							\
> -		__tlbi(r ## op, arg);					\
> +		__tlbi(r ## op, arg)
> +
> +#define __GEN_TLBI_OP_ASID_CASE(op)					\
> +	___GEN_TLBI_OP_CASE(op);					\
> +		__tlbi_user(r ## op, arg);				\
>  		break
>  
>  static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
>  {
>  	switch (op) {
> -	__GEN_TLBI_OP_CASE(vae1is);
> -	__GEN_TLBI_OP_CASE(vale1is);
> +	__GEN_TLBI_OP_ASID_CASE(vae1is);
> +	__GEN_TLBI_OP_ASID_CASE(vale1is);
>  	__GEN_TLBI_OP_CASE(vaale1is);
>  	__GEN_TLBI_OP_CASE(ipas2e1is);

Bug? This 2 underscore version is still defined from the level case above. So
this is no longer issuing a range-based tlbi? (i.e. you're no longer prepending
the "r" here.

>  	default:
>  		BUILD_BUG();
>  	}
>  }
> +#undef __GEN_TLBI_OP_ASID_CASE
> +#undef ___GEN_TLBI_OP_CASE
>  #undef __GEN_TLBI_OP_CASE
>  
>  #define __flush_tlb_range_op(op, start, pages, stride,			\
> @@ -452,8 +462,6 @@ do {									\
>  		    (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) {	\
>  			addr = __TLBI_VADDR(__flush_start, asid);	\
>  			__tlbi_level(op, addr, tlb_level);		\
> -			if (tlbi_user)					\
> -				__tlbi_user_level(op, addr, tlb_level);	\
>  			__flush_start += stride;			\
>  			__flush_pages -= stride >> PAGE_SHIFT;		\
>  			continue;					\
> @@ -464,8 +472,6 @@ do {									\
>  			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
>  						scale, num, tlb_level);	\
>  			__tlbi_range(op, addr);				\
> -			if (tlbi_user)					\
> -				__tlbi_user(r##op, addr);		\
>  			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
>  			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
>  		}							\
> @@ -584,6 +590,7 @@ static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *b
>  {
>  	__flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3);
>  }
> -#endif
>  
> +#undef __tlbi_user
> +#endif
>  #endif



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 05/10] arm64: mm: Re-implement the __tlbi_level macro in C
  2025-07-11 16:17 ` [PATCH 05/10] arm64: mm: Re-implement the __tlbi_level macro in C Will Deacon
@ 2025-07-14  9:02   ` Ryan Roberts
  0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  9:02 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> __tlbi_level() is just a simple macro around __tlbi_level_op(), so merge
> the two into a single C function. Drop the redundant comparison of
> 'u32 level' against 0 and tidy up the code a little while we're at it.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/tlbflush.h | 25 ++++++++++---------------
>  1 file changed, 10 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 728b00f3e1f4..ddd77e92b268 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -128,8 +128,17 @@ enum tlbi_op {
>  	___GEN_TLBI_OP_CASE(op);					\
>  		break
>  
> -static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
> +static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 level)

level is passed into all the higher level functions as int. That's why the
"level >= 0" comparison was previously there. Of course no users should be
calling this with a negative value so I'll guess that I was trying to guard
against seeing a valid level -1 in future with 6 level/4K D128 pgtables.

Given everything else uses signed int for level, perhaps we should be consistent
here too?

>  {
> +	u64 arg = addr;
> +
> +	if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && level <= 3) {
> +		u64 ttl = level | (get_trans_granule() << 2);
> +
> +		arg &= ~TLBI_TTL_MASK;
> +		arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);
> +	}
> +
>  	switch (op) {
>  	__GEN_TLBI_OP_ASID_CASE(vae1is);
>  	__GEN_TLBI_OP_CASE(vae2is);
> @@ -145,20 +154,6 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>  #undef __GEN_TLBI_OP_ASID_CASE
>  #undef ___GEN_TLBI_OP_CASE
>  
> -#define __tlbi_level(op, addr, level) do {				\
> -	u64 arg = addr;							\
> -									\
> -	if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) &&	\
> -	    level >= 0 && level <= 3) {					\
> -		u64 ttl = level & 3;					\
> -		ttl |= get_trans_granule() << 2;			\
> -		arg &= ~TLBI_TTL_MASK;					\
> -		arg |= FIELD_PREP(TLBI_TTL_MASK, ttl);			\
> -	}								\
> -									\
> -	__tlbi_level_op(op, arg);					\
> -} while(0)
> -
>  /*
>   * This macro creates a properly formatted VA operand for the TLB RANGE. The
>   * value bit assignments are:



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro
  2025-07-11 16:17 ` [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Will Deacon
@ 2025-07-14  9:06   ` Ryan Roberts
  2025-07-15  5:13   ` Dev Jain
  1 sibling, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  9:06 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> Since commit e2768b798a19 ("arm64/mm: Modify range-based tlbi to
> decrement scale"), we don't need to clamp the 'pages' argument to fit
> the range for the specified 'scale' as we know that the upper bits will
> have been processed in a prior iteration.
> 
> Drop the clamping and simplify the __TLBI_RANGE_NUM() macro.
> 
> Signed-off-by: Will Deacon <will@kernel.org>

Seems reasonable:

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> ---
>  arch/arm64/include/asm/tlbflush.h | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index ddd77e92b268..a8d21e52ef3a 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -205,11 +205,7 @@ static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 le
>   * range.
>   */
>  #define __TLBI_RANGE_NUM(pages, scale)					\
> -	({								\
> -		int __pages = min((pages),				\
> -				  __TLBI_RANGE_PAGES(31, (scale)));	\
> -		(__pages >> (5 * (scale) + 1)) - 1;			\
> -	})
> +	(((pages) >> (5 * (scale) + 1)) - 1)
>  
>  /*
>   *	TLB Invalidation



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 08/10] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range()
  2025-07-11 16:17 ` [PATCH 08/10] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Will Deacon
@ 2025-07-14  9:17   ` Ryan Roberts
  0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  9:17 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> The __TLBI_VADDR_RANGE() macro is only used in one place and isn't
> something that's generally useful outside of the low-level range
> invalidation gubbins.
> 
> Inline __TLBI_VADDR_RANGE() into the __tlbi_range() function so that the
> macro can be removed entirely.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/tlbflush.h | 32 +++++++++++++------------------
>  1 file changed, 13 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 434b9fdb340a..8618a85d5cd3 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -185,19 +185,6 @@ static inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 level)
>  #define TLBIR_TTL_MASK		GENMASK_ULL(38, 37)
>  #define TLBIR_BADDR_MASK	GENMASK_ULL(36,  0)
>  
> -#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl)		\
> -	({								\
> -		unsigned long __ta = 0;					\
> -		unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0;	\
> -		__ta |= FIELD_PREP(TLBIR_BADDR_MASK, baddr);		\
> -		__ta |= FIELD_PREP(TLBIR_TTL_MASK, __ttl);		\
> -		__ta |= FIELD_PREP(TLBIR_NUM_MASK, num);		\
> -		__ta |= FIELD_PREP(TLBIR_SCALE_MASK, scale);		\
> -		__ta |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule());	\
> -		__ta |= FIELD_PREP(TLBIR_ASID_MASK, asid);		\
> -		__ta;							\
> -	})
> -
>  /* These macros are used by the TLBI RANGE feature. */
>  #define __TLBI_RANGE_PAGES(num, scale)	\
>  	((unsigned long)((num) + 1) << (5 * (scale) + 1))
> @@ -426,8 +413,19 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>  		__tlbi_user(r ## op, arg);				\
>  		break
>  
> -static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
> +static __always_inline void __tlbi_range(const enum tlbi_op op, u64 addr,
> +					 u16 asid, int scale, int num,
> +					 u32 level, bool lpa2)

Same comment about signedness of level; I think it would be marginally less
confusing to consistently consider level as signed, and it will help us when we
get to D128 pgtables.

>  {
> +	u64 arg = 0;
> +
> +	arg |= FIELD_PREP(TLBIR_BADDR_MASK, addr >> (lpa2 ? 16 : PAGE_SHIFT));
> +	arg |= FIELD_PREP(TLBIR_TTL_MASK, level > 3 ? 0 : level);
> +	arg |= FIELD_PREP(TLBIR_NUM_MASK, num);
> +	arg |= FIELD_PREP(TLBIR_SCALE_MASK, scale);
> +	arg |= FIELD_PREP(TLBIR_TG_MASK, get_trans_granule());
> +	arg |= FIELD_PREP(TLBIR_ASID_MASK, asid);
> +
>  	switch (op) {
>  	__GEN_TLBI_OP_ASID_CASE(vae1is);
>  	__GEN_TLBI_OP_ASID_CASE(vale1is);
> @@ -448,8 +446,6 @@ do {									\
>  	typeof(pages) __flush_pages = pages;				\
>  	int num = 0;							\
>  	int scale = 3;							\
> -	int shift = lpa2 ? 16 : PAGE_SHIFT;				\
> -	unsigned long addr;						\
>  									\
>  	while (__flush_pages > 0) {					\
>  		if (!system_supports_tlb_range() ||			\
> @@ -463,9 +459,7 @@ do {									\
>  									\
>  		num = __TLBI_RANGE_NUM(__flush_pages, scale);		\
>  		if (num >= 0) {						\
> -			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
> -						scale, num, tlb_level);	\
> -			__tlbi_range(op, addr);				\
> +			__tlbi_range(op, __flush_start, asid, scale, num, tlb_level, lpa2); \
>  			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
>  			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
>  		}							\



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess()
  2025-07-11 16:17 ` [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess() Will Deacon
@ 2025-07-14  9:30   ` Ryan Roberts
  2025-07-15  5:38     ` Dev Jain
  0 siblings, 1 reply; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  9:30 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> __flush_tlb_range_limit_excess() is unnecessarily complicated:
> 
>   - It takes a 'start', 'end' and 'pages' argument, whereas it only
>     needs 'pages' (which the caller has computed from the other two
>     arguments!).
> 
>   - It erroneously compares 'pages' with MAX_TLBI_RANGE_PAGES when
>     the system doesn't support range-based invalidation but the range to
>     be invalidated would result in fewer than MAX_DVM_OPS invalidations.
> 
> Simplify the function so that it no longer takes the 'start' and 'end'
> arguments and only considers the MAX_TLBI_RANGE_PAGES threshold on
> systems that implement range-based invalidation.
> 
> Signed-off-by: Will Deacon <will@kernel.org>

Does this warrant a Fixes: tag?

> ---
>  arch/arm64/include/asm/tlbflush.h | 20 ++++++--------------
>  1 file changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 8618a85d5cd3..2541863721af 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -470,21 +470,13 @@ do {									\
>  #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
>  	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
>  
> -static inline bool __flush_tlb_range_limit_excess(unsigned long start,
> -		unsigned long end, unsigned long pages, unsigned long stride)
> +static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
> +						  unsigned long stride)
>  {
> -	/*
> -	 * When the system does not support TLB range based flush
> -	 * operation, (MAX_DVM_OPS - 1) pages can be handled. But
> -	 * with TLB range based operation, MAX_TLBI_RANGE_PAGES
> -	 * pages can be handled.
> -	 */
> -	if ((!system_supports_tlb_range() &&
> -	     (end - start) >= (MAX_DVM_OPS * stride)) ||
> -	    pages > MAX_TLBI_RANGE_PAGES)
> +	if (system_supports_tlb_range() && pages > MAX_TLBI_RANGE_PAGES)
>  		return true;
>  
> -	return false;
> +	return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
>  }

I'm still not sure I totally get this... Aren't these really 2 separate
concepts? MAX_TLBI_RANGE_PAGES is the max amount of VA that can be handled by a
single tlbi-by-range (and due to implementation, the largest range that can be
handled by the loop in __flush_tlb_range_op()). Whereas MAX_DVM_OPS is the max
number of tlbi instrcutions you want to issue with the PTL held? Perhaps it is
better to split these out; For the range case, calculate the number of ops you
actually need and compare with MAX_DVM_OPS?


>  
>  static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
> @@ -498,7 +490,7 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
>  	end = round_up(end, stride);
>  	pages = (end - start) >> PAGE_SHIFT;
>  
> -	if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
> +	if (__flush_tlb_range_limit_excess(pages, stride)) {
>  		flush_tlb_mm(mm);
>  		return;
>  	}
> @@ -547,7 +539,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
>  	end = round_up(end, stride);
>  	pages = (end - start) >> PAGE_SHIFT;
>  
> -	if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
> +	if (__flush_tlb_range_limit_excess(pages, stride)) {
>  		flush_tlb_all();
>  		return;
>  	}



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C
  2025-07-11 16:17 ` [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Will Deacon
  2025-07-11 18:16   ` Linus Torvalds
@ 2025-07-14  9:44   ` Ryan Roberts
  1 sibling, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  9:44 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 11/07/2025 17:17, Will Deacon wrote:
> The __flush_tlb_range_op() macro is horrible and has been a previous

Amen to that!

> source of bugs thanks to multiple expansions of its arguments (see
> commit f7edb07ad7c6 ("Fix mmu notifiers for range-based invalidates")).
> 
> Rewrite the thing in C.

This looks much better! Do we know it's definitely valuable to have all these
functions inline though? They have grown a lot over the years. I wonder how much
code size cost they have, vs the performance they actually save?

Perhaps it's worth considering if at least these should move to a c file?

__flush_tlb_range_nosync
flush_tlb_kernel_range


FYI, I've got a patch that uses local tlbi when we can prove only the local cpu
has seen the old pgtable entries that we are trying to flush. These changes to
use enum tlbi_op make that patch quite a bit neater. I'll post that as an RFC at
some point, as I expect it will need quite a bit of discussion.

Thanks,
Ryan


> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> ---
>  arch/arm64/include/asm/tlbflush.h | 63 +++++++++++++++++--------------
>  1 file changed, 34 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 2541863721af..ee69efdc12ab 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -376,12 +376,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>  /*
>   * __flush_tlb_range_op - Perform TLBI operation upon a range
>   *
> - * @op:	TLBI instruction that operates on a range (has 'r' prefix)
> + * @op:		TLBI instruction that operates on a range
>   * @start:	The start address of the range
>   * @pages:	Range as the number of pages from 'start'
>   * @stride:	Flush granularity
>   * @asid:	The ASID of the task (0 for IPA instructions)
> - * @tlb_level:	Translation Table level hint, if known
> + * @level:	Translation Table level hint, if known
>   * @lpa2:	If 'true', the lpa2 scheme is used as set out below
>   *
>   * When the CPU does not support TLB range operations, flush the TLB
> @@ -439,33 +439,38 @@ static __always_inline void __tlbi_range(const enum tlbi_op op, u64 addr,
>  #undef ___GEN_TLBI_OP_CASE
>  #undef __GEN_TLBI_OP_CASE
>  
> -#define __flush_tlb_range_op(op, start, pages, stride,			\
> -				asid, tlb_level, lpa2)			\
> -do {									\
> -	typeof(start) __flush_start = start;				\
> -	typeof(pages) __flush_pages = pages;				\
> -	int num = 0;							\
> -	int scale = 3;							\
> -									\
> -	while (__flush_pages > 0) {					\
> -		if (!system_supports_tlb_range() ||			\
> -		    __flush_pages == 1 ||				\
> -		    (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) {	\
> -			__tlbi_level_asid(op, __flush_start, tlb_level, asid);	\
> -			__flush_start += stride;			\
> -			__flush_pages -= stride >> PAGE_SHIFT;		\
> -			continue;					\
> -		}							\
> -									\
> -		num = __TLBI_RANGE_NUM(__flush_pages, scale);		\
> -		if (num >= 0) {						\
> -			__tlbi_range(op, __flush_start, asid, scale, num, tlb_level, lpa2); \
> -			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> -			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
> -		}							\
> -		scale--;						\
> -	}								\
> -} while (0)
> +static __always_inline void __flush_tlb_range_op(const enum tlbi_op op,
> +						 u64 start, size_t pages,
> +						 u64 stride, u16 asid,
> +						 u32 level, bool lpa2)
> +{
> +	u64 addr = start, end = start + pages * PAGE_SIZE;
> +	int scale = 3;
> +
> +	while (addr != end) {
> +		int num;
> +
> +		pages = (end - addr) >> PAGE_SHIFT;
> +
> +		if (!system_supports_tlb_range() || pages == 1)
> +			goto invalidate_one;
> +
> +		if (lpa2 && !IS_ALIGNED(addr, SZ_64K))
> +			goto invalidate_one;
> +
> +		num = __TLBI_RANGE_NUM(pages, scale);
> +		if (num >= 0) {
> +			__tlbi_range(op, addr, asid, scale, num, level, lpa2);
> +			addr += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT;
> +		}
> +
> +		scale--;
> +		continue;
> +invalidate_one:
> +		__tlbi_level_asid(op, addr, level, asid);
> +		addr += stride;
> +	}
> +}
>  
>  #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
>  	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation
  2025-07-14  8:44   ` Ryan Roberts
@ 2025-07-14  9:46     ` Ryan Roberts
  0 siblings, 0 replies; 24+ messages in thread
From: Ryan Roberts @ 2025-07-14  9:46 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier

On 14/07/2025 09:44, Ryan Roberts wrote:
> On 11/07/2025 17:17, Will Deacon wrote:
>> When kpti is enabled, separate ASIDs are used for userspace and
>> kernelspace, requiring ASID-qualified TLB invalidation by virtual
>> address to invalidate both of them.
>>
>> Push the logic for invalidating the two ASIDs down into the low-level
>> __tlbi_level_op() function based on the TLBI operation and remove the
>> burden from the caller to handle the kpti-specific behaviour.
>>
>> Signed-off-by: Will Deacon <will@kernel.org>
>> ---
>>  arch/arm64/include/asm/tlbflush.h | 45 ++++++++++++++++++-------------
>>  1 file changed, 26 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
>> index 4408aeebf4d5..08e509f37b28 100644
>> --- a/arch/arm64/include/asm/tlbflush.h
>> +++ b/arch/arm64/include/asm/tlbflush.h
>> @@ -115,17 +115,25 @@ enum tlbi_op {
>>  
>>  #define TLBI_TTL_UNKNOWN	INT_MAX
>>  
>> -#define __GEN_TLBI_OP_CASE(op)						\
>> +#define ___GEN_TLBI_OP_CASE(op)						\
>>  	case op:							\
>> -		__tlbi(op, arg);					\
>> +		__tlbi(op, arg)
>> +
>> +#define __GEN_TLBI_OP_ASID_CASE(op)					\
>> +	___GEN_TLBI_OP_CASE(op);					\
>> +		__tlbi_user(op, arg);					\
>> +		break
>> +
>> +#define __GEN_TLBI_OP_CASE(op)						\
>> +	___GEN_TLBI_OP_CASE(op);					\
>>  		break
>>  
>>  static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>>  {
>>  	switch (op) {
>> -	__GEN_TLBI_OP_CASE(vae1is);
>> +	__GEN_TLBI_OP_ASID_CASE(vae1is);
>>  	__GEN_TLBI_OP_CASE(vae2is);
>> -	__GEN_TLBI_OP_CASE(vale1is);
>> +	__GEN_TLBI_OP_ASID_CASE(vale1is);
>>  	__GEN_TLBI_OP_CASE(vale2is);
>>  	__GEN_TLBI_OP_CASE(vaale1is);
>>  	__GEN_TLBI_OP_CASE(ipas2e1);
>> @@ -134,7 +142,8 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>>  		BUILD_BUG();
>>  	}
>>  }
>> -#undef __GEN_TLBI_OP_CASE
>> +#undef __GEN_TLBI_OP_ASID_CASE
>> +#undef ___GEN_TLBI_OP_CASE
>>  
>>  #define __tlbi_level(op, addr, level) do {				\
>>  	u64 arg = addr;							\
>> @@ -150,11 +159,6 @@ static __always_inline void __tlbi_level_op(const enum tlbi_op op, u64 arg)
>>  	__tlbi_level_op(op, arg);					\
>>  } while(0)
>>  
>> -#define __tlbi_user_level(op, arg, level) do {				\
>> -	if (arm64_kernel_unmapped_at_el0())				\
>> -		__tlbi_level(op, (arg | USER_ASID_FLAG), level);	\
>> -} while (0)
>> -
>>  /*
>>   * This macro creates a properly formatted VA operand for the TLB RANGE. The
>>   * value bit assignments are:
>> @@ -418,22 +422,28 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>>   *    operations can only span an even number of pages. We save this for last to
>>   *    ensure 64KB start alignment is maintained for the LPA2 case.
>>   */
>> -#define __GEN_TLBI_OP_CASE(op)						\
>> +#define ___GEN_TLBI_OP_CASE(op)						\
>>  	case op:							\
>> -		__tlbi(r ## op, arg);					\
>> +		__tlbi(r ## op, arg)
>> +
>> +#define __GEN_TLBI_OP_ASID_CASE(op)					\
>> +	___GEN_TLBI_OP_CASE(op);					\
>> +		__tlbi_user(r ## op, arg);				\
>>  		break
>>  
>>  static __always_inline void __tlbi_range(const enum tlbi_op op, u64 arg)
>>  {
>>  	switch (op) {
>> -	__GEN_TLBI_OP_CASE(vae1is);
>> -	__GEN_TLBI_OP_CASE(vale1is);
>> +	__GEN_TLBI_OP_ASID_CASE(vae1is);
>> +	__GEN_TLBI_OP_ASID_CASE(vale1is);
>>  	__GEN_TLBI_OP_CASE(vaale1is);
>>  	__GEN_TLBI_OP_CASE(ipas2e1is);
> 
> Bug? This 2 underscore version is still defined from the level case above. So
> this is no longer issuing a range-based tlbi? (i.e. you're no longer prepending
> the "r" here.

Do thse __GEN_TLBI_*() macros really help that much? I think I'd prefer to see
the case statement just written out long hand. It will make things much clearer
for not that many more lines, and if I'm right about that bug, would have
prevented it.

Thanks,
Ryan


> 
>>  	default:
>>  		BUILD_BUG();
>>  	}
>>  }
>> +#undef __GEN_TLBI_OP_ASID_CASE
>> +#undef ___GEN_TLBI_OP_CASE
>>  #undef __GEN_TLBI_OP_CASE
>>  
>>  #define __flush_tlb_range_op(op, start, pages, stride,			\
>> @@ -452,8 +462,6 @@ do {									\
>>  		    (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) {	\
>>  			addr = __TLBI_VADDR(__flush_start, asid);	\
>>  			__tlbi_level(op, addr, tlb_level);		\
>> -			if (tlbi_user)					\
>> -				__tlbi_user_level(op, addr, tlb_level);	\
>>  			__flush_start += stride;			\
>>  			__flush_pages -= stride >> PAGE_SHIFT;		\
>>  			continue;					\
>> @@ -464,8 +472,6 @@ do {									\
>>  			addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
>>  						scale, num, tlb_level);	\
>>  			__tlbi_range(op, addr);				\
>> -			if (tlbi_user)					\
>> -				__tlbi_user(r##op, addr);		\
>>  			__flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
>>  			__flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
>>  		}							\
>> @@ -584,6 +590,7 @@ static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *b
>>  {
>>  	__flush_tlb_range_nosync(mm, start, end, PAGE_SIZE, true, 3);
>>  }
>> -#endif
>>  
>> +#undef __tlbi_user
>> +#endif
>>  #endif
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro
  2025-07-11 16:17 ` [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Will Deacon
  2025-07-14  9:06   ` Ryan Roberts
@ 2025-07-15  5:13   ` Dev Jain
  1 sibling, 0 replies; 24+ messages in thread
From: Dev Jain @ 2025-07-15  5:13 UTC (permalink / raw)
  To: Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Ryan Roberts,
	Mark Rutland, Linus Torvalds, Oliver Upton, Marc Zyngier


On 11/07/25 9:47 pm, Will Deacon wrote:
> Since commit e2768b798a19 ("arm64/mm: Modify range-based tlbi to
> decrement scale"), we don't need to clamp the 'pages' argument to fit
> the range for the specified 'scale' as we know that the upper bits will
> have been processed in a prior iteration.
>
> Drop the clamping and simplify the __TLBI_RANGE_NUM() macro.
>
> Signed-off-by: Will Deacon <will@kernel.org>

Just for the sake of it:

It is easy to check that for natural numbers x and y,

x < (((x >> y) + 1) << y)

This implies x < ((x >> y) << y) + (1 << y)
=> x - ((x >> y) << y) < (1 << y)

Substitute x as pages, and y as 5 * scale + 1, the
inequation means that after the first iteration, the
new value of pages will be strictly less than 1 << (5 * scale + 1)
=> the current scale won't be repeated => the min is redundant.

Reviewed-by: Dev Jain <dev.jain@arm.com>

> ---
>   arch/arm64/include/asm/tlbflush.h | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index ddd77e92b268..a8d21e52ef3a 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -205,11 +205,7 @@ static __always_inline void __tlbi_level(const enum tlbi_op op, u64 addr, u32 le
>    * range.
>    */
>   #define __TLBI_RANGE_NUM(pages, scale)					\
> -	({								\
> -		int __pages = min((pages),				\
> -				  __TLBI_RANGE_PAGES(31, (scale)));	\
> -		(__pages >> (5 * (scale) + 1)) - 1;			\
> -	})
> +	(((pages) >> (5 * (scale) + 1)) - 1)
>   
>   /*
>    *	TLB Invalidation


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess()
  2025-07-14  9:30   ` Ryan Roberts
@ 2025-07-15  5:38     ` Dev Jain
  0 siblings, 0 replies; 24+ messages in thread
From: Dev Jain @ 2025-07-15  5:38 UTC (permalink / raw)
  To: Ryan Roberts, Will Deacon, linux-arm-kernel
  Cc: linux-kernel, Ard Biesheuvel, Catalin Marinas, Mark Rutland,
	Linus Torvalds, Oliver Upton, Marc Zyngier


On 14/07/25 3:00 pm, Ryan Roberts wrote:
> On 11/07/2025 17:17, Will Deacon wrote:
>> __flush_tlb_range_limit_excess() is unnecessarily complicated:
>>
>>    - It takes a 'start', 'end' and 'pages' argument, whereas it only
>>      needs 'pages' (which the caller has computed from the other two
>>      arguments!).
>>
>>    - It erroneously compares 'pages' with MAX_TLBI_RANGE_PAGES when
>>      the system doesn't support range-based invalidation but the range to
>>      be invalidated would result in fewer than MAX_DVM_OPS invalidations.
>>
>> Simplify the function so that it no longer takes the 'start' and 'end'
>> arguments and only considers the MAX_TLBI_RANGE_PAGES threshold on
>> systems that implement range-based invalidation.
>>
>> Signed-off-by: Will Deacon <will@kernel.org>
> Does this warrant a Fixes: tag?
>
>> ---
>>   arch/arm64/include/asm/tlbflush.h | 20 ++++++--------------
>>   1 file changed, 6 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
>> index 8618a85d5cd3..2541863721af 100644
>> --- a/arch/arm64/include/asm/tlbflush.h
>> +++ b/arch/arm64/include/asm/tlbflush.h
>> @@ -470,21 +470,13 @@ do {									\
>>   #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
>>   	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, kvm_lpa2_is_enabled());
>>   
>> -static inline bool __flush_tlb_range_limit_excess(unsigned long start,
>> -		unsigned long end, unsigned long pages, unsigned long stride)
>> +static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
>> +						  unsigned long stride)
>>   {
>> -	/*
>> -	 * When the system does not support TLB range based flush
>> -	 * operation, (MAX_DVM_OPS - 1) pages can be handled. But
>> -	 * with TLB range based operation, MAX_TLBI_RANGE_PAGES
>> -	 * pages can be handled.
>> -	 */
>> -	if ((!system_supports_tlb_range() &&
>> -	     (end - start) >= (MAX_DVM_OPS * stride)) ||
>> -	    pages > MAX_TLBI_RANGE_PAGES)
>> +	if (system_supports_tlb_range() && pages > MAX_TLBI_RANGE_PAGES)
>>   		return true;
>>   
>> -	return false;
>> +	return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
>>   }
> I'm still not sure I totally get this... Aren't these really 2 separate
> concepts? MAX_TLBI_RANGE_PAGES is the max amount of VA that can be handled by a
> single tlbi-by-range (and due to implementation, the largest range that can be
> handled by the loop in __flush_tlb_range_op()). Whereas MAX_DVM_OPS is the max
> number of tlbi instrcutions you want to issue with the PTL held? Perhaps it is
> better to split these out; For the range case, calculate the number of ops you
> actually need and compare with MAX_DVM_OPS?

If system_supports_tlb_range() returns true and pages <= MAX_TLBI_RANGE_PAGES,
then it is guaranteed that the number of tlbi range operations issued is bounded
by 4 -> we start from scale = 3, and by patch 6 of this series (and the loop
itself btw) it is guaranteed that the scale will be decremented, so the worst
case is that all scales are used, so at most 4 operations will be issued, so
we are safe.

So if my reasoning is correct, I think what we need is something like:

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index aa9efee17277..53591caf3793 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -427,21 +427,19 @@ do {									\
  #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
  	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());
  
-static inline bool __flush_tlb_range_limit_excess(unsigned long start,
-		unsigned long end, unsigned long pages, unsigned long stride)
+static inline bool __flush_tlb_range_limit_excess(unsigned long pages,
+						  unsigned long stride)
  {
  	/*
-	 * When the system does not support TLB range based flush
-	 * operation, (MAX_DVM_OPS - 1) pages can be handled. But
-	 * with TLB range based operation, MAX_TLBI_RANGE_PAGES
-	 * pages can be handled.
+	 * If a TLBI range op has pages under MAX_TLBI_RANGE_PAGES, it
+	 * is guaranteed that the loop in __flush_tlb_range_op shall
+	 * terminate in at most 4 iterations, so we do not need to
+	 * check with MAX_DVM_OPS in this case.
  	 */
-	if ((!system_supports_tlb_range() &&
-	     (end - start) >= (MAX_DVM_OPS * stride)) ||
-	    pages > MAX_TLBI_RANGE_PAGES)
-		return true;
+	if (system_supports_tlb_range())
+		return pages > MAX_TLBI_RANGE_PAGES;
  
-	return false;
+	return pages >= (MAX_DVM_OPS * stride) >> PAGE_SHIFT;
  }
  
  static inline void __flush_tlb_range_nosync(struct mm_struct *mm,


where the comment can be worded better.


>
>
>>   
>>   static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
>> @@ -498,7 +490,7 @@ static inline void __flush_tlb_range_nosync(struct mm_struct *mm,
>>   	end = round_up(end, stride);
>>   	pages = (end - start) >> PAGE_SHIFT;
>>   
>> -	if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
>> +	if (__flush_tlb_range_limit_excess(pages, stride)) {
>>   		flush_tlb_mm(mm);
>>   		return;
>>   	}
>> @@ -547,7 +539,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end
>>   	end = round_up(end, stride);
>>   	pages = (end - start) >> PAGE_SHIFT;
>>   
>> -	if (__flush_tlb_range_limit_excess(start, end, pages, stride)) {
>> +	if (__flush_tlb_range_limit_excess(pages, stride)) {
>>   		flush_tlb_all();
>>   		return;
>>   	}
>


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-07-15  5:44 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 16:17 [PATCH 00/10] arm64: Replace TLB invalidation preprocessor macros with C functions Will Deacon
2025-07-11 16:17 ` [PATCH 01/10] arm64: mm: Introduce a C wrapper for by-level TLB invalidation helpers Will Deacon
2025-07-14  8:38   ` Ryan Roberts
2025-07-11 16:17 ` [PATCH 02/10] arm64: mm: Introduce a C wrapper for by-range " Will Deacon
2025-07-14  8:26   ` Ryan Roberts
2025-07-11 16:17 ` [PATCH 03/10] arm64: mm: Implicitly invalidate user ASID based on TLBI operation Will Deacon
2025-07-14  8:44   ` Ryan Roberts
2025-07-14  9:46     ` Ryan Roberts
2025-07-11 16:17 ` [PATCH 04/10] arm64: mm: Remove unused 'tlbi_user' argument from __flush_tlb_range_op() Will Deacon
2025-07-11 16:17 ` [PATCH 05/10] arm64: mm: Re-implement the __tlbi_level macro in C Will Deacon
2025-07-14  9:02   ` Ryan Roberts
2025-07-11 16:17 ` [PATCH 06/10] arm64: mm: Simplify __TLBI_RANGE_NUM() macro Will Deacon
2025-07-14  9:06   ` Ryan Roberts
2025-07-15  5:13   ` Dev Jain
2025-07-11 16:17 ` [PATCH 07/10] arm64: mm: Push __TLBI_VADDR() into __tlbi_level() Will Deacon
2025-07-11 16:17 ` [PATCH 08/10] arm64: mm: Inline __TLBI_VADDR_RANGE() into __tlbi_range() Will Deacon
2025-07-14  9:17   ` Ryan Roberts
2025-07-11 16:17 ` [PATCH 09/10] arm64: mm: Simplify __flush_tlb_range_limit_excess() Will Deacon
2025-07-14  9:30   ` Ryan Roberts
2025-07-15  5:38     ` Dev Jain
2025-07-11 16:17 ` [PATCH 10/10] arm64: mm: Re-implement the __flush_tlb_range_op macro in C Will Deacon
2025-07-11 18:16   ` Linus Torvalds
2025-07-13 13:35     ` Will Deacon
2025-07-14  9:44   ` Ryan Roberts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).