[RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
@ 2023-10-27 11:56 Ryan Roberts
  2023-10-27 11:56 ` [RFC PATCH v1 1/3] arm64/mm: Modify range-based tlbi to decrement scale Ryan Roberts
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-10-27 11:56 UTC (permalink / raw)
  To: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Oliver Upton, Mark Rutland, Anshuman Khandual,
	Kees Cook, Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu
  Cc: Ryan Roberts, linux-arm-kernel, kvmarm

Hi All,

As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
changes to properly support LPA2 before Ard's changes get merged. So far those
changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
the TLBI changes to make them independent. The idea is that this series would go
in first, then Ard's and the rest of my series can race eachother and it doesn't
really matter who wins.

I've attempted to address all of Marc's feedback against the versions of these
patches posted at [2], including adding benchmark data (see patch 1). Although
if people are still nervous that this could regress non-lpa2 performance in some
cases, I could rework so that there are lpa2 and non-lpa2 variants of
__flush_tlb_range_op(), and the correct version is chosen at the higher level
(based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).

It turns out that we won't be able to key LPA2 usage off the same static key for
both the kernel and kvm usage because the kernel usage additionally depends on
CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
(lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
defines and implements lpa2_is_enabled() in his series, so there will be a minor
conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
static key for kvm in my series. Marc, would you be happy with this approach?

Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
I'll re-post on 6.7-rc1.

[1] https://lore.kernel.org/linux-arm-kernel/5651bb31-9ef6-4dfc-b146-64606279bbf7@arm.com/
[2] https://lore.kernel.org/kvmarm/20231009185008.3803879-1-ryan.roberts@arm.com/

Thanks,
Ryan

Ryan Roberts (3):
  arm64/mm: Modify range-based tlbi to decrement scale
  arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
  arm64/mm: Update tlb invalidation routines for FEAT_LPA2

 arch/arm64/include/asm/kvm_mmu.h      |   3 +
 arch/arm64/include/asm/pgtable-prot.h |   2 +
 arch/arm64/include/asm/tlb.h          |  15 ++--
 arch/arm64/include/asm/tlbflush.h     | 100 ++++++++++++++++----------
 4 files changed, 78 insertions(+), 42 deletions(-)

--
2.25.1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v1 1/3] arm64/mm: Modify range-based tlbi to decrement scale
  2023-10-27 11:56 [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
@ 2023-10-27 11:56 ` Ryan Roberts
  2023-10-27 11:56 ` [RFC PATCH v1 2/3] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Ryan Roberts
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-10-27 11:56 UTC (permalink / raw)
  To: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Oliver Upton, Mark Rutland, Anshuman Khandual,
	Kees Cook, Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu
  Cc: Ryan Roberts, linux-arm-kernel, kvmarm

In preparation for adding support for LPA2 to the tlb invalidation
routines, modify the algorithm used by range-based tlbi to start at the
highest 'scale' and decrement instead of starting at the lowest 'scale'
and incrementing. This new approach makes it possible to maintain 64K
alignment as we work through the range, until the last op (at scale=0).
This is required when LPA2 is enabled. (This part will be added in a
subsequent commit).

This change is separated into its own patch because it will also impact
non-LPA2 systems, and I want to make it easy to bisect in case it leads
to performance regression (see below for benchmarks that suggest this
should not be a problem).

The original commit (d1d3aa98 "arm64: tlb: Use the TLBI RANGE feature in
arm64") stated this as the reason for _incrementing_ scale:

  However, in most scenarios, the pages = 1 when flush_tlb_range() is
  called. Start from scale = 3 or other proper value (such as scale
  =ilog2(pages)), will incur extra overhead. So increase 'scale' from 0
  to maximum.

But pages=1 is already special cased by the non-range invalidation path,
which will take care of it the first time through the loop (both in the
original commit and in my change), so I don't think switching to
decrement scale should have any extra performance impact after all.

Indeed benchmarking kernel compilation, a TLBI-heavy workload, suggests
that this new approach actually _improves_ performance slightly (using a
virtual machine on Apple M2):

Table shows time to execute kernel compilation workload with 8 jobs,
relative to baseline without this patch (more negative number is
bigger speedup). Repeated 9 times across 3 system reboots:

| counter   |       mean |     stdev |
|:----------|-----------:|----------:|
| real-time |      -0.6% |      0.0% |
| kern-time |      -1.6% |      0.5% |
| user-time |      -0.4% |      0.1% |

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/include/asm/tlbflush.h | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index b149cf9f91bc..e8153c16fcdf 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -351,14 +351,14 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
  * entries one by one at the granularity of 'stride'. If the TLB
  * range ops are supported, then:
  *
- * 1. If 'pages' is odd, flush the first page through non-range
- *    operations;
+ * 1. The minimum range granularity is decided by 'scale', so multiple range
+ *    TLBI operations may be required. Start from scale = 3, flush the largest
+ *    possible number of pages ((num+1)*2^(5*scale+1)) that fit into the
+ *    requested range, then decrement scale and continue until one or zero pages
+ *    are left.
  *
- * 2. For remaining pages: the minimum range granularity is decided
- *    by 'scale', so multiple range TLBI operations may be required.
- *    Start from scale = 0, flush the corresponding number of pages
- *    ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it
- *    until no pages left.
+ * 2. If there is 1 page remaining, flush it through non-range operations. Range
+ *    operations can only span an even number of pages.
  *
  * Note that certain ranges can be represented by either num = 31 and
  * scale or num = 0 and scale + 1. The loop below favours the latter
@@ -368,12 +368,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
 				asid, tlb_level, tlbi_user)		\
 do {									\
 	int num = 0;							\
-	int scale = 0;							\
+	int scale = 3;							\
 	unsigned long addr;						\
 									\
 	while (pages > 0) {						\
 		if (!system_supports_tlb_range() ||			\
-		    pages % 2 == 1) {					\
+		    pages == 1) {					\
 			addr = __TLBI_VADDR(start, asid);		\
 			__tlbi_level(op, addr, tlb_level);		\
 			if (tlbi_user)					\
@@ -393,7 +393,7 @@ do {									\
 			start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
 			pages -= __TLBI_RANGE_PAGES(num, scale);	\
 		}							\
-		scale++;						\
+		scale--;						\
 	}								\
 } while (0)

--
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v1 2/3] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
  2023-10-27 11:56 [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
  2023-10-27 11:56 ` [RFC PATCH v1 1/3] arm64/mm: Modify range-based tlbi to decrement scale Ryan Roberts
@ 2023-10-27 11:56 ` Ryan Roberts
  2023-10-27 11:56 ` [RFC PATCH v1 3/3] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
  2023-11-13 11:55 ` [RFC PATCH v1 0/3] " Ryan Roberts
  3 siblings, 0 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-10-27 11:56 UTC (permalink / raw)
  To: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Oliver Upton, Mark Rutland, Anshuman Khandual,
	Kees Cook, Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu
  Cc: Ryan Roberts, linux-arm-kernel, kvmarm

Add stub functions which is initially always return false. These provide
the hooks that we need to update the range-based TLBI routines, whose
operands are encoded differently depending on whether lpa2 is enabled or
not.

The kernel and kvm will enable the use of lpa2 asynchronously in future,
and part of that enablement will involve fleshing out their respective
hook to advertise when it is using lpa2.

Since the kernel's decision to use lpa2 relies on more than just whether
the HW supports the feature, it can't just use the same static key as
kvm. This is another reason to use separate functions. lpa2_is_enabled()
is already implemented as part of Ard's kernel lpa2 series. Since kvm
will make its decision solely based on HW support, kvm_lpa2_is_enabled()
will be defined as system_supports_lpa2() once kvm starts using lpa2.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h      | 3 +++
 arch/arm64/include/asm/pgtable-prot.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 96a80e8f6226..57d5c2866174 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -314,5 +314,8 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
 {
 	return container_of(mmu->arch, struct kvm, arch);
 }
+
+#define kvm_lpa2_is_enabled()		false
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index eed814b00a38..b4b2b8623769 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -71,6 +71,8 @@ extern bool arm64_use_ng_mappings;
 #define PTE_MAYBE_NG		(arm64_use_ng_mappings ? PTE_NG : 0)
 #define PMD_MAYBE_NG		(arm64_use_ng_mappings ? PMD_SECT_NG : 0)

+#define lpa2_is_enabled()	false
+
 /*
  * If we have userspace only BTI we don't want to mark kernel pages
  * guarded even if the system does support BTI.
--
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v1 3/3] arm64/mm: Update tlb invalidation routines for FEAT_LPA2
  2023-10-27 11:56 [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
  2023-10-27 11:56 ` [RFC PATCH v1 1/3] arm64/mm: Modify range-based tlbi to decrement scale Ryan Roberts
  2023-10-27 11:56 ` [RFC PATCH v1 2/3] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Ryan Roberts
@ 2023-10-27 11:56 ` Ryan Roberts
  2023-11-13 11:55 ` [RFC PATCH v1 0/3] " Ryan Roberts
  3 siblings, 0 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-10-27 11:56 UTC (permalink / raw)
  To: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Oliver Upton, Mark Rutland, Anshuman Khandual,
	Kees Cook, Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu
  Cc: Ryan Roberts, linux-arm-kernel, kvmarm

FEAT_LPA2 impacts tlb invalidation in 2 ways; Firstly, the TTL field in
the non-range tlbi instructions can now validly take a 0 value as a
level hint for the 4KB granule (this is due to the extra level of
translation) - previously TTL=0b0100 meant no hint and was treated as
0b0000. Secondly, The BADDR field of the range-based tlbi instructions
is specified in 64KB units when LPA2 is in use (TCR.DS=1), whereas it is
in page units otherwise. Changes are required for tlbi to continue to
operate correctly when LPA2 is in use.

Solve the first problem by always adding the level hint if the level is
between [0, 3] (previously anything other than 0 was hinted, which
breaks in the new level -1 case from kvm). When running on non-LPA2 HW,
0 is still safe to hint as the HW will fall back to non-hinted. While we
are at it, we replace the notion of 0 being the non-hinted sentinel with
a macro, TLBI_TTL_UNKNOWN. This means callers won't need updating
if/when translation depth increases in future.

The second issue is more complex: When LPA2 is in use, use the non-range
tlbi instructions to forward align to a 64KB boundary first, then use
range-based tlbi from there on, until we have either invalidated all
pages or we have a single page remaining. If the latter, that is done
with non-range tlbi. We determine whether LPA2 is in use based on
lpa2_is_enabled() (for kernel calls) or kvm_lpa2_is_enabled() (for kvm
calls).

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/tlb.h      | 15 ++++--
 arch/arm64/include/asm/tlbflush.h | 90 ++++++++++++++++++++-----------
 2 files changed, 68 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 2c29239d05c3..396ba9b4872c 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -22,15 +22,15 @@ static void tlb_flush(struct mmu_gather *tlb);
 #include <asm-generic/tlb.h>

 /*
- * get the tlbi levels in arm64.  Default value is 0 if more than one
- * of cleared_* is set or neither is set.
- * Arm64 doesn't support p4ds now.
+ * get the tlbi levels in arm64.  Default value is TLBI_TTL_UNKNOWN if more than
+ * one of cleared_* is set or neither is set - this elides the level hinting to
+ * the hardware.
  */
 static inline int tlb_get_level(struct mmu_gather *tlb)
 {
 	/* The TTL field is only valid for the leaf entry. */
 	if (tlb->freed_tables)
-		return 0;
+		return TLBI_TTL_UNKNOWN;

 	if (tlb->cleared_ptes && !(tlb->cleared_pmds ||
 				   tlb->cleared_puds ||
@@ -47,7 +47,12 @@ static inline int tlb_get_level(struct mmu_gather *tlb)
 				   tlb->cleared_p4ds))
 		return 1;

-	return 0;
+	if (tlb->cleared_p4ds && !(tlb->cleared_ptes ||
+				   tlb->cleared_pmds ||
+				   tlb->cleared_puds))
+		return 0;
+
+	return TLBI_TTL_UNKNOWN;
 }

 static inline void tlb_flush(struct mmu_gather *tlb)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index e8153c16fcdf..cbc9ad4db32d 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -94,19 +94,22 @@ static inline unsigned long get_trans_granule(void)
  * When ARMv8.4-TTL exists, TLBI operations take an additional hint for
  * the level at which the invalidation must take place. If the level is
  * wrong, no invalidation may take place. In the case where the level
- * cannot be easily determined, a 0 value for the level parameter will
- * perform a non-hinted invalidation.
+ * cannot be easily determined, the value TLBI_TTL_UNKNOWN will perform
+ * a non-hinted invalidation. Any provided level outside the hint range
+ * will also cause fall-back to non-hinted invalidation.
  *
  * For Stage-2 invalidation, use the level values provided to that effect
  * in asm/stage2_pgtable.h.
  */
 #define TLBI_TTL_MASK		GENMASK_ULL(47, 44)

+#define TLBI_TTL_UNKNOWN	INT_MAX
+
 #define __tlbi_level(op, addr, level) do {				\
 	u64 arg = addr;							\
 									\
 	if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) &&		\
-	    level) {							\
+	    level >= 0 && level <= 3) {					\
 		u64 ttl = level & 3;					\
 		ttl |= get_trans_granule() << 2;			\
 		arg &= ~TLBI_TTL_MASK;					\
@@ -122,28 +125,34 @@ static inline unsigned long get_trans_granule(void)
 } while (0)

 /*
- * This macro creates a properly formatted VA operand for the TLB RANGE.
- * The value bit assignments are:
+ * This macro creates a properly formatted VA operand for the TLB RANGE. The
+ * value bit assignments are:
  *
  * +----------+------+-------+-------+-------+----------------------+
  * |   ASID   |  TG  | SCALE |  NUM  |  TTL  |        BADDR         |
  * +-----------------+-------+-------+-------+----------------------+
  * |63      48|47  46|45   44|43   39|38   37|36                   0|
  *
- * The address range is determined by below formula:
- * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE)
+ * The address range is determined by below formula: [BADDR, BADDR + (NUM + 1) *
+ * 2^(5*SCALE + 1) * PAGESIZE)
+ *
+ * Note that the first argument, baddr, is pre-shifted; If LPA2 is in use, BADDR
+ * holds addr[52:16]. Else BADDR holds page number. See for example ARM DDI
+ * 0487J.a section C5.5.60 "TLBI VAE1IS, TLBI VAE1ISNXS, TLB Invalidate by VA,
+ * EL1, Inner Shareable".
  *
  */
-#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl)		\
-	({							\
-		unsigned long __ta = (addr) >> PAGE_SHIFT;	\
-		__ta &= GENMASK_ULL(36, 0);			\
-		__ta |= (unsigned long)(ttl) << 37;		\
-		__ta |= (unsigned long)(num) << 39;		\
-		__ta |= (unsigned long)(scale) << 44;		\
-		__ta |= get_trans_granule() << 46;		\
-		__ta |= (unsigned long)(asid) << 48;		\
-		__ta;						\
+#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl)			\
+	({									\
+		unsigned long __ta = (baddr);					\
+		unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0;		\
+		__ta &= GENMASK_ULL(36, 0);					\
+		__ta |= __ttl << 37;						\
+		__ta |= (unsigned long)(num) << 39;				\
+		__ta |= (unsigned long)(scale) << 44;				\
+		__ta |= get_trans_granule() << 46;				\
+		__ta |= (unsigned long)(asid) << 48;				\
+		__ta;								\
 	})

 /* These macros are used by the TLBI RANGE feature. */
@@ -216,12 +225,16 @@ static inline unsigned long get_trans_granule(void)
  *		CPUs, ensuring that any walk-cache entries associated with the
  *		translation are also invalidated.
  *
- *	__flush_tlb_range(vma, start, end, stride, last_level)
+ *	__flush_tlb_range(vma, start, end, stride, last_level, tlb_level)
  *		Invalidate the virtual-address range '[start, end)' on all
  *		CPUs for the user address space corresponding to 'vma->mm'.
  *		The invalidation operations are issued at a granularity
  *		determined by 'stride' and only affect any walk-cache entries
- *		if 'last_level' is equal to false.
+ *		if 'last_level' is equal to false. tlb_level is the level at
+ *		which the invalidation must take place. If the level is wrong,
+ *		no invalidation may take place. In the case where the level
+ *		cannot be easily determined, the value TLBI_TTL_UNKNOWN will
+ *		perform a non-hinted invalidation.
  *
  *
  *	Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented
@@ -346,34 +359,44 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
  * @tlb_level:	Translation Table level hint, if known
  * @tlbi_user:	If 'true', call an additional __tlbi_user()
  *              (typically for user ASIDs). 'flase' for IPA instructions
+ * @lpa2:	If 'true', the lpa2 scheme is used as set out below
  *
  * When the CPU does not support TLB range operations, flush the TLB
  * entries one by one at the granularity of 'stride'. If the TLB
  * range ops are supported, then:
  *
- * 1. The minimum range granularity is decided by 'scale', so multiple range
+ * 1. If FEAT_LPA2 is in use, the start address of a range operation must be
+ *    64KB aligned, so flush pages one by one until the alignment is reached
+ *    using the non-range operations. This step is skipped if LPA2 is not in
+ *    use.
+ *
+ * 2. The minimum range granularity is decided by 'scale', so multiple range
  *    TLBI operations may be required. Start from scale = 3, flush the largest
  *    possible number of pages ((num+1)*2^(5*scale+1)) that fit into the
  *    requested range, then decrement scale and continue until one or zero pages
- *    are left.
+ *    are left. We must start from highest scale to ensure 64KB start alignment
+ *    is maintained in the LPA2 case.
  *
- * 2. If there is 1 page remaining, flush it through non-range operations. Range
- *    operations can only span an even number of pages.
+ * 3. If there is 1 page remaining, flush it through non-range operations. Range
+ *    operations can only span an even number of pages. We save this for last to
+ *    ensure 64KB start alignment is maintained for the LPA2 case.
  *
  * Note that certain ranges can be represented by either num = 31 and
  * scale or num = 0 and scale + 1. The loop below favours the latter
  * since num is limited to 30 by the __TLBI_RANGE_NUM() macro.
  */
 #define __flush_tlb_range_op(op, start, pages, stride,			\
-				asid, tlb_level, tlbi_user)		\
+				asid, tlb_level, tlbi_user, lpa2)	\
 do {									\
 	int num = 0;							\
 	int scale = 3;							\
+	int shift = lpa2 ? 16 : PAGE_SHIFT;				\
 	unsigned long addr;						\
 									\
 	while (pages > 0) {						\
 		if (!system_supports_tlb_range() ||			\
-		    pages == 1) {					\
+		    pages == 1 ||					\
+		    (lpa2 && start != ALIGN(start, SZ_64K))) {		\
 			addr = __TLBI_VADDR(start, asid);		\
 			__tlbi_level(op, addr, tlb_level);		\
 			if (tlbi_user)					\
@@ -385,8 +408,8 @@ do {									\
 									\
 		num = __TLBI_RANGE_NUM(pages, scale);			\
 		if (num >= 0) {						\
-			addr = __TLBI_VADDR_RANGE(start, asid, scale,	\
-						  num, tlb_level);	\
+			addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
+						scale, num, tlb_level);	\
 			__tlbi(r##op, addr);				\
 			if (tlbi_user)					\
 				__tlbi_user(r##op, addr);		\
@@ -398,7 +421,7 @@ do {									\
 } while (0)

 #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
-	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false)
+	__flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());

 static inline void __flush_tlb_range(struct vm_area_struct *vma,
 				     unsigned long start, unsigned long end,
@@ -428,9 +451,11 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
 	asid = ASID(vma->vm_mm);

 	if (last_level)
-		__flush_tlb_range_op(vale1is, start, pages, stride, asid, tlb_level, true);
+		__flush_tlb_range_op(vale1is, start, pages, stride, asid,
+				     tlb_level, true, lpa2_is_enabled());
 	else
-		__flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true);
+		__flush_tlb_range_op(vae1is, start, pages, stride, asid,
+				     tlb_level, true, lpa2_is_enabled());

 	dsb(ish);
 	mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end);
@@ -442,9 +467,10 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
 	/*
 	 * We cannot use leaf-only invalidation here, since we may be invalidating
 	 * table entries as part of collapsing hugepages or moving page tables.
-	 * Set the tlb_level to 0 because we can not get enough information here.
+	 * Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough
+	 * information here.
 	 */
-	__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+	__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
 }

 static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
--
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-10-27 11:56 [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
                   ` (2 preceding siblings ...)
  2023-10-27 11:56 ` [RFC PATCH v1 3/3] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
@ 2023-11-13 11:55 ` Ryan Roberts
  2023-11-13 12:20   ` Marc Zyngier
  2023-11-13 13:27   ` Catalin Marinas
  3 siblings, 2 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-11-13 11:55 UTC (permalink / raw)
  To: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Oliver Upton, Mark Rutland, Anshuman Khandual,
	Kees Cook, Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu
  Cc: linux-arm-kernel, kvmarm

On 27/10/2023 12:56, Ryan Roberts wrote:
> Hi All,
> 
> As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> changes to properly support LPA2 before Ard's changes get merged. So far those
> changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> the TLBI changes to make them independent. The idea is that this series would go
> in first, then Ard's and the rest of my series can race eachother and it doesn't
> really matter who wins.
> 
> I've attempted to address all of Marc's feedback against the versions of these
> patches posted at [2], including adding benchmark data (see patch 1). Although
> if people are still nervous that this could regress non-lpa2 performance in some
> cases, I could rework so that there are lpa2 and non-lpa2 variants of
> __flush_tlb_range_op(), and the correct version is chosen at the higher level
> (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> 
> It turns out that we won't be able to key LPA2 usage off the same static key for
> both the kernel and kvm usage because the kernel usage additionally depends on
> CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> defines and implements lpa2_is_enabled() in his series, so there will be a minor
> conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> static key for kvm in my series. Marc, would you be happy with this approach?
> 
> Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> I'll re-post on 6.7-rc1.

Marc, All,

I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
be the first 3 of the series. But if you have an issue with the approach it
would be good to work out an alternative plan to avoid wasting effort preparing
the series.

Thanks,
Ryan

> 
> [1] https://lore.kernel.org/linux-arm-kernel/5651bb31-9ef6-4dfc-b146-64606279bbf7@arm.com/
> [2] https://lore.kernel.org/kvmarm/20231009185008.3803879-1-ryan.roberts@arm.com/
> 
> Thanks,
> Ryan
> 
> Ryan Roberts (3):
>   arm64/mm: Modify range-based tlbi to decrement scale
>   arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
>   arm64/mm: Update tlb invalidation routines for FEAT_LPA2
> 
>  arch/arm64/include/asm/kvm_mmu.h      |   3 +
>  arch/arm64/include/asm/pgtable-prot.h |   2 +
>  arch/arm64/include/asm/tlb.h          |  15 ++--
>  arch/arm64/include/asm/tlbflush.h     | 100 ++++++++++++++++----------
>  4 files changed, 78 insertions(+), 42 deletions(-)
> 
> --
> 2.25.1
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 11:55 ` [RFC PATCH v1 0/3] " Ryan Roberts
@ 2023-11-13 12:20   ` Marc Zyngier
  2023-11-13 12:29     ` Ard Biesheuvel
  2023-11-13 13:27   ` Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Marc Zyngier @ 2023-11-13 12:20 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On Mon, 13 Nov 2023 11:55:01 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
> On 27/10/2023 12:56, Ryan Roberts wrote:
> > Hi All,
> > 
> > As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> > changes to properly support LPA2 before Ard's changes get merged. So far those
> > changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> > the TLBI changes to make them independent. The idea is that this series would go
> > in first, then Ard's and the rest of my series can race eachother and it doesn't
> > really matter who wins.
> > 
> > I've attempted to address all of Marc's feedback against the versions of these
> > patches posted at [2], including adding benchmark data (see patch 1). Although
> > if people are still nervous that this could regress non-lpa2 performance in some
> > cases, I could rework so that there are lpa2 and non-lpa2 variants of
> > __flush_tlb_range_op(), and the correct version is chosen at the higher level
> > (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> > 
> > It turns out that we won't be able to key LPA2 usage off the same static key for
> > both the kernel and kvm usage because the kernel usage additionally depends on
> > CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> > (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> > defines and implements lpa2_is_enabled() in his series, so there will be a minor
> > conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> > static key for kvm in my series. Marc, would you be happy with this approach?
> > 
> > Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> > I'll re-post on 6.7-rc1.
> 
> Marc, All,
> 
> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
> series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
> be the first 3 of the series. But if you have an issue with the approach it
> would be good to work out an alternative plan to avoid wasting effort preparing
> the series.

No specific concern on the approach. Having a static-key for the KVM
side is good, and the uplift on the range stuff seems compelling.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 12:20   ` Marc Zyngier
@ 2023-11-13 12:29     ` Ard Biesheuvel
  2023-11-13 12:42       ` Ryan Roberts
  0 siblings, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2023-11-13 12:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Ryan Roberts, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On Mon, 13 Nov 2023 at 22:20, Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 13 Nov 2023 11:55:01 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
> >
> > On 27/10/2023 12:56, Ryan Roberts wrote:
> > > Hi All,
> > >
> > > As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> > > changes to properly support LPA2 before Ard's changes get merged. So far those
> > > changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> > > the TLBI changes to make them independent. The idea is that this series would go
> > > in first, then Ard's and the rest of my series can race eachother and it doesn't
> > > really matter who wins.
> > >
> > > I've attempted to address all of Marc's feedback against the versions of these
> > > patches posted at [2], including adding benchmark data (see patch 1). Although
> > > if people are still nervous that this could regress non-lpa2 performance in some
> > > cases, I could rework so that there are lpa2 and non-lpa2 variants of
> > > __flush_tlb_range_op(), and the correct version is chosen at the higher level
> > > (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> > >
> > > It turns out that we won't be able to key LPA2 usage off the same static key for
> > > both the kernel and kvm usage because the kernel usage additionally depends on
> > > CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> > > (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> > > defines and implements lpa2_is_enabled() in his series, so there will be a minor
> > > conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> > > static key for kvm in my series. Marc, would you be happy with this approach?
> > >
> > > Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> > > I'll re-post on 6.7-rc1.
> >
> > Marc, All,
> >
> > I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
> > series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
> > be the first 3 of the series. But if you have an issue with the approach it
> > would be good to work out an alternative plan to avoid wasting effort preparing
> > the series.
>
> No specific concern on the approach. Having a static-key for the KVM
> side is good, and the uplift on the range stuff seems compelling.
>

OK, I'll put these at the start of my ++v as well.

Thanks,

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 12:29     ` Ard Biesheuvel
@ 2023-11-13 12:42       ` Ryan Roberts
  0 siblings, 0 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-11-13 12:42 UTC (permalink / raw)
  To: Ard Biesheuvel, Marc Zyngier
  Cc: Ard Biesheuvel, Will Deacon, Catalin Marinas, Oliver Upton,
	Mark Rutland, Anshuman Khandual, Kees Cook, Joey Gouly,
	Suzuki K Poulose, James Morse, Zenghui Yu, linux-arm-kernel,
	kvmarm

On 13/11/2023 12:29, Ard Biesheuvel wrote:
> On Mon, 13 Nov 2023 at 22:20, Marc Zyngier <maz@kernel.org> wrote:
>>
>> On Mon, 13 Nov 2023 11:55:01 +0000,
>> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>
>>> On 27/10/2023 12:56, Ryan Roberts wrote:
>>>> Hi All,
>>>>
>>>> As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
>>>> changes to properly support LPA2 before Ard's changes get merged. So far those
>>>> changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
>>>> the TLBI changes to make them independent. The idea is that this series would go
>>>> in first, then Ard's and the rest of my series can race eachother and it doesn't
>>>> really matter who wins.
>>>>
>>>> I've attempted to address all of Marc's feedback against the versions of these
>>>> patches posted at [2], including adding benchmark data (see patch 1). Although
>>>> if people are still nervous that this could regress non-lpa2 performance in some
>>>> cases, I could rework so that there are lpa2 and non-lpa2 variants of
>>>> __flush_tlb_range_op(), and the correct version is chosen at the higher level
>>>> (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
>>>>
>>>> It turns out that we won't be able to key LPA2 usage off the same static key for
>>>> both the kernel and kvm usage because the kernel usage additionally depends on
>>>> CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
>>>> (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
>>>> defines and implements lpa2_is_enabled() in his series, so there will be a minor
>>>> conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
>>>> static key for kvm in my series. Marc, would you be happy with this approach?
>>>>
>>>> Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
>>>> I'll re-post on 6.7-rc1.
>>>
>>> Marc, All,
>>>
>>> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
>>> series on top of v6.7-rc1 in the next day or 2. By default, these 3 patches will
>>> be the first 3 of the series. But if you have an issue with the approach it
>>> would be good to work out an alternative plan to avoid wasting effort preparing
>>> the series.
>>
>> No specific concern on the approach. Having a static-key for the KVM
>> side is good, and the uplift on the range stuff seems compelling.
>>
> 
> OK, I'll put these at the start of my ++v as well.

Great - thanks both!

> 
> Thanks,
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 11:55 ` [RFC PATCH v1 0/3] " Ryan Roberts
  2023-11-13 12:20   ` Marc Zyngier
@ 2023-11-13 13:27   ` Catalin Marinas
  2023-11-13 16:44     ` Ryan Roberts
  2023-11-15 22:33     ` Ard Biesheuvel
  1 sibling, 2 replies; 14+ messages in thread
From: Catalin Marinas @ 2023-11-13 13:27 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Marc Zyngier,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On Mon, Nov 13, 2023 at 11:55:01AM +0000, Ryan Roberts wrote:
> On 27/10/2023 12:56, Ryan Roberts wrote:
> > As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
> > changes to properly support LPA2 before Ard's changes get merged. So far those
> > changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
> > the TLBI changes to make them independent. The idea is that this series would go
> > in first, then Ard's and the rest of my series can race eachother and it doesn't
> > really matter who wins.
> > 
> > I've attempted to address all of Marc's feedback against the versions of these
> > patches posted at [2], including adding benchmark data (see patch 1). Although
> > if people are still nervous that this could regress non-lpa2 performance in some
> > cases, I could rework so that there are lpa2 and non-lpa2 variants of
> > __flush_tlb_range_op(), and the correct version is chosen at the higher level
> > (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
> > 
> > It turns out that we won't be able to key LPA2 usage off the same static key for
> > both the kernel and kvm usage because the kernel usage additionally depends on
> > CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
> > (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
> > defines and implements lpa2_is_enabled() in his series, so there will be a minor
> > conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
> > static key for kvm in my series. Marc, would you be happy with this approach?
> > 
> > Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
> > I'll re-post on 6.7-rc1.
[...]
> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
> series on top of v6.7-rc1 in the next day or 2.

I suspect that's what most maintainers wait for ;). Usually patches
posted just before or during the merging window get ignored, unless they
are urgent fixes.

> By default, these 3 patches will
> be the first 3 of the series. But if you have an issue with the approach it
> would be good to work out an alternative plan to avoid wasting effort preparing
> the series.

If these patches are needed for Ard's series (I think they do), they
could be posted together. But first we need to split Ard's series into
at least two: reworking the memory map together with moving (some of)
the init code to C and the actual LPA2 stage 1 support. We might even
through LPA2 stage 2 on top if we feel brave ;). We can queue them all
together but having separate series gives us an option to drop bits if
needed.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 13:27   ` Catalin Marinas
@ 2023-11-13 16:44     ` Ryan Roberts
  2023-11-13 17:49       ` Catalin Marinas
  2023-11-15 22:33     ` Ard Biesheuvel
  1 sibling, 1 reply; 14+ messages in thread
From: Ryan Roberts @ 2023-11-13 16:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Marc Zyngier,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On 13/11/2023 13:27, Catalin Marinas wrote:
> On Mon, Nov 13, 2023 at 11:55:01AM +0000, Ryan Roberts wrote:
>> On 27/10/2023 12:56, Ryan Roberts wrote:
>>> As raised yesterday against Ard's LPA2 series [1], we need to address the TLBI
>>> changes to properly support LPA2 before Ard's changes get merged. So far those
>>> changes have been part of my KVM LPA2 series [2]. So this is an attempt to split
>>> the TLBI changes to make them independent. The idea is that this series would go
>>> in first, then Ard's and the rest of my series can race eachother and it doesn't
>>> really matter who wins.
>>>
>>> I've attempted to address all of Marc's feedback against the versions of these
>>> patches posted at [2], including adding benchmark data (see patch 1). Although
>>> if people are still nervous that this could regress non-lpa2 performance in some
>>> cases, I could rework so that there are lpa2 and non-lpa2 variants of
>>> __flush_tlb_range_op(), and the correct version is chosen at the higher level
>>> (based on lpa2_is_enabled() / kvm_lpa2_is_enabled()).
>>>
>>> It turns out that we won't be able to key LPA2 usage off the same static key for
>>> both the kernel and kvm usage because the kernel usage additionally depends on
>>> CONFIG_ARM64_LPA2 being enabled. So I've introduced 2 stub functions
>>> (lpa2_is_enabled() and kvm_lpa2_is_enabled()) to advertise it. Ard already
>>> defines and implements lpa2_is_enabled() in his series, so there will be a minor
>>> conflict to resolve there. I plan to define kvm_lpa2_is_enabled() to be the
>>> static key for kvm in my series. Marc, would you be happy with this approach?
>>>
>>> Anyway, I wanted to put this out there as an RFC. If we are happy with it, then
>>> I'll re-post on 6.7-rc1.
> [...]
>> I polite bump; I never heard back on this. I'm planning to post my LPA2/KVM
>> series on top of v6.7-rc1 in the next day or 2.
> 
> I suspect that's what most maintainers wait for ;). Usually patches
> posted just before or during the merging window get ignored, unless they
> are urgent fixes.
> 
>> By default, these 3 patches will
>> be the first 3 of the series. But if you have an issue with the approach it
>> would be good to work out an alternative plan to avoid wasting effort preparing
>> the series.
> 
> If these patches are needed for Ard's series (I think they do), they
> could be posted together. But first we need to split Ard's series into
> at least two: reworking the memory map together with moving (some of)
> the init code to C and the actual LPA2 stage 1 support. We might even
> through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> together but having separate series gives us an option to drop bits if
> needed.

Sorry I'm not completely sure what you are suggesting I do here. In the context
of the lpa2/kvm series, Marc previously said he would merge it for v6.8 if I
posted against v6.7-rc1 (which is what I'm gearing up for now).

My series and Ard's are independent except that they both depend on the 3
patches in this RFC. Ard has just suggested he would prefix his series with
these 3 patches and I would continue to do the same, then they can be handled as
2 independent series (or more if Ard is splitting his) - the first 3 patches
will just desolve to nothing for the series that goes in second.

Does that work for you, are are you suggesting I should re-post this as an
independent series?

Thanks,
Ryan



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 16:44     ` Ryan Roberts
@ 2023-11-13 17:49       ` Catalin Marinas
  2023-11-14 11:25         ` Ryan Roberts
  0 siblings, 1 reply; 14+ messages in thread
From: Catalin Marinas @ 2023-11-13 17:49 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Marc Zyngier,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On Mon, Nov 13, 2023 at 04:44:58PM +0000, Ryan Roberts wrote:
> On 13/11/2023 13:27, Catalin Marinas wrote:
> > If these patches are needed for Ard's series (I think they do), they
> > could be posted together. But first we need to split Ard's series into
> > at least two: reworking the memory map together with moving (some of)
> > the init code to C and the actual LPA2 stage 1 support. We might even
> > through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> > together but having separate series gives us an option to drop bits if
> > needed.
> 
> Sorry I'm not completely sure what you are suggesting I do here. In the context
> of the lpa2/kvm series, Marc previously said he would merge it for v6.8 if I
> posted against v6.7-rc1 (which is what I'm gearing up for now).

If we merge your series independently of Ard's, that's fine, these three
patches can go together as a prefix.

> My series and Ard's are independent except that they both depend on the 3
> patches in this RFC. Ard has just suggested he would prefix his series with
> these 3 patches and I would continue to do the same, then they can be handled as
> 2 independent series (or more if Ard is splitting his) - the first 3 patches
> will just desolve to nothing for the series that goes in second.
> 
> Does that work for you, are are you suggesting I should re-post this as an
> independent series?

No need to. Just post your LPA2 patches with these three as a prefix.
Ard can include them as well and if it gets to merging both, we'll sort
out a small stable branch for these three patches.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 17:49       ` Catalin Marinas
@ 2023-11-14 11:25         ` Ryan Roberts
  0 siblings, 0 replies; 14+ messages in thread
From: Ryan Roberts @ 2023-11-14 11:25 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ard Biesheuvel, Ard Biesheuvel, Will Deacon, Marc Zyngier,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On 13/11/2023 17:49, Catalin Marinas wrote:
> On Mon, Nov 13, 2023 at 04:44:58PM +0000, Ryan Roberts wrote:
>> On 13/11/2023 13:27, Catalin Marinas wrote:
>>> If these patches are needed for Ard's series (I think they do), they
>>> could be posted together. But first we need to split Ard's series into
>>> at least two: reworking the memory map together with moving (some of)
>>> the init code to C and the actual LPA2 stage 1 support. We might even
>>> through LPA2 stage 2 on top if we feel brave ;). We can queue them all
>>> together but having separate series gives us an option to drop bits if
>>> needed.
>>
>> Sorry I'm not completely sure what you are suggesting I do here. In the context
>> of the lpa2/kvm series, Marc previously said he would merge it for v6.8 if I
>> posted against v6.7-rc1 (which is what I'm gearing up for now).
> 
> If we merge your series independently of Ard's, that's fine, these three
> patches can go together as a prefix.
> 
>> My series and Ard's are independent except that they both depend on the 3
>> patches in this RFC. Ard has just suggested he would prefix his series with
>> these 3 patches and I would continue to do the same, then they can be handled as
>> 2 independent series (or more if Ard is splitting his) - the first 3 patches
>> will just desolve to nothing for the series that goes in second.
>>
>> Does that work for you, are are you suggesting I should re-post this as an
>> independent series?
> 
> No need to. Just post your LPA2 patches with these three as a prefix.
> Ard can include them as well and if it gets to merging both, we'll sort
> out a small stable branch for these three patches.
> 

Great - thanks for the clarification!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-13 13:27   ` Catalin Marinas
  2023-11-13 16:44     ` Ryan Roberts
@ 2023-11-15 22:33     ` Ard Biesheuvel
  2023-11-23 17:04       ` Catalin Marinas
  1 sibling, 1 reply; 14+ messages in thread
From: Ard Biesheuvel @ 2023-11-15 22:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ryan Roberts, Ard Biesheuvel, Will Deacon, Marc Zyngier,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On Mon, 13 Nov 2023 at 23:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> But first we need to split Ard's series into
> at least two: reworking the memory map together with moving (some of)
> the init code to C and the actual LPA2 stage 1 support. We might even
> through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> together but having separate series gives us an option to drop bits if
> needed.
>

I have rebased the changes onto v6.7-rc1 and squashed or added some of
the followup fixes:

https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm64-lpa2-v5

I am happy to carve this up any way you like. A natural split would be
to include everything up to and including

arm64: mmu: Make __cpu_replace_ttbr1() out of line

and queue the rest later, but perhaps you prefer to start with a smaller subset?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2
  2023-11-15 22:33     ` Ard Biesheuvel
@ 2023-11-23 17:04       ` Catalin Marinas
  0 siblings, 0 replies; 14+ messages in thread
From: Catalin Marinas @ 2023-11-23 17:04 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ryan Roberts, Ard Biesheuvel, Will Deacon, Marc Zyngier,
	Oliver Upton, Mark Rutland, Anshuman Khandual, Kees Cook,
	Joey Gouly, Suzuki K Poulose, James Morse, Zenghui Yu,
	linux-arm-kernel, kvmarm

On Thu, Nov 16, 2023 at 08:33:40AM +1000, Ard Biesheuvel wrote:
> On Mon, 13 Nov 2023 at 23:27, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > But first we need to split Ard's series into
> > at least two: reworking the memory map together with moving (some of)
> > the init code to C and the actual LPA2 stage 1 support. We might even
> > through LPA2 stage 2 on top if we feel brave ;). We can queue them all
> > together but having separate series gives us an option to drop bits if
> > needed.
> >
> 
> I have rebased the changes onto v6.7-rc1 and squashed or added some of
> the followup fixes:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm64-lpa2-v5
> 
> I am happy to carve this up any way you like. A natural split would be
> to include everything up to and including
> 
> arm64: mmu: Make __cpu_replace_ttbr1() out of line
> 
> and queue the rest later, but perhaps you prefer to start with a smaller subset?

This looks fine, just split this whole lot in two, the first one to
__cpu_replace_ttbr1().

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-11-23 17:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-27 11:56 [RFC PATCH v1 0/3] Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
2023-10-27 11:56 ` [RFC PATCH v1 1/3] arm64/mm: Modify range-based tlbi to decrement scale Ryan Roberts
2023-10-27 11:56 ` [RFC PATCH v1 2/3] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Ryan Roberts
2023-10-27 11:56 ` [RFC PATCH v1 3/3] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
2023-11-13 11:55 ` [RFC PATCH v1 0/3] " Ryan Roberts
2023-11-13 12:20   ` Marc Zyngier
2023-11-13 12:29     ` Ard Biesheuvel
2023-11-13 12:42       ` Ryan Roberts
2023-11-13 13:27   ` Catalin Marinas
2023-11-13 16:44     ` Ryan Roberts
2023-11-13 17:49       ` Catalin Marinas
2023-11-14 11:25         ` Ryan Roberts
2023-11-15 22:33     ` Ard Biesheuvel
2023-11-23 17:04       ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).