* [PATCH v5 01/12] arm64/mm: Modify range-based tlbi to decrement scale
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 02/12] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Ryan Roberts
` (11 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
In preparation for adding support for LPA2 to the tlb invalidation
routines, modify the algorithm used by range-based tlbi to start at the
highest 'scale' and decrement instead of starting at the lowest 'scale'
and incrementing. This new approach makes it possible to maintain 64K
alignment as we work through the range, until the last op (at scale=0).
This is required when LPA2 is enabled. (This part will be added in a
subsequent commit).
This change is separated into its own patch because it will also impact
non-LPA2 systems, and I want to make it easy to bisect in case it leads
to performance regression (see below for benchmarks that suggest this
should not be a problem).
The original commit (d1d3aa98 "arm64: tlb: Use the TLBI RANGE feature in
arm64") stated this as the reason for _incrementing_ scale:
However, in most scenarios, the pages = 1 when flush_tlb_range() is
called. Start from scale = 3 or other proper value (such as scale
=ilog2(pages)), will incur extra overhead. So increase 'scale' from 0
to maximum.
But pages=1 is already special cased by the non-range invalidation path,
which will take care of it the first time through the loop (both in the
original commit and in my change), so I don't think switching to
decrement scale should have any extra performance impact after all.
Indeed benchmarking kernel compilation, a TLBI-heavy workload, suggests
that this new approach actually _improves_ performance slightly (using a
virtual machine on Apple M2):
Table shows time to execute kernel compilation workload with 8 jobs,
relative to baseline without this patch (more negative number is
bigger speedup). Repeated 9 times across 3 system reboots:
| counter | mean | stdev |
|:----------|-----------:|----------:|
| real-time | -0.6% | 0.0% |
| kern-time | -1.6% | 0.5% |
| user-time | -0.4% | 0.1% |
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index bb2c2833a987..36acdb3d16a5 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -350,14 +350,14 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
* entries one by one at the granularity of 'stride'. If the TLB
* range ops are supported, then:
*
- * 1. If 'pages' is odd, flush the first page through non-range
- * operations;
+ * 1. The minimum range granularity is decided by 'scale', so multiple range
+ * TLBI operations may be required. Start from scale = 3, flush the largest
+ * possible number of pages ((num+1)*2^(5*scale+1)) that fit into the
+ * requested range, then decrement scale and continue until one or zero pages
+ * are left.
*
- * 2. For remaining pages: the minimum range granularity is decided
- * by 'scale', so multiple range TLBI operations may be required.
- * Start from scale = 0, flush the corresponding number of pages
- * ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it
- * until no pages left.
+ * 2. If there is 1 page remaining, flush it through non-range operations. Range
+ * operations can only span an even number of pages.
*
* Note that certain ranges can be represented by either num = 31 and
* scale or num = 0 and scale + 1. The loop below favours the latter
@@ -367,12 +367,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
asid, tlb_level, tlbi_user) \
do { \
int num = 0; \
- int scale = 0; \
+ int scale = 3; \
unsigned long addr; \
\
while (pages > 0) { \
if (!system_supports_tlb_range() || \
- pages % 2 == 1) { \
+ pages == 1) { \
addr = __TLBI_VADDR(start, asid); \
__tlbi_level(op, addr, tlb_level); \
if (tlbi_user) \
@@ -392,7 +392,7 @@ do { \
start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
pages -= __TLBI_RANGE_PAGES(num, scale); \
} \
- scale++; \
+ scale--; \
} \
} while (0)
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 02/12] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 01/12] arm64/mm: Modify range-based tlbi to decrement scale Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 03/12] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
` (10 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
Add stub functions which is initially always return false. These provide
the hooks that we need to update the range-based TLBI routines, whose
operands are encoded differently depending on whether lpa2 is enabled or
not.
The kernel and kvm will enable the use of lpa2 asynchronously in future,
and part of that enablement will involve fleshing out their respective
hook to advertise when it is using lpa2.
Since the kernel's decision to use lpa2 relies on more than just whether
the HW supports the feature, it can't just use the same static key as
kvm. This is another reason to use separate functions. lpa2_is_enabled()
is already implemented as part of Ard's kernel lpa2 series. Since kvm
will make its decision solely based on HW support, kvm_lpa2_is_enabled()
will be defined as system_supports_lpa2() once kvm starts using lpa2.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/kvm_mmu.h | 3 +++
arch/arm64/include/asm/pgtable-prot.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 49e0d4b36bd0..31e8d7faed65 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -339,5 +339,8 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
{
return container_of(mmu->arch, struct kvm, arch);
}
+
+#define kvm_lpa2_is_enabled() false
+
#endif /* __ASSEMBLY__ */
#endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index e9624f6326dd..483dbfa39c4c 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -71,6 +71,8 @@ extern bool arm64_use_ng_mappings;
#define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0)
#define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0)
+#define lpa2_is_enabled() false
+
/*
* If we have userspace only BTI we don't want to mark kernel pages
* guarded even if the system does support BTI.
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 03/12] arm64/mm: Update tlb invalidation routines for FEAT_LPA2
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 01/12] arm64/mm: Modify range-based tlbi to decrement scale Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 02/12] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 04/12] arm64: Add ARM64_HAS_LPA2 CPU capability Ryan Roberts
` (9 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
FEAT_LPA2 impacts tlb invalidation in 2 ways; Firstly, the TTL field in
the non-range tlbi instructions can now validly take a 0 value as a
level hint for the 4KB granule (this is due to the extra level of
translation) - previously TTL=0b0100 meant no hint and was treated as
0b0000. Secondly, The BADDR field of the range-based tlbi instructions
is specified in 64KB units when LPA2 is in use (TCR.DS=1), whereas it is
in page units otherwise. Changes are required for tlbi to continue to
operate correctly when LPA2 is in use.
Solve the first problem by always adding the level hint if the level is
between [0, 3] (previously anything other than 0 was hinted, which
breaks in the new level -1 case from kvm). When running on non-LPA2 HW,
0 is still safe to hint as the HW will fall back to non-hinted. While we
are at it, we replace the notion of 0 being the non-hinted sentinel with
a macro, TLBI_TTL_UNKNOWN. This means callers won't need updating
if/when translation depth increases in future.
The second issue is more complex: When LPA2 is in use, use the non-range
tlbi instructions to forward align to a 64KB boundary first, then use
range-based tlbi from there on, until we have either invalidated all
pages or we have a single page remaining. If the latter, that is done
with non-range tlbi. We determine whether LPA2 is in use based on
lpa2_is_enabled() (for kernel calls) or kvm_lpa2_is_enabled() (for kvm
calls).
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm64/include/asm/tlb.h | 15 ++++--
arch/arm64/include/asm/tlbflush.h | 90 ++++++++++++++++++++-----------
2 files changed, 68 insertions(+), 37 deletions(-)
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 846c563689a8..0150deb332af 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -22,15 +22,15 @@ static void tlb_flush(struct mmu_gather *tlb);
#include <asm-generic/tlb.h>
/*
- * get the tlbi levels in arm64. Default value is 0 if more than one
- * of cleared_* is set or neither is set.
- * Arm64 doesn't support p4ds now.
+ * get the tlbi levels in arm64. Default value is TLBI_TTL_UNKNOWN if more than
+ * one of cleared_* is set or neither is set - this elides the level hinting to
+ * the hardware.
*/
static inline int tlb_get_level(struct mmu_gather *tlb)
{
/* The TTL field is only valid for the leaf entry. */
if (tlb->freed_tables)
- return 0;
+ return TLBI_TTL_UNKNOWN;
if (tlb->cleared_ptes && !(tlb->cleared_pmds ||
tlb->cleared_puds ||
@@ -47,7 +47,12 @@ static inline int tlb_get_level(struct mmu_gather *tlb)
tlb->cleared_p4ds))
return 1;
- return 0;
+ if (tlb->cleared_p4ds && !(tlb->cleared_ptes ||
+ tlb->cleared_pmds ||
+ tlb->cleared_puds))
+ return 0;
+
+ return TLBI_TTL_UNKNOWN;
}
static inline void tlb_flush(struct mmu_gather *tlb)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 36acdb3d16a5..1deb5d789c2e 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -94,19 +94,22 @@ static inline unsigned long get_trans_granule(void)
* When ARMv8.4-TTL exists, TLBI operations take an additional hint for
* the level at which the invalidation must take place. If the level is
* wrong, no invalidation may take place. In the case where the level
- * cannot be easily determined, a 0 value for the level parameter will
- * perform a non-hinted invalidation.
+ * cannot be easily determined, the value TLBI_TTL_UNKNOWN will perform
+ * a non-hinted invalidation. Any provided level outside the hint range
+ * will also cause fall-back to non-hinted invalidation.
*
* For Stage-2 invalidation, use the level values provided to that effect
* in asm/stage2_pgtable.h.
*/
#define TLBI_TTL_MASK GENMASK_ULL(47, 44)
+#define TLBI_TTL_UNKNOWN INT_MAX
+
#define __tlbi_level(op, addr, level) do { \
u64 arg = addr; \
\
if (alternative_has_cap_unlikely(ARM64_HAS_ARMv8_4_TTL) && \
- level) { \
+ level >= 0 && level <= 3) { \
u64 ttl = level & 3; \
ttl |= get_trans_granule() << 2; \
arg &= ~TLBI_TTL_MASK; \
@@ -122,28 +125,34 @@ static inline unsigned long get_trans_granule(void)
} while (0)
/*
- * This macro creates a properly formatted VA operand for the TLB RANGE.
- * The value bit assignments are:
+ * This macro creates a properly formatted VA operand for the TLB RANGE. The
+ * value bit assignments are:
*
* +----------+------+-------+-------+-------+----------------------+
* | ASID | TG | SCALE | NUM | TTL | BADDR |
* +-----------------+-------+-------+-------+----------------------+
* |63 48|47 46|45 44|43 39|38 37|36 0|
*
- * The address range is determined by below formula:
- * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE)
+ * The address range is determined by below formula: [BADDR, BADDR + (NUM + 1) *
+ * 2^(5*SCALE + 1) * PAGESIZE)
+ *
+ * Note that the first argument, baddr, is pre-shifted; If LPA2 is in use, BADDR
+ * holds addr[52:16]. Else BADDR holds page number. See for example ARM DDI
+ * 0487J.a section C5.5.60 "TLBI VAE1IS, TLBI VAE1ISNXS, TLB Invalidate by VA,
+ * EL1, Inner Shareable".
*
*/
-#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \
- ({ \
- unsigned long __ta = (addr) >> PAGE_SHIFT; \
- __ta &= GENMASK_ULL(36, 0); \
- __ta |= (unsigned long)(ttl) << 37; \
- __ta |= (unsigned long)(num) << 39; \
- __ta |= (unsigned long)(scale) << 44; \
- __ta |= get_trans_granule() << 46; \
- __ta |= (unsigned long)(asid) << 48; \
- __ta; \
+#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl) \
+ ({ \
+ unsigned long __ta = (baddr); \
+ unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0; \
+ __ta &= GENMASK_ULL(36, 0); \
+ __ta |= __ttl << 37; \
+ __ta |= (unsigned long)(num) << 39; \
+ __ta |= (unsigned long)(scale) << 44; \
+ __ta |= get_trans_granule() << 46; \
+ __ta |= (unsigned long)(asid) << 48; \
+ __ta; \
})
/* These macros are used by the TLBI RANGE feature. */
@@ -216,12 +225,16 @@ static inline unsigned long get_trans_granule(void)
* CPUs, ensuring that any walk-cache entries associated with the
* translation are also invalidated.
*
- * __flush_tlb_range(vma, start, end, stride, last_level)
+ * __flush_tlb_range(vma, start, end, stride, last_level, tlb_level)
* Invalidate the virtual-address range '[start, end)' on all
* CPUs for the user address space corresponding to 'vma->mm'.
* The invalidation operations are issued at a granularity
* determined by 'stride' and only affect any walk-cache entries
- * if 'last_level' is equal to false.
+ * if 'last_level' is equal to false. tlb_level is the level at
+ * which the invalidation must take place. If the level is wrong,
+ * no invalidation may take place. In the case where the level
+ * cannot be easily determined, the value TLBI_TTL_UNKNOWN will
+ * perform a non-hinted invalidation.
*
*
* Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented
@@ -345,34 +358,44 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
* @tlb_level: Translation Table level hint, if known
* @tlbi_user: If 'true', call an additional __tlbi_user()
* (typically for user ASIDs). 'flase' for IPA instructions
+ * @lpa2: If 'true', the lpa2 scheme is used as set out below
*
* When the CPU does not support TLB range operations, flush the TLB
* entries one by one at the granularity of 'stride'. If the TLB
* range ops are supported, then:
*
- * 1. The minimum range granularity is decided by 'scale', so multiple range
+ * 1. If FEAT_LPA2 is in use, the start address of a range operation must be
+ * 64KB aligned, so flush pages one by one until the alignment is reached
+ * using the non-range operations. This step is skipped if LPA2 is not in
+ * use.
+ *
+ * 2. The minimum range granularity is decided by 'scale', so multiple range
* TLBI operations may be required. Start from scale = 3, flush the largest
* possible number of pages ((num+1)*2^(5*scale+1)) that fit into the
* requested range, then decrement scale and continue until one or zero pages
- * are left.
+ * are left. We must start from highest scale to ensure 64KB start alignment
+ * is maintained in the LPA2 case.
*
- * 2. If there is 1 page remaining, flush it through non-range operations. Range
- * operations can only span an even number of pages.
+ * 3. If there is 1 page remaining, flush it through non-range operations. Range
+ * operations can only span an even number of pages. We save this for last to
+ * ensure 64KB start alignment is maintained for the LPA2 case.
*
* Note that certain ranges can be represented by either num = 31 and
* scale or num = 0 and scale + 1. The loop below favours the latter
* since num is limited to 30 by the __TLBI_RANGE_NUM() macro.
*/
#define __flush_tlb_range_op(op, start, pages, stride, \
- asid, tlb_level, tlbi_user) \
+ asid, tlb_level, tlbi_user, lpa2) \
do { \
int num = 0; \
int scale = 3; \
+ int shift = lpa2 ? 16 : PAGE_SHIFT; \
unsigned long addr; \
\
while (pages > 0) { \
if (!system_supports_tlb_range() || \
- pages == 1) { \
+ pages == 1 || \
+ (lpa2 && start != ALIGN(start, SZ_64K))) { \
addr = __TLBI_VADDR(start, asid); \
__tlbi_level(op, addr, tlb_level); \
if (tlbi_user) \
@@ -384,8 +407,8 @@ do { \
\
num = __TLBI_RANGE_NUM(pages, scale); \
if (num >= 0) { \
- addr = __TLBI_VADDR_RANGE(start, asid, scale, \
- num, tlb_level); \
+ addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
+ scale, num, tlb_level); \
__tlbi(r##op, addr); \
if (tlbi_user) \
__tlbi_user(r##op, addr); \
@@ -397,7 +420,7 @@ do { \
} while (0)
#define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \
- __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false)
+ __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled());
static inline void __flush_tlb_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
@@ -427,9 +450,11 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
asid = ASID(vma->vm_mm);
if (last_level)
- __flush_tlb_range_op(vale1is, start, pages, stride, asid, tlb_level, true);
+ __flush_tlb_range_op(vale1is, start, pages, stride, asid,
+ tlb_level, true, lpa2_is_enabled());
else
- __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true);
+ __flush_tlb_range_op(vae1is, start, pages, stride, asid,
+ tlb_level, true, lpa2_is_enabled());
dsb(ish);
mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end);
@@ -441,9 +466,10 @@ static inline void flush_tlb_range(struct vm_area_struct *vma,
/*
* We cannot use leaf-only invalidation here, since we may be invalidating
* table entries as part of collapsing hugepages or moving page tables.
- * Set the tlb_level to 0 because we can not get enough information here.
+ * Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough
+ * information here.
*/
- __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+ __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
}
static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 04/12] arm64: Add ARM64_HAS_LPA2 CPU capability
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (2 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 03/12] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-22 15:14 ` Marc Zyngier
2023-11-16 14:29 ` [PATCH v5 05/12] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Ryan Roberts
` (8 subsequent siblings)
12 siblings, 1 reply; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
Expose FEAT_LPA2 as a capability so that we can take advantage of
alternatives patching in the hypervisor.
Although FEAT_LPA2 presence is advertised separately for stage1 and
stage2, the expectation is that in practice both stages will either
support or not support it. Therefore, we combine both into a single
capability, allowing us to simplify the implementation. KVM requires
support in both stages in order to use LPA2 since the same library is
used for hyp stage 1 and guest stage 2 pgtables.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/cpufeature.h | 5 ++++
arch/arm64/kernel/cpufeature.c | 39 +++++++++++++++++++++++++++++
arch/arm64/tools/cpucaps | 1 +
3 files changed, 45 insertions(+)
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index f6d416fe49b0..acf109581ac0 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -819,6 +819,11 @@ static inline bool system_supports_tlb_range(void)
return alternative_has_cap_unlikely(ARM64_HAS_TLB_RANGE);
}
+static inline bool system_supports_lpa2(void)
+{
+ return cpus_have_final_cap(ARM64_HAS_LPA2);
+}
+
int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0e7d0c2bab36..38dfdaff8176 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1768,6 +1768,39 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
return !meltdown_safe;
}
+#if defined(CONFIG_ARM64_4K_PAGES) || defined(CONFIG_ARM64_16K_PAGES)
+static bool has_lpa2_at_stage1(u64 mmfr0)
+{
+ unsigned int tgran;
+
+ tgran = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_EL1_TGRAN_SHIFT);
+ return tgran == ID_AA64MMFR0_EL1_TGRAN_LPA2;
+}
+
+static bool has_lpa2_at_stage2(u64 mmfr0)
+{
+ unsigned int tgran;
+
+ tgran = cpuid_feature_extract_unsigned_field(mmfr0,
+ ID_AA64MMFR0_EL1_TGRAN_2_SHIFT);
+ return tgran == ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2;
+}
+
+static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ u64 mmfr0;
+
+ mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+ return has_lpa2_at_stage1(mmfr0) && has_lpa2_at_stage2(mmfr0);
+}
+#else
+static bool has_lpa2(const struct arm64_cpu_capabilities *entry, int scope)
+{
+ return false;
+}
+#endif
+
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
#define KPTI_NG_TEMP_VA (-(1UL << PMD_SHIFT))
@@ -2733,6 +2766,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
},
+ {
+ .desc = "52-bit Virtual Addressing for KVM (LPA2)",
+ .capability = ARM64_HAS_LPA2,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = has_lpa2,
+ },
{},
};
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index b98c38288a9d..919eceb0b3da 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -37,6 +37,7 @@ HAS_GIC_PRIO_MASKING
HAS_GIC_PRIO_RELAXED_SYNC
HAS_HCX
HAS_LDAPR
+HAS_LPA2
HAS_LSE_ATOMICS
HAS_MOPS
HAS_NESTED_VIRT
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* Re: [PATCH v5 04/12] arm64: Add ARM64_HAS_LPA2 CPU capability
2023-11-16 14:29 ` [PATCH v5 04/12] arm64: Add ARM64_HAS_LPA2 CPU capability Ryan Roberts
@ 2023-11-22 15:14 ` Marc Zyngier
0 siblings, 0 replies; 26+ messages in thread
From: Marc Zyngier @ 2023-11-22 15:14 UTC (permalink / raw)
To: Ryan Roberts
Cc: Catalin Marinas, Will Deacon, Oliver Upton, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On Thu, 16 Nov 2023 14:29:23 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> Expose FEAT_LPA2 as a capability so that we can take advantage of
> alternatives patching in the hypervisor.
>
> Although FEAT_LPA2 presence is advertised separately for stage1 and
> stage2, the expectation is that in practice both stages will either
> support or not support it. Therefore, we combine both into a single
> capability, allowing us to simplify the implementation. KVM requires
> support in both stages in order to use LPA2 since the same library is
> used for hyp stage 1 and guest stage 2 pgtables.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> arch/arm64/include/asm/cpufeature.h | 5 ++++
> arch/arm64/kernel/cpufeature.c | 39 +++++++++++++++++++++++++++++
> arch/arm64/tools/cpucaps | 1 +
> 3 files changed, 45 insertions(+)
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index f6d416fe49b0..acf109581ac0 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -819,6 +819,11 @@ static inline bool system_supports_tlb_range(void)
> return alternative_has_cap_unlikely(ARM64_HAS_TLB_RANGE);
> }
>
> +static inline bool system_supports_lpa2(void)
> +{
> + return cpus_have_final_cap(ARM64_HAS_LPA2);
> +}
> +
> int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
> bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
>
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 0e7d0c2bab36..38dfdaff8176 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1768,6 +1768,39 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> return !meltdown_safe;
> }
>
> +#if defined(CONFIG_ARM64_4K_PAGES) || defined(CONFIG_ARM64_16K_PAGES)
nit: if you move patch #5 before this one, this can be gated by the
actual definition of ID_AA64MMFR0_EL1_TGRAN_{,2_}LPA2 instead of the
page size indirection. I personally would find it slightly more
readable.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v5 05/12] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (3 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 04/12] arm64: Add ARM64_HAS_LPA2 CPU capability Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 06/12] KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2 Ryan Roberts
` (7 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
From: Anshuman Khandual <anshuman.khandual@arm.com>
PAGE_SIZE support is tested against possible minimum and maximum values for
its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed
or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K
and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it
adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A).
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/sysreg.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 5e65f51c10d2..48181cf6cc40 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -871,10 +871,12 @@
/* id_aa64mmfr0 */
#define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN 0x0
+#define ID_AA64MMFR0_EL1_TGRAN4_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX 0x7
#define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MIN 0x0
#define ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MAX 0x7
#define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN 0x1
+#define ID_AA64MMFR0_EL1_TGRAN16_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX 0xf
#define ARM64_MIN_PARANGE_BITS 32
@@ -882,6 +884,7 @@
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_DEFAULT 0x0
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_NONE 0x1
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MIN 0x2
+#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_LPA2 0x3
#define ID_AA64MMFR0_EL1_TGRAN_2_SUPPORTED_MAX 0x7
#ifdef CONFIG_ARM64_PA_BITS_52
@@ -892,11 +895,13 @@
#if defined(CONFIG_ARM64_4K_PAGES)
#define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN4_SHIFT
+#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN4_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX
#define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN4_2_SHIFT
#elif defined(CONFIG_ARM64_16K_PAGES)
#define ID_AA64MMFR0_EL1_TGRAN_SHIFT ID_AA64MMFR0_EL1_TGRAN16_SHIFT
+#define ID_AA64MMFR0_EL1_TGRAN_LPA2 ID_AA64MMFR0_EL1_TGRAN16_52_BIT
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MIN ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN
#define ID_AA64MMFR0_EL1_TGRAN_SUPPORTED_MAX ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX
#define ID_AA64MMFR0_EL1_TGRAN_2_SHIFT ID_AA64MMFR0_EL1_TGRAN16_2_SHIFT
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 06/12] KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (4 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 05/12] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1 Ryan Roberts
` (6 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
As per Arm ARM (0487I.a), (V)TCR_EL2.DS fields control whether 52 bit
input and output addresses are supported on 4K and 16K page size
configurations when FEAT_LPA2 is known to have been implemented.
This adds these field definitions which will be used by KVM when
FEAT_LPA2 is enabled.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
arch/arm64/include/asm/kvm_arm.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index b85f46a73e21..312cbc300831 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -108,6 +108,7 @@
#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En)
/* TCR_EL2 Registers bits */
+#define TCR_EL2_DS (1UL << 32)
#define TCR_EL2_RES1 ((1U << 31) | (1 << 23))
#define TCR_EL2_TBI (1 << 20)
#define TCR_EL2_PS_SHIFT 16
@@ -122,6 +123,7 @@
TCR_EL2_ORGN0_MASK | TCR_EL2_IRGN0_MASK | TCR_EL2_T0SZ_MASK)
/* VTCR_EL2 Registers bits */
+#define VTCR_EL2_DS TCR_EL2_DS
#define VTCR_EL2_RES1 (1U << 31)
#define VTCR_EL2_HD (1 << 22)
#define VTCR_EL2_HA (1 << 21)
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (5 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 06/12] KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2 Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-21 20:34 ` Oliver Upton
2023-11-16 14:29 ` [PATCH v5 08/12] KVM: arm64: Convert translation level parameter to s8 Ryan Roberts
` (5 subsequent siblings)
12 siblings, 1 reply; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
page size we are using, always use LPA2-style page-tables for stage 2
and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
IPA size or HW-implemented PA size. When in use we can now support up to
52-bit IPA and PA sizes.
We use the previously created cpu feature to track whether LPA2 is
supported for deciding whether to use the LPA2 or classic pte format.
Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
4KB, 64GB with 16KB). We explicitly don't enable these in the library
because stage2_apply_range() works on batch sizes of the largest used
block mapping, and increasing the size of the batch would lead to soft
lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
stage2_apply_range() batch size to largest block").
With the addition of LPA2 support in the hypervisor, the PA size
supported by the HW must be capped with a runtime decision, rather than
simply using a compile-time decision based on PA_BITS. For example, on a
system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
or 16KB kernel compiled with LPA2 support must still limit the PA size
to 48 bits.
Therefore, move the insertion of the PS field into TCR_EL2 out of
__kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
where the rest of TCR_EL2 is prepared. This allows us to figure out PS
with kvm_get_parange(), which has the appropriate logic to ensure the
above requirement. (and the PS field of VTCR_EL2 is already populated
this way).
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/kvm_mmu.h | 2 +-
arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
arch/arm64/kvm/arm.c | 5 +++
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ---
arch/arm64/kvm/hyp/pgtable.c | 15 +++++++--
5 files changed, 54 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 31e8d7faed65..f4e4fcb35afc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
return container_of(mmu->arch, struct kvm, arch);
}
-#define kvm_lpa2_is_enabled() false
+#define kvm_lpa2_is_enabled() system_supports_lpa2()
#endif /* __ASSEMBLY__ */
#endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index d3e354bb8351..d738c47d8a77 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -25,12 +25,22 @@
#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U
#endif
+static inline u64 kvm_get_parange_max(void)
+{
+ if (system_supports_lpa2() ||
+ (IS_ENABLED(CONFIG_ARM64_PA_BITS_52) && PAGE_SHIFT == 16))
+ return ID_AA64MMFR0_EL1_PARANGE_52;
+ else
+ return ID_AA64MMFR0_EL1_PARANGE_48;
+}
+
static inline u64 kvm_get_parange(u64 mmfr0)
{
+ u64 parange_max = kvm_get_parange_max();
u64 parange = cpuid_feature_extract_unsigned_field(mmfr0,
ID_AA64MMFR0_EL1_PARANGE_SHIFT);
- if (parange > ID_AA64MMFR0_EL1_PARANGE_MAX)
- parange = ID_AA64MMFR0_EL1_PARANGE_MAX;
+ if (parange > parange_max)
+ parange = parange_max;
return parange;
}
@@ -41,6 +51,8 @@ typedef u64 kvm_pte_t;
#define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT)
#define KVM_PTE_ADDR_51_48 GENMASK(15, 12)
+#define KVM_PTE_ADDR_MASK_LPA2 GENMASK(49, PAGE_SHIFT)
+#define KVM_PTE_ADDR_51_50_LPA2 GENMASK(9, 8)
#define KVM_PHYS_INVALID (-1ULL)
@@ -51,21 +63,34 @@ static inline bool kvm_pte_valid(kvm_pte_t pte)
static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
{
- u64 pa = pte & KVM_PTE_ADDR_MASK;
-
- if (PAGE_SHIFT == 16)
- pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+ u64 pa;
+
+ if (system_supports_lpa2()) {
+ pa = pte & KVM_PTE_ADDR_MASK_LPA2;
+ pa |= FIELD_GET(KVM_PTE_ADDR_51_50_LPA2, pte) << 50;
+ } else {
+ pa = pte & KVM_PTE_ADDR_MASK;
+ if (PAGE_SHIFT == 16)
+ pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+ }
return pa;
}
static inline kvm_pte_t kvm_phys_to_pte(u64 pa)
{
- kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
-
- if (PAGE_SHIFT == 16) {
- pa &= GENMASK(51, 48);
- pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+ kvm_pte_t pte;
+
+ if (system_supports_lpa2()) {
+ pte = pa & KVM_PTE_ADDR_MASK_LPA2;
+ pa &= GENMASK(51, 50);
+ pte |= FIELD_PREP(KVM_PTE_ADDR_51_50_LPA2, pa >> 50);
+ } else {
+ pte = pa & KVM_PTE_ADDR_MASK;
+ if (PAGE_SHIFT == 16) {
+ pa &= GENMASK(51, 48);
+ pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+ }
}
return pte;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e5f75f1f1085..082100c582e2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1837,6 +1837,7 @@ static int kvm_init_vector_slots(void)
static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
{
struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
+ u64 mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
unsigned long tcr;
/*
@@ -1859,6 +1860,10 @@ static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits)
}
tcr &= ~TCR_T0SZ_MASK;
tcr |= TCR_T0SZ(hyp_va_bits);
+ tcr &= ~TCR_EL2_PS_MASK;
+ tcr |= FIELD_PREP(TCR_EL2_PS_MASK, kvm_get_parange(mmfr0));
+ if (system_supports_lpa2())
+ tcr |= TCR_EL2_DS;
params->tcr_el2 = tcr;
params->pgd_pa = kvm_mmu_get_httbr();
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
index 1cc06e6797bd..f62a7d360285 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
@@ -122,11 +122,7 @@ alternative_if ARM64_HAS_CNP
alternative_else_nop_endif
msr ttbr0_el2, x2
- /*
- * Set the PS bits in TCR_EL2.
- */
ldr x0, [x0, #NVHE_INIT_TCR_EL2]
- tcr_compute_pa_size x0, #TCR_EL2_PS_SHIFT, x1, x2
msr tcr_el2, x0
isb
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 1966fdee740e..e0cf96bafe4a 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -79,7 +79,10 @@ static bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx)
static bool kvm_phys_is_valid(u64 phys)
{
- return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX));
+ u64 parange_max = kvm_get_parange_max();
+ u8 shift = id_aa64mmfr0_parange_to_phys_shift(parange_max);
+
+ return phys < BIT(shift);
}
static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx, u64 phys)
@@ -408,7 +411,8 @@ static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep)
}
attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap);
- attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
+ if (!system_supports_lpa2())
+ attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh);
attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF;
attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
*ptep = attr;
@@ -654,6 +658,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
vtcr |= VTCR_EL2_HA;
#endif /* CONFIG_ARM64_HW_AFDBM */
+ if (system_supports_lpa2())
+ vtcr |= VTCR_EL2_DS;
+
/* Set the vmid bits */
vtcr |= (get_vmid_bits(mmfr1) == 16) ?
VTCR_EL2_VS_16BIT :
@@ -711,7 +718,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p
if (prot & KVM_PGTABLE_PROT_W)
attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
- attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+ if (!system_supports_lpa2())
+ attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+
attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
*ptep = attr;
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* Re: [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-16 14:29 ` [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1 Ryan Roberts
@ 2023-11-21 20:34 ` Oliver Upton
2023-11-22 13:41 ` Ryan Roberts
0 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2023-11-21 20:34 UTC (permalink / raw)
To: Ryan Roberts
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
> page size we are using, always use LPA2-style page-tables for stage 2
> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
> IPA size or HW-implemented PA size. When in use we can now support up to
> 52-bit IPA and PA sizes.
>
> We use the previously created cpu feature to track whether LPA2 is
> supported for deciding whether to use the LPA2 or classic pte format.
>
> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
> because stage2_apply_range() works on batch sizes of the largest used
> block mapping, and increasing the size of the batch would lead to soft
> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
> stage2_apply_range() batch size to largest block").
>
> With the addition of LPA2 support in the hypervisor, the PA size
> supported by the HW must be capped with a runtime decision, rather than
> simply using a compile-time decision based on PA_BITS. For example, on a
> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
> or 16KB kernel compiled with LPA2 support must still limit the PA size
> to 48 bits.
>
> Therefore, move the insertion of the PS field into TCR_EL2 out of
> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
> with kvm_get_parange(), which has the appropriate logic to ensure the
> above requirement. (and the PS field of VTCR_EL2 is already populated
> this way).
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> arch/arm64/include/asm/kvm_mmu.h | 2 +-
> arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
> arch/arm64/kvm/arm.c | 5 +++
> arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ---
> arch/arm64/kvm/hyp/pgtable.c | 15 +++++++--
> 5 files changed, 54 insertions(+), 19 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 31e8d7faed65..f4e4fcb35afc 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
> return container_of(mmu->arch, struct kvm, arch);
> }
>
> -#define kvm_lpa2_is_enabled() false
> +#define kvm_lpa2_is_enabled() system_supports_lpa2()
Can we use this predicate consistently throughout the KVM code? Looks
like the rest of this diff is using system_supports_lpa2() directly.
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-21 20:34 ` Oliver Upton
@ 2023-11-22 13:41 ` Ryan Roberts
2023-11-22 15:21 ` Marc Zyngier
0 siblings, 1 reply; 26+ messages in thread
From: Ryan Roberts @ 2023-11-22 13:41 UTC (permalink / raw)
To: Oliver Upton
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On 21/11/2023 20:34, Oliver Upton wrote:
> On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
>> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
>> page size we are using, always use LPA2-style page-tables for stage 2
>> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
>> IPA size or HW-implemented PA size. When in use we can now support up to
>> 52-bit IPA and PA sizes.
>>
>> We use the previously created cpu feature to track whether LPA2 is
>> supported for deciding whether to use the LPA2 or classic pte format.
>>
>> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
>> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
>> because stage2_apply_range() works on batch sizes of the largest used
>> block mapping, and increasing the size of the batch would lead to soft
>> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
>> stage2_apply_range() batch size to largest block").
>>
>> With the addition of LPA2 support in the hypervisor, the PA size
>> supported by the HW must be capped with a runtime decision, rather than
>> simply using a compile-time decision based on PA_BITS. For example, on a
>> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
>> or 16KB kernel compiled with LPA2 support must still limit the PA size
>> to 48 bits.
>>
>> Therefore, move the insertion of the PS field into TCR_EL2 out of
>> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
>> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
>> with kvm_get_parange(), which has the appropriate logic to ensure the
>> above requirement. (and the PS field of VTCR_EL2 is already populated
>> this way).
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>> arch/arm64/include/asm/kvm_mmu.h | 2 +-
>> arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
>> arch/arm64/kvm/arm.c | 5 +++
>> arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ---
>> arch/arm64/kvm/hyp/pgtable.c | 15 +++++++--
>> 5 files changed, 54 insertions(+), 19 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 31e8d7faed65..f4e4fcb35afc 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
>> return container_of(mmu->arch, struct kvm, arch);
>> }
>>
>> -#define kvm_lpa2_is_enabled() false
>> +#define kvm_lpa2_is_enabled() system_supports_lpa2()
>
> Can we use this predicate consistently throughout the KVM code? Looks
> like the rest of this diff is using system_supports_lpa2() directly.
My thinking was that system_supports_lpa2() is an input to KVM's policy to
decide if it is going to use LPA2 (currently that policy is very simple - if the
system supports it, then KVM uses it - but it doesn't have to be that way), and
kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
and the other is an output.
It's a lightly held opinion though - I'll make the change if you insist? :)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-22 13:41 ` Ryan Roberts
@ 2023-11-22 15:21 ` Marc Zyngier
2023-11-24 11:49 ` Ryan Roberts
0 siblings, 1 reply; 26+ messages in thread
From: Marc Zyngier @ 2023-11-22 15:21 UTC (permalink / raw)
To: Ryan Roberts
Cc: Oliver Upton, Catalin Marinas, Will Deacon, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On Wed, 22 Nov 2023 13:41:33 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 21/11/2023 20:34, Oliver Upton wrote:
> > On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
> >> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
> >> page size we are using, always use LPA2-style page-tables for stage 2
> >> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
> >> IPA size or HW-implemented PA size. When in use we can now support up to
> >> 52-bit IPA and PA sizes.
> >>
> >> We use the previously created cpu feature to track whether LPA2 is
> >> supported for deciding whether to use the LPA2 or classic pte format.
> >>
> >> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
> >> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
> >> because stage2_apply_range() works on batch sizes of the largest used
> >> block mapping, and increasing the size of the batch would lead to soft
> >> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
> >> stage2_apply_range() batch size to largest block").
> >>
> >> With the addition of LPA2 support in the hypervisor, the PA size
> >> supported by the HW must be capped with a runtime decision, rather than
> >> simply using a compile-time decision based on PA_BITS. For example, on a
> >> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
> >> or 16KB kernel compiled with LPA2 support must still limit the PA size
> >> to 48 bits.
> >>
> >> Therefore, move the insertion of the PS field into TCR_EL2 out of
> >> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
> >> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
> >> with kvm_get_parange(), which has the appropriate logic to ensure the
> >> above requirement. (and the PS field of VTCR_EL2 is already populated
> >> this way).
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> >> ---
> >> arch/arm64/include/asm/kvm_mmu.h | 2 +-
> >> arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
> >> arch/arm64/kvm/arm.c | 5 +++
> >> arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ---
> >> arch/arm64/kvm/hyp/pgtable.c | 15 +++++++--
> >> 5 files changed, 54 insertions(+), 19 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 31e8d7faed65..f4e4fcb35afc 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
> >> return container_of(mmu->arch, struct kvm, arch);
> >> }
> >>
> >> -#define kvm_lpa2_is_enabled() false
> >> +#define kvm_lpa2_is_enabled() system_supports_lpa2()
> >
> > Can we use this predicate consistently throughout the KVM code? Looks
> > like the rest of this diff is using system_supports_lpa2() directly.
>
> My thinking was that system_supports_lpa2() is an input to KVM's policy to
> decide if it is going to use LPA2 (currently that policy is very simple - if the
> system supports it, then KVM uses it - but it doesn't have to be that way), and
> kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
> and the other is an output.
>
> It's a lightly held opinion though - I'll make the change if you insist? :)
<bikeshed>
I personally don't find this dichotomy very useful. It could make
sense if we used the page table walker for S1 outside of KVM, but
that's not the case at the moment.
If there is no plan for such a use case, I'd rather see a single
predicate, making the code a bit more readable.
</bikeshed>
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-22 15:21 ` Marc Zyngier
@ 2023-11-24 11:49 ` Ryan Roberts
2023-11-27 9:32 ` Marc Zyngier
0 siblings, 1 reply; 26+ messages in thread
From: Ryan Roberts @ 2023-11-24 11:49 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, Catalin Marinas, Will Deacon, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On 22/11/2023 15:21, Marc Zyngier wrote:
> On Wed, 22 Nov 2023 13:41:33 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> On 21/11/2023 20:34, Oliver Upton wrote:
>>> On Thu, Nov 16, 2023 at 02:29:26PM +0000, Ryan Roberts wrote:
>>>> Implement a simple policy whereby if the HW supports FEAT_LPA2 for the
>>>> page size we are using, always use LPA2-style page-tables for stage 2
>>>> and hyp stage 1 (assuming an nvhe hyp), regardless of the VMM-requested
>>>> IPA size or HW-implemented PA size. When in use we can now support up to
>>>> 52-bit IPA and PA sizes.
>>>>
>>>> We use the previously created cpu feature to track whether LPA2 is
>>>> supported for deciding whether to use the LPA2 or classic pte format.
>>>>
>>>> Note that FEAT_LPA2 brings support for bigger block mappings (512GB with
>>>> 4KB, 64GB with 16KB). We explicitly don't enable these in the library
>>>> because stage2_apply_range() works on batch sizes of the largest used
>>>> block mapping, and increasing the size of the batch would lead to soft
>>>> lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit
>>>> stage2_apply_range() batch size to largest block").
>>>>
>>>> With the addition of LPA2 support in the hypervisor, the PA size
>>>> supported by the HW must be capped with a runtime decision, rather than
>>>> simply using a compile-time decision based on PA_BITS. For example, on a
>>>> system that advertises 52 bit PA but does not support FEAT_LPA2, A 4KB
>>>> or 16KB kernel compiled with LPA2 support must still limit the PA size
>>>> to 48 bits.
>>>>
>>>> Therefore, move the insertion of the PS field into TCR_EL2 out of
>>>> __kvm_hyp_init assembly code and instead do it in cpu_prepare_hyp_mode()
>>>> where the rest of TCR_EL2 is prepared. This allows us to figure out PS
>>>> with kvm_get_parange(), which has the appropriate logic to ensure the
>>>> above requirement. (and the PS field of VTCR_EL2 is already populated
>>>> this way).
>>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>> ---
>>>> arch/arm64/include/asm/kvm_mmu.h | 2 +-
>>>> arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++-------
>>>> arch/arm64/kvm/arm.c | 5 +++
>>>> arch/arm64/kvm/hyp/nvhe/hyp-init.S | 4 ---
>>>> arch/arm64/kvm/hyp/pgtable.c | 15 +++++++--
>>>> 5 files changed, 54 insertions(+), 19 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>>>> index 31e8d7faed65..f4e4fcb35afc 100644
>>>> --- a/arch/arm64/include/asm/kvm_mmu.h
>>>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>>>> @@ -340,7 +340,7 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu)
>>>> return container_of(mmu->arch, struct kvm, arch);
>>>> }
>>>>
>>>> -#define kvm_lpa2_is_enabled() false
>>>> +#define kvm_lpa2_is_enabled() system_supports_lpa2()
>>>
>>> Can we use this predicate consistently throughout the KVM code? Looks
>>> like the rest of this diff is using system_supports_lpa2() directly.
>>
>> My thinking was that system_supports_lpa2() is an input to KVM's policy to
>> decide if it is going to use LPA2 (currently that policy is very simple - if the
>> system supports it, then KVM uses it - but it doesn't have to be that way), and
>> kvm_lpa2_is_enabled() is how KVM exports its policy decision, so one is an input
>> and the other is an output.
>>
>> It's a lightly held opinion though - I'll make the change if you insist? :)
>
> <bikeshed>
> I personally don't find this dichotomy very useful. It could make
> sense if we used the page table walker for S1 outside of KVM, but
> that's not the case at the moment.
>
> If there is no plan for such a use case, I'd rather see a single
> predicate, making the code a bit more readable.
> </bikeshed>
OK fair enough. I've made this change for the next rev.
>
> M.
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-24 11:49 ` Ryan Roberts
@ 2023-11-27 9:32 ` Marc Zyngier
2023-11-27 9:43 ` Ryan Roberts
0 siblings, 1 reply; 26+ messages in thread
From: Marc Zyngier @ 2023-11-27 9:32 UTC (permalink / raw)
To: Ryan Roberts
Cc: Oliver Upton, Catalin Marinas, Will Deacon, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On Fri, 24 Nov 2023 11:49:57 +0000,
Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> OK fair enough. I've made this change for the next rev.
Any chance you could post this new revision shortly? It looks ready to
me, and I would really like this to simmer in -next for a while.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
2023-11-27 9:32 ` Marc Zyngier
@ 2023-11-27 9:43 ` Ryan Roberts
0 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-27 9:43 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, Catalin Marinas, Will Deacon, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On 27/11/2023 09:32, Marc Zyngier wrote:
> On Fri, 24 Nov 2023 11:49:57 +0000,
> Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> OK fair enough. I've made this change for the next rev.
>
> Any chance you could post this new revision shortly? It looks ready to
> me, and I would really like this to simmer in -next for a while.
Yes; I was just rerunning the kvm selftests over the weekend. No new issues
there. But I want to rerun the boot tests too, which I will do this morning.
Assuming that's still good, I'll post later today.
>
> Thanks,
>
> M.
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v5 08/12] KVM: arm64: Convert translation level parameter to s8
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (6 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 07/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1 Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 09/12] KVM: arm64: Support up to 5 levels of translation in kvm_pgtable Ryan Roberts
` (4 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
With the introduction of FEAT_LPA2, the Arm ARM adds a new level of
translation, level -1, so levels can now be in the range [-1;3]. 3 is
always the last level and the first level is determined based on the
number of VA bits in use.
Convert level variables to use a signed type in preparation for
supporting this new level -1.
Since the last level is always anchored at 3, and the first level varies
to suit the number of VA/IPA bits, take the opportunity to replace
KVM_PGTABLE_MAX_LEVELS with the 2 macros KVM_PGTABLE_FIRST_LEVEL and
KVM_PGTABLE_LAST_LEVEL. This removes the assumption from the code that
levels run from 0 to KVM_PGTABLE_MAX_LEVELS - 1, which will soon no
longer be true.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/kvm_emulate.h | 2 +-
arch/arm64/include/asm/kvm_pgtable.h | 31 +++++++------
arch/arm64/include/asm/kvm_pkvm.h | 5 +-
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 +--
arch/arm64/kvm/hyp/nvhe/mm.c | 4 +-
arch/arm64/kvm/hyp/nvhe/setup.c | 2 +-
arch/arm64/kvm/hyp/pgtable.c | 66 +++++++++++++++------------
arch/arm64/kvm/mmu.c | 16 ++++---
8 files changed, 71 insertions(+), 61 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 78a550537b67..13fd9dbf2d1d 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -409,7 +409,7 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE;
}
-static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu)
+static __always_inline s8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu)
{
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL;
}
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index d738c47d8a77..89eb648c40f3 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -11,7 +11,8 @@
#include <linux/kvm_host.h>
#include <linux/types.h>
-#define KVM_PGTABLE_MAX_LEVELS 4U
+#define KVM_PGTABLE_FIRST_LEVEL 0
+#define KVM_PGTABLE_LAST_LEVEL 3
/*
* The largest supported block sizes for KVM (no 52-bit PA support):
@@ -20,9 +21,9 @@
* - 64K (level 2): 512MB
*/
#ifdef CONFIG_ARM64_4K_PAGES
-#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1U
+#define KVM_PGTABLE_MIN_BLOCK_LEVEL 1
#else
-#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U
+#define KVM_PGTABLE_MIN_BLOCK_LEVEL 2
#endif
static inline u64 kvm_get_parange_max(void)
@@ -101,28 +102,28 @@ static inline kvm_pfn_t kvm_pte_to_pfn(kvm_pte_t pte)
return __phys_to_pfn(kvm_pte_to_phys(pte));
}
-static inline u64 kvm_granule_shift(u32 level)
+static inline u64 kvm_granule_shift(s8 level)
{
- /* Assumes KVM_PGTABLE_MAX_LEVELS is 4 */
+ /* Assumes KVM_PGTABLE_LAST_LEVEL is 3 */
return ARM64_HW_PGTABLE_LEVEL_SHIFT(level);
}
-static inline u64 kvm_granule_size(u32 level)
+static inline u64 kvm_granule_size(s8 level)
{
return BIT(kvm_granule_shift(level));
}
-static inline bool kvm_level_supports_block_mapping(u32 level)
+static inline bool kvm_level_supports_block_mapping(s8 level)
{
return level >= KVM_PGTABLE_MIN_BLOCK_LEVEL;
}
static inline u32 kvm_supported_block_sizes(void)
{
- u32 level = KVM_PGTABLE_MIN_BLOCK_LEVEL;
+ s8 level = KVM_PGTABLE_MIN_BLOCK_LEVEL;
u32 r = 0;
- for (; level < KVM_PGTABLE_MAX_LEVELS; level++)
+ for (; level <= KVM_PGTABLE_LAST_LEVEL; level++)
r |= BIT(kvm_granule_shift(level));
return r;
@@ -167,7 +168,7 @@ struct kvm_pgtable_mm_ops {
void* (*zalloc_page)(void *arg);
void* (*zalloc_pages_exact)(size_t size);
void (*free_pages_exact)(void *addr, size_t size);
- void (*free_unlinked_table)(void *addr, u32 level);
+ void (*free_unlinked_table)(void *addr, s8 level);
void (*get_page)(void *addr);
void (*put_page)(void *addr);
int (*page_count)(void *addr);
@@ -263,7 +264,7 @@ struct kvm_pgtable_visit_ctx {
u64 start;
u64 addr;
u64 end;
- u32 level;
+ s8 level;
enum kvm_pgtable_walk_flags flags;
};
@@ -366,7 +367,7 @@ static inline bool kvm_pgtable_walk_lock_held(void)
*/
struct kvm_pgtable {
u32 ia_bits;
- u32 start_level;
+ s8 start_level;
kvm_pteref_t pgd;
struct kvm_pgtable_mm_ops *mm_ops;
@@ -500,7 +501,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
* The page-table is assumed to be unreachable by any hardware walkers prior to
* freeing and therefore no TLB invalidation is performed.
*/
-void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level);
+void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, s8 level);
/**
* kvm_pgtable_stage2_create_unlinked() - Create an unlinked stage-2 paging structure.
@@ -524,7 +525,7 @@ void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *p
* an ERR_PTR(error) on failure.
*/
kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
- u64 phys, u32 level,
+ u64 phys, s8 level,
enum kvm_pgtable_prot prot,
void *mc, bool force_pte);
@@ -750,7 +751,7 @@ int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size,
* Return: 0 on success, negative error code on failure.
*/
int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr,
- kvm_pte_t *ptep, u32 *level);
+ kvm_pte_t *ptep, s8 *level);
/**
* kvm_pgtable_stage2_pte_prot() - Retrieve the protection attributes of a
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index e46250a02017..ad9cfb5c1ff4 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -56,10 +56,11 @@ static inline unsigned long hyp_vm_table_pages(void)
static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages)
{
- unsigned long total = 0, i;
+ unsigned long total = 0;
+ int i;
/* Provision the worst case scenario */
- for (i = 0; i < KVM_PGTABLE_MAX_LEVELS; i++) {
+ for (i = KVM_PGTABLE_FIRST_LEVEL; i <= KVM_PGTABLE_LAST_LEVEL; i++) {
nr_pages = DIV_ROUND_UP(nr_pages, PTRS_PER_PTE);
total += nr_pages;
}
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 8d0a5834e883..861c76021a25 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -91,7 +91,7 @@ static void host_s2_put_page(void *addr)
hyp_put_page(&host_s2_pool, addr);
}
-static void host_s2_free_unlinked_table(void *addr, u32 level)
+static void host_s2_free_unlinked_table(void *addr, s8 level)
{
kvm_pgtable_stage2_free_unlinked(&host_mmu.mm_ops, addr, level);
}
@@ -443,7 +443,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range)
{
struct kvm_mem_range cur;
kvm_pte_t pte;
- u32 level;
+ s8 level;
int ret;
hyp_assert_lock_held(&host_mmu.lock);
@@ -462,7 +462,7 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range)
cur.start = ALIGN_DOWN(addr, granule);
cur.end = cur.start + granule;
level++;
- } while ((level < KVM_PGTABLE_MAX_LEVELS) &&
+ } while ((level <= KVM_PGTABLE_LAST_LEVEL) &&
!(kvm_level_supports_block_mapping(level) &&
range_included(&cur, range)));
diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index 65a7a186d7b2..b01a3d1078a8 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -260,7 +260,7 @@ static void fixmap_clear_slot(struct hyp_fixmap_slot *slot)
* https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/T/#mf10dfbaf1eaef9274c581b81c53758918c1d0f03
*/
dsb(ishst);
- __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), (KVM_PGTABLE_MAX_LEVELS - 1));
+ __tlbi_level(vale2is, __TLBI_VADDR(addr, 0), KVM_PGTABLE_LAST_LEVEL);
dsb(ish);
isb();
}
@@ -275,7 +275,7 @@ static int __create_fixmap_slot_cb(const struct kvm_pgtable_visit_ctx *ctx,
{
struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg);
- if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1)
+ if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_LAST_LEVEL)
return -EINVAL;
slot->addr = ctx->addr;
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 0d5e0a89ddce..bc58d1b515af 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -181,7 +181,7 @@ static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx,
if (!kvm_pte_valid(ctx->old))
return 0;
- if (ctx->level != (KVM_PGTABLE_MAX_LEVELS - 1))
+ if (ctx->level != KVM_PGTABLE_LAST_LEVEL)
return -EINVAL;
phys = kvm_pte_to_phys(ctx->old);
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index e0cf96bafe4a..8f42c6f80c9c 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -101,7 +101,7 @@ static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx,
return IS_ALIGNED(ctx->addr, granule);
}
-static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, u32 level)
+static u32 kvm_pgtable_idx(struct kvm_pgtable_walk_data *data, s8 level)
{
u64 shift = kvm_granule_shift(level);
u64 mask = BIT(PAGE_SHIFT - 3) - 1;
@@ -117,7 +117,7 @@ static u32 kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr)
return (addr & mask) >> shift;
}
-static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
+static u32 kvm_pgd_pages(u32 ia_bits, s8 start_level)
{
struct kvm_pgtable pgt = {
.ia_bits = ia_bits,
@@ -127,9 +127,9 @@ static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
return kvm_pgd_page_idx(&pgt, -1ULL) + 1;
}
-static bool kvm_pte_table(kvm_pte_t pte, u32 level)
+static bool kvm_pte_table(kvm_pte_t pte, s8 level)
{
- if (level == KVM_PGTABLE_MAX_LEVELS - 1)
+ if (level == KVM_PGTABLE_LAST_LEVEL)
return false;
if (!kvm_pte_valid(pte))
@@ -157,11 +157,11 @@ static kvm_pte_t kvm_init_table_pte(kvm_pte_t *childp, struct kvm_pgtable_mm_ops
return pte;
}
-static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, u32 level)
+static kvm_pte_t kvm_init_valid_leaf_pte(u64 pa, kvm_pte_t attr, s8 level)
{
kvm_pte_t pte = kvm_phys_to_pte(pa);
- u64 type = (level == KVM_PGTABLE_MAX_LEVELS - 1) ? KVM_PTE_TYPE_PAGE :
- KVM_PTE_TYPE_BLOCK;
+ u64 type = (level == KVM_PGTABLE_LAST_LEVEL) ? KVM_PTE_TYPE_PAGE :
+ KVM_PTE_TYPE_BLOCK;
pte |= attr & (KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI);
pte |= FIELD_PREP(KVM_PTE_TYPE, type);
@@ -206,11 +206,11 @@ static bool kvm_pgtable_walk_continue(const struct kvm_pgtable_walker *walker,
}
static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data,
- struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level);
+ struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level);
static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
struct kvm_pgtable_mm_ops *mm_ops,
- kvm_pteref_t pteref, u32 level)
+ kvm_pteref_t pteref, s8 level)
{
enum kvm_pgtable_walk_flags flags = data->walker->flags;
kvm_pte_t *ptep = kvm_dereference_pteref(data->walker, pteref);
@@ -275,12 +275,13 @@ static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data,
}
static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data,
- struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, u32 level)
+ struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level)
{
u32 idx;
int ret = 0;
- if (WARN_ON_ONCE(level >= KVM_PGTABLE_MAX_LEVELS))
+ if (WARN_ON_ONCE(level < KVM_PGTABLE_FIRST_LEVEL ||
+ level > KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
for (idx = kvm_pgtable_idx(data, level); idx < PTRS_PER_PTE; ++idx) {
@@ -343,7 +344,7 @@ int kvm_pgtable_walk(struct kvm_pgtable *pgt, u64 addr, u64 size,
struct leaf_walk_data {
kvm_pte_t pte;
- u32 level;
+ s8 level;
};
static int leaf_walker(const struct kvm_pgtable_visit_ctx *ctx,
@@ -358,7 +359,7 @@ static int leaf_walker(const struct kvm_pgtable_visit_ctx *ctx,
}
int kvm_pgtable_get_leaf(struct kvm_pgtable *pgt, u64 addr,
- kvm_pte_t *ptep, u32 *level)
+ kvm_pte_t *ptep, s8 *level)
{
struct leaf_walk_data data;
struct kvm_pgtable_walker walker = {
@@ -471,7 +472,7 @@ static int hyp_map_walker(const struct kvm_pgtable_visit_ctx *ctx,
if (hyp_map_walker_try_leaf(ctx, data))
return 0;
- if (WARN_ON(ctx->level == KVM_PGTABLE_MAX_LEVELS - 1))
+ if (WARN_ON(ctx->level == KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
childp = (kvm_pte_t *)mm_ops->zalloc_page(NULL);
@@ -567,14 +568,19 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size)
int kvm_pgtable_hyp_init(struct kvm_pgtable *pgt, u32 va_bits,
struct kvm_pgtable_mm_ops *mm_ops)
{
- u64 levels = ARM64_HW_PGTABLE_LEVELS(va_bits);
+ s8 start_level = KVM_PGTABLE_LAST_LEVEL + 1 -
+ ARM64_HW_PGTABLE_LEVELS(va_bits);
+
+ if (start_level < KVM_PGTABLE_FIRST_LEVEL ||
+ start_level > KVM_PGTABLE_LAST_LEVEL)
+ return -EINVAL;
pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_page(NULL);
if (!pgt->pgd)
return -ENOMEM;
pgt->ia_bits = va_bits;
- pgt->start_level = KVM_PGTABLE_MAX_LEVELS - levels;
+ pgt->start_level = start_level;
pgt->mm_ops = mm_ops;
pgt->mmu = NULL;
pgt->force_pte_cb = NULL;
@@ -628,7 +634,7 @@ struct stage2_map_data {
u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
{
u64 vtcr = VTCR_EL2_FLAGS;
- u8 lvls;
+ s8 lvls;
vtcr |= kvm_get_parange(mmfr0) << VTCR_EL2_PS_SHIFT;
vtcr |= VTCR_EL2_T0SZ(phys_shift);
@@ -911,7 +917,7 @@ static bool stage2_leaf_mapping_allowed(const struct kvm_pgtable_visit_ctx *ctx,
{
u64 phys = stage2_map_walker_phys_addr(ctx, data);
- if (data->force_pte && (ctx->level < (KVM_PGTABLE_MAX_LEVELS - 1)))
+ if (data->force_pte && ctx->level < KVM_PGTABLE_LAST_LEVEL)
return false;
return kvm_block_mapping_supported(ctx, phys);
@@ -990,7 +996,7 @@ static int stage2_map_walk_leaf(const struct kvm_pgtable_visit_ctx *ctx,
if (ret != -E2BIG)
return ret;
- if (WARN_ON(ctx->level == KVM_PGTABLE_MAX_LEVELS - 1))
+ if (WARN_ON(ctx->level == KVM_PGTABLE_LAST_LEVEL))
return -EINVAL;
if (!data->memcache)
@@ -1160,7 +1166,7 @@ struct stage2_attr_data {
kvm_pte_t attr_set;
kvm_pte_t attr_clr;
kvm_pte_t pte;
- u32 level;
+ s8 level;
};
static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx,
@@ -1203,7 +1209,7 @@ static int stage2_attr_walker(const struct kvm_pgtable_visit_ctx *ctx,
static int stage2_update_leaf_attrs(struct kvm_pgtable *pgt, u64 addr,
u64 size, kvm_pte_t attr_set,
kvm_pte_t attr_clr, kvm_pte_t *orig_pte,
- u32 *level, enum kvm_pgtable_walk_flags flags)
+ s8 *level, enum kvm_pgtable_walk_flags flags)
{
int ret;
kvm_pte_t attr_mask = KVM_PTE_LEAF_ATTR_LO | KVM_PTE_LEAF_ATTR_HI;
@@ -1305,7 +1311,7 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr,
enum kvm_pgtable_prot prot)
{
int ret;
- u32 level;
+ s8 level;
kvm_pte_t set = 0, clr = 0;
if (prot & KVM_PTE_LEAF_ATTR_HI_SW)
@@ -1358,7 +1364,7 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size)
}
kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
- u64 phys, u32 level,
+ u64 phys, s8 level,
enum kvm_pgtable_prot prot,
void *mc, bool force_pte)
{
@@ -1416,7 +1422,7 @@ kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt,
* fully populated tree up to the PTE entries. Note that @level is
* interpreted as in "level @level entry".
*/
-static int stage2_block_get_nr_page_tables(u32 level)
+static int stage2_block_get_nr_page_tables(s8 level)
{
switch (level) {
case 1:
@@ -1427,7 +1433,7 @@ static int stage2_block_get_nr_page_tables(u32 level)
return 0;
default:
WARN_ON_ONCE(level < KVM_PGTABLE_MIN_BLOCK_LEVEL ||
- level >= KVM_PGTABLE_MAX_LEVELS);
+ level > KVM_PGTABLE_LAST_LEVEL);
return -EINVAL;
};
}
@@ -1440,13 +1446,13 @@ static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx,
struct kvm_s2_mmu *mmu;
kvm_pte_t pte = ctx->old, new, *childp;
enum kvm_pgtable_prot prot;
- u32 level = ctx->level;
+ s8 level = ctx->level;
bool force_pte;
int nr_pages;
u64 phys;
/* No huge-pages exist at the last level */
- if (level == KVM_PGTABLE_MAX_LEVELS - 1)
+ if (level == KVM_PGTABLE_LAST_LEVEL)
return 0;
/* We only split valid block mappings */
@@ -1523,7 +1529,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu,
u64 vtcr = mmu->vtcr;
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
- u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
+ s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE;
pgt->pgd = (kvm_pteref_t)mm_ops->zalloc_pages_exact(pgd_sz);
@@ -1546,7 +1552,7 @@ size_t kvm_pgtable_stage2_pgd_size(u64 vtcr)
{
u32 ia_bits = VTCR_EL2_IPA(vtcr);
u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr);
- u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
+ s8 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0;
return kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE;
}
@@ -1582,7 +1588,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt)
pgt->pgd = NULL;
}
-void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level)
+void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, s8 level)
{
kvm_pteref_t ptep = (kvm_pteref_t)pgtable;
struct kvm_pgtable_walker walker = {
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d87c8fcc4c24..986a2e6fb900 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -223,12 +223,12 @@ static void stage2_free_unlinked_table_rcu_cb(struct rcu_head *head)
{
struct page *page = container_of(head, struct page, rcu_head);
void *pgtable = page_to_virt(page);
- u32 level = page_private(page);
+ s8 level = page_private(page);
kvm_pgtable_stage2_free_unlinked(&kvm_s2_mm_ops, pgtable, level);
}
-static void stage2_free_unlinked_table(void *addr, u32 level)
+static void stage2_free_unlinked_table(void *addr, s8 level)
{
struct page *page = virt_to_page(addr);
@@ -804,13 +804,13 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr)
struct kvm_pgtable pgt = {
.pgd = (kvm_pteref_t)kvm->mm->pgd,
.ia_bits = vabits_actual,
- .start_level = (KVM_PGTABLE_MAX_LEVELS -
- CONFIG_PGTABLE_LEVELS),
+ .start_level = (KVM_PGTABLE_LAST_LEVEL -
+ CONFIG_PGTABLE_LEVELS + 1),
.mm_ops = &kvm_user_mm_ops,
};
unsigned long flags;
kvm_pte_t pte = 0; /* Keep GCC quiet... */
- u32 level = ~0;
+ s8 level = S8_MAX;
int ret;
/*
@@ -829,7 +829,9 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr)
* Not seeing an error, but not updating level? Something went
* deeply wrong...
*/
- if (WARN_ON(level >= KVM_PGTABLE_MAX_LEVELS))
+ if (WARN_ON(level > KVM_PGTABLE_LAST_LEVEL))
+ return -EFAULT;
+ if (WARN_ON(level < KVM_PGTABLE_FIRST_LEVEL))
return -EFAULT;
/* Oops, the userspace PTs are gone... Replay the fault */
@@ -1388,7 +1390,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
gfn_t gfn;
kvm_pfn_t pfn;
bool logging_active = memslot_is_logging(memslot);
- unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu);
+ s8 fault_level = kvm_vcpu_trap_get_fault_level(vcpu);
long vma_pagesize, fault_granule;
enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
struct kvm_pgtable *pgt;
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 09/12] KVM: arm64: Support up to 5 levels of translation in kvm_pgtable
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (7 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 08/12] KVM: arm64: Convert translation level parameter to s8 Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 10/12] KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems Ryan Roberts
` (3 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
FEAT_LPA2 increases the maximum levels of translation from 4 to 5 for
the 4KB page case, when IA is >48 bits. While we can still use 4 levels
for stage2 translation in this case (due to stage2 allowing concatenated
page tables for first level lookup), the same kvm_pgtable library is
used for the hyp stage1 page tables and stage1 does not support
concatenation.
Therefore, modify the library to support up to 5 levels. Previous
patches already laid the groundwork for this by refactoring code to work
in terms of KVM_PGTABLE_FIRST_LEVEL and KVM_PGTABLE_LAST_LEVEL. So we
just need to change these macros.
The hardware sometimes encodes the new level differently from the
others: One such place is when reading the level from the FSC field in
the ESR_EL2 register. We never expect to see the lowest level (-1) here
since the stage 2 page tables always use concatenated tables for first
level lookup and therefore only use 4 levels of lookup. So we get away
with just adding a comment to explain why we are not being careful about
decoding level -1.
For stage2 VTCR_EL2.SL2 is introduced to encode the new start level.
However, since we always use concatenated page tables for first level
look up at stage2 (and therefore we will never need the new extra level)
we never touch this new field.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/kvm_emulate.h | 10 ++++++++++
arch/arm64/include/asm/kvm_pgtable.h | 2 +-
arch/arm64/kvm/hyp/pgtable.c | 9 +++++++++
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 13fd9dbf2d1d..d4f1e9cdd554 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -411,6 +411,16 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc
static __always_inline s8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu)
{
+ /*
+ * Note: With the introduction of FEAT_LPA2 an extra level of
+ * translation (level -1) is added. This level (obviously) doesn't
+ * follow the previous convention of encoding the 4 levels in the 2 LSBs
+ * of the FSC so this function breaks if the fault is for level -1.
+ *
+ * However, stage2 tables always use concatenated tables for first level
+ * lookup and therefore it is guaranteed that the level will be between
+ * 0 and 3, and this function continues to work.
+ */
return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL;
}
diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 89eb648c40f3..f036ee861675 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -11,7 +11,7 @@
#include <linux/kvm_host.h>
#include <linux/types.h>
-#define KVM_PGTABLE_FIRST_LEVEL 0
+#define KVM_PGTABLE_FIRST_LEVEL -1
#define KVM_PGTABLE_LAST_LEVEL 3
/*
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 8f42c6f80c9c..cccea87556e0 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -645,6 +645,15 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
lvls = stage2_pgtable_levels(phys_shift);
if (lvls < 2)
lvls = 2;
+
+ /*
+ * When LPA2 is enabled, the HW supports an extra level of translation
+ * (for 5 in total) when using 4K pages. It also introduces VTCR_EL2.SL2
+ * to as an addition to SL0 to enable encoding this extra start level.
+ * However, since we always use concatenated pages for the first level
+ * lookup, we will never need this extra level and therefore do not need
+ * to touch SL2.
+ */
vtcr |= VTCR_EL2_LVLS_TO_SL0(lvls);
#ifdef CONFIG_ARM64_HW_AFDBM
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 10/12] KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (8 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 09/12] KVM: arm64: Support up to 5 levels of translation in kvm_pgtable Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-16 14:29 ` [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size Ryan Roberts
` (2 subsequent siblings)
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
With all the page-table infrastructure in place, we can finally increase
the maximum permisable IPA size to 52-bits on 4KB and 16KB page systems
that have FEAT_LPA2.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/kvm/reset.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5bb4de162cab..a7356f251473 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -280,12 +280,11 @@ int __init kvm_set_ipa_limit(void)
parange = cpuid_feature_extract_unsigned_field(mmfr0,
ID_AA64MMFR0_EL1_PARANGE_SHIFT);
/*
- * IPA size beyond 48 bits could not be supported
- * on either 4K or 16K page size. Hence let's cap
- * it to 48 bits, in case it's reported as larger
- * on the system.
+ * IPA size beyond 48 bits for 4K and 16K page size is only supported
+ * when LPA2 is available. So if we have LPA2, enable it, else cap to 48
+ * bits, in case it's reported as larger on the system.
*/
- if (PAGE_SIZE != SZ_64K)
+ if (!system_supports_lpa2() && PAGE_SIZE != SZ_64K)
parange = min(parange, (unsigned int)ID_AA64MMFR0_EL1_PARANGE_48);
/*
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (9 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 10/12] KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-21 23:27 ` Oliver Upton
2023-11-21 23:34 ` Oliver Upton
2023-11-16 14:29 ` [PATCH v5 12/12] KVM: selftests: arm64: Support P52V48 4K and 16K guest_modes Ryan Roberts
2023-11-21 23:38 ` [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Oliver Upton
12 siblings, 2 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
We are about to add 52 bit PA guest modes for 4K and 16K pages when the
system supports LPA2. In preparation beef up the logic that parses mmfr0
to also tell us what the maximum supported PA size is for each page
size. Max PA size = 0 implies the page size is not supported at all.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
.../selftests/kvm/include/aarch64/processor.h | 4 +-
.../selftests/kvm/lib/aarch64/processor.c | 27 ++++++++++---
tools/testing/selftests/kvm/lib/guest_modes.c | 40 ++++++++-----------
3 files changed, 41 insertions(+), 30 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
index c42d683102c7..cf20e44e86f2 100644
--- a/tools/testing/selftests/kvm/include/aarch64/processor.h
+++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
@@ -119,8 +119,8 @@ enum {
/* Access flag update enable/disable */
#define TCR_EL1_HA (1ULL << 39)
-void aarch64_get_supported_page_sizes(uint32_t ipa,
- bool *ps4k, bool *ps16k, bool *ps64k);
+void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
+ uint32_t *ipa16k, uint32_t *ipa64k);
void vm_init_descriptor_tables(struct kvm_vm *vm);
void vcpu_init_descriptor_tables(struct kvm_vcpu *vcpu);
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 6fe12e985ba5..917cfeddb6b4 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -492,12 +492,24 @@ uint32_t guest_get_vcpuid(void)
return read_sysreg(tpidr_el1);
}
-void aarch64_get_supported_page_sizes(uint32_t ipa,
- bool *ps4k, bool *ps16k, bool *ps64k)
+static uint32_t max_ipa_for_page_size(uint32_t vm_ipa, uint32_t gran,
+ uint32_t not_sup_val, uint32_t ipa52_min_val)
+{
+ if (gran == not_sup_val)
+ return 0;
+ else if (gran >= ipa52_min_val && vm_ipa >= 52)
+ return 52;
+ else
+ return min(vm_ipa, 48U);
+}
+
+void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
+ uint32_t *ipa16k, uint32_t *ipa64k)
{
struct kvm_vcpu_init preferred_init;
int kvm_fd, vm_fd, vcpu_fd, err;
uint64_t val;
+ uint32_t gran;
struct kvm_one_reg reg = {
.id = KVM_ARM64_SYS_REG(SYS_ID_AA64MMFR0_EL1),
.addr = (uint64_t)&val,
@@ -518,9 +530,14 @@ void aarch64_get_supported_page_sizes(uint32_t ipa,
err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®);
TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd));
- *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val) != 0xf;
- *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val) == 0;
- *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val) != 0;
+ gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val);
+ *ipa4k = max_ipa_for_page_size(ipa, gran, 0xf, 1);
+
+ gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val);
+ *ipa64k = max_ipa_for_page_size(ipa, gran, 0xf, 0);
+
+ gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val);
+ *ipa16k = max_ipa_for_page_size(ipa, gran, 0, 2);
close(vcpu_fd);
close(vm_fd);
diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c
index 1df3ce4b16fd..c64c5cf49942 100644
--- a/tools/testing/selftests/kvm/lib/guest_modes.c
+++ b/tools/testing/selftests/kvm/lib/guest_modes.c
@@ -18,33 +18,27 @@ void guest_modes_append_default(void)
#else
{
unsigned int limit = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE);
- bool ps4k, ps16k, ps64k;
+ uint32_t ipa4k, ipa16k, ipa64k;
int i;
- aarch64_get_supported_page_sizes(limit, &ps4k, &ps16k, &ps64k);
+ aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k);
- vm_mode_default = NUM_VM_MODES;
+ guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52, ipa64k >= 52);
- if (limit >= 52)
- guest_mode_append(VM_MODE_P52V48_64K, ps64k, ps64k);
- if (limit >= 48) {
- guest_mode_append(VM_MODE_P48V48_4K, ps4k, ps4k);
- guest_mode_append(VM_MODE_P48V48_16K, ps16k, ps16k);
- guest_mode_append(VM_MODE_P48V48_64K, ps64k, ps64k);
- }
- if (limit >= 40) {
- guest_mode_append(VM_MODE_P40V48_4K, ps4k, ps4k);
- guest_mode_append(VM_MODE_P40V48_16K, ps16k, ps16k);
- guest_mode_append(VM_MODE_P40V48_64K, ps64k, ps64k);
- if (ps4k)
- vm_mode_default = VM_MODE_P40V48_4K;
- }
- if (limit >= 36) {
- guest_mode_append(VM_MODE_P36V48_4K, ps4k, ps4k);
- guest_mode_append(VM_MODE_P36V48_16K, ps16k, ps16k);
- guest_mode_append(VM_MODE_P36V48_64K, ps64k, ps64k);
- guest_mode_append(VM_MODE_P36V47_16K, ps16k, ps16k);
- }
+ guest_mode_append(VM_MODE_P48V48_4K, ipa4k >= 48, ipa4k >= 48);
+ guest_mode_append(VM_MODE_P48V48_16K, ipa16k >= 48, ipa16k >= 48);
+ guest_mode_append(VM_MODE_P48V48_64K, ipa64k >= 48, ipa16k >= 48);
+
+ guest_mode_append(VM_MODE_P40V48_4K, ipa4k >= 40, ipa4k >= 40);
+ guest_mode_append(VM_MODE_P40V48_16K, ipa16k >= 40, ipa16k >= 40);
+ guest_mode_append(VM_MODE_P40V48_64K, ipa64k >= 40, ipa64k >= 40);
+
+ guest_mode_append(VM_MODE_P36V48_4K, ipa4k >= 36, ipa4k >= 36);
+ guest_mode_append(VM_MODE_P36V48_16K, ipa16k >= 36, ipa16k >= 36);
+ guest_mode_append(VM_MODE_P36V48_64K, ipa64k >= 36, ipa64k >= 36);
+ guest_mode_append(VM_MODE_P36V47_16K, ipa16k >= 36, ipa16k >= 36);
+
+ vm_mode_default = ipa4k >= 40 ? VM_MODE_P40V48_4K : NUM_VM_MODES;
/*
* Pick the first supported IPA size if the default
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* Re: [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size
2023-11-16 14:29 ` [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size Ryan Roberts
@ 2023-11-21 23:27 ` Oliver Upton
2023-11-22 13:47 ` Ryan Roberts
2023-11-21 23:34 ` Oliver Upton
1 sibling, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2023-11-21 23:27 UTC (permalink / raw)
To: Ryan Roberts
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On Thu, Nov 16, 2023 at 02:29:30PM +0000, Ryan Roberts wrote:
> We are about to add 52 bit PA guest modes for 4K and 16K pages when the
> system supports LPA2. In preparation beef up the logic that parses mmfr0
> to also tell us what the maximum supported PA size is for each page
> size. Max PA size = 0 implies the page size is not supported at all.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
> .../selftests/kvm/include/aarch64/processor.h | 4 +-
> .../selftests/kvm/lib/aarch64/processor.c | 27 ++++++++++---
> tools/testing/selftests/kvm/lib/guest_modes.c | 40 ++++++++-----------
> 3 files changed, 41 insertions(+), 30 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
> index c42d683102c7..cf20e44e86f2 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
> @@ -119,8 +119,8 @@ enum {
> /* Access flag update enable/disable */
> #define TCR_EL1_HA (1ULL << 39)
>
> -void aarch64_get_supported_page_sizes(uint32_t ipa,
> - bool *ps4k, bool *ps16k, bool *ps64k);
> +void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
> + uint32_t *ipa16k, uint32_t *ipa64k);
>
> void vm_init_descriptor_tables(struct kvm_vm *vm);
> void vcpu_init_descriptor_tables(struct kvm_vcpu *vcpu);
> diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> index 6fe12e985ba5..917cfeddb6b4 100644
> --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
> +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
> @@ -492,12 +492,24 @@ uint32_t guest_get_vcpuid(void)
> return read_sysreg(tpidr_el1);
> }
>
> -void aarch64_get_supported_page_sizes(uint32_t ipa,
> - bool *ps4k, bool *ps16k, bool *ps64k)
> +static uint32_t max_ipa_for_page_size(uint32_t vm_ipa, uint32_t gran,
> + uint32_t not_sup_val, uint32_t ipa52_min_val)
> +{
> + if (gran == not_sup_val)
> + return 0;
> + else if (gran >= ipa52_min_val && vm_ipa >= 52)
> + return 52;
> + else
> + return min(vm_ipa, 48U);
> +}
> +
> +void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
> + uint32_t *ipa16k, uint32_t *ipa64k)
> {
> struct kvm_vcpu_init preferred_init;
> int kvm_fd, vm_fd, vcpu_fd, err;
> uint64_t val;
> + uint32_t gran;
> struct kvm_one_reg reg = {
> .id = KVM_ARM64_SYS_REG(SYS_ID_AA64MMFR0_EL1),
> .addr = (uint64_t)&val,
> @@ -518,9 +530,14 @@ void aarch64_get_supported_page_sizes(uint32_t ipa,
> err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®);
> TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd));
>
> - *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val) != 0xf;
> - *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val) == 0;
> - *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val) != 0;
> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val);
> + *ipa4k = max_ipa_for_page_size(ipa, gran, 0xf, 1);
> +
> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val);
> + *ipa64k = max_ipa_for_page_size(ipa, gran, 0xf, 0);
> +
> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val);
> + *ipa16k = max_ipa_for_page_size(ipa, gran, 0, 2);
>
> close(vcpu_fd);
> close(vm_fd);
> diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c
> index 1df3ce4b16fd..c64c5cf49942 100644
> --- a/tools/testing/selftests/kvm/lib/guest_modes.c
> +++ b/tools/testing/selftests/kvm/lib/guest_modes.c
> @@ -18,33 +18,27 @@ void guest_modes_append_default(void)
> #else
> {
> unsigned int limit = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE);
> - bool ps4k, ps16k, ps64k;
> + uint32_t ipa4k, ipa16k, ipa64k;
> int i;
>
> - aarch64_get_supported_page_sizes(limit, &ps4k, &ps16k, &ps64k);
> + aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k);
>
> - vm_mode_default = NUM_VM_MODES;
> + guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52, ipa64k >= 52);
Can we just change guest_mode_append() to take a single bool argument and
initialize both ::supported and ::enabled to its value?
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size
2023-11-21 23:27 ` Oliver Upton
@ 2023-11-22 13:47 ` Ryan Roberts
0 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-22 13:47 UTC (permalink / raw)
To: Oliver Upton
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On 21/11/2023 23:27, Oliver Upton wrote:
> On Thu, Nov 16, 2023 at 02:29:30PM +0000, Ryan Roberts wrote:
>> We are about to add 52 bit PA guest modes for 4K and 16K pages when the
>> system supports LPA2. In preparation beef up the logic that parses mmfr0
>> to also tell us what the maximum supported PA size is for each page
>> size. Max PA size = 0 implies the page size is not supported at all.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>> .../selftests/kvm/include/aarch64/processor.h | 4 +-
>> .../selftests/kvm/lib/aarch64/processor.c | 27 ++++++++++---
>> tools/testing/selftests/kvm/lib/guest_modes.c | 40 ++++++++-----------
>> 3 files changed, 41 insertions(+), 30 deletions(-)
>>
>> diff --git a/tools/testing/selftests/kvm/include/aarch64/processor.h b/tools/testing/selftests/kvm/include/aarch64/processor.h
>> index c42d683102c7..cf20e44e86f2 100644
>> --- a/tools/testing/selftests/kvm/include/aarch64/processor.h
>> +++ b/tools/testing/selftests/kvm/include/aarch64/processor.h
>> @@ -119,8 +119,8 @@ enum {
>> /* Access flag update enable/disable */
>> #define TCR_EL1_HA (1ULL << 39)
>>
>> -void aarch64_get_supported_page_sizes(uint32_t ipa,
>> - bool *ps4k, bool *ps16k, bool *ps64k);
>> +void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
>> + uint32_t *ipa16k, uint32_t *ipa64k);
>>
>> void vm_init_descriptor_tables(struct kvm_vm *vm);
>> void vcpu_init_descriptor_tables(struct kvm_vcpu *vcpu);
>> diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
>> index 6fe12e985ba5..917cfeddb6b4 100644
>> --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
>> +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
>> @@ -492,12 +492,24 @@ uint32_t guest_get_vcpuid(void)
>> return read_sysreg(tpidr_el1);
>> }
>>
>> -void aarch64_get_supported_page_sizes(uint32_t ipa,
>> - bool *ps4k, bool *ps16k, bool *ps64k)
>> +static uint32_t max_ipa_for_page_size(uint32_t vm_ipa, uint32_t gran,
>> + uint32_t not_sup_val, uint32_t ipa52_min_val)
>> +{
>> + if (gran == not_sup_val)
>> + return 0;
>> + else if (gran >= ipa52_min_val && vm_ipa >= 52)
>> + return 52;
>> + else
>> + return min(vm_ipa, 48U);
>> +}
>> +
>> +void aarch64_get_supported_page_sizes(uint32_t ipa, uint32_t *ipa4k,
>> + uint32_t *ipa16k, uint32_t *ipa64k)
>> {
>> struct kvm_vcpu_init preferred_init;
>> int kvm_fd, vm_fd, vcpu_fd, err;
>> uint64_t val;
>> + uint32_t gran;
>> struct kvm_one_reg reg = {
>> .id = KVM_ARM64_SYS_REG(SYS_ID_AA64MMFR0_EL1),
>> .addr = (uint64_t)&val,
>> @@ -518,9 +530,14 @@ void aarch64_get_supported_page_sizes(uint32_t ipa,
>> err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®);
>> TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd));
>>
>> - *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val) != 0xf;
>> - *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val) == 0;
>> - *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val) != 0;
>> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val);
>> + *ipa4k = max_ipa_for_page_size(ipa, gran, 0xf, 1);
>> +
>> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val);
>> + *ipa64k = max_ipa_for_page_size(ipa, gran, 0xf, 0);
>> +
>> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val);
>> + *ipa16k = max_ipa_for_page_size(ipa, gran, 0, 2);
>>
>> close(vcpu_fd);
>> close(vm_fd);
>> diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c
>> index 1df3ce4b16fd..c64c5cf49942 100644
>> --- a/tools/testing/selftests/kvm/lib/guest_modes.c
>> +++ b/tools/testing/selftests/kvm/lib/guest_modes.c
>> @@ -18,33 +18,27 @@ void guest_modes_append_default(void)
>> #else
>> {
>> unsigned int limit = kvm_check_cap(KVM_CAP_ARM_VM_IPA_SIZE);
>> - bool ps4k, ps16k, ps64k;
>> + uint32_t ipa4k, ipa16k, ipa64k;
>> int i;
>>
>> - aarch64_get_supported_page_sizes(limit, &ps4k, &ps16k, &ps64k);
>> + aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k);
>>
>> - vm_mode_default = NUM_VM_MODES;
>> + guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52, ipa64k >= 52);
>
> Can we just change guest_mode_append() to take a single bool argument and
> initialize both ::supported and ::enabled to its value?
Works for me; I'll fix it up and re-post next week.
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size
2023-11-16 14:29 ` [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size Ryan Roberts
2023-11-21 23:27 ` Oliver Upton
@ 2023-11-21 23:34 ` Oliver Upton
2023-11-22 13:47 ` Ryan Roberts
1 sibling, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2023-11-21 23:34 UTC (permalink / raw)
To: Ryan Roberts
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On Thu, Nov 16, 2023 at 02:29:30PM +0000, Ryan Roberts wrote:
[...]
> @@ -518,9 +530,14 @@ void aarch64_get_supported_page_sizes(uint32_t ipa,
> err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®);
> TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd));
>
> - *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val) != 0xf;
> - *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val) == 0;
> - *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val) != 0;
> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val);
> + *ipa4k = max_ipa_for_page_size(ipa, gran, 0xf, 1);
> +
> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val);
> + *ipa64k = max_ipa_for_page_size(ipa, gran, 0xf, 0);
> +
> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val);
> + *ipa16k = max_ipa_for_page_size(ipa, gran, 0, 2);
Oh, also: we have the generated system register definitions available in
KVM selftests at this point. It'd be a good idea to move away from
'magic' values and use the enumerations instead.
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size
2023-11-21 23:34 ` Oliver Upton
@ 2023-11-22 13:47 ` Ryan Roberts
0 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-22 13:47 UTC (permalink / raw)
To: Oliver Upton
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On 21/11/2023 23:34, Oliver Upton wrote:
> On Thu, Nov 16, 2023 at 02:29:30PM +0000, Ryan Roberts wrote:
>
> [...]
>
>> @@ -518,9 +530,14 @@ void aarch64_get_supported_page_sizes(uint32_t ipa,
>> err = ioctl(vcpu_fd, KVM_GET_ONE_REG, ®);
>> TEST_ASSERT(err == 0, KVM_IOCTL_ERROR(KVM_GET_ONE_REG, vcpu_fd));
>>
>> - *ps4k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val) != 0xf;
>> - *ps64k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val) == 0;
>> - *ps16k = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val) != 0;
>> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN4), val);
>> + *ipa4k = max_ipa_for_page_size(ipa, gran, 0xf, 1);
>> +
>> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN64), val);
>> + *ipa64k = max_ipa_for_page_size(ipa, gran, 0xf, 0);
>> +
>> + gran = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_TGRAN16), val);
>> + *ipa16k = max_ipa_for_page_size(ipa, gran, 0, 2);
>
> Oh, also: we have the generated system register definitions available in
> KVM selftests at this point. It'd be a good idea to move away from
> 'magic' values and use the enumerations instead.
>
OK will do!
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v5 12/12] KVM: selftests: arm64: Support P52V48 4K and 16K guest_modes
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (10 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 11/12] KVM: selftests: arm64: Determine max ipa size per-page size Ryan Roberts
@ 2023-11-16 14:29 ` Ryan Roberts
2023-11-21 23:38 ` [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Oliver Upton
12 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-16 14:29 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Marc Zyngier, Oliver Upton,
Suzuki K Poulose, James Morse, Zenghui Yu, Ard Biesheuvel,
Anshuman Khandual
Cc: Ryan Roberts, linux-arm-kernel, kvmarm
Add support for VM_MODE_P52V48_4K and VM_MODE_P52V48_16K guest modes by
using the FEAT_LPA2 pte format for stage1, when FEAT_LPA2 is available.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
.../selftests/kvm/include/kvm_util_base.h | 1 +
.../selftests/kvm/lib/aarch64/processor.c | 39 ++++++++++++++-----
tools/testing/selftests/kvm/lib/guest_modes.c | 2 +
tools/testing/selftests/kvm/lib/kvm_util.c | 3 ++
4 files changed, 36 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index a18db6a7b3cf..406500fb6e28 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -171,6 +171,7 @@ static inline struct userspace_mem_region *vm_get_mem_region(struct kvm_vm *vm,
enum vm_guest_mode {
VM_MODE_P52V48_4K,
+ VM_MODE_P52V48_16K,
VM_MODE_P52V48_64K,
VM_MODE_P48V48_4K,
VM_MODE_P48V48_16K,
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 917cfeddb6b4..77dc83d83de3 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -12,6 +12,7 @@
#include "kvm_util.h"
#include "processor.h"
#include <linux/bitfield.h>
+#include <linux/sizes.h>
#define DEFAULT_ARM64_GUEST_STACK_VADDR_MIN 0xac0000
@@ -58,13 +59,25 @@ static uint64_t pte_index(struct kvm_vm *vm, vm_vaddr_t gva)
return (gva >> vm->page_shift) & mask;
}
+static inline bool use_lpa2_pte_format(struct kvm_vm *vm)
+{
+ return (vm->page_size == SZ_4K || vm->page_size == SZ_16K) &&
+ (vm->pa_bits > 48 || vm->va_bits > 48);
+}
+
static uint64_t addr_pte(struct kvm_vm *vm, uint64_t pa, uint64_t attrs)
{
uint64_t pte;
- pte = pa & GENMASK(47, vm->page_shift);
- if (vm->page_shift == 16)
- pte |= FIELD_GET(GENMASK(51, 48), pa) << 12;
+ if (use_lpa2_pte_format(vm)) {
+ pte = pa & GENMASK(49, vm->page_shift);
+ pte |= FIELD_GET(GENMASK(51, 50), pa) << 8;
+ attrs &= ~GENMASK(9, 8);
+ } else {
+ pte = pa & GENMASK(47, vm->page_shift);
+ if (vm->page_shift == 16)
+ pte |= FIELD_GET(GENMASK(51, 48), pa) << 12;
+ }
pte |= attrs;
return pte;
@@ -74,9 +87,14 @@ static uint64_t pte_addr(struct kvm_vm *vm, uint64_t pte)
{
uint64_t pa;
- pa = pte & GENMASK(47, vm->page_shift);
- if (vm->page_shift == 16)
- pa |= FIELD_GET(GENMASK(15, 12), pte) << 48;
+ if (use_lpa2_pte_format(vm)) {
+ pa = pte & GENMASK(49, vm->page_shift);
+ pa |= FIELD_GET(GENMASK(9, 8), pte) << 50;
+ } else {
+ pa = pte & GENMASK(47, vm->page_shift);
+ if (vm->page_shift == 16)
+ pa |= FIELD_GET(GENMASK(15, 12), pte) << 48;
+ }
return pa;
}
@@ -266,9 +284,6 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init)
/* Configure base granule size */
switch (vm->mode) {
- case VM_MODE_P52V48_4K:
- TEST_FAIL("AArch64 does not support 4K sized pages "
- "with 52-bit physical address ranges");
case VM_MODE_PXXV48_4K:
TEST_FAIL("AArch64 does not support 4K sized pages "
"with ANY-bit physical address ranges");
@@ -278,12 +293,14 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init)
case VM_MODE_P36V48_64K:
tcr_el1 |= 1ul << 14; /* TG0 = 64KB */
break;
+ case VM_MODE_P52V48_16K:
case VM_MODE_P48V48_16K:
case VM_MODE_P40V48_16K:
case VM_MODE_P36V48_16K:
case VM_MODE_P36V47_16K:
tcr_el1 |= 2ul << 14; /* TG0 = 16KB */
break;
+ case VM_MODE_P52V48_4K:
case VM_MODE_P48V48_4K:
case VM_MODE_P40V48_4K:
case VM_MODE_P36V48_4K:
@@ -297,6 +314,8 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init)
/* Configure output size */
switch (vm->mode) {
+ case VM_MODE_P52V48_4K:
+ case VM_MODE_P52V48_16K:
case VM_MODE_P52V48_64K:
tcr_el1 |= 6ul << 32; /* IPS = 52 bits */
ttbr0_el1 |= FIELD_GET(GENMASK(51, 48), vm->pgd) << 2;
@@ -325,6 +344,8 @@ void aarch64_vcpu_setup(struct kvm_vcpu *vcpu, struct kvm_vcpu_init *init)
/* TCR_EL1 |= IRGN0:WBWA | ORGN0:WBWA | SH0:Inner-Shareable */;
tcr_el1 |= (1 << 8) | (1 << 10) | (3 << 12);
tcr_el1 |= (64 - vm->va_bits) /* T0SZ */;
+ if (use_lpa2_pte_format(vm))
+ tcr_el1 |= (1ul << 59) /* DS */;
vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_SCTLR_EL1), sctlr_el1);
vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TCR_EL1), tcr_el1);
diff --git a/tools/testing/selftests/kvm/lib/guest_modes.c b/tools/testing/selftests/kvm/lib/guest_modes.c
index c64c5cf49942..6634afc22137 100644
--- a/tools/testing/selftests/kvm/lib/guest_modes.c
+++ b/tools/testing/selftests/kvm/lib/guest_modes.c
@@ -23,6 +23,8 @@ void guest_modes_append_default(void)
aarch64_get_supported_page_sizes(limit, &ipa4k, &ipa16k, &ipa64k);
+ guest_mode_append(VM_MODE_P52V48_4K, ipa4k >= 52, ipa4k >= 52);
+ guest_mode_append(VM_MODE_P52V48_16K, ipa16k >= 52, ipa16k >= 52);
guest_mode_append(VM_MODE_P52V48_64K, ipa64k >= 52, ipa64k >= 52);
guest_mode_append(VM_MODE_P48V48_4K, ipa4k >= 48, ipa4k >= 48);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 7a8af1821f5d..aeba7a23105c 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -148,6 +148,7 @@ const char *vm_guest_mode_string(uint32_t i)
{
static const char * const strings[] = {
[VM_MODE_P52V48_4K] = "PA-bits:52, VA-bits:48, 4K pages",
+ [VM_MODE_P52V48_16K] = "PA-bits:52, VA-bits:48, 16K pages",
[VM_MODE_P52V48_64K] = "PA-bits:52, VA-bits:48, 64K pages",
[VM_MODE_P48V48_4K] = "PA-bits:48, VA-bits:48, 4K pages",
[VM_MODE_P48V48_16K] = "PA-bits:48, VA-bits:48, 16K pages",
@@ -173,6 +174,7 @@ const char *vm_guest_mode_string(uint32_t i)
const struct vm_guest_mode_params vm_guest_mode_params[] = {
[VM_MODE_P52V48_4K] = { 52, 48, 0x1000, 12 },
+ [VM_MODE_P52V48_16K] = { 52, 48, 0x4000, 14 },
[VM_MODE_P52V48_64K] = { 52, 48, 0x10000, 16 },
[VM_MODE_P48V48_4K] = { 48, 48, 0x1000, 12 },
[VM_MODE_P48V48_16K] = { 48, 48, 0x4000, 14 },
@@ -251,6 +253,7 @@ struct kvm_vm *____vm_create(enum vm_guest_mode mode)
case VM_MODE_P36V48_64K:
vm->pgtable_levels = 3;
break;
+ case VM_MODE_P52V48_16K:
case VM_MODE_P48V48_16K:
case VM_MODE_P40V48_16K:
case VM_MODE_P36V48_16K:
--
2.25.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 26+ messages in thread* Re: [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2
2023-11-16 14:29 [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Ryan Roberts
` (11 preceding siblings ...)
2023-11-16 14:29 ` [PATCH v5 12/12] KVM: selftests: arm64: Support P52V48 4K and 16K guest_modes Ryan Roberts
@ 2023-11-21 23:38 ` Oliver Upton
2023-11-22 13:37 ` Ryan Roberts
12 siblings, 1 reply; 26+ messages in thread
From: Oliver Upton @ 2023-11-21 23:38 UTC (permalink / raw)
To: Ryan Roberts
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
Hi Ryan,
On Thu, Nov 16, 2023 at 02:29:19PM +0000, Ryan Roberts wrote:
> Additionally, when running page_fault_test, when run on FVP against v6.7-rc1 and
> newer, I'm seeing RCU stalls. I'm confident that this is not an issue introduced
> by this series because it reproduces without my patches.
Oh, that's interesting. Can you gather any additional details, like what
test case in page_fault_test trips the RCU stalls?
> Anshuman Khandual (1):
> arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
>
> Ryan Roberts (11):
> arm64/mm: Modify range-based tlbi to decrement scale
> arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
> arm64/mm: Update tlb invalidation routines for FEAT_LPA2
> arm64: Add ARM64_HAS_LPA2 CPU capability
> KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2
> KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
> KVM: arm64: Convert translation level parameter to s8
> KVM: arm64: Support up to 5 levels of translation in kvm_pgtable
> KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems
> KVM: selftests: arm64: Determine max ipa size per-page size
> KVM: selftests: arm64: Support P52V48 4K and 16K guest_modes
Besides the nitpicks I left on the series, LGTM.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
--
Thanks,
Oliver
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2
2023-11-21 23:38 ` [PATCH v5 00/12] KVM: arm64: Support FEAT_LPA2 at hyp s1 and vm s2 Oliver Upton
@ 2023-11-22 13:37 ` Ryan Roberts
0 siblings, 0 replies; 26+ messages in thread
From: Ryan Roberts @ 2023-11-22 13:37 UTC (permalink / raw)
To: Oliver Upton
Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Suzuki K Poulose,
James Morse, Zenghui Yu, Ard Biesheuvel, Anshuman Khandual,
linux-arm-kernel, kvmarm
On 21/11/2023 23:38, Oliver Upton wrote:
> Hi Ryan,
>
> On Thu, Nov 16, 2023 at 02:29:19PM +0000, Ryan Roberts wrote:
>> Additionally, when running page_fault_test, when run on FVP against v6.7-rc1 and
>> newer, I'm seeing RCU stalls. I'm confident that this is not an issue introduced
>> by this series because it reproduces without my patches.
>
> Oh, that's interesting. Can you gather any additional details, like what
> test case in page_fault_test trips the RCU stalls?
I don't know off the top of my head. I'll switch back to this next week and do
the changes you requested and retest with some extra logging turned on.
Hopefully that will turn up some more info that I can pass along.
>
>> Anshuman Khandual (1):
>> arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
>>
>> Ryan Roberts (11):
>> arm64/mm: Modify range-based tlbi to decrement scale
>> arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs
>> arm64/mm: Update tlb invalidation routines for FEAT_LPA2
>> arm64: Add ARM64_HAS_LPA2 CPU capability
>> KVM: arm64: Add new (V)TCR_EL2 field definitions for FEAT_LPA2
>> KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1
>> KVM: arm64: Convert translation level parameter to s8
>> KVM: arm64: Support up to 5 levels of translation in kvm_pgtable
>> KVM: arm64: Allow guests with >48-bit IPA size on FEAT_LPA2 systems
>> KVM: selftests: arm64: Determine max ipa size per-page size
>> KVM: selftests: arm64: Support P52V48 4K and 16K guest_modes
>
> Besides the nitpicks I left on the series, LGTM.
>
> Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Thanks!
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 26+ messages in thread