From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78D83CDB474 for ; Fri, 20 Oct 2023 09:16:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=tnccU68UHQ/CwpkS/uJ3D4ejKk7bQ3JjxMAID4/Rufo=; b=oQ7DihJku2H5nl qyQCLqVIUqenZDz405RytOhjiwgtjiBScwqNq4bHW5AMzWj2i8nrJ7MF0Txvga8YaDoElLrzC80kp SbNd3lA3ebgYophrAa2691fiE1VolqAcqVBD9G1Zdyg/hynA+DPMenlSvQOWP/xJ9y46FVIS3YOYw rZ4Xet0bFG4BEMcHQImVD+V+FwMxKJQNL1EVjgsc9WCnhvRQ/3gJC3cX/ky0yOLjIIyzL46p044eJ jxr8fs65YzTNJPPk9OXgpZITS6SDuB+N6Gxm5kEdC6Oiv8TYAxtKFl/tunRhAAPbiKmacCM5jvafj mV1TlOV/C/uGowWbQlJQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qtlcC-001ekF-02; Fri, 20 Oct 2023 09:16:20 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qtlc7-001ejb-0m for linux-arm-kernel@lists.infradead.org; Fri, 20 Oct 2023 09:16:18 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id 6B907B8301C; Fri, 20 Oct 2023 09:16:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 99C39C433C7; Fri, 20 Oct 2023 09:16:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697793372; bh=zUzTstD/8ivzZ8wnaaA6XBHjpqgidslogrspjXbMiiA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=b5u3WjtRzqaM3sxXDShMjyDpM709jSPUC9vRPfss1QrEaMYKI2QvkU0D0OH0h6ov7 6ANJTT1l/+VS5/W8ZR/q4jgEqOuQuN+qoO6h1+zOZFkkfxy3p8ArkKPuxqJpA6MMFh 74O9aSoKZ3DWeBfj9uNW3BdSBbW+ttdCYWij9qNZG7rXADsP6c1XXZ58PVgv/407Qm R4o1bgca6V0aW/h3lX3AOsHpu2gwQgfvCAVZlmWQH0Vx5h56An6lwimWnT1E3XHSEW jXpCbpkgOwlWiUAtyyIKg/q2hZwtdIL7aCgLpxwJuoW6RqR0DV38itODOes4IYjxYd UF6MW6fsn2yCg== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qtlc2-0063cq-CH; Fri, 20 Oct 2023 10:16:10 +0100 Date: Fri, 20 Oct 2023 10:16:10 +0100 Message-ID: <86bkctmz6t.wl-maz@kernel.org> From: Marc Zyngier To: Ryan Roberts Cc: Catalin Marinas , Will Deacon , Oliver Upton , Suzuki K Poulose , James Morse , Zenghui Yu , Ard Biesheuvel , Anshuman Khandual , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev Subject: Re: [PATCH v4 06/12] KVM: arm64: Use LPA2 page-tables for stage2 and hyp stage1 In-Reply-To: <20231009185008.3803879-7-ryan.roberts@arm.com> References: <20231009185008.3803879-1-ryan.roberts@arm.com> <20231009185008.3803879-7-ryan.roberts@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: ryan.roberts@arm.com, catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, suzuki.poulose@arm.com, james.morse@arm.com, yuzenghui@huawei.com, ardb@kernel.org, anshuman.khandual@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231020_021615_587417_6389347F X-CRM114-Status: GOOD ( 35.46 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, 09 Oct 2023 19:50:02 +0100, Ryan Roberts wrote: > > Implement a simple policy whereby if the HW supports FEAT_LPA2 for the > page size we are using, always use LPA2-style page-tables for stage 2 > and hyp stage 1, regardless of the VMM-requested IPA size or > HW-implemented PA size. When in use we can now support up to 52-bit IPA > and PA sizes. Maybe worth stating that this S1 comment only applies to the standalone EL2 portion, and not the VHE S1 mappings. > > We use the previously created cpu feature to track whether LPA2 is > supported for deciding whether to use the LPA2 or classic pte format. > > Note that FEAT_LPA2 brings support for bigger block mappings (512GB with > 4KB, 64GB with 16KB). We explicitly don't enable these in the library > because stage2_apply_range() works on batch sizes of the largest used > block mapping, and increasing the size of the batch would lead to soft > lockups. See commit 5994bc9e05c2 ("KVM: arm64: Limit > stage2_apply_range() batch size to largest block"). > > Signed-off-by: Ryan Roberts > --- > arch/arm64/include/asm/kvm_pgtable.h | 47 +++++++++++++++++++++------- > arch/arm64/kvm/arm.c | 2 ++ > arch/arm64/kvm/hyp/nvhe/tlb.c | 3 +- > arch/arm64/kvm/hyp/pgtable.c | 15 +++++++-- > arch/arm64/kvm/hyp/vhe/tlb.c | 3 +- > 5 files changed, 54 insertions(+), 16 deletions(-) > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h > index d3e354bb8351..b240158e1218 100644 > --- a/arch/arm64/include/asm/kvm_pgtable.h > +++ b/arch/arm64/include/asm/kvm_pgtable.h > @@ -25,12 +25,22 @@ > #define KVM_PGTABLE_MIN_BLOCK_LEVEL 2U > #endif > > +static inline u64 kvm_get_parange_max(void) > +{ > + if (system_supports_lpa2() || > + (IS_ENABLED(CONFIG_ARM64_PA_BITS_52) && PAGE_SIZE == SZ_64K)) nit: the rest of the code uses PAGE_SHIFT instead of PAGE_SIZE. Not a big deal, but being consistent might help the reader. > + return ID_AA64MMFR0_EL1_PARANGE_52; > + else > + return ID_AA64MMFR0_EL1_PARANGE_48; > +} > + > static inline u64 kvm_get_parange(u64 mmfr0) > { > + u64 parange_max = kvm_get_parange_max(); > u64 parange = cpuid_feature_extract_unsigned_field(mmfr0, > ID_AA64MMFR0_EL1_PARANGE_SHIFT); > - if (parange > ID_AA64MMFR0_EL1_PARANGE_MAX) > - parange = ID_AA64MMFR0_EL1_PARANGE_MAX; > + if (parange > parange_max) > + parange = parange_max; > > return parange; > } > @@ -41,6 +51,8 @@ typedef u64 kvm_pte_t; > > #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT) > #define KVM_PTE_ADDR_51_48 GENMASK(15, 12) > +#define KVM_PTE_ADDR_MASK_LPA2 GENMASK(49, PAGE_SHIFT) > +#define KVM_PTE_ADDR_51_50_LPA2 GENMASK(9, 8) > > #define KVM_PHYS_INVALID (-1ULL) > > @@ -51,21 +63,34 @@ static inline bool kvm_pte_valid(kvm_pte_t pte) > > static inline u64 kvm_pte_to_phys(kvm_pte_t pte) > { > - u64 pa = pte & KVM_PTE_ADDR_MASK; > - > - if (PAGE_SHIFT == 16) > - pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; > + u64 pa; > + > + if (system_supports_lpa2()) { > + pa = pte & KVM_PTE_ADDR_MASK_LPA2; > + pa |= FIELD_GET(KVM_PTE_ADDR_51_50_LPA2, pte) << 50; > + } else { > + pa = pte & KVM_PTE_ADDR_MASK; > + if (PAGE_SHIFT == 16) > + pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48; > + } > > return pa; > } > > static inline kvm_pte_t kvm_phys_to_pte(u64 pa) > { > - kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK; > - > - if (PAGE_SHIFT == 16) { > - pa &= GENMASK(51, 48); > - pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); > + kvm_pte_t pte; > + > + if (system_supports_lpa2()) { > + pte = pa & KVM_PTE_ADDR_MASK_LPA2; > + pa &= GENMASK(51, 50); > + pte |= FIELD_PREP(KVM_PTE_ADDR_51_50_LPA2, pa >> 50); > + } else { > + pte = pa & KVM_PTE_ADDR_MASK; > + if (PAGE_SHIFT == 16) { > + pa &= GENMASK(51, 48); > + pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); > + } > } > > return pte; > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index 4866b3f7b4ea..73cc67c2a8a7 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -1747,6 +1747,8 @@ static void __init cpu_prepare_hyp_mode(int cpu, u32 hyp_va_bits) > } > tcr &= ~TCR_T0SZ_MASK; > tcr |= TCR_T0SZ(hyp_va_bits); > + if (system_supports_lpa2()) > + tcr |= TCR_EL2_DS; > params->tcr_el2 = tcr; > > params->pgd_pa = kvm_mmu_get_httbr(); > diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c > index d42b72f78a9b..c3cd16c6f95f 100644 > --- a/arch/arm64/kvm/hyp/nvhe/tlb.c > +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c > @@ -198,7 +198,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, > /* Switch to requested VMID */ > __tlb_switch_to_guest(mmu, &cxt, false); > > - __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0, false); > + __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0, > + system_supports_lpa2()); At this stage, I'd fully expect the flag to have been subsumed into the helper... > > dsb(ish); > __tlbi(vmalle1is); > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index f155b8c9e98c..062eb7bcdb8a 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -79,7 +79,10 @@ static bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx) > > static bool kvm_phys_is_valid(u64 phys) > { > - return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX)); > + u64 parange_max = kvm_get_parange_max(); > + u8 shift = id_aa64mmfr0_parange_to_phys_shift(parange_max); > + > + return phys < BIT(shift); > } > > static bool kvm_block_mapping_supported(const struct kvm_pgtable_visit_ctx *ctx, u64 phys) > @@ -408,7 +411,8 @@ static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep) > } > > attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap); > - attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); > + if (!system_supports_lpa2()) > + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); > attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF; > attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW; > *ptep = attr; > @@ -654,6 +658,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift) > vtcr |= VTCR_EL2_HA; > #endif /* CONFIG_ARM64_HW_AFDBM */ > > + if (system_supports_lpa2()) > + vtcr |= VTCR_EL2_DS; > + > /* Set the vmid bits */ > vtcr |= (get_vmid_bits(mmfr1) == 16) ? > VTCR_EL2_VS_16BIT : > @@ -711,7 +718,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p > if (prot & KVM_PGTABLE_PROT_W) > attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W; > > - attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); > + if (!system_supports_lpa2()) > + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); > + > attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF; > attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW; > *ptep = attr; > diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c > index 6041c6c78984..40cea2482a76 100644 > --- a/arch/arm64/kvm/hyp/vhe/tlb.c > +++ b/arch/arm64/kvm/hyp/vhe/tlb.c > @@ -161,7 +161,8 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu, > /* Switch to requested VMID */ > __tlb_switch_to_guest(mmu, &cxt); > > - __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0, false); > + __flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0, > + system_supports_lpa2()); > > dsb(ish); > __tlbi(vmalle1is); One thing I don't see here is how you update the tcr_compute_pa_size macro that is used on the initial nVHE setup, which is inconsistent with the kvm_get_parange_max() helper. Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel