From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED80BC433E6 for ; Wed, 2 Sep 2020 06:42:00 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BB03A2075D for ; Wed, 2 Sep 2020 06:42:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="fEFIXCsn"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="W9kg7n9h" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BB03A2075D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:Reply-To:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=8VoowFCTmH6ACXLw7dOFmuSIfi/lkZcthUNAigIcEdY=; b=fEFIXCsnF2w2Ux Hl5Re6ZtZh4VYKJBU06xKHgpikGM9ERtPbrgLIbc0uSkqYIhQALhvCKdovSrTkFW/pQ5hs9g+ZEa6 0jfE0ZeymE3j981wXTncOFycOEuO2HSKbyGZbN5iH17zapgh4vWCQdSNC0BexJ0btP4CGqJpC4RJ1 YDABdnlyEs5OX05I2cA/QEG7EcVT6C0Qq3tLNEzD/fk13QBvYooSn298qR5AXjG3nzptEzAm/Cac/ DEvYGLCy4rietvwsFnm/s5USkcduMTwg5VU1ObPLc8FvVw32oGDWkSbPUMkRfWsiElji8nGCvKqDv WQRQg3TvnyIMzRRHpuQw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kDMRO-0002Z3-IO; Wed, 02 Sep 2020 06:40:18 +0000 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120] helo=us-smtp-1.mimecast.com) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kDMRL-0002YW-Fc for linux-arm-kernel@lists.infradead.org; Wed, 02 Sep 2020 06:40:16 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599028814; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tgrlGxdrvM8uJv8i/HBzwz3RVf5rBosWTxZvDHQbaCs=; b=W9kg7n9hX52IPU23j2wU6dP7lusjOmVzPhJfkUoPOFbIwb2mBJB7MLzBEg7/pEYo1AH4Gz Iq016RHPzWrkYWDrviNJu4qXw7R4C1gkzTZrueQ4f7bYfJ0uYl3yoMxNSrEnt2AZbYSDBh mVkZe85iMot2s+NFPL1ZE5xL4H/q0+s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-326-K4Z35uSrNimFcl2HQuMqfw-1; Wed, 02 Sep 2020 02:40:10 -0400 X-MC-Unique: K4Z35uSrNimFcl2HQuMqfw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 95EB61084C84; Wed, 2 Sep 2020 06:40:08 +0000 (UTC) Received: from [10.64.54.147] (vpn2-54-147.bne.redhat.com [10.64.54.147]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B4F9710098AC; Wed, 2 Sep 2020 06:40:05 +0000 (UTC) Subject: Re: [PATCH v3 05/21] KVM: arm64: Add support for creating kernel-agnostic stage-2 page tables To: Will Deacon , kvmarm@lists.cs.columbia.edu References: <20200825093953.26493-1-will@kernel.org> <20200825093953.26493-6-will@kernel.org> From: Gavin Shan Message-ID: Date: Wed, 2 Sep 2020 16:40:03 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 MIME-Version: 1.0 In-Reply-To: <20200825093953.26493-6-will@kernel.org> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=gshan@redhat.com X-Mimecast-Spam-Score: 0.002 X-Mimecast-Originator: redhat.com Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200902_024015_557793_6FBFDF6D X-CRM114-Status: GOOD ( 30.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Gavin Shan Cc: Suzuki Poulose , Marc Zyngier , Quentin Perret , James Morse , Catalin Marinas , kernel-team@android.com, linux-arm-kernel@lists.infradead.org Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Will, On 8/25/20 7:39 PM, Will Deacon wrote: > Introduce alloc() and free() functions to the generic page-table code > for guest stage-2 page-tables and plumb these into the existing KVM > page-table allocator. Subsequent patches will convert other operations > within the KVM allocator over to the generic code. > > Cc: Marc Zyngier > Cc: Quentin Perret > Signed-off-by: Will Deacon > --- > arch/arm64/include/asm/kvm_host.h | 1 + > arch/arm64/include/asm/kvm_pgtable.h | 18 +++++++++ > arch/arm64/kvm/hyp/pgtable.c | 51 ++++++++++++++++++++++++++ > arch/arm64/kvm/mmu.c | 55 +++++++++++++++------------- > 4 files changed, 99 insertions(+), 26 deletions(-) > With the following one question resolved: Reviewed-by: Gavin Shan > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index e52c927aade5..0b7c702b2151 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -81,6 +81,7 @@ struct kvm_s2_mmu { > */ > pgd_t *pgd; > phys_addr_t pgd_phys; > + struct kvm_pgtable *pgt; > > /* The last vcpu id that ran on each physical CPU */ > int __percpu *last_vcpu_ran; > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h > index 2af84ab78cb8..3389f978d573 100644 > --- a/arch/arm64/include/asm/kvm_pgtable.h > +++ b/arch/arm64/include/asm/kvm_pgtable.h > @@ -116,6 +116,24 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt); > int kvm_pgtable_hyp_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys, > enum kvm_pgtable_prot prot); > > +/** > + * kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table. > + * @pgt: Uninitialised page-table structure to initialise. > + * @kvm: KVM structure representing the guest virtual machine. > + * > + * Return: 0 on success, negative error code on failure. > + */ > +int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm); > + > +/** > + * kvm_pgtable_stage2_destroy() - Destroy an unused guest stage-2 page-table. > + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init(). > + * > + * The page-table is assumed to be unreachable by any hardware walkers prior > + * to freeing and therefore no TLB invalidation is performed. > + */ > +void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); > + > /** > * kvm_pgtable_walk() - Walk a page-table. > * @pgt: Page-table structure initialised by kvm_pgtable_*_init(). > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > index d75166823ad9..b8550ccaef4d 100644 > --- a/arch/arm64/kvm/hyp/pgtable.c > +++ b/arch/arm64/kvm/hyp/pgtable.c > @@ -419,3 +419,54 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt) > free_page((unsigned long)pgt->pgd); > pgt->pgd = NULL; > } > + > +int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm) > +{ > + size_t pgd_sz; > + u64 vtcr = kvm->arch.vtcr; > + u32 ia_bits = VTCR_EL2_IPA(vtcr); > + u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); > + u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; > + > + pgd_sz = kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; > + pgt->pgd = alloc_pages_exact(pgd_sz, GFP_KERNEL | __GFP_ZERO); > + if (!pgt->pgd) > + return -ENOMEM; > + > + pgt->ia_bits = ia_bits; > + pgt->start_level = start_level; > + pgt->mmu = &kvm->arch.mmu; > + return 0; > +} > + > +static int stage2_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, > + enum kvm_pgtable_walk_flags flag, > + void * const arg) > +{ > + kvm_pte_t pte = *ptep; > + > + if (!kvm_pte_valid(pte)) > + return 0; > + > + put_page(virt_to_page(ptep)); > + > + if (kvm_pte_table(pte, level)) > + free_page((unsigned long)kvm_pte_follow(pte)); > + > + return 0; > +} > + > +void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) > +{ > + size_t pgd_sz; > + struct kvm_pgtable_walker walker = { > + .cb = stage2_free_walker, > + .flags = KVM_PGTABLE_WALK_LEAF | > + KVM_PGTABLE_WALK_TABLE_POST, > + }; > + > + WARN_ON(kvm_pgtable_walk(pgt, 0, BIT(pgt->ia_bits), &walker)); > + pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level) * PAGE_SIZE; > + free_pages_exact(pgt->pgd, pgd_sz); > + pgt->pgd = NULL; > +} > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index fabd72b0c8a4..4607e9ca60a2 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -668,47 +668,49 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size, > * @kvm: The pointer to the KVM structure > * @mmu: The pointer to the s2 MMU structure > * > - * Allocates only the stage-2 HW PGD level table(s) of size defined by > - * stage2_pgd_size(mmu->kvm). > - * > + * Allocates only the stage-2 HW PGD level table(s). > * Note we don't need locking here as this is only called when the VM is > * created, which can only be done once. > */ > int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu) > { > - phys_addr_t pgd_phys; > - pgd_t *pgd; > - int cpu; > + int cpu, err; > + struct kvm_pgtable *pgt; > > - if (mmu->pgd != NULL) { > + if (mmu->pgt != NULL) { > kvm_err("kvm_arch already initialized?\n"); > return -EINVAL; > } > > - /* Allocate the HW PGD, making sure that each page gets its own refcount */ > - pgd = alloc_pages_exact(stage2_pgd_size(kvm), GFP_KERNEL | __GFP_ZERO); > - if (!pgd) > + pgt = kzalloc(sizeof(*pgt), GFP_KERNEL); > + if (!pgt) > return -ENOMEM; > > - pgd_phys = virt_to_phys(pgd); > - if (WARN_ON(pgd_phys & ~kvm_vttbr_baddr_mask(kvm))) > - return -EINVAL; > + err = kvm_pgtable_stage2_init(pgt, kvm); > + if (err) > + goto out_free_pgtable; > > mmu->last_vcpu_ran = alloc_percpu(typeof(*mmu->last_vcpu_ran)); > if (!mmu->last_vcpu_ran) { > - free_pages_exact(pgd, stage2_pgd_size(kvm)); > - return -ENOMEM; > + err = -ENOMEM; > + goto out_destroy_pgtable; > } > > for_each_possible_cpu(cpu) > *per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1; > > mmu->kvm = kvm; > - mmu->pgd = pgd; > - mmu->pgd_phys = pgd_phys; > + mmu->pgt = pgt; > + mmu->pgd_phys = __pa(pgt->pgd); > + mmu->pgd = (void *)pgt->pgd; > mmu->vmid.vmid_gen = 0; > - > return 0; > + > +out_destroy_pgtable: > + kvm_pgtable_stage2_destroy(pgt); > +out_free_pgtable: > + kfree(pgt); > + return err; > } > kvm_pgtable_stage2_destroy() might not needed here because the stage2 page pgtable is empty so far. However, it should be rare to hit the case. If I'm correct, what we need to do is just freeing the PGDs. > static void stage2_unmap_memslot(struct kvm *kvm, > @@ -781,20 +783,21 @@ void stage2_unmap_vm(struct kvm *kvm) > void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu) > { > struct kvm *kvm = mmu->kvm; > - void *pgd = NULL; > + struct kvm_pgtable *pgt = NULL; > > spin_lock(&kvm->mmu_lock); > - if (mmu->pgd) { > - unmap_stage2_range(mmu, 0, kvm_phys_size(kvm)); > - pgd = READ_ONCE(mmu->pgd); > + pgt = mmu->pgt; > + if (pgt) { > mmu->pgd = NULL; > + mmu->pgd_phys = 0; > + mmu->pgt = NULL; > + free_percpu(mmu->last_vcpu_ran); > } > spin_unlock(&kvm->mmu_lock); > > - /* Free the HW pgd, one page at a time */ > - if (pgd) { > - free_pages_exact(pgd, stage2_pgd_size(kvm)); > - free_percpu(mmu->last_vcpu_ran); > + if (pgt) { > + kvm_pgtable_stage2_destroy(pgt); > + kfree(pgt); > } > } > Thanks, Gavin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel