From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC687C433E7 for ; Thu, 3 Sep 2020 06:06:44 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9A9F42071B for ; Thu, 3 Sep 2020 06:06:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="vZrVpigU"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JT/5IGlM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9A9F42071B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:Reply-To:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=gcOkA21bHJhg7TUA8x2i4ujY8aTHg8UWztj63GYgG+o=; b=vZrVpigUcUPuAU uXI1ZRLwfC4352GHAs6QAc8Sp5iz7nD/TiwwxuSWzuzaihArtwfpd+ubg32UOAsIMjpTL9F9pB03Y 4De8OJQE9gAi4trpxTJMU5TNenjjrPH10MMvpjfUB0lonN5OL7DpuYPZCN+VUSsYihXk0NmkjF/Jw hI/hGdJLdXM2r9E7cEHanrVYQ4zYm8Fg00hA13rXwWcKaoDTTEN7jnMDjqSp/ihu5agGlOwcYjhHc tf9VxemcYBHcMmyES5PefOixWRWeC4xoX5E89dfN22jpDRvqDT4iNJyR/jGNk5eTgKY61lmUMJ18P hzIqpeBpCfNciI7Cp1Aw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kDiND-0003SH-11; Thu, 03 Sep 2020 06:05:27 +0000 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kDiNA-0003Rf-8f for linux-arm-kernel@lists.infradead.org; Thu, 03 Sep 2020 06:05:25 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599113123; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0zmRK1VDBtKihvK7EFCYUI69rIgn2xm0JRbgclboctE=; b=JT/5IGlMynRbEe1A8oOCW/m6hIdJ+o3tCJI7uQ9jYu0nBQ7MGYbcCivgGcPBRfl6WGgjuX GwmFvFWi5dXxjDGQ2qZz9LInuYzTv3Wgz9mPFggOoemFDzK3UYv12nJz2bC8mbQTmdsVGs K0eC7N5F9rSryDs6c7aKyPETiIpQpTo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-66-j0eppuLlM12MXrx-rR9n3A-1; Thu, 03 Sep 2020 02:05:19 -0400 X-MC-Unique: j0eppuLlM12MXrx-rR9n3A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E73B1393B1; Thu, 3 Sep 2020 06:05:17 +0000 (UTC) Received: from [10.64.54.159] (vpn2-54-159.bne.redhat.com [10.64.54.159]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 265FF19C71; Thu, 3 Sep 2020 06:05:14 +0000 (UTC) Subject: Re: [PATCH v3 17/21] KVM: arm64: Convert user_mem_abort() to generic page-table API To: Will Deacon , kvmarm@lists.cs.columbia.edu References: <20200825093953.26493-1-will@kernel.org> <20200825093953.26493-18-will@kernel.org> From: Gavin Shan Message-ID: <1dd58be7-75d8-1b7f-0eb2-d93b57bdcecd@redhat.com> Date: Thu, 3 Sep 2020 16:05:12 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 MIME-Version: 1.0 In-Reply-To: <20200825093953.26493-18-will@kernel.org> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=gshan@redhat.com X-Mimecast-Spam-Score: 0.002 X-Mimecast-Originator: redhat.com Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200903_020524_333246_CD9611EF X-CRM114-Status: GOOD ( 27.65 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Gavin Shan Cc: Suzuki Poulose , Marc Zyngier , Quentin Perret , James Morse , Catalin Marinas , kernel-team@android.com, linux-arm-kernel@lists.infradead.org Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Will, On 8/25/20 7:39 PM, Will Deacon wrote: > Convert user_mem_abort() to call kvm_pgtable_stage2_relax_perms() when > handling a stage-2 permission fault and kvm_pgtable_stage2_map() when > handling a stage-2 translation fault, rather than walking the page-table > manually. > > Cc: Marc Zyngier > Cc: Quentin Perret > Signed-off-by: Will Deacon > --- > arch/arm64/kvm/mmu.c | 112 +++++++++++++------------------------------ > 1 file changed, 34 insertions(+), 78 deletions(-) > I looks good to me. As it's changing the stage2 page table management mechanism completely. I will test this series with various configuration on different machines. I will update the result when it's finished. Reviewed-by: Gavin Shan > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index d4b0716a6ab4..cfbf32cae3a5 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1491,7 +1491,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > { > int ret; > bool write_fault, writable, force_pte = false; > - bool exec_fault, needs_exec; > + bool exec_fault; > + bool device = false; > unsigned long mmu_seq; > gfn_t gfn = fault_ipa >> PAGE_SHIFT; > struct kvm *kvm = vcpu->kvm; > @@ -1499,10 +1500,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > struct vm_area_struct *vma; > short vma_shift; > kvm_pfn_t pfn; > - pgprot_t mem_type = PAGE_S2; > bool logging_active = memslot_is_logging(memslot); > - unsigned long vma_pagesize, flags = 0; > - struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu; > + unsigned long vma_pagesize; > + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > + struct kvm_pgtable *pgt; > > write_fault = kvm_is_write_fault(vcpu); > exec_fault = kvm_vcpu_trap_is_iabt(vcpu); > @@ -1535,22 +1536,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > vma_pagesize = PAGE_SIZE; > } > > - /* > - * The stage2 has a minimum of 2 level table (For arm64 see > - * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can > - * use PMD_SIZE huge mappings (even when the PMD is folded into PGD). > - * As for PUD huge maps, we must make sure that we have at least > - * 3 levels, i.e, PMD is not folded. > - */ > - if (vma_pagesize == PMD_SIZE || > - (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) > + if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE) > gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT; > mmap_read_unlock(current->mm); > > - /* We need minimum second+third level pages */ > - ret = kvm_mmu_topup_memory_cache(memcache, kvm_mmu_cache_min_pages(kvm)); > - if (ret) > - return ret; > + if (fault_status != FSC_PERM) { > + ret = kvm_mmu_topup_memory_cache(memcache, > + kvm_mmu_cache_min_pages(kvm)); > + if (ret) > + return ret; > + } > > mmu_seq = vcpu->kvm->mmu_notifier_seq; > /* > @@ -1573,28 +1568,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > return -EFAULT; > > if (kvm_is_device_pfn(pfn)) { > - mem_type = PAGE_S2_DEVICE; > - flags |= KVM_S2PTE_FLAG_IS_IOMAP; > - } else if (logging_active) { > - /* > - * Faults on pages in a memslot with logging enabled > - * should not be mapped with huge pages (it introduces churn > - * and performance degradation), so force a pte mapping. > - */ > - flags |= KVM_S2_FLAG_LOGGING_ACTIVE; > - > + device = true; > + } else if (logging_active && !write_fault) { > /* > * Only actually map the page as writable if this was a write > * fault. > */ > - if (!write_fault) > - writable = false; > + writable = false; > } > > - if (exec_fault && is_iomap(flags)) > + if (exec_fault && device) > return -ENOEXEC; > > spin_lock(&kvm->mmu_lock); > + pgt = vcpu->arch.hw_mmu->pgt; > if (mmu_notifier_retry(kvm, mmu_seq)) > goto out_unlock; > > @@ -1605,62 +1592,31 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (vma_pagesize == PAGE_SIZE && !force_pte) > vma_pagesize = transparent_hugepage_adjust(memslot, hva, > &pfn, &fault_ipa); > - if (writable) > + if (writable) { > + prot |= KVM_PGTABLE_PROT_W; > kvm_set_pfn_dirty(pfn); > + mark_page_dirty(kvm, gfn); > + } > > - if (fault_status != FSC_PERM && !is_iomap(flags)) > + if (fault_status != FSC_PERM && !device) > clean_dcache_guest_page(pfn, vma_pagesize); > > - if (exec_fault) > + if (exec_fault) { > + prot |= KVM_PGTABLE_PROT_X; > invalidate_icache_guest_page(pfn, vma_pagesize); > + } > > - /* > - * If we took an execution fault we have made the > - * icache/dcache coherent above and should now let the s2 > - * mapping be executable. > - * > - * Write faults (!exec_fault && FSC_PERM) are orthogonal to > - * execute permissions, and we preserve whatever we have. > - */ > - needs_exec = exec_fault || > - (fault_status == FSC_PERM && > - stage2_is_exec(mmu, fault_ipa, vma_pagesize)); > - > - if (vma_pagesize == PUD_SIZE) { > - pud_t new_pud = kvm_pfn_pud(pfn, mem_type); > - > - new_pud = kvm_pud_mkhuge(new_pud); > - if (writable) > - new_pud = kvm_s2pud_mkwrite(new_pud); > - > - if (needs_exec) > - new_pud = kvm_s2pud_mkexec(new_pud); > - > - ret = stage2_set_pud_huge(mmu, memcache, fault_ipa, &new_pud); > - } else if (vma_pagesize == PMD_SIZE) { > - pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type); > - > - new_pmd = kvm_pmd_mkhuge(new_pmd); > - > - if (writable) > - new_pmd = kvm_s2pmd_mkwrite(new_pmd); > - > - if (needs_exec) > - new_pmd = kvm_s2pmd_mkexec(new_pmd); > + if (device) > + prot |= KVM_PGTABLE_PROT_DEVICE; > + else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) > + prot |= KVM_PGTABLE_PROT_X; > > - ret = stage2_set_pmd_huge(mmu, memcache, fault_ipa, &new_pmd); > + if (fault_status == FSC_PERM) { > + ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); > } else { > - pte_t new_pte = kvm_pfn_pte(pfn, mem_type); > - > - if (writable) { > - new_pte = kvm_s2pte_mkwrite(new_pte); > - mark_page_dirty(kvm, gfn); > - } > - > - if (needs_exec) > - new_pte = kvm_s2pte_mkexec(new_pte); > - > - ret = stage2_set_pte(mmu, memcache, fault_ipa, &new_pte, flags); > + ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize, > + __pfn_to_phys(pfn), prot, > + memcache); > } > > out_unlock: > Thanks, Gavin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel