From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4519ECD3424 for ; Fri, 1 May 2026 14:18:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=JvOnRbZdSG/7wBrtRm+8n+Do2dRXUbjUXwKRNK243Gs=; b=FZOjMvyt/VSQFXXxw2ucAAnt9z JvIM4Rk3BADou3FIUXuVuuSxPIy1emYEZfOP0jLFfA5fT3Mx76ptXsGCzyTEKWZQCuSXXdJDA1FwO E/CX31gN1lH17eMQ7oav2WV+jF9sTOMKyzhuvZetdCrKF16PS4rJR3igbT04ie4kduISdazLdZR2T 0Nb8G8TATHKGpcpfDsOM0dtE2YYnFK1YrC4O2EDtkBoMFFZQ/Wylb78EY+dN98ZAG0ysDZbPgnKLp ShBSOwwoH4IgO9V+idJj/VkNcKLW2AYK3awFvtOLaDVd6H2zHvtoR9rxL98QtH9zqOsVsozdrw82B M8ntl13A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1wIohL-00000007EjO-23Bx; Fri, 01 May 2026 14:18:31 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1wIohI-00000007Eij-3g8f for linux-arm-kernel@lists.infradead.org; Fri, 01 May 2026 14:18:30 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5AB672C1C; Fri, 1 May 2026 07:18:20 -0700 (PDT) Received: from raptor (usa-sjc-mx-foss1.foss.arm.com [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BF50F3F7B4; Fri, 1 May 2026 07:18:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1777645105; bh=n4QKpHTlKIbKxoVyb8jomzE98FcpdhxCxXybOaeTddY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=YhClDYmWmp6O5MaEN5QwnO5JBVi0bDYByAXkKn7fiDuCzxz0ob/7DxW/wZ/xIYLCA gkfwAy+hNWKVA8+B1KyQrqGix+V7kPyXloob1Ckrt+kuUQUF9cqYRo97BQpFH4JqxS BESkBXabNRE0Q6XJjUxIEpPESEi0NcjJgVMHTsOU= Date: Fri, 1 May 2026 15:18:16 +0100 From: Alexandru Elisei To: Fuad Tabba Cc: maz@kernel.org, oupton@kernel.org, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, mark.rutland@arm.com Subject: Re: [PATCH] KVM: arm64: Handle permission faults with guest_memfd Message-ID: References: <20260430132351.280766-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260501_071828_991421_4BF2DE79 X-CRM114-Status: GOOD ( 37.78 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Fuad, On Thu, Apr 30, 2026 at 03:48:21PM +0100, Fuad Tabba wrote: > Hi Alexandru, > > On Thu, 30 Apr 2026 at 14:24, Alexandru Elisei wrote: > > > > gmem_abort() calls kvm_pgtable_stage2_map() to make changes to stage 2. It > > does this for both relaxing permissions on an existing mapping and to > > install a missing mapping. > > > > kvm_pgtable_stage2_map() doesn't make changes to stage 2 if there is an > > existing, valid entry and the new entry modifies only the permissions. > > This is checked in: > > > > kvm_pgtable_stage2_map() > > stage2_map_walk_leaf() > > stage2_map_walker_try_leaf() > > stage2_pte_needs_update() > > > > and if only the permissions differ, kvm_pgtable_stage2_map() returns > > -EAGAIN and KVM returns to the guest to replay the instruction. The > > assumption is that a concurrent fault on a different VCPU already mapped > > the faulting IPA, and replaying the instruction will either succeed, or > > cause a permission fault, which should be handled with > > kvm_pgtable_stage2_relax_perms(). > > > > gmem_abort(), on a read or write fault on a system without DIC (instruction > > cache invalidation required for data to instruction coherence), installs a > > valid entry with read and write permissions, but without executable > > permissions. On an execution fault on the same page, gmem_abort() attempts > > to relax the permissions to allow execution, but calls > > kvm_pgtable_stage2_map() to change the existing, valid, entry. > > kvm_pgtable_stage2_map() returns -EAGAIN and KVM resumes execution from the > > faulting instruction, which leads to an infinite loop of permission faults > > on the same instruction. > > > > Allow the guest to make progress by using kvm_pgtable_stage2_relax_perms() > > to relax permissions. > > > > Fixes: a7b57e099592 ("KVM: arm64: Handle guest_memfd-backed guest page faults") > > Signed-off-by: Alexandru Elisei > > --- > > > > Lightly tested on an Orion O6 board, without pkvm, and without nested > > virtualisation. > > > > Doesn't apply cleanly on top of a7b57e099592, I can send a patch for that if > > needed. > > > > arch/arm64/kvm/mmu.c | 24 +++++++++++++++++------- > > 1 file changed, 17 insertions(+), 7 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index d089c107d9b7..dff58de7703b 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -1576,6 +1576,7 @@ struct kvm_s2_fault_desc { > > static int gmem_abort(const struct kvm_s2_fault_desc *s2fd) > > { > > bool write_fault, exec_fault; > > + bool perm_fault = kvm_vcpu_trap_is_permission_fault(s2fd->vcpu); > > enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_SHARED; > > enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > > struct kvm_pgtable *pgt = s2fd->vcpu->arch.hw_mmu->pgt; > > @@ -1587,10 +1588,12 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd) > > gfn_t gfn; > > int ret; > > > > - memcache = get_mmu_memcache(s2fd->vcpu); > > - ret = topup_mmu_memcache(s2fd->vcpu, memcache); > > - if (ret) > > - return ret; > > + if (!perm_fault) { > > + memcache = get_mmu_memcache(s2fd->vcpu); > > + ret = topup_mmu_memcache(s2fd->vcpu, memcache); > > + if (ret) > > + return ret; > > + } > > memcache is now uninitialized when perm_fault, and only read in the > !perm_fault branch below. It's safe today, but initializing it to NULL > would silence any defensive analyzer and keep things robust against > future churn. Sure, I'll initialise memcache to NULL. > > > > > if (s2fd->nested) > > gfn = kvm_s2_trans_output(s2fd->nested) >> PAGE_SHIFT; > > @@ -1631,9 +1634,16 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd) > > goto out_unlock; > > } > > > > - ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE, > > - __pfn_to_phys(pfn), prot, > > - memcache, flags); > > + if (perm_fault) { > > + /* Preserve the software bits from the existing table entry. */ > > + prot &= ~KVM_NV_GUEST_MAP_SZ; > > Can you please phrase it the same way as the existing comment in > user_mem_abort, as that phrasing more precisely describes the > rationale? i.e., Of course. > > /* > * Drop the SW bits in favour of those stored in the > * PTE, which will be preserved. > */ > > > + ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms(pgt, s2fd->fault_ipa, > > + prot, flags)); > > nit: args wrapped inside KVM_PGT_FN() is inconsistent with the rest of > the file (and with the else-branch immediately below). Every other > call site uses name-only inside the macro. Both expand the same way, > but it's better to stick to the convention. This was a typo on my part, thanks for catching it. > > > + } else { > > + ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE, > > + __pfn_to_phys(pfn), prot, > > + memcache, flags); > > + } > > > > out_unlock: > > kvm_release_faultin_page(kvm, page, !!ret, prot & KVM_PGTABLE_PROT_W); > > With these fixed: > > Reviewed-by: Fuad Tabba Thanks! Alex