From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B3F9C83F1B for ; Fri, 11 Jul 2025 16:38:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C40CD6B00A1; Fri, 11 Jul 2025 12:38:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C18546B00A3; Fri, 11 Jul 2025 12:38:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B2E1F6B00A4; Fri, 11 Jul 2025 12:38:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A5DDD6B00A1 for ; Fri, 11 Jul 2025 12:38:04 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 589F4BE4D7 for ; Fri, 11 Jul 2025 16:38:04 +0000 (UTC) X-FDA: 83652540888.13.C0C15DA Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf26.hostedemail.com (Postfix) with ESMTP id 4AAFA140011 for ; Fri, 11 Jul 2025 16:38:02 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Zbs/fxqU"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of maz@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=maz@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752251882; a=rsa-sha256; cv=none; b=0vXvPN79vJcAI6Hu9XUZ8CgOPZk3wUOsBlvnQ9Pv6amOiml+Kb3fKTW7H5kaczMOzI6OH4 u4I8sEAZlCuXLO06fmGrPfloyI+xfEKkYqK/zViKqVp7qkBylqwFDe4K3mfHUs9Ua269ze qacQouoK/macyarD3yqua+ClWW8kPfU= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="Zbs/fxqU"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of maz@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=maz@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752251882; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sr2Zdmhd8naUkWggpH8kMOSW4tHjPoH927+p7jn2XGw=; b=sXLSoOUn1bsNzgqnNeds4I4rDikiReI2q74TBvJeIFudAViuLIgrGU+3LzcJxo5baRowvR cAeJEQR26xQ16VgjC6SC7LCagqXNbZkNfhcrwHsZ9RFeMZOmsj+Zv35F3JQHB2nnML1Xoc cZvXqNuryVDfvvVcAZz8niI7qvuHUtU= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 47F2DA475F4; Fri, 11 Jul 2025 16:38:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90C8FC4CEED; Fri, 11 Jul 2025 16:38:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752251880; bh=QAZcD5rwRQEymEaXE3HeYg5QulhnM/zknP1OwbVjQO0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Zbs/fxqUBNGKLoaGChnrfgIXjCFT8qgS1NFYne+Dntrv03UofO6RaqUtVoGCE+qXT 4LtX9qSLWq9UEYXm2ndP+LYmAv1x/QXGD3xVHV5mBSjnPpEB2JeQyWIM9zAd/dzjAr 0kTk4tkpChqqlHuypQ1vg9t4JOW3VKGz5H2FExq+gJ9SFjngAv4IbeGCor1hutvXTt CptaZVwXmq8EnVrNTabapFOmexWrDoBvxj85EkBU83yo8mpowJ4QeVoFEoTAvCbELS NA40GDZsp8oDBlgqLLQnbL9JvGCF3suVqRJKzFwasbcrvuOpxcgLWBMNVGfUD7q1v8 zOVHb92flLc3Q== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1uaGl3-00EwZT-Mo; Fri, 11 Jul 2025 17:37:57 +0100 Date: Fri, 11 Jul 2025 17:37:56 +0100 Message-ID: <865xfyadjv.wl-maz@kernel.org> From: Marc Zyngier To: Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com Subject: Re: [PATCH v13 16/20] KVM: arm64: Handle guest_memfd-backed guest page faults In-Reply-To: <20250709105946.4009897-17-tabba@google.com> References: <20250709105946.4009897-1-tabba@google.com> <20250709105946.4009897-17-tabba@google.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: tabba@google.com, kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, kvmarm@lists.linux.dev, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, qu ic_pderr in@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, ira.weiny@intel.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Rspamd-Server: rspam11 X-Rspam-User: X-Stat-Signature: gjbomwad8h4huukue8exypsuijofsqf3 X-Rspamd-Queue-Id: 4AAFA140011 X-HE-Tag: 1752251882-174649 X-HE-Meta: U2FsdGVkX1/IgWi6V3h0CgKcdChiuloc3pBHtZlJqRqIF2bshfv1WzTLhf1HyjmLJpLVwwNAOlACRiosyUGu1oYQ4EXIKqpsZKM/vkmi2wn7HXfjKw3GIglOpX9smpCEkuCL1FH8vzdmKfD2yI0f8W6K/TLaxU3ZEaR/15p7e9YkoEh0jUyK/Ux2/gya2D86snp23lna0ZesW5nRexoGFdlhiML0Q3tDVVmfMujtqB1iBlgJBxy0zTgR9bbVI058sw55Mt8fMUpqerfX3SjSxqqX73euny0CWXq+bPeRKHlrWH/y/p+NLKPMXOHoJKu9m0jQsqlekGJWe0XIH77TwjRtkjjdJ2qF/FcBopoWWyN170VDo/1VCXZAwYDb587V30Yzrg/dhrUJuTb3OR9BrtQ66fS5V37wLJwi3sh6HwpgnZbf33xwAp2rm5cLKaGK27jVOEcW+SS9DgY9NiDFEuVUQ5KXmpSgfELlt/BORfB0RYYNGURGW/qzI2KGN28axQDGnk2QTA8W9Mt02GQEZTLXMCP5IfIJ2q0dDcDvLKiTyqjCrJTJ+U4R1JTxy5W+CmYDjMNhRWvAdX77pV1xDCkETBsLakSpFrBuYvCrfqgrTfqh9nCOwSO4cz4mkZye+YFtfejBzR2xwPgLKoTTL0hSiFLpuddVaH/89OoPH8ihP3x8f0nogeBgiOzDl/N28D7lQSlzS6enUXYBvMDWahU6ewp3Y3VUIG4Ie5eZwI5pzTvvd89vpCZxkjR61Qqyghb6qrj0uUjh6LK4Z4UncwJe9m4W0FogfdBejWl8iuwPTyqCBLSJnu+IQqdJBr3O2odMiFywtdNgzEbVtBc+la1y9xe+VKNa9zHZiPI67IcMfzRJyjlrb+ymR5I1XqT8pRkgDAN9cFEw96mBLQPk6i1cH6F4M/8nGMVFikOb+/vwYUjzZp7vHSA/JD4DqEn9yqmwxflbBUGLIKcJ+6c xmSM5daq dHn0amIl8+6ph431iwDbdmt/SAFMHgfWYSBIrvW4U3eVj+hEogGZyv2oAe6+MLaE6UIuM4etaXD2gmKljyWfvu+NLzubZsXL670BZHNJKNgnH+GFEe6hH0ODDgAIbFKKzdtM7npol2PZEH1wBXOCOIbzisHH9QPSe0faLNZoOibcUtbfxhrFEWVe6UxVK2EzHfrGjK8suLCgXe+KardUQN5X5/sJASI1OBEos9M5SQ5jQdrek2QwhcJckCBuZ/BzaVsVinyYpmZWohG7uZO+xwIKXjg0FfR5RyL14Hd1YR8zqcmy4oKDiRjabHZtvVq16Bb78b5fKApJP/9VxLIBwSys93ATeuk1YD5zdST2UTjL5BNZL7pT7TiyrgtQmlcfTmB+s X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 09 Jul 2025 11:59:42 +0100, Fuad Tabba wrote: > > Add arm64 architecture support for handling guest page faults on memory > slots backed by guest_memfd. > > This change introduces a new function, gmem_abort(), which encapsulates > the fault handling logic specific to guest_memfd-backed memory. The > kvm_handle_guest_abort() entry point is updated to dispatch to > gmem_abort() when a fault occurs on a guest_memfd-backed memory slot (as > determined by kvm_slot_has_gmem()). > > Until guest_memfd gains support for huge pages, the fault granule for > these memory regions is restricted to PAGE_SIZE. > > Reviewed-by: Gavin Shan > Reviewed-by: James Houghton > Signed-off-by: Fuad Tabba > --- > arch/arm64/kvm/mmu.c | 82 ++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 79 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 58662e0ef13e..71f8b53683e7 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1512,6 +1512,78 @@ static void adjust_nested_fault_perms(struct kvm_s2_trans *nested, > *prot |= kvm_encode_nested_level(nested); > } > > +#define KVM_PGTABLE_WALK_MEMABORT_FLAGS (KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED) > + > +static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > + struct kvm_s2_trans *nested, > + struct kvm_memory_slot *memslot, bool is_perm) > +{ > + bool write_fault, exec_fault, writable; > + enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_MEMABORT_FLAGS; > + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; > + struct page *page; > + struct kvm *kvm = vcpu->kvm; > + void *memcache; > + kvm_pfn_t pfn; > + gfn_t gfn; > + int ret; > + > + ret = prepare_mmu_memcache(vcpu, true, &memcache); > + if (ret) > + return ret; > + > + if (nested) > + gfn = kvm_s2_trans_output(nested) >> PAGE_SHIFT; > + else > + gfn = fault_ipa >> PAGE_SHIFT; > + > + write_fault = kvm_is_write_fault(vcpu); > + exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); > + > + if (write_fault && exec_fault) { > + kvm_err("Simultaneous write and execution fault\n"); > + return -EFAULT; > + } I don't think we need to cargo-cult this stuff. This cannot happen architecturally (data and instruction aborts are two different exceptions, so you can't have both at the same time), and is only there because we were young and foolish when we wrote this crap. Now that we (the royal We) are only foolish, we can save a few bits by dropping it. Or turn it into a VM_BUG_ON() if you really want to keep it. > + > + if (is_perm && !write_fault && !exec_fault) { > + kvm_err("Unexpected L2 read permission error\n"); > + return -EFAULT; > + } Again, this is copying something that was always a bit crap: - it's not an "error", it's a permission fault - it's not "L2", it's "stage-2" But this should equally be turned into an assertion, ideally in a single spot. See below for the usual untested hack. Thanks, M. diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index b92ce4d9b4e01..c79dc8fd45d5a 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1540,16 +1540,7 @@ static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); - - if (write_fault && exec_fault) { - kvm_err("Simultaneous write and execution fault\n"); - return -EFAULT; - } - - if (is_perm && !write_fault && !exec_fault) { - kvm_err("Unexpected L2 read permission error\n"); - return -EFAULT; - } + VM_BUG_ON(write_fault && exec_fault); ret = kvm_gmem_get_pfn(kvm, memslot, gfn, &pfn, &page, NULL); if (ret) { @@ -1616,11 +1607,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); VM_BUG_ON(write_fault && exec_fault); - if (fault_is_perm && !write_fault && !exec_fault) { - kvm_err("Unexpected L2 read permission error\n"); - return -EFAULT; - } - /* * Permission faults just need to update the existing leaf entry, * and so normally don't require allocations from the memcache. The @@ -2035,6 +2021,9 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } + VM_BUG_ON(kvm_vcpu_trap_is_permission_fault(vcpu) && + !write_fault && !kvm_vcpu_trap_is_exec_fault(vcpu)); + if (kvm_slot_has_gmem(memslot)) ret = gmem_abort(vcpu, fault_ipa, nested, memslot, esr_fsc_is_permission_fault(esr)); -- Without deviation from the norm, progress is not possible.