From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97122396B9A; Wed, 24 Jun 2026 08:41:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782290490; cv=none; b=ckriprXZSr72IzDhuoagoTAcpeSaOVwiLz+w9cZn11ua5bFg4FFkBEwjgjzGaAi3O4i+iev+/HGRsA1cQUHt0D5PzprhGHUXIRwQvgx9ao502p79XEAZajthYqnw6NfQ8UBv6ssKZzchxST9t1RNikHMfzad0wm2Xzv5pbtpsuY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782290490; c=relaxed/simple; bh=CdONM1OE/NBGrFu9hL5oZ4CpxNgH3b9veAAoZhPhYSY=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=ume+2x0/+phVl0cO7/nN9EhDERUN4zcGJLJLhT2DanSC7r6iTJMv7tLp/32YebH2xCdy/EPoYGilG/5wD9nu9uVZB+yqBSKfW4gvnHNPwMXNcrTq9HoeBnltlzMRSU+FFhao26EBoY74QF05hazmY4VwIQj4ONRyvS2MiA6ookA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SfIAGOhg; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SfIAGOhg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3D8281F000E9; Wed, 24 Jun 2026 08:41:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782290489; bh=8Ruwy07ATJLjdh3xRn4UvG5Laz0BmboWr22dH0BnRCg=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=SfIAGOhgbjeGiKnp3Xs0t2HA35T0NF4LxmRrBwE3AVWIVWDul+Ly8E3puWG4WSoqv tlBhDO6YoiDLLuf4xaB4slE7P7bwapRl5Vcjywq56uHuGp7mSC8XeqXa6ZTZnjd89F 5qZ5915EoNQJWi5Wk0xDjpZXiivCno9hxMTM0kGcnvnvclpufQk42IVuFLTPvlZPVz srBmfLvxK+MA8T15qGIgwhp39I49+3HUr5R7p6V9IIKJCjeadVM2X2ala8hN4ENTry O1ZnfY/KoS7pbhqbaB/NaKgvOvK+00gZyEMT/d/jQKx9B9JUNIn+yHluHD03b4Bibd iq7BHcf1dDwPA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wcJAl-0000000FYcx-0PJy; Wed, 24 Jun 2026 08:41:27 +0000 Date: Wed, 24 Jun 2026 09:41:26 +0100 Message-ID: <86mrwkqqix.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton Cc: sashiko-reviews@lists.linux.dev, kvmarm@lists.linux.dev Subject: Re: [PATCH] KVM: arm64: nv: Check RW permissions for insn abort due to S1PTW In-Reply-To: References: <20260623211310.1529760-1-oupton@kernel.org> <20260623213225.A89CF1F000E9@smtp.kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: oupton@kernel.org, sashiko-reviews@lists.linux.dev, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Wed, 24 Jun 2026 01:07:50 +0100, Oliver Upton wrote: > > On Tue, Jun 23, 2026 at 09:32:25PM +0000, sashiko-bot@kernel.org wrote: > > > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c > > > index 94df26de6990c..713fc44f362d9 100644 > > > --- a/arch/arm64/kvm/nested.c > > > +++ b/arch/arm64/kvm/nested.c > > [ ... ] > > > @@ -918,7 +918,7 @@ int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans) > > > if (!kvm_vcpu_trap_is_permission_fault(vcpu)) > > > return 0; > > > > > > - if (kvm_vcpu_trap_is_iabt(vcpu)) { > > > + if (kvm_vcpu_trap_is_iabt(vcpu) && !kvm_vcpu_abt_iss1tw(vcpu)) { > > > if (vcpu_mode_priv(vcpu)) > > > forward_fault = !kvm_s2_trans_exec_el1(vcpu->kvm, trans); > > > else > > > > [Severity: High] > > This isn't a bug introduced by this patch, but does this change expose > > instruction aborts to an infinite loop if the L1 guest maps nested stage-2 > > memory as Write-Only? > > > > By falling through to the else block in kvm_s2_handle_perm_fault(), S1PTW > > instruction aborts are now evaluated for read/write permissions. > > > > However, if the L1 guest mapped the nested stage-2 page table entry as > > Write-Only, the hardware will still fault on the missing read permission > > during the stage-1 translation table walk. > > > > Looking at kvm_is_write_fault() in arch/arm64/include/asm/kvm_emulate.h: > > > > static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu) > > { > > if (kvm_vcpu_abt_iss1tw(vcpu)) { > > ... > > return kvm_vcpu_trap_is_permission_fault(vcpu); > > } > > ... > > } > > > > It unconditionally treats all permission faults on an S1PTW as write faults, > > assuming they are Access Flag or Dirty Bit updates. > > > > Back in the else block of kvm_s2_handle_perm_fault(): > > > > bool write_fault = kvm_is_write_fault(vcpu); > > > > forward_fault = ((write_fault && !trans->writable) || > > (!write_fault && !trans->readable)); > > > > Because write_fault is true, it only evaluates !trans->writable. Since the > > mapping is Write-Only, this check passes and KVM doesn't forward the fault. > > Instead, KVM resumes the guest, which will immediately fault again on the > > read, causing an infinite loop. > > Well this isn't exactly malice, just a guest choosing to shoot itself in > the foot. Having said that, it's pretty clear that the assumption around S1PTW > permission faults no longer holds in kvm_is_write_fault(). > > I think the best we can do is key write faults off of TCR_ELx.HA and > evaluate potentially both permissions for S1PTW. Needs to be split up > into a couple patches: > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h > index 5bf3d7e1d92c..d5c61e0027c8 100644 > --- a/arch/arm64/include/asm/kvm_emulate.h > +++ b/arch/arm64/include/asm/kvm_emulate.h > @@ -479,21 +479,12 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) > > static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu) > { > - if (kvm_vcpu_abt_iss1tw(vcpu)) { > - /* > - * Only a permission fault on a S1PTW should be > - * considered as a write. Otherwise, page tables baked > - * in a read-only memslot will result in an exception > - * being delivered in the guest. > - * > - * The drawback is that we end-up faulting twice if the > - * guest is using any of HW AF/DB: a translation fault > - * to map the page containing the PT (read only at > - * first), then a permission fault to allow the flags > - * to be set. > - */ > - return kvm_vcpu_trap_is_permission_fault(vcpu); > - } > + /* > + * The architecture sucks; assume that the S1PTW fetched for write if > + * HA is enabled at stage-1. > + */ > + if (kvm_vcpu_abt_iss1tw(vcpu)) > + return effective_tcr_ha(vcpu); OK, so you're trading the implicit state machine (translation fault -> RO, permission fault -> RW) for a direct TCR.HA lookup, because the only reason you'd get a S1PTW fault for write is if you were updating the descriptor. This is cute, and I wish I had thought of that, as it saves us a round trip. But why can't that happen with HD? Surely this results in a similar fault, and I think we should also evaluate that bit. Thanks, M. -- Without deviation from the norm, progress is not possible.