From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 567AA3E317D; Wed, 10 Jun 2026 12:42:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781095358; cv=none; b=k38E+1ZWmNIwYdlNb24easBZVIdAB0DZP34taV6Fg8GyTxuJCxrnay+o+s0uWo7n+duARQaHzUPdYM9Iw4y9FcPsGeiubIrYl3npSPMGiFqPN8o3ZQ1QtV70qjM1ELw9T3d38bIEKJCAO5RdGlmEdhGPxVO9o8KXcfBmKB7zp9k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781095358; c=relaxed/simple; bh=uxE+uoQzufSHOCgFsUDlLpYa+J6KbaAT0nelksXiQNE=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=GEC1OUK4BfGOmfwf5seOohnXFe4CTnZ8ZO/mFloH41ii8ZvxveRdtzPwZzBxn07ppEJ3v8wu59xw6ewzGZxXgWjo+u181836WvsRLYxGpaRU8yPJmdxo/eSK2pc3kdKUGnXhKOp4WXLeVTDAibWLBzcfsjRBMgOhC5UEccde4Tg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=S1D+ySUs; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="S1D+ySUs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB4341F00893; Wed, 10 Jun 2026 12:42:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781095357; bh=uH9NVykxHBYBVo9QQtLJo4jbBCxi07bApR3SMXwAEw8=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=S1D+ySUsVHGWm0bjUINYcPmCjUUV3FiWRMonPTYpZGCZYIRB7M4JpypasqAsMH6eW fsDfxoN8dKaHM6jkgZq9ywEZkC85N/RsPrVDok071kcigbhEG0bKmafHbvmLuJ3IfL Y8lwctH7EvCN6GINBFccNyOAhWwO357OsGXDGg4mZLPG+P1haM2KADZ1+xiCw47SI/ D7Hc0oBuJUIKxd5Gzf1IAkDL9QW0yliksw6jrqNTOUZY+9xHWRU/Qca2s0s0j4wim0 yQo4EpBTi4kDcMf7qPmrA3qiW6n8bB4m7JR3GvlfMujf/Hhj8EAKXm1xFLyuYnTx2R oaPe32sWjILcQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wXIGQ-0000000BM3u-2eph; Wed, 10 Jun 2026 12:42:34 +0000 Date: Wed, 10 Jun 2026 13:42:34 +0100 Message-ID: <867bo6tvlx.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton Cc: kvmarm@lists.linux.dev, Joey Gouly , Suzuki K Poulose , Zenghui Yu , Wei-Lin Chang , stable@vger.kernel.org, Sashiko Subject: Re: [PATCH RESEND v2 3/5] KVM: arm64: nv: Re-translate VNCR before injecting abort In-Reply-To: <20260609185514.746507-4-oupton@kernel.org> References: <20260609185514.746507-1-oupton@kernel.org> <20260609185514.746507-4-oupton@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: oupton@kernel.org, kvmarm@lists.linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, weilin.chang@arm.com, stable@vger.kernel.org, sashiko-bot@kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Tue, 09 Jun 2026 19:55:12 +0100, Oliver Upton wrote: > > KVM faults in the VNCR page with FOLL_WRITE whenver the guest aborts for > a write, similar to how a regular stage-2 mapping is handled. It is > entirely possible that the guest reads from the VNCR before writing to > it, in which case the PFN could only be read-only. > > Invalidate the VNCR TLB and re-fetch the translation upon taking a VNCR > abort, allowing the host mapping to be faulted in for write the second > time around. Interestingly enough, this also satisfies the ordering > requirements of FEAT_ETS2/3 between descriptor updates and MMU faults. > > Cc: stable@vger.kernel.org > Fixes: 2a359e072596 ("KVM: arm64: nv: Handle mapping of VNCR_EL2 at EL2") > Reported-by: Sashiko > Signed-off-by: Oliver Upton > --- > arch/arm64/kvm/nested.c | 115 +++++++++++++++------------------------- > 1 file changed, 44 insertions(+), 71 deletions(-) > > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c > index ebd7ccfeee99..d5c4b57123a9 100644 > --- a/arch/arm64/kvm/nested.c > +++ b/arch/arm64/kvm/nested.c > @@ -1460,92 +1460,65 @@ static void handle_vncr_perm(struct kvm_vcpu *vcpu) > kvm_inject_nested_sync(vcpu, esr); > } > > -static bool kvm_vncr_tlb_lookup(struct kvm_vcpu *vcpu) > -{ > - struct vncr_tlb *vt = vcpu->arch.vncr_tlb; > - > - lockdep_assert_held_read(&vcpu->kvm->mmu_lock); > - > - if (!vt->valid) > - return false; > - > - if (read_vncr_el2(vcpu) != vt->gva) > - return false; > - > - if (vt->wr.nG) > - return get_asid_by_regime(vcpu, TR_EL20) == vt->wr.asid; > - > - return true; > -} > - > int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu) > { > struct vncr_tlb *vt = vcpu->arch.vncr_tlb; > u64 esr = kvm_vcpu_get_esr(vcpu); > + bool is_gmem, perm; > + int ret; > > WARN_ON_ONCE(!(esr & ESR_ELx_VNCR)); > > if (kvm_vcpu_abt_issea(vcpu)) > return kvm_handle_guest_sea(vcpu); > > - if (esr_fsc_is_permission_fault(esr)) { > - handle_vncr_perm(vcpu); > - } else if (esr_fsc_is_translation_fault(esr)) { > - bool valid, is_gmem = false; > - int ret; > - > - scoped_guard(read_lock, &vcpu->kvm->mmu_lock) > - valid = kvm_vncr_tlb_lookup(vcpu); > - > - if (!valid) > - ret = kvm_translate_vncr(vcpu, &is_gmem); > - else > - ret = -EPERM; > + if (!esr_fsc_is_translation_fault(esr) && !esr_fsc_is_permission_fault(esr)) { > + WARN_ONCE(1, "Unhandled VNCR abort, ESR=%llx\n", esr); > + return 1; > + } > > - switch (ret) { > - case -EAGAIN: > - /* Let's try again... */ > - break; > - case -ENOMEM: > - /* > - * For guest_memfd, this indicates that it failed to > - * create a folio to back the memory. Inform userspace. > - */ > - if (is_gmem) > - return 0; > - /* Otherwise, let's try again... */ > - break; > - case -EFAULT: > - case -EIO: > - case -EHWPOISON: > - if (is_gmem) > - return 0; > - fallthrough; > - case -EINVAL: > - case -ENOENT: > - case -EACCES: > - /* > - * Translation failed, inject the corresponding > - * exception back to EL2. > - */ > - BUG_ON(!vt->wr.failed); > + ret = kvm_translate_vncr(vcpu, &is_gmem); > + switch (ret) { > + case -EAGAIN: > + /* Let's try again... */ > + break; > + case -ENOMEM: > + /* > + * For guest_memfd, this indicates that it failed to > + * create a folio to back the memory. Inform userspace. > + */ > + if (is_gmem) > + return 0; > + /* Otherwise, let's try again... */ > + break; > + case -EFAULT: > + case -EIO: > + case -EHWPOISON: > + if (is_gmem) > + return 0; > + fallthrough; > + case -EINVAL: > + case -ENOENT: > + case -EACCES: > + /* > + * Translation failed, inject the corresponding > + * exception back to EL2. > + */ > + BUG_ON(!vt->wr.failed); Maybe we can lose this BUG_ON(). We rely on it being correct anyway. > > - esr &= ~ESR_ELx_FSC; > - esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst); > + esr &= ~ESR_ELx_FSC; > + esr |= FIELD_PREP(ESR_ELx_FSC, vt->wr.fst); > > - kvm_inject_nested_sync(vcpu, esr); > - break; > - case -EPERM: > - /* Hack to deal with POE until we get kernel support */ > - handle_vncr_perm(vcpu); > - break; > - case 0: > - break; > - } > - } else { > - WARN_ONCE(1, "Unhandled VNCR abort, ESR=%llx\n", esr); > + kvm_inject_nested_sync(vcpu, esr); > + break; > + case 0: > + break; > } > > + perm = kvm_is_write_fault(vcpu) ? vt->wr.pw && vt->hpa_writable : vt->wr.pr; > + if (!perm) > + handle_vncr_perm(vcpu); > + Isn't there a problem here, where anything that doesn't perform an early return ends up evaluating the permissions, even if the translation has failed? My hunch is that you'd want something like this: diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 736fdd6e99cdc..0cefc73d97199 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -1460,13 +1460,12 @@ int kvm_handle_vncr_abort(struct kvm_vcpu *vcpu) kvm_inject_nested_sync(vcpu, esr); break; case 0: + perm = kvm_is_write_fault(vcpu) ? vt->wr.pw && vt->hpa_writable : vt->wr.pr; + if (!perm) + handle_vncr_perm(vcpu); break; } - perm = kvm_is_write_fault(vcpu) ? vt->wr.pw && vt->hpa_writable : vt->wr.pr; - if (!perm) - handle_vncr_perm(vcpu); - return 1; } Thoughts? M. -- Without deviation from the norm, progress is not possible.