From: Oliver Upton <oupton@kernel.org>
To: Jiaqi Yan <jiaqiyan@google.com>
Cc: Jose Marinho <jose.marinho@arm.com>,
maz@kernel.org, oliver.upton@linux.dev, duenwen@google.com,
rananta@google.com, jthoughton@google.com, vsethi@nvidia.com,
jgg@nvidia.com, joey.gouly@arm.com, suzuki.poulose@arm.com,
yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org,
kvm@vger.kernel.org, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v4 1/3] KVM: arm64: VM exit to userspace to handle SEA
Date: Tue, 11 Nov 2025 01:53:01 -0800 [thread overview]
Message-ID: <aRMHfS1-K4E4UCbc@kernel.org> (raw)
In-Reply-To: <CACw3F51_0A8CuCgzcvoA3Db=Wxo8mm5XZw5in+nTKrst+NCcqw@mail.gmail.com>
Hi Jiaqi,
On Mon, Nov 03, 2025 at 12:45:50PM -0800, Jiaqi Yan wrote:
> On Mon, Nov 3, 2025 at 10:17 AM Jose Marinho <jose.marinho@arm.com> wrote:
> >
> > Thank you for these patches.
>
> Thanks for your comments, Jose!
>
> >
> > On 10/13/2025 7:59 PM, Jiaqi Yan wrote:
> > > When APEI fails to handle a stage-2 synchronous external abort (SEA),
> > > today KVM injects an asynchronous SError to the VCPU then resumes it,
> > > which usually results in unpleasant guest kernel panic.
> > >
> > > One major situation of guest SEA is when vCPU consumes recoverable
> > > uncorrected memory error (UER). Although SError and guest kernel panic
> > > effectively stops the propagation of corrupted memory, guest may
> > > re-use the corrupted memory if auto-rebooted; in worse case, guest
> > > boot may run into poisoned memory. So there is room to recover from
> > > an UER in a more graceful manner.
> > >
> > > Alternatively KVM can redirect the synchronous SEA event to VMM to
> > > - Reduce blast radius if possible. VMM can inject a SEA to VCPU via
> > > KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison
> > > consumption or fault is not from guest kernel, blast radius can be
> > > limited to the triggering thread in guest userspace, so VM can
> > > keep running.
> > > - Allow VMM to protect from future memory poison consumption by
> > > unmapping the page from stage-2, or to interrupt guest of the
> > > poisoned page so guest kernel can unmap it from stage-1 page table.
> > > - Allow VMM to track SEA events that VM customers care about, to restart
> > > VM when certain number of distinct poison events have happened,
> > > to provide observability to customers in log management UI.
> > >
> > > Introduce an userspace-visible feature to enable VMM handle SEA:
> > > - KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior
> > > when host APEI fails to claim a SEA, userspace can opt in this new
> > > capability to let KVM exit to userspace during SEA if it is not
> > > owned by host.
> > > - KVM_EXIT_ARM_SEA. A new exit reason is introduced for this.
> > > KVM fills kvm_run.arm_sea with as much as possible information about
> > > the SEA, enabling VMM to emulate SEA to guest by itself.
> > > - Sanitized ESR_EL2. The general rule is to keep only the bits
> > > useful for userspace and relevant to guest memory.
> > > - Flags indicating if faulting guest physical address is valid.
> > > - Faulting guest physical and virtual addresses if valid.
> > >
> > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
> > > Co-developed-by: Oliver Upton <oliver.upton@linux.dev>
> > > Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
> > > ---
> > > arch/arm64/include/asm/kvm_host.h | 2 +
> > > arch/arm64/kvm/arm.c | 5 +++
> > > arch/arm64/kvm/mmu.c | 68 ++++++++++++++++++++++++++++++-
> > > include/uapi/linux/kvm.h | 10 +++++
> > > 4 files changed, 84 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index b763293281c88..e2c65b14e60c4 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -350,6 +350,8 @@ struct kvm_arch {
> > > #define KVM_ARCH_FLAG_GUEST_HAS_SVE 9
> > > /* MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are writable from userspace */
> > > #define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10
> > > + /* Unhandled SEAs are taken to userspace */
> > > +#define KVM_ARCH_FLAG_EXIT_SEA 11
> > > unsigned long flags;
> > >
> > > /* VM-wide vCPU feature set */
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index f21d1b7f20f8e..888600df79c40 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -132,6 +132,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> > > }
> > > mutex_unlock(&kvm->lock);
> > > break;
> > > + case KVM_CAP_ARM_SEA_TO_USER:
> > > + r = 0;
> > > + set_bit(KVM_ARCH_FLAG_EXIT_SEA, &kvm->arch.flags);
> > > + break;
> > > default:
> > > break;
> > > }
> > > @@ -327,6 +331,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> > > case KVM_CAP_IRQFD_RESAMPLE:
> > > case KVM_CAP_COUNTER_OFFSET:
> > > case KVM_CAP_ARM_WRITABLE_IMP_ID_REGS:
> > > + case KVM_CAP_ARM_SEA_TO_USER:
> > > r = 1;
> > > break;
> > > case KVM_CAP_SET_GUEST_DEBUG2:
> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > > index 7cc964af8d305..09210b6ab3907 100644
> > > --- a/arch/arm64/kvm/mmu.c
> > > +++ b/arch/arm64/kvm/mmu.c
> > > @@ -1899,8 +1899,48 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
> > > read_unlock(&vcpu->kvm->mmu_lock);
> > > }
> > >
> > > +/*
> > > + * Returns true if the SEA should be handled locally within KVM if the abort
> > > + * is caused by a kernel memory allocation (e.g. stage-2 table memory).
> > > + */
> > > +static bool host_owns_sea(struct kvm_vcpu *vcpu, u64 esr)
> > > +{
> > > + /*
> > > + * Without FEAT_RAS HCR_EL2.TEA is RES0, meaning any external abort
> > > + * taken from a guest EL to EL2 is due to a host-imposed access (e.g.
> > > + * stage-2 PTW).
> > > + */
> > > + if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
> > > + return true;
> > > +
> > > + /* KVM owns the VNCR when the vCPU isn't in a nested context. */
> > > + if (is_hyp_ctxt(vcpu) && (esr & ESR_ELx_VNCR))
> > Is this check valid only for a "Data Abort"?
>
> Yes, the VNCR bit is specific to a Data Abort (provided we can only
> reach host_owns_sea if kvm_vcpu_abt_issea).
> I don't think we need to explicitly exclude the check here for
> Instruction Abort.
You can take an external abort on an instruction fetch, in which case
bit 13 of the ISS (VNCR bit for data abort) is RES0. So this does need
to check for a data abort.
Thanks,
Oliver
next prev parent reply other threads:[~2025-11-11 9:53 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-13 18:59 [PATCH v4 0/3] VMM can handle guest SEA via KVM_EXIT_ARM_SEA Jiaqi Yan
2025-10-13 18:59 ` [PATCH v4 1/3] KVM: arm64: VM exit to userspace to handle SEA Jiaqi Yan
2025-11-03 18:17 ` Jose Marinho
2025-11-03 20:45 ` Jiaqi Yan
2025-11-11 9:53 ` Oliver Upton [this message]
2025-11-11 23:32 ` Jiaqi Yan
2025-11-03 22:22 ` Marc Zyngier
2025-10-13 18:59 ` [PATCH v4 2/3] KVM: selftests: Test for KVM_EXIT_ARM_SEA Jiaqi Yan
2025-10-13 18:59 ` [PATCH v4 3/3] Documentation: kvm: new UAPI for handling SEA Jiaqi Yan
2025-10-14 1:51 ` Randy Dunlap
2025-10-21 16:13 ` Jiaqi Yan
2025-10-20 14:46 ` [PATCH v4 0/3] VMM can handle guest SEA via KVM_EXIT_ARM_SEA Jason Gunthorpe
2025-11-10 17:41 ` Jiaqi Yan
2025-11-13 13:54 ` Mauro Carvalho Chehab
2025-11-13 18:21 ` Oliver Upton
2025-11-13 21:06 ` Oliver Upton
2025-11-13 22:14 ` Jiaqi Yan
2025-11-13 22:33 ` Oliver Upton
2025-11-14 0:53 ` Jiaqi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRMHfS1-K4E4UCbc@kernel.org \
--to=oupton@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=duenwen@google.com \
--cc=jgg@nvidia.com \
--cc=jiaqiyan@google.com \
--cc=joey.gouly@arm.com \
--cc=jose.marinho@arm.com \
--cc=jthoughton@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=rananta@google.com \
--cc=shuah@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).