linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Oliver Upton <oliver.upton@linux.dev>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Jiaqi Yan <jiaqiyan@google.com>, Jason Gunthorpe <jgg@nvidia.com>,
	maz@kernel.org, duenwen@google.com, rananta@google.com,
	jthoughton@google.com, vsethi@nvidia.com, joey.gouly@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com,
	corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org,
	kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v4 0/3] VMM can handle guest SEA via KVM_EXIT_ARM_SEA
Date: Thu, 13 Nov 2025 10:21:34 -0800	[thread overview]
Message-ID: <aRYhrmLz__AbnCFN@linux.dev> (raw)
In-Reply-To: <wuuvrqxezybzdnijarlom4wvxlfgzgjoakwt7ixittz2jb4mal@ngjvq2rrt2ps>

On Thu, Nov 13, 2025 at 02:54:33PM +0100, Mauro Carvalho Chehab wrote:
> Hi,
> 
> On Mon, Nov 10, 2025 at 09:41:33AM -0800, Jiaqi Yan wrote:
> > On Mon, Oct 20, 2025 at 7:46 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Mon, Oct 13, 2025 at 06:59:00PM +0000, Jiaqi Yan wrote:
> > > > Problem
> > > > =======
> > > >
> > > > When host APEI is unable to claim a synchronous external abort (SEA)
> > > > during guest abort, today KVM directly injects an asynchronous SError
> > > > into the VCPU then resumes it. The injected SError usually results in
> > > > unpleasant guest kernel panic.
> > > >
> > > > One of the major situation of guest SEA is when VCPU consumes recoverable
> > > > uncorrected memory error (UER), which is not uncommon at all in modern
> > > > datacenter servers with large amounts of physical memory. Although SError
> > > > and guest panic is sufficient to stop the propagation of corrupted memory,
> > > > there is room to recover from an UER in a more graceful manner.
> > > >
> > > > Proposed Solution
> > > > =================
> > > >
> > > > The idea is, we can replay the SEA to the faulting VCPU. If the memory
> > > > error consumption or the fault that cause SEA is not from guest kernel,
> > > > the blast radius can be limited to the poison-consuming guest process,
> > > > while the VM can keep running.
> 
> I like the idea of having a "guest-first"/"host-first" approach for APEI,
> letting userspace (likely rasdaemon) to decide to handle hardware errors
> either at the guest or at the host. Yet, it sounds wrong to have a flag
> called KVM_EXIT_ARM_SEA, as:
> 
>     1. This is not exclusive to ARM;
>     2. There are other notification mechanisms that can rise an APEI
>        errors. For instance QEMU code defines:
> 
>     ACPI_GHES_NOTIFY_POLLED = 0,
>     ACPI_GHES_NOTIFY_EXTERNAL = 1,
>     ACPI_GHES_NOTIFY_LOCAL = 2,
>     ACPI_GHES_NOTIFY_SCI = 3,
>     ACPI_GHES_NOTIFY_NMI = 4,
>     ACPI_GHES_NOTIFY_CMCI = 5,
>     ACPI_GHES_NOTIFY_MCE = 6,
>     ACPI_GHES_NOTIFY_GPIO = 7,
>     ACPI_GHES_NOTIFY_SEA = 8,
>     ACPI_GHES_NOTIFY_SEI = 9,
>     ACPI_GHES_NOTIFY_GSIV = 10,
>     ACPI_GHES_NOTIFY_SDEI = 11,
>     ACPI_GHES_NOTIFY_RESERVED = 12
> 
>  - even on arm. QEMU currently implements two mechanisms (SEA and GPIO);
>  - once we implement the same feature on Intel, it will likely use
>    NMI, MCE and/or SCI.
> 
> So, IMO, the best would be to use a more generic name like
> KVM_EXIT_APEI or KVM_EXIT_GHES - or maybe even name it the way it really
> is meant: KVM_EXIT_ACPI_GUEST_FIRST.

This is not the sort of thing that I'd like to seen dressed up as an
arch-generic interface.

What Jiaqi is dealing with is the very sorry state of RAS on arm64,
giving userspace the opportunity to decide how an SEA is handled when a
platform's firmware couldn't be bothered to do so. The SEA is an
architecture-specific event so we provide the hardware context to
the VMM to sort things out.

If the APEI driver actually registers to handle the SEA then it will
continue to handle the SEA before ever involving the VMM. I'm not
aware of any system that does this. If you're lucky you'll take an
*asynchronous* vector after to process a CPER and still have to deal
with a 'bare' SEA.

And of course, none of this even matters for the several billion
DT-based hosts out in the wild.

Thanks,
Oliver


  reply	other threads:[~2025-11-13 18:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 18:59 [PATCH v4 0/3] VMM can handle guest SEA via KVM_EXIT_ARM_SEA Jiaqi Yan
2025-10-13 18:59 ` [PATCH v4 1/3] KVM: arm64: VM exit to userspace to handle SEA Jiaqi Yan
2025-11-03 18:17   ` Jose Marinho
2025-11-03 20:45     ` Jiaqi Yan
2025-11-11  9:53       ` Oliver Upton
2025-11-11 23:32         ` Jiaqi Yan
2025-11-03 22:22     ` Marc Zyngier
2025-10-13 18:59 ` [PATCH v4 2/3] KVM: selftests: Test for KVM_EXIT_ARM_SEA Jiaqi Yan
2025-10-13 18:59 ` [PATCH v4 3/3] Documentation: kvm: new UAPI for handling SEA Jiaqi Yan
2025-10-14  1:51   ` Randy Dunlap
2025-10-21 16:13     ` Jiaqi Yan
2025-10-20 14:46 ` [PATCH v4 0/3] VMM can handle guest SEA via KVM_EXIT_ARM_SEA Jason Gunthorpe
2025-11-10 17:41   ` Jiaqi Yan
2025-11-13 13:54     ` Mauro Carvalho Chehab
2025-11-13 18:21       ` Oliver Upton [this message]
2025-11-13 21:06 ` Oliver Upton
2025-11-13 22:14   ` Jiaqi Yan
2025-11-13 22:33     ` Oliver Upton
2025-11-14  0:53       ` Jiaqi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRYhrmLz__AbnCFN@linux.dev \
    --to=oliver.upton@linux.dev \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=duenwen@google.com \
    --cc=jgg@nvidia.com \
    --cc=jiaqiyan@google.com \
    --cc=joey.gouly@arm.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rananta@google.com \
    --cc=shuah@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).