From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: linux-kernel@vger.kernel.org, qemu-arm@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [PATCH v9 11/12] target/arm: add an experimental mpidr arm cpu property object
Date: Sun, 1 Sep 2024 08:40:04 +0200 [thread overview]
Message-ID: <20240901084004.166a3c96@sal.lan> (raw)
In-Reply-To: <CAFEAcA-wD6U+onh3y4Y-LDTFuYoeWbGShkRPx7emi1ZPfKJP0w@mail.gmail.com>
Em Fri, 30 Aug 2024 17:27:27 +0100
Peter Maydell <peter.maydell@linaro.org> escreveu:
> On Mon, 26 Aug 2024 at 04:12, Mauro Carvalho Chehab
> <mchehab+huawei@kernel.org> wrote:
> >
> > Em Sun, 25 Aug 2024 12:34:14 +0100
> > Peter Maydell <peter.maydell@linaro.org> escreveu:
> >
> > > On Sun, 25 Aug 2024 at 04:46, Mauro Carvalho Chehab
> > > <mchehab+huawei@kernel.org> wrote:
> > > >
> > > > Accurately injecting an ARM Processor error ACPI/APEI GHES
> > > > error record requires the value of the ARM Multiprocessor
> > > > Affinity Register (mpidr).
> > > >
> > > > While ARM implements it, this is currently not visible.
> > > >
> > > > Add a field at CPU storing it, and place it at arm_cpu_properties
> > > > as experimental, thus allowing it to be queried via QMP using
> > > > qom-get function.
> > >
> > > > static Property arm_cpu_properties[] = {
> > > > DEFINE_PROP_UINT64("midr", ARMCPU, midr, 0),
> > > > + DEFINE_PROP_UINT64("x-mpidr", ARMCPU, mpidr, 0),
> > > > DEFINE_PROP_UINT64("mp-affinity", ARMCPU,
> > > > mp_affinity, ARM64_AFFINITY_INVALID),
> > > > DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID),
> > >
> > > Why do we need this?
> >
> > The ACPI HEST tables, in particular when using GHESv2 provide
> > several kinds of errors. Among them, we have ARM Processor Error,
> > as defined at UEFI 2.10 spec (and earlier versions), the Common
> > Platform Error Record (CPER) is defined as:
> >
> > https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html?highlight=ghes#arm-processor-error-section
> >
> > There are two fields that are part of the CPER record. One of them is
> > mandatory (MIDR); the other one is optional, but needed to decode another
> > field.
> >
> > So, basically those errors need them.
>
> OK, but why do scripts outside of QEMU need the information,
> as opposed to telling QEMU "hey, generate an error" and
> QEMU knowing the format to use? Do we have any other
> QMP APIs where something external provides raw ACPI
> data like this?
This was discussed during the review of this patch series.
See, the ACPI Platform Error Interfaces (APEI) code currently in QEMU
implements limited support for ACPI HEST - Hardware Error Source Table [1].
[1] https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#acpi-error-source
HEST consists of, currently, 9 error types (plus 3 obsoleted ones). Among
them, there is support for generic errors via GHES and GHESv2 types.
While not officially obsoleted, GHES is superseded by GHESv2.
GHESv2 (and GHES) has a section type field to identify which error type it
is [2]. Currently, there are +10 defined UUIDs for the section type.
[2] https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#section-descriptor
The current code on ghes.c implements GHESv2 support for a single
type (memory error), received from the host OS via SIGBUS.
Testing such code and injecting such error is not easy, as the host OS needs
to send a SIGBUS to the guest, this reflecting an error at the main OS.
Such code also has several limitations.
-
At the first three versions of this patch set, the code was just doing
like what you said: it was adding an error injection for a HEST GHESv2
ARM Processor Error. So the error record (CPER) were produced in QEMU using
some optional parameters passed via QMP to change fields when needed.
With such approach, QEMU could use directly the value from MIDR and MPIDR.
The main disadvantage is that, to make full support of HEST, a lot
of code will be needed to add support for every GHESv2 type and for
every GHESv2 section type. So, the feedback we had were to re-implement
it into a generic way.
The generic CPER error inject approach (since v4 of this series), has
soma advantages:
- it is easy to do fuzz testing, as the entire CPER is built via a python
script;
- no need to modify QEMU to support other GHESv2 types of record and
to support other types of processors;
- GHESv2 fields can also be dynamically generated;
- It shouldn't be hard to change the code to support other types of
HEST table (currently, only GHESv2 is supported).
The disadvantage is that queries are needed to pick configuration and
register values from the current emulation to do error injection. For
ARM Processor Error, it means that MPIDR and MIDR, are needed. Other
processors and other error types will also require to query other data
from QEMU, either using already-existing QMP code or by adding new ones.
Yet, the amount of code for such queries seem to be smaller than the
amount of code to be added for every single GHESv2/HEST type.
-
Worth saying that QEMU may still require internal HEST/GHES errors to be
able to reflect at the guests hardware problems detected at the host OS.
So, for instance, if a host OS memory is poisoned due to hardware errors,
QEMU and guests need to know, in order to kill processes affected
by a bad memory.
Regards,
Mauro
next prev parent reply other threads:[~2024-09-01 6:41 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-25 3:45 [PATCH v9 00/12] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
2024-08-25 3:45 ` [PATCH v9 01/12] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
2024-09-11 13:51 ` Igor Mammedov
2024-09-13 5:44 ` Mauro Carvalho Chehab
2024-09-13 13:25 ` Igor Mammedov
2024-09-14 5:33 ` Mauro Carvalho Chehab
2024-09-16 11:05 ` Igor Mammedov
2024-10-01 8:54 ` Mauro Carvalho Chehab
2024-10-03 11:48 ` Igor Mammedov
2024-08-25 3:45 ` [PATCH v9 02/12] acpi/ghes: rework the logic to handle HEST source ID Mauro Carvalho Chehab
2024-09-11 15:01 ` Igor Mammedov
2024-09-13 15:14 ` Mauro Carvalho Chehab
2024-08-25 3:45 ` [PATCH v9 03/12] acpi/ghes: rename etc/hardware_error file macros Mauro Carvalho Chehab
2024-09-13 13:27 ` Igor Mammedov
2024-08-25 3:45 ` [PATCH v9 04/12] acpi/ghes: better name GHES memory error function Mauro Carvalho Chehab
2024-09-13 13:31 ` Igor Mammedov
2024-08-25 3:46 ` [PATCH v9 05/12] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
2024-08-25 3:46 ` [PATCH v9 06/12] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2024-08-25 3:46 ` [PATCH v9 07/12] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
2024-08-25 3:46 ` [PATCH v9 08/12] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2024-08-27 11:11 ` Markus Armbruster
2024-08-25 3:46 ` [PATCH v9 09/12] docs: acpi_hest_ghes: fix documentation for CPER size Mauro Carvalho Chehab
2024-08-25 3:46 ` [PATCH v9 10/12] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2024-08-25 3:46 ` [PATCH v9 11/12] target/arm: add an experimental mpidr arm cpu property object Mauro Carvalho Chehab
2024-08-25 11:34 ` Peter Maydell
2024-08-26 1:53 ` Mauro Carvalho Chehab
2024-08-30 16:27 ` Peter Maydell
2024-09-01 6:40 ` Mauro Carvalho Chehab [this message]
2024-08-25 3:46 ` [PATCH v9 12/12] scripts/arm_processor_error.py: retrieve mpidr if not filled Mauro Carvalho Chehab
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240901084004.166a3c96@sal.lan \
--to=mchehab+huawei@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).