From: Igor Mammedov <imammedo@redhat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: "Michael S . Tsirkin" <mst@redhat.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>,
qemu-arm@nongnu.org, qemu-devel@nongnu.org,
Ani Sinha <anisinha@redhat.com>,
Dongjiu Geng <gengdongjiu1@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records
Date: Thu, 27 Feb 2025 10:22:55 +0100 [thread overview]
Message-ID: <20250227102255.6843705e@imammedo.users.ipa.redhat.com> (raw)
In-Reply-To: <20250226171406.19c2de6b@sal.lan>
On Wed, 26 Feb 2025 17:14:06 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Em Tue, 25 Feb 2025 10:43:27 +0100
> Igor Mammedov <imammedo@redhat.com> escreveu:
>
> > On Fri, 21 Feb 2025 07:02:21 +0100
> > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> >
> > > Em Mon, 3 Feb 2025 15:34:23 +0100
> > > Igor Mammedov <imammedo@redhat.com> escreveu:
> > >
> > > > On Fri, 31 Jan 2025 18:42:44 +0100
> > > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > > >
> > > > > There are two pointers that are needed during error injection:
> > > > >
> > > > > 1. The start address of the CPER block to be stored;
> > > > > 2. The address of the ack.
> > > > >
> > > > > It is preferable to calculate them from the HEST table. This allows
> > > > > checking the source ID, the size of the table and the type of the
> > > > > HEST error block structures.
> > > > >
> > > > > Yet, keep the old code, as this is needed for migration purposes.
> > > > >
> > > > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > > > > ---
> > > > > hw/acpi/ghes.c | 132 ++++++++++++++++++++++++++++++++++++-----
> > > > > include/hw/acpi/ghes.h | 1 +
> > > > > 2 files changed, 119 insertions(+), 14 deletions(-)
> > > > >
> > > > > diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> > > > > index 27478f2d5674..8f284fd191a6 100644
> > > > > --- a/hw/acpi/ghes.c
> > > > > +++ b/hw/acpi/ghes.c
> > > > > @@ -41,6 +41,12 @@
> > > > > /* Address offset in Generic Address Structure(GAS) */
> > > > > #define GAS_ADDR_OFFSET 4
> > > > >
> > > > > +/*
> > > > > + * ACPI spec 1.0b
> > > > > + * 5.2.3 System Description Table Header
> > > > > + */
> > > > > +#define ACPI_DESC_HEADER_OFFSET 36
> > > > > +
> > > > > /*
> > > > > * The total size of Generic Error Data Entry
> > > > > * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> > > > > @@ -61,6 +67,25 @@
> > > > > */
> > > > > #define ACPI_GHES_GESB_SIZE 20
> > > > >
> > > > > +/*
> > > > > + * Offsets with regards to the start of the HEST table stored at
> > > > > + * ags->hest_addr_le,
> > > >
> > > > If I read this literary, then offsets above are not what
> > > > declared later in this patch.
> > > > I'd really drop this comment altogether as it's confusing,
> > > > and rather get variables/macro naming right
> > > >
> > > > > according with the memory layout map at
> > > > > + * docs/specs/acpi_hest_ghes.rst.
> > > > > + */
> > > >
> > > > what we need is update to above doc, describing new and old ways.
> > > > a separate patch.
> > >
> > > I can't see anything that should be changed at
> > > docs/specs/acpi_hest_ghes.rst, as this series doesn't change the
> > > firmware layout: we're still using two firmware tables:
> > >
> > > - etc/acpi/tables, with HEST on it;
> > > - etc/hardware_errors, with:
> > > - error block addresses;
> > > - read_ack registers;
> > > - CPER records.
> > >
> > > The only changes that this series introduce are related to how
> > > the error generation logic navigates between HEST and hw_errors
> > > firmware. This is not described at acpi_hest_ghes.rst, and both
> > > ways follow ACPI specs to the letter.
> > >
> > > The only difference is that the code which populates the CPER
> > > record and the error/read offsets doesn't require to know how
> > > the HEST table generation placed offsets, as it will basically
> > > reproduce what OSPM firmware does when handling HEST events.
> >
> > section 8 describes old way to get to address to record old CPER,
> > so it needs to amended to also describe a new approach and say
> > which way is used for which version.
> >
> > possibly section 11 might need some messaging as well.
>
> Ok, I'll modify it and place at the end of the series. Please
> see below if the new text is ok for you.
>
> ---
>
> [PATCH] docs/specs/acpi_hest_ghes.rst: update it to reflect some changes
s/^^^/docs: hest: add new "etc/acpi_table_hest_addr" and update workflow/
>
> While the HEST layout didn't change, there are some internal
> changes related to how offsets are calculated and how memory error
> events are triggered.
>
> Update specs to reflect such changes.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>
> diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
> index c3e9f8d9a702..f22d2eefdec7 100644
> --- a/docs/specs/acpi_hest_ghes.rst
> +++ b/docs/specs/acpi_hest_ghes.rst
> @@ -89,12 +89,21 @@ Design Details
> addresses in the "error_block_address" fields with a pointer to the
> respective "Error Status Data Block" in the "etc/hardware_errors" blob.
>
> -(8) QEMU defines a third and write-only fw_cfg blob which is called
> - "etc/hardware_errors_addr". Through that blob, the firmware can send back
> - the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
> - blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER command
> - for the firmware. The firmware will write back the start address of
> - "etc/hardware_errors" blob to the fw_cfg file "etc/hardware_errors_addr".
> +(8) QEMU defines a third and write-only fw_cfg blob to store the location
> + where the error block offsets, read ack registers and CPER records are
> + stored.
> +
> + Up to QEMU 9.2, the location was at "etc/hardware_errors_addr", and
> + contains an offset for the beginning of "etc/hardware_errors".
> +
> + Newer versions place the location at "etc/acpi_table_hest_addr",
> + pointing to the beginning of the HEST table.
> +
> + Through that such offsets, the firmware can send back the guest-side
^^^^^^^^^^^^^^^^^^^^^^^^^ can't parse that, suggest to just drop the phrase
> + allocation addresses to QEMU. They contain a 8-byte entry. QEMU generates
> + a single WRITE_POINTER command for the firmware. The firmware will write
> + back the start address of either "etc/hardware_errors" or HEST table at
^^^^ drop this?
> + the correspoinding address firmware.
>
> (9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into corresponding
> "Error Status Data Block", guest memory, and then injects platform specific
> @@ -105,8 +114,6 @@ Design Details
> kernel, on receiving notification, guest APEI driver could read the CPER error
> and take appropriate action.
>
> -(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in "etc/hardware_errors" to
> - find out "Error Status Data Block" entry corresponding to error source. So supported
> - source_id values should be assigned here and not be changed afterwards to make sure
> - that guest will write error into expected "Error Status Data Block" even if guest was
> - migrated to a newer QEMU.
> +(11) kvm_arch_on_sigbus_vcpu() report RAS errors via a SEA notifications,
> + when a SIGBUS event is triggered.
> The logic to convert a SEA notification
> + into a source ID is defined inside ghes.c source file.
that's cheating and not documentation by any means
>
>
>
next prev parent reply other threads:[~2025-02-27 9:23 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-31 17:42 [PATCH v3 00/14] Change ghes to use HEST-based offsets and add support for error inject Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 01/14] acpi/ghes: Prepare to support multiple sources on ghes Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 02/14] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
2025-02-03 13:41 ` Igor Mammedov
2025-01-31 17:42 ` [PATCH v3 03/14] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
2025-02-03 10:42 ` Jonathan Cameron via
2025-02-03 14:34 ` Igor Mammedov
2025-02-21 6:02 ` Mauro Carvalho Chehab
2025-02-25 9:43 ` Igor Mammedov
2025-02-26 16:14 ` Mauro Carvalho Chehab
2025-02-27 9:22 ` Igor Mammedov [this message]
2025-02-27 10:11 ` Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 04/14] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 05/14] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
2025-02-03 14:56 ` Igor Mammedov
2025-01-31 17:42 ` [PATCH v3 06/14] acpi/ghes: only set hw_error_le or hest_addr_le Mauro Carvalho Chehab
2025-02-03 10:48 ` Jonathan Cameron via
2025-01-31 17:42 ` [PATCH v3 07/14] acpi/ghes: add a notifier to notify when error data is ready Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 08/14] acpi/ghes: Cleanup the code which gets ghes ged state Mauro Carvalho Chehab
2025-02-03 10:51 ` Jonathan Cameron via
2025-01-31 17:42 ` [PATCH v3 09/14] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 10/14] tests/acpi: virt: allow acpi table changes for a new table: HEST Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 11/14] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
2025-01-31 17:42 ` [PATCH v3 12/14] tests/acpi: virt: add a HEST table to aarch64 virt and update DSDT Mauro Carvalho Chehab
2025-02-03 10:53 ` Jonathan Cameron via
2025-01-31 17:42 ` [PATCH v3 13/14] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2025-02-05 8:12 ` Markus Armbruster
2025-01-31 17:42 ` [PATCH v3 14/14] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2025-02-03 10:56 ` Jonathan Cameron via
2025-02-05 8:16 ` Markus Armbruster
2025-02-21 4:57 ` Mauro Carvalho Chehab
2025-02-21 5:50 ` Markus Armbruster
2025-02-03 11:09 ` [PATCH v3 00/14] Change ghes to use HEST-based offsets and add support for " Jonathan Cameron via
2025-02-03 15:22 ` Igor Mammedov
2025-02-21 6:38 ` Mauro Carvalho Chehab
2025-02-21 10:21 ` Jonathan Cameron via
2025-02-21 12:23 ` Mauro Carvalho Chehab
2025-02-21 15:05 ` Mauro Carvalho Chehab
2025-02-25 10:01 ` Igor Mammedov
2025-02-26 9:56 ` Mauro Carvalho Chehab
2025-02-26 11:23 ` Mauro Carvalho Chehab
2025-02-26 11:31 ` Mauro Carvalho Chehab
2025-02-26 12:29 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250227102255.6843705e@imammedo.users.ipa.redhat.com \
--to=imammedo@redhat.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=anisinha@redhat.com \
--cc=gengdongjiu1@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab+huawei@kernel.org \
--cc=mst@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).