From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Ani Sinha <anisinha@redhat.com>,
Dongjiu Geng <gengdongjiu1@gmail.com>,
<linux-kernel@vger.kernel.org>, <qemu-arm@nongnu.org>,
<qemu-devel@nongnu.org>
Subject: Re: [PATCH 4/6] acpi/ghes: Use HEST table offsets when preparing GHES records
Date: Mon, 25 Nov 2024 11:31:32 +0000 [thread overview]
Message-ID: <20241125113132.000069e1@huawei.com> (raw)
In-Reply-To: <20241122113714.04450f6b@foz.lan>
> >
> > >
> > > Yet, keep the old code, as this is needed for migration purposes.
> > >
> > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > > ---
> > > hw/acpi/ghes.c | 98 ++++++++++++++++++++++++++++++++++++++++++++------
> > > 1 file changed, 88 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> > > index c93bbaf1994a..9ee25efe8abf 100644
> > > --- a/hw/acpi/ghes.c
> > > +++ b/hw/acpi/ghes.c
> > > @@ -61,6 +61,23 @@
> > > */
> > > #define ACPI_GHES_GESB_SIZE 20
> > >
> > > +/*
> > > + * Offsets with regards to the start of the HEST table stored at
> > > + * ags->hest_addr_le, according with the memory layout map at
> > > + * docs/specs/acpi_hest_ghes.rst.
> > > + */
> > > +
> > /*
> > * ACPI 6.2:
> >
> > to be consistent with local style.
>
> Actually, the local style inside this file was preserved. See, before
> this series we have:
Ah. That's me being lazy and unclear :(
What I meant was
/*
* ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
So open the comment with a blank line.
>
> $ git grep "ACPI " hw/acpi/ghes.c
> hw/acpi/ghes.c: * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> hw/acpi/ghes.c: * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
> hw/acpi/ghes.c: * ACPI 4.0: 17.3.2.7 Hardware Error Notification
> hw/acpi/ghes.c: * ACPI 6.1: 18.3.2.7.1 Generic Error Data
> hw/acpi/ghes.c: * ACPI 6.1: 18.3.2.7.1 Generic Error Data
> hw/acpi/ghes.c: /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
> hw/acpi/ghes.c: * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
> hw/acpi/ghes.c: * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
>
> >
> >
> >
> > > + return;
> > > + }
> > > +
> > > + cpu_physical_memory_read(hest_addr, &num_sources, sizeof(num_sources));
> > > + num_sources = le32_to_cpu(num_sources);
> > > +
> > > + err_source_struct = hest_addr + sizeof(num_sources);
> > > +
> > > + /*
> > > + * Currently, HEST Error source navigates only for GHESv2 tables
> > > + */
> >
> > Feels like duplication of the comment below where the type check is.
> > Maybe drop this one?
>
> If I recall correctly [1], Igor asked to place such comment, on one of
> the past versions of this code.
>
> IMO, placing it clearly there helps to identify what needs to change when
> support for non-GHES tables is needed.
>
> [1] this is the 12th version of this code - my memory is starting to fail
> to remind were exactly each change was requested.
Me too! :)
> >
> > > + addr += sizeof(type);
> > > + cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
> > > +
> > > + if (src_id == source_id) {
> > > + break;
> > > + }
> > > +
> > > + err_source_struct += HEST_GHES_V2_TABLE_SIZE;
> > > + }
> > > + if (i == num_sources) {
> > > + error_setg(errp, "HEST: Source %d not found.", source_id);
> > > + return;
> > > + }
> > > +
> > > + /* Navigate though table address pointers */
> > > + hest_err_block_addr = err_source_struct + GHES_ERR_ST_ADDR_OFFSET;
> > > + hest_read_ack_addr = err_source_struct + GHES_ACK_OFFSET;
> >
> > > +
> > > + cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
> > > + sizeof(error_block_addr));
> > So this points to a registers
> > > +
> > > + cpu_physical_memory_read(error_block_addr, cper_addr,
> > > + sizeof(*cper_addr));
> > and this reads the register. I'm not sure the spec defines the
> > contents of that register to be constant. Maybe we should avoid
> > reading the register here and do it instead at read of the record?
> > I 'think' you could in theory use different storage for the CPER
> > depending on other unhandled errors or whatever else meant you didn't
> > want a fixed location.
> >
> > Or maybe just add a comment to say that the location of CPER storage
> > is fixed.
>
> Sorry, but I missed your point. The actual offset of the error block
> address is defined when fw_cfg callback is called. When this happens,
> this function is called:
>
> void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> GArray *hardware_error)
> {
> /* Create a read-only fw_cfg file for GHES */
> fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
> hardware_error->len);
>
> /* Create a read-write fw_cfg file for Address */
> fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
>
> fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
> NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
>
> ags->present = true;
> }
My question was intended to be a little different but was based on a misread on my part.
I'd failed to figure out this function is called on each addition of an error
not just once.
Code is fine as is.
Jonathan
>
> The other calls for fw_cfg functions ensure the offset of the memory read
> inside the hardware_error firmware and this never changes, as such offsets
> are defined when the hardware_firmware is built at build_ghes_error_table()
> function. This will only change if such function is called again, which
> would, in turn, make acpi_ghes_add_fw_cfg() be called again.
>
> In any case, no matter if build_ghes_error_table()/acpi_ghes_add_fw_cfg()
> is called only once or multiple times [1], at the time
> get_ghes_source_offsets() is called, such offsets are stable.
>
> [1] On some tests I did adding printks, the GHES build logic and the callbacks
> are called twice - both before loading OSPM.
>
> If migration is used, I suspect that it will be called again during
> migration but before starting OSPM. Again, when get_ghes_source_offsets()
> is called, the offsets are fixed.
>
> Thanks,
> Mauro
WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Ani Sinha <anisinha@redhat.com>,
Dongjiu Geng <gengdongjiu1@gmail.com>,
<linux-kernel@vger.kernel.org>, <qemu-arm@nongnu.org>,
<qemu-devel@nongnu.org>
Subject: Re: [PATCH 4/6] acpi/ghes: Use HEST table offsets when preparing GHES records
Date: Mon, 25 Nov 2024 11:31:32 +0000 [thread overview]
Message-ID: <20241125113132.000069e1@huawei.com> (raw)
In-Reply-To: <20241122113714.04450f6b@foz.lan>
> >
> > >
> > > Yet, keep the old code, as this is needed for migration purposes.
> > >
> > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > > ---
> > > hw/acpi/ghes.c | 98 ++++++++++++++++++++++++++++++++++++++++++++------
> > > 1 file changed, 88 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> > > index c93bbaf1994a..9ee25efe8abf 100644
> > > --- a/hw/acpi/ghes.c
> > > +++ b/hw/acpi/ghes.c
> > > @@ -61,6 +61,23 @@
> > > */
> > > #define ACPI_GHES_GESB_SIZE 20
> > >
> > > +/*
> > > + * Offsets with regards to the start of the HEST table stored at
> > > + * ags->hest_addr_le, according with the memory layout map at
> > > + * docs/specs/acpi_hest_ghes.rst.
> > > + */
> > > +
> > /*
> > * ACPI 6.2:
> >
> > to be consistent with local style.
>
> Actually, the local style inside this file was preserved. See, before
> this series we have:
Ah. That's me being lazy and unclear :(
What I meant was
/*
* ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
So open the comment with a blank line.
>
> $ git grep "ACPI " hw/acpi/ghes.c
> hw/acpi/ghes.c: * ACPI 6.1/6.2: 18.3.2.7.1 Generic Error Data,
> hw/acpi/ghes.c: * ACPI 6.2: 18.3.2.7.1 Generic Error Data,
> hw/acpi/ghes.c: * ACPI 4.0: 17.3.2.7 Hardware Error Notification
> hw/acpi/ghes.c: * ACPI 6.1: 18.3.2.7.1 Generic Error Data
> hw/acpi/ghes.c: * ACPI 6.1: 18.3.2.7.1 Generic Error Data
> hw/acpi/ghes.c: /* invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
> hw/acpi/ghes.c: * ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
> hw/acpi/ghes.c: * ACPI 6.1: 18.3.2.8 Generic Hardware Error Source
>
> >
> >
> >
> > > + return;
> > > + }
> > > +
> > > + cpu_physical_memory_read(hest_addr, &num_sources, sizeof(num_sources));
> > > + num_sources = le32_to_cpu(num_sources);
> > > +
> > > + err_source_struct = hest_addr + sizeof(num_sources);
> > > +
> > > + /*
> > > + * Currently, HEST Error source navigates only for GHESv2 tables
> > > + */
> >
> > Feels like duplication of the comment below where the type check is.
> > Maybe drop this one?
>
> If I recall correctly [1], Igor asked to place such comment, on one of
> the past versions of this code.
>
> IMO, placing it clearly there helps to identify what needs to change when
> support for non-GHES tables is needed.
>
> [1] this is the 12th version of this code - my memory is starting to fail
> to remind were exactly each change was requested.
Me too! :)
> >
> > > + addr += sizeof(type);
> > > + cpu_physical_memory_read(addr, &src_id, sizeof(src_id));
> > > +
> > > + if (src_id == source_id) {
> > > + break;
> > > + }
> > > +
> > > + err_source_struct += HEST_GHES_V2_TABLE_SIZE;
> > > + }
> > > + if (i == num_sources) {
> > > + error_setg(errp, "HEST: Source %d not found.", source_id);
> > > + return;
> > > + }
> > > +
> > > + /* Navigate though table address pointers */
> > > + hest_err_block_addr = err_source_struct + GHES_ERR_ST_ADDR_OFFSET;
> > > + hest_read_ack_addr = err_source_struct + GHES_ACK_OFFSET;
> >
> > > +
> > > + cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
> > > + sizeof(error_block_addr));
> > So this points to a registers
> > > +
> > > + cpu_physical_memory_read(error_block_addr, cper_addr,
> > > + sizeof(*cper_addr));
> > and this reads the register. I'm not sure the spec defines the
> > contents of that register to be constant. Maybe we should avoid
> > reading the register here and do it instead at read of the record?
> > I 'think' you could in theory use different storage for the CPER
> > depending on other unhandled errors or whatever else meant you didn't
> > want a fixed location.
> >
> > Or maybe just add a comment to say that the location of CPER storage
> > is fixed.
>
> Sorry, but I missed your point. The actual offset of the error block
> address is defined when fw_cfg callback is called. When this happens,
> this function is called:
>
> void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> GArray *hardware_error)
> {
> /* Create a read-only fw_cfg file for GHES */
> fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
> hardware_error->len);
>
> /* Create a read-write fw_cfg file for Address */
> fw_cfg_add_file_callback(s, ACPI_HW_ERROR_ADDR_FW_CFG_FILE, NULL, NULL,
> NULL, &(ags->hw_error_le), sizeof(ags->hw_error_le), false);
>
> fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
> NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
>
> ags->present = true;
> }
My question was intended to be a little different but was based on a misread on my part.
I'd failed to figure out this function is called on each addition of an error
not just once.
Code is fine as is.
Jonathan
>
> The other calls for fw_cfg functions ensure the offset of the memory read
> inside the hardware_error firmware and this never changes, as such offsets
> are defined when the hardware_firmware is built at build_ghes_error_table()
> function. This will only change if such function is called again, which
> would, in turn, make acpi_ghes_add_fw_cfg() be called again.
>
> In any case, no matter if build_ghes_error_table()/acpi_ghes_add_fw_cfg()
> is called only once or multiple times [1], at the time
> get_ghes_source_offsets() is called, such offsets are stable.
>
> [1] On some tests I did adding printks, the GHES build logic and the callbacks
> are called twice - both before loading OSPM.
>
> If migration is used, I suspect that it will be called again during
> migration but before starting OSPM. Again, when get_ghes_source_offsets()
> is called, the offsets are fixed.
>
> Thanks,
> Mauro
next prev parent reply other threads:[~2024-11-25 11:31 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-13 8:36 [PATCH 0/6] Change ghes driver to use HEST-based offsets Mauro Carvalho Chehab
2024-11-13 8:36 ` [PATCH 1/6] acpi/ghes: Prepare to support multiple sources on ghes Mauro Carvalho Chehab
2024-11-20 14:29 ` Jonathan Cameron
2024-11-20 14:29 ` Jonathan Cameron via
2024-11-13 8:36 ` [PATCH 2/6] acpi/ghes: add a firmware file with HEST address Mauro Carvalho Chehab
2024-11-20 14:32 ` Jonathan Cameron via
2024-11-20 14:32 ` Jonathan Cameron
2024-11-20 14:32 ` Jonathan Cameron via
2024-11-13 8:37 ` [PATCH 3/6] acpi/ghes: rename the function which gets hw error offsets Mauro Carvalho Chehab
2024-11-20 14:33 ` Jonathan Cameron
2024-11-20 14:33 ` Jonathan Cameron via
2024-11-22 9:32 ` Mauro Carvalho Chehab
2024-11-13 8:37 ` [PATCH 4/6] acpi/ghes: Use HEST table offsets when preparing GHES records Mauro Carvalho Chehab
2024-11-20 14:59 ` Jonathan Cameron via
2024-11-20 14:59 ` Jonathan Cameron
2024-11-20 14:59 ` Jonathan Cameron via
2024-11-22 10:37 ` Mauro Carvalho Chehab
2024-11-25 11:31 ` Jonathan Cameron [this message]
2024-11-25 11:31 ` Jonathan Cameron via
2024-11-13 8:37 ` [PATCH 5/6] acpi/generic_event_device: Update GHES migration to cover hest addr Mauro Carvalho Chehab
2024-11-20 15:01 ` Jonathan Cameron via
2024-11-20 15:01 ` Jonathan Cameron
2024-11-22 10:40 ` Mauro Carvalho Chehab
2024-11-13 8:37 ` [PATCH 6/6] acpi/generic_event_device: add logic to detect if HEST addr is available Mauro Carvalho Chehab
2024-11-20 15:09 ` Jonathan Cameron via
2024-11-20 15:09 ` Jonathan Cameron
2024-11-20 15:09 ` Jonathan Cameron via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241125113132.000069e1@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=anisinha@redhat.com \
--cc=gengdongjiu1@gmail.com \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab+huawei@kernel.org \
--cc=mst@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.