From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Ani Sinha <anisinha@redhat.com>,
Dongjiu Geng <gengdongjiu1@gmail.com>,
linux-kernel@vger.kernel.org, qemu-arm@nongnu.org,
qemu-devel@nongnu.org
Subject: Re: [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate function
Date: Mon, 9 Dec 2024 10:27:50 +0100 [thread overview]
Message-ID: <20241209102750.03bc6ec5@foz.lan> (raw)
In-Reply-To: <20241204102413.31c8d76d@imammedo.users.ipa.redhat.com>
Em Wed, 4 Dec 2024 10:24:13 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Wed, 4 Dec 2024 09:56:35 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Em Wed, 4 Dec 2024 08:54:40 +0100
> > Igor Mammedov <imammedo@redhat.com> escreveu:
> >
> > > On Tue, 3 Dec 2024 14:47:30 +0100
> > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > >
> > > > Em Tue, 3 Dec 2024 12:51:43 +0100
> > > > Igor Mammedov <imammedo@redhat.com> escreveu:
> > > >
> > > > > On Fri, 22 Nov 2024 10:11:30 +0100
> > > > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > > > >
...
> > > > > > +static void get_hw_error_offsets(uint64_t ghes_addr,
> > > > > > + uint64_t *cper_addr,
> > > > > > + uint64_t *read_ack_register_addr)
> > > > > > +{
> > > > >
> > > > >
> > > > > > + if (!ghes_addr) {
> > > > > > + return;
> > > > > > + }
> > > > >
> > > > > why do we need this check?
> > > >
> > > > It is a safeguard measure to avoid crashes and OOM access. If fw_cfg
> > > > callback doesn't fill it properly, this will be zero.
> > >
> > > shouldn't happen, but yeah it firmware job to write back addr
> > > which might happen for whatever reason (a bug for example).
> > >
> >
> > The main reason I added it is that, after the second series, it could
> > also happen if there's something wrong with the backward compat logic.
> >
> > So, both here and after switching to HEST-based offsets, I opted
> > to explicitly test.
> >
> > > Perhaps push this up to the stack, so we don't have to deal
> > > with scattered checks in ghes code.
> > >
> > > kvm_arch_on_sigbus_vcpu() looks like a goo candidate for check
> > > and warn_once if that ever happens.
> > > It already calls acpi_ghes_present() which resolves GED device
> > > and then later we duplicate this job in ghes_record_cper_errors()
> > >
> > > so maybe rename acpi_ghes_present to something like AcpiGhesState* acpi_ghes_get_state()
> > > and call it instead. And then move ghes_addr check/warn_once there.
> > > This way the rest of ghes code won't have to deal handling practically
> > > impossible error conditions that cause reader to wonder why it might happen.
> >
> > I'll look on it.
Wrote the cleanup patch. See enclosed. I'll place it at the end of the
second series.
> > Yet, if ok for you, I would prefer dealing with this
> > once we have a bigger picture, e.g. once we merge those tree series:
> >
> > - cleanup series (this one);
> > - HEST offset (I'll be sending a new version today);
> ok, lets revisit this point after this series.
> Since at this point we should have a clean picture of how new code
> works.
Thanks,
Mauro
[PATCH] acpi/ghes: Cleanup the code which gets ghes ged state
Move the check logic into a common function and simplify the
code which checks if GHES is enabled and was properly setup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..fbabf955155a 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -16,7 +16,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return -1;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
- return false;
+ return NULL;
}
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index a9c5315c1936..17aada9ee352 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -420,10 +420,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
uint64_t *cper_addr,
uint64_t *read_ack_register_addr)
{
- if (!ghes_addr) {
- return;
- }
-
/*
* non-HEST version supports only one source, so no need to change
* the start offset based on the source ID. Also, we can't validate
@@ -451,10 +447,6 @@ static void get_ghes_source_offsets(uint16_t source_id, uint64_t hest_addr,
uint64_t err_source_struct, error_block_addr;
uint32_t num_sources, i;
- if (!hest_addr) {
- return;
- }
-
cpu_physical_memory_read(hest_addr, &num_sources, sizeof(num_sources));
num_sources = le32_to_cpu(num_sources);
@@ -513,7 +505,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
- AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
@@ -521,13 +512,10 @@ void ghes_record_cper_errors(const void *cper, size_t len,
return;
}
- acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
- NULL));
- if (!acpi_ged_state) {
- error_setg(errp, "Can't find ACPI_GED object");
+ ags = acpi_ghes_get_state();
+ if (!ags) {
return;
}
- ags = &acpi_ged_state->ghes_state;
if (!ags->hest_lookup) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
@@ -537,11 +525,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
&cper_addr, &read_ack_register_addr, errp);
}
- if (!cper_addr) {
- error_setg(errp, "can not find Generic Error Status Block");
- return;
- }
-
cpu_physical_memory_read(read_ack_register_addr,
&read_ack_register, sizeof(read_ack_register));
@@ -606,7 +589,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return 0;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
@@ -615,8 +598,14 @@ bool acpi_ghes_present(void)
NULL));
if (!acpi_ged_state) {
- return false;
+ return NULL;
}
ags = &acpi_ged_state->ghes_state;
- return ags->present;
+ if (!ags->present) {
+ return NULL;
+ }
+ if (!ags->hw_error_le && !ags->hest_addr_le) {
+ return NULL;
+ }
+ return ags;
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 2e8405edfe27..64fe2b5bea65 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -91,10 +91,11 @@ void ghes_record_cper_errors(const void *cper, size_t len,
uint16_t source_id, Error **errp);
/**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
*
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
+ * a memory error. Returns false, otherwise.
*/
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
#endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index b4260467f8b9..7802c32fb7e0 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2369,7 +2369,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
- if (acpi_ghes_present() && addr) {
+ if (acpi_ghes_get_state() && addr) {
ram_addr = qemu_ram_addr_from_host(addr);
if (ram_addr != RAM_ADDR_INVALID &&
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
next prev parent reply other threads:[~2024-12-09 9:27 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-22 9:11 [PATCH v4 00/15] Prepare GHES driver to support error injection Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 01/15] acpi/ghes: get rid of ACPI_HEST_SRC_ID_RESERVED Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 02/15] acpi/ghes: simplify acpi_ghes_record_errors() code Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 03/15] acpi/ghes: simplify the per-arch caller to build HEST table Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 04/15] acpi/ghes: better handle source_id and notification Mauro Carvalho Chehab
2024-11-22 16:20 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 05/15] acpi/ghes: Fix acpi_ghes_record_errors() argument Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 06/15] acpi/ghes: Remove a duplicated out of bounds check Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 07/15] acpi/ghes: Change the type for source_id Mauro Carvalho Chehab
2024-11-22 15:41 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 08/15] acpi/ghes: make the GHES record generation more generic Mauro Carvalho Chehab
2024-11-22 16:19 ` Igor Mammedov
2024-11-25 11:06 ` Mauro Carvalho Chehab
2024-11-25 11:56 ` Jonathan Cameron
2024-12-04 7:52 ` Mauro Carvalho Chehab
2024-12-03 11:42 ` Igor Mammedov
2024-12-03 13:38 ` Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 09/15] acpi/ghes: better name GHES memory error function Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 10/15] acpi/ghes: don't crash QEMU if ghes GED is not found Mauro Carvalho Chehab
2024-11-22 16:21 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 11/15] acpi/ghes: rename etc/hardware_error file macros Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 12/15] acpi/ghes: better name the offset of the hardware error firmware Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate function Mauro Carvalho Chehab
2024-12-03 11:51 ` Igor Mammedov
2024-12-03 13:47 ` Mauro Carvalho Chehab
2024-12-04 7:54 ` Igor Mammedov
2024-12-04 8:56 ` Mauro Carvalho Chehab
2024-12-04 9:24 ` Igor Mammedov
2024-12-09 9:27 ` Mauro Carvalho Chehab [this message]
2024-11-22 9:11 ` [PATCH v4 14/15] acpi/ghes: Change ghes fill logic to work with only one source Mauro Carvalho Chehab
2024-12-03 11:52 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 15/15] docs: acpi_hest_ghes: fix documentation for CPER size Mauro Carvalho Chehab
2024-12-03 11:56 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241209102750.03bc6ec5@foz.lan \
--to=mchehab+huawei@kernel.org \
--cc=Jonathan.Cameron@huawei.com \
--cc=anisinha@redhat.com \
--cc=gengdongjiu1@gmail.com \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox