From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Ani Sinha <anisinha@redhat.com>,
Dongjiu Geng <gengdongjiu1@gmail.com>,
linux-kernel@vger.kernel.org, qemu-arm@nongnu.org,
qemu-devel@nongnu.org
Subject: Re: [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate function
Date: Mon, 9 Dec 2024 10:27:50 +0100 [thread overview]
Message-ID: <20241209102750.03bc6ec5@foz.lan> (raw)
In-Reply-To: <20241204102413.31c8d76d@imammedo.users.ipa.redhat.com>
Em Wed, 4 Dec 2024 10:24:13 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Wed, 4 Dec 2024 09:56:35 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Em Wed, 4 Dec 2024 08:54:40 +0100
> > Igor Mammedov <imammedo@redhat.com> escreveu:
> >
> > > On Tue, 3 Dec 2024 14:47:30 +0100
> > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > >
> > > > Em Tue, 3 Dec 2024 12:51:43 +0100
> > > > Igor Mammedov <imammedo@redhat.com> escreveu:
> > > >
> > > > > On Fri, 22 Nov 2024 10:11:30 +0100
> > > > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > > > >
...
> > > > > > +static void get_hw_error_offsets(uint64_t ghes_addr,
> > > > > > + uint64_t *cper_addr,
> > > > > > + uint64_t *read_ack_register_addr)
> > > > > > +{
> > > > >
> > > > >
> > > > > > + if (!ghes_addr) {
> > > > > > + return;
> > > > > > + }
> > > > >
> > > > > why do we need this check?
> > > >
> > > > It is a safeguard measure to avoid crashes and OOM access. If fw_cfg
> > > > callback doesn't fill it properly, this will be zero.
> > >
> > > shouldn't happen, but yeah it firmware job to write back addr
> > > which might happen for whatever reason (a bug for example).
> > >
> >
> > The main reason I added it is that, after the second series, it could
> > also happen if there's something wrong with the backward compat logic.
> >
> > So, both here and after switching to HEST-based offsets, I opted
> > to explicitly test.
> >
> > > Perhaps push this up to the stack, so we don't have to deal
> > > with scattered checks in ghes code.
> > >
> > > kvm_arch_on_sigbus_vcpu() looks like a goo candidate for check
> > > and warn_once if that ever happens.
> > > It already calls acpi_ghes_present() which resolves GED device
> > > and then later we duplicate this job in ghes_record_cper_errors()
> > >
> > > so maybe rename acpi_ghes_present to something like AcpiGhesState* acpi_ghes_get_state()
> > > and call it instead. And then move ghes_addr check/warn_once there.
> > > This way the rest of ghes code won't have to deal handling practically
> > > impossible error conditions that cause reader to wonder why it might happen.
> >
> > I'll look on it.
Wrote the cleanup patch. See enclosed. I'll place it at the end of the
second series.
> > Yet, if ok for you, I would prefer dealing with this
> > once we have a bigger picture, e.g. once we merge those tree series:
> >
> > - cleanup series (this one);
> > - HEST offset (I'll be sending a new version today);
> ok, lets revisit this point after this series.
> Since at this point we should have a clean picture of how new code
> works.
Thanks,
Mauro
[PATCH] acpi/ghes: Cleanup the code which gets ghes ged state
Move the check logic into a common function and simplify the
code which checks if GHES is enabled and was properly setup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
diff --git a/hw/acpi/ghes-stub.c b/hw/acpi/ghes-stub.c
index 7cec1812dad9..fbabf955155a 100644
--- a/hw/acpi/ghes-stub.c
+++ b/hw/acpi/ghes-stub.c
@@ -16,7 +16,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return -1;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
- return false;
+ return NULL;
}
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index a9c5315c1936..17aada9ee352 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -420,10 +420,6 @@ static void get_hw_error_offsets(uint64_t ghes_addr,
uint64_t *cper_addr,
uint64_t *read_ack_register_addr)
{
- if (!ghes_addr) {
- return;
- }
-
/*
* non-HEST version supports only one source, so no need to change
* the start offset based on the source ID. Also, we can't validate
@@ -451,10 +447,6 @@ static void get_ghes_source_offsets(uint16_t source_id, uint64_t hest_addr,
uint64_t err_source_struct, error_block_addr;
uint32_t num_sources, i;
- if (!hest_addr) {
- return;
- }
-
cpu_physical_memory_read(hest_addr, &num_sources, sizeof(num_sources));
num_sources = le32_to_cpu(num_sources);
@@ -513,7 +505,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
uint16_t source_id, Error **errp)
{
uint64_t cper_addr = 0, read_ack_register_addr = 0, read_ack_register;
- AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
if (len > ACPI_GHES_MAX_RAW_DATA_LENGTH) {
@@ -521,13 +512,10 @@ void ghes_record_cper_errors(const void *cper, size_t len,
return;
}
- acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
- NULL));
- if (!acpi_ged_state) {
- error_setg(errp, "Can't find ACPI_GED object");
+ ags = acpi_ghes_get_state();
+ if (!ags) {
return;
}
- ags = &acpi_ged_state->ghes_state;
if (!ags->hest_lookup) {
get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
@@ -537,11 +525,6 @@ void ghes_record_cper_errors(const void *cper, size_t len,
&cper_addr, &read_ack_register_addr, errp);
}
- if (!cper_addr) {
- error_setg(errp, "can not find Generic Error Status Block");
- return;
- }
-
cpu_physical_memory_read(read_ack_register_addr,
&read_ack_register, sizeof(read_ack_register));
@@ -606,7 +589,7 @@ int acpi_ghes_memory_errors(uint16_t source_id, uint64_t physical_address)
return 0;
}
-bool acpi_ghes_present(void)
+AcpiGhesState *acpi_ghes_get_state(void)
{
AcpiGedState *acpi_ged_state;
AcpiGhesState *ags;
@@ -615,8 +598,14 @@ bool acpi_ghes_present(void)
NULL));
if (!acpi_ged_state) {
- return false;
+ return NULL;
}
ags = &acpi_ged_state->ghes_state;
- return ags->present;
+ if (!ags->present) {
+ return NULL;
+ }
+ if (!ags->hw_error_le && !ags->hest_addr_le) {
+ return NULL;
+ }
+ return ags;
}
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 2e8405edfe27..64fe2b5bea65 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -91,10 +91,11 @@ void ghes_record_cper_errors(const void *cper, size_t len,
uint16_t source_id, Error **errp);
/**
- * acpi_ghes_present: Report whether ACPI GHES table is present
+ * acpi_ghes_get_state: Get a pointer for ACPI ghes state
*
- * Returns: true if the system has an ACPI GHES table and it is
- * safe to call acpi_ghes_memory_errors() to record a memory error.
+ * Returns: a pointer to ghes state if the system has an ACPI GHES table,
+ * it is enabled and it is safe to call acpi_ghes_memory_errors() to record
+ * a memory error. Returns false, otherwise.
*/
-bool acpi_ghes_present(void);
+AcpiGhesState *acpi_ghes_get_state(void);
#endif
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index b4260467f8b9..7802c32fb7e0 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -2369,7 +2369,7 @@ void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
- if (acpi_ghes_present() && addr) {
+ if (acpi_ghes_get_state() && addr) {
ram_addr = qemu_ram_addr_from_host(addr);
if (ram_addr != RAM_ADDR_INVALID &&
kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) {
next prev parent reply other threads:[~2024-12-09 9:28 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-22 9:11 [PATCH v4 00/15] Prepare GHES driver to support error injection Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 01/15] acpi/ghes: get rid of ACPI_HEST_SRC_ID_RESERVED Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 02/15] acpi/ghes: simplify acpi_ghes_record_errors() code Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 03/15] acpi/ghes: simplify the per-arch caller to build HEST table Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 04/15] acpi/ghes: better handle source_id and notification Mauro Carvalho Chehab
2024-11-22 16:20 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 05/15] acpi/ghes: Fix acpi_ghes_record_errors() argument Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 06/15] acpi/ghes: Remove a duplicated out of bounds check Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 07/15] acpi/ghes: Change the type for source_id Mauro Carvalho Chehab
2024-11-22 15:41 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 08/15] acpi/ghes: make the GHES record generation more generic Mauro Carvalho Chehab
2024-11-22 16:19 ` Igor Mammedov
2024-11-25 11:06 ` Mauro Carvalho Chehab
2024-11-25 11:56 ` Jonathan Cameron
2024-11-25 11:56 ` Jonathan Cameron via
2024-12-04 7:52 ` Mauro Carvalho Chehab
2024-12-03 11:42 ` Igor Mammedov
2024-12-03 13:38 ` Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 09/15] acpi/ghes: better name GHES memory error function Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 10/15] acpi/ghes: don't crash QEMU if ghes GED is not found Mauro Carvalho Chehab
2024-11-22 16:21 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 11/15] acpi/ghes: rename etc/hardware_error file macros Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 12/15] acpi/ghes: better name the offset of the hardware error firmware Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate function Mauro Carvalho Chehab
2024-12-03 11:51 ` Igor Mammedov
2024-12-03 13:47 ` Mauro Carvalho Chehab
2024-12-04 7:54 ` Igor Mammedov
2024-12-04 8:56 ` Mauro Carvalho Chehab
2024-12-04 9:24 ` Igor Mammedov
2024-12-09 9:27 ` Mauro Carvalho Chehab [this message]
2024-11-22 9:11 ` [PATCH v4 14/15] acpi/ghes: Change ghes fill logic to work with only one source Mauro Carvalho Chehab
2024-12-03 11:52 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 15/15] docs: acpi_hest_ghes: fix documentation for CPER size Mauro Carvalho Chehab
2024-12-03 11:56 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241209102750.03bc6ec5@foz.lan \
--to=mchehab+huawei@kernel.org \
--cc=Jonathan.Cameron@huawei.com \
--cc=anisinha@redhat.com \
--cc=gengdongjiu1@gmail.com \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.