From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Shiju Jose <shiju.jose@huawei.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Ani Sinha <anisinha@redhat.com>,
Dongjiu Geng <gengdongjiu1@gmail.com>,
linux-kernel@vger.kernel.org, qemu-arm@nongnu.org,
qemu-devel@nongnu.org
Subject: Re: [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate function
Date: Wed, 4 Dec 2024 09:56:35 +0100 [thread overview]
Message-ID: <20241204095635.512a44d5@foz.lan> (raw)
In-Reply-To: <20241204085440.4640a476@imammedo.users.ipa.redhat.com>
Em Wed, 4 Dec 2024 08:54:40 +0100
Igor Mammedov <imammedo@redhat.com> escreveu:
> On Tue, 3 Dec 2024 14:47:30 +0100
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Em Tue, 3 Dec 2024 12:51:43 +0100
> > Igor Mammedov <imammedo@redhat.com> escreveu:
> >
> > > On Fri, 22 Nov 2024 10:11:30 +0100
> > > Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> > >
> > > > Currently, CPER address location is calculated as an offset of
> > > > the hardware_errors table. It is also badly named, as the
> > > > offset actually used is the address where the CPER data starts,
> > > > and not the beginning of the error source.
> > > >
> > > > Move the logic which calculates such offset to a separate
> > > > function, in preparation for a patch that will be changing the
> > > > logic to calculate it from the HEST table.
> > > >
> > > > While here, properly name the variable which stores the cper
> > > > address.
> > > >
> > > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > ---
> > > > hw/acpi/ghes.c | 41 ++++++++++++++++++++++++++++++++---------
> > > > 1 file changed, 32 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> > > > index 87fd3feedd2a..d99697b20164 100644
> > > > --- a/hw/acpi/ghes.c
> > > > +++ b/hw/acpi/ghes.c
> > > > @@ -364,10 +364,37 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> > > > ags->present = true;
> > > > }
> > > >
> > > > +static void get_hw_error_offsets(uint64_t ghes_addr,
> > > > + uint64_t *cper_addr,
> > > > + uint64_t *read_ack_register_addr)
> > > > +{
> > >
> > >
> > > > + if (!ghes_addr) {
> > > > + return;
> > > > + }
> > >
> > > why do we need this check?
> >
> > It is a safeguard measure to avoid crashes and OOM access. If fw_cfg
> > callback doesn't fill it properly, this will be zero.
>
> shouldn't happen, but yeah it firmware job to write back addr
> which might happen for whatever reason (a bug for example).
>
The main reason I added it is that, after the second series, it could
also happen if there's something wrong with the backward compat logic.
So, both here and after switching to HEST-based offsets, I opted
to explicitly test.
> Perhaps push this up to the stack, so we don't have to deal
> with scattered checks in ghes code.
>
> kvm_arch_on_sigbus_vcpu() looks like a goo candidate for check
> and warn_once if that ever happens.
> It already calls acpi_ghes_present() which resolves GED device
> and then later we duplicate this job in ghes_record_cper_errors()
>
> so maybe rename acpi_ghes_present to something like AcpiGhesState* acpi_ghes_get_state()
> and call it instead. And then move ghes_addr check/warn_once there.
> This way the rest of ghes code won't have to deal handling practically
> impossible error conditions that cause reader to wonder why it might happen.
I'll look on it. Yet, if ok for you, I would prefer dealing with this
once we have a bigger picture, e.g. once we merge those tree series:
- cleanup series (this one);
- HEST offset (I'll be sending a new version today);
- error_inject.
Thanks,
Mauro
next prev parent reply other threads:[~2024-12-04 8:57 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-22 9:11 [PATCH v4 00/15] Prepare GHES driver to support error injection Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 01/15] acpi/ghes: get rid of ACPI_HEST_SRC_ID_RESERVED Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 02/15] acpi/ghes: simplify acpi_ghes_record_errors() code Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 03/15] acpi/ghes: simplify the per-arch caller to build HEST table Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 04/15] acpi/ghes: better handle source_id and notification Mauro Carvalho Chehab
2024-11-22 16:20 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 05/15] acpi/ghes: Fix acpi_ghes_record_errors() argument Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 06/15] acpi/ghes: Remove a duplicated out of bounds check Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 07/15] acpi/ghes: Change the type for source_id Mauro Carvalho Chehab
2024-11-22 15:41 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 08/15] acpi/ghes: make the GHES record generation more generic Mauro Carvalho Chehab
2024-11-22 16:19 ` Igor Mammedov
2024-11-25 11:06 ` Mauro Carvalho Chehab
2024-11-25 11:56 ` Jonathan Cameron
2024-11-25 11:56 ` Jonathan Cameron via
2024-12-04 7:52 ` Mauro Carvalho Chehab
2024-12-03 11:42 ` Igor Mammedov
2024-12-03 13:38 ` Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 09/15] acpi/ghes: better name GHES memory error function Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 10/15] acpi/ghes: don't crash QEMU if ghes GED is not found Mauro Carvalho Chehab
2024-11-22 16:21 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 11/15] acpi/ghes: rename etc/hardware_error file macros Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 12/15] acpi/ghes: better name the offset of the hardware error firmware Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 13/15] acpi/ghes: move offset calculus to a separate function Mauro Carvalho Chehab
2024-12-03 11:51 ` Igor Mammedov
2024-12-03 13:47 ` Mauro Carvalho Chehab
2024-12-04 7:54 ` Igor Mammedov
2024-12-04 8:56 ` Mauro Carvalho Chehab [this message]
2024-12-04 9:24 ` Igor Mammedov
2024-12-09 9:27 ` Mauro Carvalho Chehab
2024-11-22 9:11 ` [PATCH v4 14/15] acpi/ghes: Change ghes fill logic to work with only one source Mauro Carvalho Chehab
2024-12-03 11:52 ` Igor Mammedov
2024-11-22 9:11 ` [PATCH v4 15/15] docs: acpi_hest_ghes: fix documentation for CPER size Mauro Carvalho Chehab
2024-12-03 11:56 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241204095635.512a44d5@foz.lan \
--to=mchehab+huawei@kernel.org \
--cc=Jonathan.Cameron@huawei.com \
--cc=anisinha@redhat.com \
--cc=gengdongjiu1@gmail.com \
--cc=imammedo@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.