qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Shiju Jose <shiju.jose@huawei.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Ani Sinha <anisinha@redhat.com>,
	Dongjiu Geng <gengdongjiu1@gmail.com>,
	linux-kernel@vger.kernel.org, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v8 13/13] acpi/ghes: check if the BIOS pointers for HEST are correct
Date: Mon, 19 Aug 2024 16:07:33 +0200	[thread overview]
Message-ID: <20240819160733.464ccebf@imammedo.users.ipa.redhat.com> (raw)
In-Reply-To: <52e6058feba318d01f54da6dca427b40ea5c9435.1723793768.git.mchehab+huawei@kernel.org>

On Fri, 16 Aug 2024 09:37:45 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> The OS kernels navigate between HEST, error source struct
> and CPER by the usage of some pointers. Double-check if such
> pointers were properly initializing, ensuring that they match
> the right address for CPER.

as QEMU, we don't care about what guest wrote into those addresses
(aka it's not hw businesses), even if later qemu will trample
on wrong guest memory (it's guest responsibility to do init right).

However this patch introduces usage for hest_addr_le, that I was looking for.
See notes below.

> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
>  hw/acpi/ghes.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index a822a5eafaa0..51e2e40e5a9c 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -85,6 +85,9 @@ enum AcpiHestSourceId {
>  #define HEST_GHES_V2_TABLE_SIZE  92
>  #define GHES_ACK_OFFSET          (64 + GAS_ADDR_OFFSET + ACPI_HEST_HEADER_SIZE)
>  
> +/* ACPI 6.2: 18.3.2.7: Generic Hardware Error Source */
> +#define GHES_ERR_ST_ADDR_OFFSET  (20 + GAS_ADDR_OFFSET + ACPI_HEST_HEADER_SIZE)
> +
>  /*
>   * Values for error_severity field
>   */
> @@ -425,7 +428,10 @@ NotifierList acpi_generic_error_notifiers =
>  void ghes_record_cper_errors(const void *cper, size_t len,
>                               enum AcpiGhesNotifyType notify, Error **errp)
>  {
> -    uint64_t cper_addr, read_ack_start_addr;
> +    uint64_t hest_read_ack_start_addr, read_ack_start_addr;
> +    uint64_t read_ack_start_addr_2, err_source_struct;
> +    uint64_t hest_err_block_addr, error_block_addr;
> +    uint64_t cper_addr, cper_addr_2;
>      enum AcpiHestSourceId source;
>      AcpiGedState *acpi_ged_state;
>      AcpiGhesState *ags;
> @@ -450,6 +456,28 @@ void ghes_record_cper_errors(const void *cper, size_t len,
>      cper_addr += ACPI_HEST_SRC_ID_COUNT * sizeof(uint64_t);
>      cper_addr += source * ACPI_GHES_MAX_RAW_DATA_LENGTH;
>  
> +    err_source_struct = le64_to_cpu(ags->hest_addr_le) +
> +                        source * HEST_GHES_V2_TABLE_SIZE;

there is no guaranties that HEST table will contain only GHESv2 sources,
and once such is added this place becomes broken.

we need to iterate over HEST taking that into account
and find only ghesv2 structure with source id of interest.

This function (and acpi_ghes_record_errors() as well) taking source_id
as input should be able to lookup pointers from HEST in guest RAM,
very crude idea could look something like this:

typedef struct hest_source_type2len{
   uint16_t type
   int len
} hest_structure_type2len

hest_structure_type2len supported_hest_sources[] = {
    /* Table 18-344 Generic Hardware Error Source version 2 (GHESv2) Structure */
    {.type = 10, .len = 92},
}

uint64_t find_error_source(src_id) {
    uint32_t struct_offset = hest_header_size;
    uint16_t type, id
    do {
       addr = ags->hest_addr_le + struct_offset
 
       cpu_physical_memory_read(addr, &id)
       if (src_id == id)
         return addr

       cpu_physical_memory_read(addr, &type)
       struct_offset ++= get_len_from_supported_hest_sources(type)
    while(struct_offset < hest_len)
    assert if not found
}

unit64_t get_error_status_block_addr(src_id) {
   struct_addr = find_error_source(src_id) 
   hest_err_block_addr =   struct_addr + GHES_ERR_ST_ADDR_OFFSET
   // read intermediate pointer to status block addr pointer in hw table
   cpu_physical_memory_read(hest_err_block_addr, &error_block_addr)
   // read actual pointer to status block
   cpu_physical_memory_read(error_block_addr, &error_status_block_addr)
   return error_status_block_addr
}
 
ditto for read_ack modulo indirection that we have for error_status_block_addr

This way we can easily map source id to error status block
and find needed addresses using pointer info from guest RAM
without fragile pointer math and assumptions which might go wrong
when new error sources are added and regardless of the order they
are being added.

> +    /* Check if BIOS addr pointers were properly generated */
> +
> +    hest_err_block_addr = err_source_struct + GHES_ERR_ST_ADDR_OFFSET;
> +    hest_read_ack_start_addr = err_source_struct + GHES_ACK_OFFSET;
> +
> +    cpu_physical_memory_read(hest_err_block_addr, &error_block_addr,
> +                             sizeof(error_block_addr));
> +
> +    cpu_physical_memory_read(error_block_addr, &cper_addr_2,
> +                             sizeof(error_block_addr));
> +
> +    cpu_physical_memory_read(hest_read_ack_start_addr, &read_ack_start_addr_2,
> +			     sizeof(read_ack_start_addr_2));
> +
> +    assert(cper_addr == cper_addr_2);
> +    assert(read_ack_start_addr == read_ack_start_addr_2);
> +
> +    /* Update ACK offset to notify about a new error */
> +
>      cpu_physical_memory_read(read_ack_start_addr,
>                               &read_ack, sizeof(uint64_t));
>  



  reply	other threads:[~2024-08-19 14:08 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-16  7:37 [PATCH v8 00/13] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 01/13] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2024-08-19 11:21   ` Igor Mammedov
2024-08-16  7:37 ` [PATCH v8 02/13] arm/virt: Wire up a GED error device for ACPI / GHES Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 03/13] acpi/ghes: Add support for GED error device Mauro Carvalho Chehab
2024-08-19 11:43   ` Igor Mammedov
2024-08-23 23:28     ` Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 04/13] qapi/acpi-hest: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2024-08-19 11:54   ` Igor Mammedov
2024-08-16  7:37 ` [PATCH v8 05/13] acpi/ghes: rework the logic to handle HEST source ID Mauro Carvalho Chehab
2024-08-19 12:10   ` Igor Mammedov
2024-08-25  2:02     ` Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 06/13] acpi/ghes: add support for generic error injection via QAPI Mauro Carvalho Chehab
2024-08-19 12:51   ` Igor Mammedov
2024-08-25  3:29     ` Mauro Carvalho Chehab
2024-09-11 13:21       ` Igor Mammedov
2024-09-11 15:34         ` Jonathan Cameron via
2024-09-12 12:42           ` Igor Mammedov
2024-09-13  5:20             ` Mauro Carvalho Chehab
2024-09-13 10:13               ` Jonathan Cameron via
2024-09-13 12:28                 ` Igor Mammedov
2024-09-14  5:38                   ` Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 07/13] acpi/ghes: cleanup the memory error code logic Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 08/13] docs: acpi_hest_ghes: fix documentation for CPER size Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 09/13] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 10/13] target/arm: add an experimental mpidr arm cpu property object Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 11/13] scripts/arm_processor_error.py: retrieve mpidr if not filled Mauro Carvalho Chehab
2024-08-16  7:37 ` [PATCH v8 12/13] acpi/ghes: cleanup generic error data logic Mauro Carvalho Chehab
2024-08-19 12:57   ` Igor Mammedov
2024-08-16  7:37 ` [PATCH v8 13/13] acpi/ghes: check if the BIOS pointers for HEST are correct Mauro Carvalho Chehab
2024-08-19 14:07   ` Igor Mammedov [this message]
2024-08-24  0:15     ` Mauro Carvalho Chehab
2024-08-25  3:48       ` Mauro Carvalho Chehab
2024-08-19 14:21 ` [PATCH v8 00/13] Add ACPI CPER firmware first error injection on ARM emulation Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240819160733.464ccebf@imammedo.users.ipa.redhat.com \
    --to=imammedo@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=anisinha@redhat.com \
    --cc=gengdongjiu1@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mst@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shiju.jose@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).