qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 0/3] Generate APEI GHES table and dynamically record CPER
@ 2017-07-12  2:08 Dongjiu Geng
  2017-07-12  2:08 ` [Qemu-devel] [PATCH v5 1/3] ACPI: Add new ACPI structures and macros Dongjiu Geng
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Dongjiu Geng @ 2017-07-12  2:08 UTC (permalink / raw)
  To: lersek, mst, imammedo, famz, qemu-devel, zhaoshenglong,
	peter.maydell, qemu-arm, james.morse, zhengqiang10, huangshaoyu,
	wuquanming
  Cc: gengdongjiu

In the armv8 platform, the mainly hardware error source are ARMv8
SEA/SEI/GSIV. For the ARMv8 SEA/SEI, the KVM or host kernel will signal SIGBUS
or use other interface to notify user space, such as Qemu. After Qemu gets
the notification, it will record the CPER and inject the SEA/SEI to KVM. this
series of patches will generate APEI table when guest OS boot up, and dynamically
record CPER for the guest OS about the generic hardware errors, currently the
userspace only handle the memory section hardware errors

how to test:
1. In the guest OS, use this command to dump the APEI table: 
	"iasl -p ./HEST -d /sys/firmware/acpi/tables/HEST"
2. And find the address for the generic error status block
   according to the notification type
3. then find the CPER record through the generic error status block.

For example(notification type is SEA):

(1) root@genericarmv8:~# iasl -p ./HEST -d /sys/firmware/acpi/tables/HEST
(2) root@genericarmv8:~# cat HEST.dsl
	/*
	 * Intel ACPI Component Architecture
	 * AML/ASL+ Disassembler version 20170303 (64-bit version)
	 * Copyright (c) 2000 - 2017 Intel Corporation
	 *
	 * Disassembly of /sys/firmware/acpi/tables/HEST, Mon Dec 12 07:19:43 2016
	 *
	 * ACPI Data Table [HEST]
	 *
	 * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
	 */
    ..................................................................................
    [228h 0552   2]                Subtable Type : 0009 [Generic Hardware Error Source]
	[22Ah 0554   2]                    Source Id : 0008
	[22Ch 0556   2]            Related Source Id : FFFF
	[22Eh 0558   1]                     Reserved : 00
	[22Fh 0559   1]                      Enabled : 01
	[230h 0560   4]       Records To Preallocate : 00000001
	[234h 0564   4]      Max Sections Per Record : 00000001
	[238h 0568   4]          Max Raw Data Length : 00001000

	[23Ch 0572  12]         Error Status Address : [Generic Address Structure]
	[23Ch 0572   1]                     Space ID : 00 [SystemMemory]
	[23Dh 0573   1]                    Bit Width : 40
	[23Eh 0574   1]                   Bit Offset : 00
	[23Fh 0575   1]         Encoded Access Width : 04 [QWord Access:64]
	[240h 0576   8]                      Address : 00000000785D0040

	[248h 0584  28]                       Notify : [Hardware Error Notification Structure]
	[248h 0584   1]                  Notify Type : 08 [SEA]
	[249h 0585   1]                Notify Length : 1C
	[24Ah 0586   2]   Configuration Write Enable : 0000
	[24Ch 0588   4]                 PollInterval : 00000000
	[250h 0592   4]                       Vector : 00000000
	[254h 0596   4]      Polling Threshold Value : 00000000
	[258h 0600   4]     Polling Threshold Window : 00000000
	[25Ch 0604   4]        Error Threshold Value : 00000000
	[260h 0608   4]       Error Threshold Window : 00000000

	[264h 0612   4]    Error Status Block Length : 00001000
    .....................................................................................
(3) according to above table, the address that contains the physical address of a block
    of memory that holds the error status data for SEA notification error source is 0x00000000785D0040
(4) the address for SEA notification error source is 0x785d8058
	(qemu) xp /2x 0x00000000785D0040
	00000000785d0040: 0x785d8058 0x00000000
(5) check the content of generic error status block and generic error data entry
    (qemu) xp /100x 0x785d8058
    00000000785d8058: 0x00000001 0x00000000 0x00000000 0x00000098
    00000000785d8068: 0x00000001 0xa5bc1114 0x4ede6f64 0x833e63b8
	00000000785d8078: 0xb1837ced 0x00000000 0x00000000 0x00000050
	00000000785d8088: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d8098: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d80a8: 0x00000000 0x00000000 0x00000000 0x00004002
	00000000785d80b8: 0x00000000 0x00000000 0x00000000 0x00001111
	00000000785d80c8: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d80d8: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d80e8: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d80f8: 0x00000000 0x00000003 0x00000000 0x00000000
	00000000785d8108: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d8118: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d8128: 0x00000000 0x00000000 0x00000000 0x00000000
	00000000785d8138: 0x00000000 0x00000000 0x00000000 0x00000000

Dongjiu Geng (3):
  ACPI: Add new ACPI structures and macros
  ACPI: Add APEI GHES Table Generation support
  ACPI: build and enable APEI GHES in the Makefile and configuration

 default-configs/arm-softmmu.mak |   1 +
 hw/acpi/Makefile.objs           |   1 +
 hw/acpi/aml-build.c             |   2 +
 hw/acpi/hest_ghes.c             | 219 ++++++++++++++++++++++++++++++++++++++++
 hw/arm/virt-acpi-build.c        |   6 ++
 include/hw/acpi/acpi-defs.h     | 194 +++++++++++++++++++++++++++++++++++
 include/hw/acpi/aml-build.h     |   1 +
 include/hw/acpi/hest_ghes.h     |  47 +++++++++
 include/qemu/uuid.h             |  11 ++
 9 files changed, 482 insertions(+)
 create mode 100644 hw/acpi/hest_ghes.c
 create mode 100644 include/hw/acpi/hest_ghes.h

-- 
2.10.1

^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: [Qemu-devel] [PATCH v5 2/3] ACPI: Add APEI GHES Table Generation support
@ 2017-07-15 15:29 gengdongjiu
  0 siblings, 0 replies; 20+ messages in thread
From: gengdongjiu @ 2017-07-15 15:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: lersek@redhat.com, imammedo@redhat.com, famz@redhat.com,
	qemu-devel@nongnu.org, Zhaoshenglong, peter.maydell@linaro.org,
	qemu-arm@nongnu.org, james.morse@arm.com, Zhengqiang (turing),
	Huangshaoyu, Wuquanming, Gaozhihui

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 23147 bytes --]

Michael,
  Thanks for the review and comments. 

> -----ÓʼþÔ­¼þ-----
> ·¢¼þÈË: Michael S. Tsirkin [mailto:mst@redhat.com]
> Ö÷Ìâ: Re: [PATCH v5 2/3] ACPI: Add APEI GHES Table Generation support
> 
> On Wed, Jul 12, 2017 at 10:08:16AM +0800, Dongjiu Geng wrote:
> > This implements APEI GHES Table by passing the error CPER info to the
> > guest via a fw_cfg_blob. After a CPER info is recorded, an
> > SEA(Synchronous External Abort)/SEI(SError Interrupt) exception will
> > be injected into the guest OS.
> >
> > Below is the table layout, the max number of error soure is 11, which
> > is classified by notification type.
> >
> > etc/acpi/tables                 etc/hardware_errors
> > ================     ==========================================
> >                      +-----------+
> > +--------------+     | address   |         +-> +--------------+
> > |    HEST      +     | registers |         |   | Error Status |
> > + +------------+     | +---------+         |   | Data Block 0 |
> > | | GHES0      | --> | |address0 | --------+   | +------------+
> > | | GHES1      | --> | |address1 | ------+     | |  CPER      |
> > | | GHES2      | --> | |address2 | ----+ |     | |  CPER      |
> > | |  ....      | --> | | ....... |     | |     | |  CPER      |
> > | | GHES10     | --> | |address10| -+  | |     | |  CPER      |
> > +-+------------+     +-+---------+  |  | |     +-+------------+
> >                                     |  | |
> >                                     |  | +---> +--------------+
> >                                     |  |       | Error Status |
> >                                     |  |       | Data Block 1 |
> >                                     |  |       | +------------+
> >                                     |  |       | |  CPER      |
> >                                     |  |       | |  CPER      |
> >                                     |  |       +-+------------+
> >                                     |  |
> >                                     |  +-----> +--------------+
> >                                     |          | Error Status |
> >                                     |          | Data Block 2 |
> >                                     |          | +------------+
> >                                     |          | |  CPER      |
> >                                     |          +-+------------+
> >                                     |            ...........
> >                                     +--------> +--------------+
> >                                                | Error Status |
> >                                                | Data Block 10|
> >                                                | +------------+
> >                                                | |  CPER      |
> >                                                | |  CPER      |
> >                                                | |  CPER      |
> >                                                +-+------------+
> > Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> > ---
> > thanks a lot Laszlo's review and comments:
> >
> > change since v4:
> > 1. fix email threading in this series is incorrect issue
> >
> > change since v3:
> > 1. remove the unnecessary include for "hw/acpi/vmgenid.h" in
> >    hw/arm/virt-acpi-build.c
> > 2. add conversion between LE and host-endian for the CPER record 3.
> > handle the case that run out of the preallocated memory for the CPER
> > record 4. change to use g_malloc0 instead of g_malloc 5. change
> > block_reqr_size name to block_rer_size 6. change QEMU coding style,
> > that is, the operator is at the end of the line.
> > 7. drop the ERROR_STATUS_ADDRESS_OFFSET and GAS_ADDRESS_OFFSET macros (from
> >    the header file as well), and use the offsetof to replace it.
> > 8. remove the init_aml_allocator() / free_aml_allocator(), calculate the
> >    needed size, and push that many bytes directly to "table_data".
> > 9. take an "OVMF header probe suppressor" into account 10.corrct HEST
> > and CPER value assigment, for example, correct the source_id
> >    for every error source, this identifier of source_id should be unique among
> >    all error sources;
> > 11. create only one WRITE_POINTER command, for the base address of "etc/hardware_errors".
> >    This should be done outside of the loop.The base addresses of the individual error
> >    status data blocks should be calculated in ghes_update_guest(), based on the error
> >    source / notification type
> > 12.correct the commit message lists error sources / notification types 0 through 10 (count=11 in total).
> > 13.correct the size calculation for GHES_DATA_ADDR_FW_CFG_FILE
> > 14.range-checked the value of "notify" before using it as an array
> > subscript
> > ---
> >  hw/acpi/aml-build.c      |   2 +
> >  hw/acpi/hest_ghes.c      | 219 +++++++++++++++++++++++++++++++++++++++++++++++
> >  hw/arm/virt-acpi-build.c |   6 ++
> >  3 files changed, 227 insertions(+)
> >  create mode 100644 hw/acpi/hest_ghes.c
> >
> > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index
> > c6f2032..802b98d 100644
> > --- a/hw/acpi/aml-build.c
> > +++ b/hw/acpi/aml-build.c
> > @@ -1560,6 +1560,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
> >      tables->table_data = g_array_new(false, true /* clear */, 1);
> >      tables->tcpalog = g_array_new(false, true /* clear */, 1);
> >      tables->vmgenid = g_array_new(false, true /* clear */, 1);
> > +    tables->hardware_errors = g_array_new(false, true /* clear */,
> > + 1);
> >      tables->linker = bios_linker_loader_init();  }
> >
> > @@ -1570,6 +1571,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
> >      g_array_free(tables->table_data, true);
> >      g_array_free(tables->tcpalog, mfre);
> >      g_array_free(tables->vmgenid, mfre);
> > +    g_array_free(tables->hardware_errors, mfre);
> >  }
> >
> >  /* Build rsdt table */
> > diff --git a/hw/acpi/hest_ghes.c b/hw/acpi/hest_ghes.c new file mode
> > 100644 index 0000000..c9442b6
> > --- /dev/null
> > +++ b/hw/acpi/hest_ghes.c
> > @@ -0,0 +1,219 @@
> > +/*
> > + *  APEI GHES table Generation
> > + *
> > + *  Copyright (C) 2017 huawei.
> > + *
> > + *  Author: Dongjiu Geng <gengdongjiu@huawei.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qmp-commands.h"
> > +#include "hw/acpi/acpi.h"
> > +#include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/hest_ghes.h"
> > +#include "hw/nvram/fw_cfg.h"
> > +#include "sysemu/sysemu.h"
> > +
> > +static int ghes_record_cper(uint64_t error_block_address,
> > +                                    uint64_t error_physical_addr) {
> > +    AcpiGenericErrorStatus block;
> > +    AcpiGenericErrorData *gdata;
> > +    UefiCperSecMemErr *mem_err;
> > +    uint64_t current_block_length;
> > +    unsigned char *buffer;
> > +    QemuUUID section_id_le = UEFI_CPER_SEC_PLATFORM_MEM;
> > +
> > +
> > +    cpu_physical_memory_read(error_block_address, &block,
> > +                                sizeof(AcpiGenericErrorStatus));
> > +
> > +    /* Get the current generic error status block length */
> > +    current_block_length = sizeof(AcpiGenericErrorStatus) +
> > +        le32_to_cpu(block.data_length);
> > +
> > +    /* If the Generic Error Status Block is NULL, update
> > +     * the block header
> > +     */
> > +    if (!block.block_status) {
> > +        block.block_status = ACPI_GEBS_UNCORRECTABLE;
> > +        block.error_severity = ACPI_CPER_SEV_FATAL;
> > +    }
> > +
> > +    block.data_length += cpu_to_le32(sizeof(AcpiGenericErrorData));
> > +    block.data_length += cpu_to_le32(sizeof(UefiCperSecMemErr));
> > +
> > +    /* check whether it runs out of the preallocated memory */
> > +    if ((le32_to_cpu(block.data_length) + sizeof(AcpiGenericErrorStatus)) >
> > +       GHES_MAX_RAW_DATA_LENGTH) {
> > +        return GHES_CPER_FAIL;
> > +    }
> > +    /* Write back the Generic Error Status Block to guest memory */
> > +    cpu_physical_memory_write(error_block_address, &block,
> > +                        sizeof(AcpiGenericErrorStatus));
> > +
> > +    /* Fill in Generic Error Data Entry */
> > +    buffer = g_malloc0(sizeof(AcpiGenericErrorData) +
> > +                       sizeof(UefiCperSecMemErr));
> > +    memset(buffer, 0, sizeof(AcpiGenericErrorData) + sizeof(UefiCperSecMemErr));
> > +    gdata = (AcpiGenericErrorData *)buffer;
> > +
> > +    qemu_uuid_bswap(&section_id_le);
> > +    memcpy(&(gdata->section_type_le), &section_id_le,
> > +                sizeof(QemuUUID));
> > +    gdata->error_data_length =
> > + cpu_to_le32(sizeof(UefiCperSecMemErr));
> > +
> > +    mem_err = (UefiCperSecMemErr *) (gdata + 1);
> > +
> > +    /* In order to simplify simulation, hard code the CPER section to memory
> > +     * section.
> > +     */
> > +
> > +    /* Hard code to Multi-bit ECC error */
> > +    mem_err->validation_bits |= cpu_to_le32(UEFI_CPER_MEM_VALID_ERROR_TYPE);
> > +    mem_err->error_type =
> > + cpu_to_le32(UEFI_CPER_MEM_ERROR_TYPE_MULTI_ECC);
> > +
> > +    /* Record the physical address at which the memory error occurred */
> > +    mem_err->validation_bits |= cpu_to_le32(UEFI_CPER_MEM_VALID_PA);
> > +    mem_err->physical_addr = cpu_to_le32(error_physical_addr);
> > +
> > +    /* Write back the Generic Error Data Entry to guest memory */
> > +    cpu_physical_memory_write(error_block_address + current_block_length,
> > +        buffer, sizeof(AcpiGenericErrorData) +
> > + sizeof(UefiCperSecMemErr));
> > +
> > +    g_free(buffer);
> > +    return GHES_CPER_OK;
> > +}
> > +
> > +void ghes_build_acpi(GArray *table_data, GArray *hardware_error,
> > +                                            BIOSLinker *linker) {
> > +    GArray *buffer;
> > +    uint32_t address_registers_offset;
> > +    AcpiHardwareErrorSourceTable *error_source_table;
> > +    AcpiGenericHardwareErrorSource *error_source;
> > +    int i;
> > +    /*
> > +     * The block_req_size stands for one address and one
> > +     * generic error status block
> > +      +---------+
> > +      | address | --------+-> +---------+
> > +      +---------+             |  CPER   |
> > +                              |  CPER   |
> > +                              |  CPER   |
> > +                              |  CPER   |
> > +                              |  ....   |
> > +                              +---------+
> > +     */
> > +    int block_req_size = sizeof(uint64_t) + GHES_MAX_RAW_DATA_LENGTH;
> > +
> > +    /* The total size for address of data structure and
> > +     * error status data block
> 
> Pls start multiple comments with /* by itself

 Ok, got it

> 
> > +     */
> > +    g_array_set_size(hardware_error, GHES_ACPI_HEST_NOTIFY_RESERVED *
> > +                                                block_req_size);
> > +
> > +    buffer = g_array_new(false, true /* clear */, 1);
> > +    address_registers_offset = table_data->len +
> > +        sizeof(AcpiHardwareErrorSourceTable) +
> > +        offsetof(AcpiGenericHardwareErrorSource, error_status_address) +
> > +        offsetof(struct AcpiGenericAddress, address);
> > +
> > +    /* Reserve space for HEST table size */
> > +    acpi_data_push(buffer, sizeof(AcpiHardwareErrorSourceTable) +
> > +                                GHES_ACPI_HEST_NOTIFY_RESERVED *
> > +
> > + sizeof(AcpiGenericHardwareErrorSource));
> > +
> > +    g_array_append_vals(table_data, buffer->data, buffer->len);
> > +    /* Allocate guest memory for the Data fw_cfg blob */
> > +    bios_linker_loader_alloc(linker, GHES_ERRORS_FW_CFG_FILE, hardware_error,
> > +                            4096, false /* page boundary, high memory
> > + */);
> 
> page boundary refers to 4096? Pls move it there, or put before bios_linker_loader_alloc.

Yes, I want to make it align to one page which is 4096 bytes.
Do you mean put before bios_linker_loader_alloc is make the array hardware_error align to 4096 instead of alloc alignment
to 4096?

> 
> > +
> > +    error_source_table = (AcpiHardwareErrorSourceTable *)(table_data->data
> > +                        + table_data->len - buffer->len);
> > +    error_source_table->error_source_count = GHES_ACPI_HEST_NOTIFY_RESERVED;
> > +    error_source = (AcpiGenericHardwareErrorSource *)
> > +        ((AcpiHardwareErrorSourceTable *)error_source_table + 1);
> 
> Concern with all these casts and pointer math is that it is barely readable.
> We can easily overflow the array and get hard to debug heap corruption errors.
> 
> What can I suggest?  Just add this to the array in steps. Then you will not need all the math.
> 
> Or again, you can just add things as you init them using build_append_int_noprefix.

Michael, thanks for your suggestion. I will do it. 

> 
> 
> > +
> > +    bios_linker_loader_write_pointer(linker, GHES_DATA_ADDR_FW_CFG_FILE,
> > +        0, sizeof(uint64_t), GHES_ERRORS_FW_CFG_FILE,
> > +        GHES_ACPI_HEST_NOTIFY_RESERVED * sizeof(uint64_t));
> 
> What does this uint64_t refer to? It's repeated everywhere.
> 
The uint64_t is the address point to the Error Status Data Block.
Now there are 11 Generic hardware error source, one Generic hardware error source have one
Error Status Data Block , every Error Status Data Block has different notification type, the notification type
Is shown below:

0 ¨C Polled
1 ¨C External Interrupt
2 ¨C Local Interrupt
3 ¨C SCI
4 ¨C NMI
5 - CMCI
6 - MCE
7 - GPIO-Signal
8 - ARMv8 SEA
9 - ARMv8 SEI
10 - External Interrupt - GSIV

So the cfg_blob etc/hardware_errors stands for 11 address(every address size is uint64_t)
and 11 Error Status Data Block(every Error Status Data Block size is 0x10000), so here
ask guest to write a pointer to the Error Status Data Block 0 into the address0.
Because the physical address is continuous, so here I only write the point to Error Status
Data Block 0 into the address0 one time, other address can be calculated. For example,
The value of address1 is the value of address0 add 0x10000, the value of address2 is the value of address0 add 0x20000

etc/hardware_errors 
================  
+------------------+ 
|    HEST  |    
+ +---------------+    
| | address0 |     
| | address1 |     
| | address2 |     
| |  ....    |   
| | adress10 |   
+-+--------------+  
| Error Status |
| Data Block 0|
| +----------------+
| Error Status |
| Data Block 1|
| +----------------+
| Error Status |
| Data Block 2|
| +----------------+
| Error Status |
| Data Block 3|
| +----------------+
|  ----------   |
|  ----------   |
| +----------------+
| Error Status |
| Data Block10|
| +----------------+                                   

> > +
> > +    for (i = 0; i < GHES_ACPI_HEST_NOTIFY_RESERVED; i++) {
> > +        error_source->type = ACPI_HEST_SOURCE_GENERIC_ERROR;
> > +        error_source->source_id = cpu_to_le16(i);
> > +        error_source->related_source_id = 0xffff;
> > +        error_source->flags = 0;
> > +        error_source->enabled = 1;
> > +        /* The number of error status block per Generic Hardware
> > + Error Source */
> 
> number of ... blocks ?

  Yes, it is. Here error_source->number_of_records is indeed the number of error status block per Generic
  Hardware Error Source. I have confirmed it. We set the block number to 1. In one blocks there are many CPER.

> 
> > +        error_source->number_of_records = 1;
> > +        error_source->max_sections_per_record = 1;
> > +        error_source->max_raw_data_length = GHES_MAX_RAW_DATA_LENGTH;
> > +        error_source->error_status_address.space_id =
> > +                                    AML_SYSTEM_MEMORY;
> > +        error_source->error_status_address.bit_width = 64;
> > +        error_source->error_status_address.bit_offset = 0;
> > +        error_source->error_status_address.access_width = 4;
> > +        error_source->notify.type = i;
> > +        error_source->notify.length = sizeof(AcpiHestNotify);
> > +
> > +        error_source->error_status_block_length =
> > + GHES_MAX_RAW_DATA_LENGTH;
> > +
> > +        bios_linker_loader_add_pointer(linker,
> > +            ACPI_BUILD_TABLE_FILE, address_registers_offset + i *
> > +            sizeof(AcpiGenericHardwareErrorSource), sizeof(uint64_t),
> > +            GHES_ERRORS_FW_CFG_FILE, i * sizeof(uint64_t));
> 
> and what's this math for?

Refer to the ACPI 2.1, 18.3.2.7 Generic Hardware Error Source In the APEI table. there are 11 Generic Hardware Error Source, every hardware
Error source has different notification type. In every Generic Hardware Error Source Structure, there is a number called Error Status Address which
specifies the location of a register that contains the physical address of a block of memory that holds the error
status data for this error source. So here patch address with a pointer to address0/address1/....../address10.
As shown below or as shown in the commit messages.

I paste the APEI table here:
https://wiki.linaro.org/LEG/Engineering/Kernel/RAS/APEITables

 etc/acpi/tables                 etc/hardware_errors
 ================     ==========================================
                      +-----------+
 +--------------+     | address   |         +-> +--------------+
 | HEST    +     | registers |         |   | Error Status |
 + +------------+     | +---------+         |   | Data Block 0 |
 | | GHES0      | --> | |address0 | --------+   | +------------+
 | | GHES1      | --> | |address1 | ------+     | |  CPER      |
 | | GHES2      | --> | |address2 | ----+ |     | |  CPER      |
 | |  ....       | -->  | | ....... |     | |     | |  CPER      |
 | | GHES10     | --> | |address10| -+  | |     | |  CPER      |
 +-+------------+     +-+---------+  |  | |     +-+------------+
                                     |  | |
                                     |  | +---> +--------------+
                                     |  |       | Error Status |
                                     |  |       | Data Block 1 |
                                     |  |       | +------------+
                                     |  |       | |  CPER      |
                                     |  |       | |  CPER      |
                                     |  |       +-+------------+
                                     |  |
                                     |  +-----> +--------------+
                                     |          | Error Status |
                                     |          | Data Block 2 |
                                     |          | +------------+
                                     |          | |  CPER      |
                                     |          +-+------------+
                                     |            ...........
                                     +--------> +--------------+
                                                | Error Status |
                                                | Data Block 10|
                                                | +------------+
                                                | |  CPER      |
                                                | |  CPER      |
                                                | |  CPER      |
                                                +-+------------+


> 
> > +
> > +        error_source++;
> > +    }
> > +
> > +    for (i = 0; i < GHES_ACPI_HEST_NOTIFY_RESERVED; i++) {
> > +        bios_linker_loader_add_pointer(linker,
> > +            GHES_ERRORS_FW_CFG_FILE, sizeof(uint64_t) * i, sizeof(uint64_t),
> > +            GHES_ERRORS_FW_CFG_FILE, GHES_ACPI_HEST_NOTIFY_RESERVED *
> > +            sizeof(uint64_t) + i * GHES_MAX_RAW_DATA_LENGTH);
> > +    }
> > +
> > +    build_header(linker, table_data,
> > +        (void *)error_source_table, "HEST", buffer->len, 1, NULL,
> > + "GHES");
> 
> Pls indent so continuation lines are ro the right of (.
 Ok, thanks.
> 
> > +
> > +    g_array_free(buffer, true);
> > +}
> > +
> > +static GhesState ges;
> > +void ghes_add_fw_cfg(FWCfgState *s, GArray *hardware_error) {
> > +
> > +    size_t request_block_size = sizeof(uint64_t) + GHES_MAX_RAW_DATA_LENGTH;
> > +    size_t size = GHES_ACPI_HEST_NOTIFY_RESERVED *
> > + request_block_size;
> > +
> > +    /* Create a read-only fw_cfg file for GHES */
> > +    fw_cfg_add_file(s, GHES_ERRORS_FW_CFG_FILE, hardware_error->data,
> > +                    size);
> > +    /* Create a read-write fw_cfg file for Address */
> > +    fw_cfg_add_file_callback(s, GHES_DATA_ADDR_FW_CFG_FILE, NULL, NULL,
> > +        &ges.ghes_addr_le, sizeof(ges.ghes_addr_le), false); }
> > +
> > +bool ghes_update_guest(uint32_t notify, uint64_t physical_address) {
> > +    uint64_t error_block_addr;
> > +
> > +    if (physical_address && notify < GHES_ACPI_HEST_NOTIFY_RESERVED) {
> > +        error_block_addr = ges.ghes_addr_le + notify * GHES_MAX_RAW_DATA_LENGTH;
> > +        error_block_addr = le32_to_cpu(error_block_addr);
> > +
> > +        /* A zero value in ghes_addr means that BIOS has not yet written
> > +         * the address
> > +         */
> > +        if (error_block_addr) {
> > +            return ghes_record_cper(error_block_addr, physical_address);
> > +        }
> > +    }
> > +
> > +    return GHES_CPER_FAIL;
> > +}
> > diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index
> > 0835e59..5c97016 100644
> > --- a/hw/arm/virt-acpi-build.c
> > +++ b/hw/arm/virt-acpi-build.c
> > @@ -45,6 +45,7 @@
> >  #include "hw/arm/virt.h"
> >  #include "sysemu/numa.h"
> >  #include "kvm_arm.h"
> > +#include "hw/acpi/hest_ghes.h"
> >
> >  #define ARM_SPI_BASE 32
> >  #define ACPI_POWER_BUTTON_DEVICE "PWRB"
> > @@ -778,6 +779,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
> >      acpi_add_table(table_offsets, tables_blob);
> >      build_spcr(tables_blob, tables->linker, vms);
> >
> > +    acpi_add_table(table_offsets, tables_blob);
> > +    ghes_build_acpi(tables_blob, tables->hardware_errors,
> > + tables->linker);
> > +
> >      if (nb_numa_nodes > 0) {
> >          acpi_add_table(table_offsets, tables_blob);
> >          build_srat(tables_blob, tables->linker, vms); @@ -890,6
> > +894,8 @@ void virt_acpi_setup(VirtMachineState *vms)
> >      fw_cfg_add_file(vms->fw_cfg, ACPI_BUILD_TPMLOG_FILE, tables.tcpalog->data,
> >                      acpi_data_len(tables.tcpalog));
> >
> > +    ghes_add_fw_cfg(vms->fw_cfg, tables.hardware_errors);
> > +
> >      build_state->rsdp_mr = acpi_add_rom_blob(build_state, tables.rsdp,
> >                                                ACPI_BUILD_RSDP_FILE,
> > 0);
> >
> > --
> > 2.10.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-07-17  4:43 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-12  2:08 [Qemu-devel] [PATCH v5 0/3] Generate APEI GHES table and dynamically record CPER Dongjiu Geng
2017-07-12  2:08 ` [Qemu-devel] [PATCH v5 1/3] ACPI: Add new ACPI structures and macros Dongjiu Geng
2017-07-13 10:33   ` Laszlo Ersek
2017-07-13 12:00     ` gengdongjiu
2017-07-13 15:33       ` Laszlo Ersek
2017-07-13 17:13         ` Michael S. Tsirkin
2017-07-14  8:25           ` gengdongjiu
2017-07-13 17:35       ` Michael S. Tsirkin
2017-07-13 17:01   ` Michael S. Tsirkin
2017-07-14  5:51     ` gengdongjiu
2017-07-13 17:11   ` Michael S. Tsirkin
2017-07-14  8:22     ` gengdongjiu
2017-07-12  2:08 ` [Qemu-devel] [PATCH v5 2/3] ACPI: Add APEI GHES Table Generation support Dongjiu Geng
2017-07-13 17:32   ` Michael S. Tsirkin
2017-07-13 19:41     ` Laszlo Ersek
2017-07-13 22:31       ` Michael S. Tsirkin
2017-07-17  4:43       ` gengdongjiu
2017-07-12  2:08 ` [Qemu-devel] [PATCH v5 3/3] ACPI: build and enable APEI GHES in the Makefile and configuration Dongjiu Geng
2017-07-13 17:36 ` [Qemu-devel] [PATCH v5 0/3] Generate APEI GHES table and dynamically record CPER Michael S. Tsirkin
  -- strict thread matches above, loose matches on Subject: below --
2017-07-15 15:29 [Qemu-devel] [PATCH v5 2/3] ACPI: Add APEI GHES Table Generation support gengdongjiu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).