All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Dongjiu Geng <gengdongjiu@huawei.com>
Cc: <pbonzini@redhat.com>, <mst@redhat.com>,
	<shannon.zhaosl@gmail.com>, <peter.maydell@linaro.org>,
	<lersek@redhat.com>, <james.morse@arm.com>, <mtosatti@redhat.com>,
	<rth@twiddle.net>, <ehabkost@redhat.com>,
	<zhengxiang9@huawei.com>, <jonathan.cameron@huawei.com>,
	<xuwei5@huawei.com>, <kvm@vger.kernel.org>,
	<qemu-devel@nongnu.org>, <qemu-arm@nongnu.org>,
	<linuxarm@huawei.com>
Subject: Re: [PATCH v17 06/10] docs: APEI GHES generation and CPER record description
Date: Mon, 24 Jun 2019 13:39:08 +0200	[thread overview]
Message-ID: <20190624133908.635ff763@redhat.com> (raw)
In-Reply-To: <1557832703-42620-7-git-send-email-gengdongjiu@huawei.com>

On Tue, 14 May 2019 04:18:19 -0700
Dongjiu Geng <gengdongjiu@huawei.com> wrote:

> Add APEI/GHES detailed design document
> 
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> ---
>  docs/specs/acpi_hest_ghes.txt | 97 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 97 insertions(+)
>  create mode 100644 docs/specs/acpi_hest_ghes.txt
> 
> diff --git a/docs/specs/acpi_hest_ghes.txt b/docs/specs/acpi_hest_ghes.txt
> new file mode 100644
> index 0000000..fbfc787
> --- /dev/null
> +++ b/docs/specs/acpi_hest_ghes.txt
> @@ -0,0 +1,97 @@
> +APEI tables generating and CPER record
> +=============================
> +
> +Copyright (C) 2017 HuaWei Corporation.
> +
> +Design Details:
> +-------------------
> +
> +       etc/acpi/tables                                 etc/hardware_errors
> +    ====================                      ==========================================
> ++ +--------------------------+            +-----------------------+
> +| | HEST                     |            |    address            |            +--------------+
> +| +--------------------------+            |    registers          |            | Error Status |
> +| | GHES1                    |            | +---------------------+            | Data Block 1 |
> +| +--------------------------+ +--------->| |error_block_address1 |----------->| +------------+
> +| | .................        | |          | +---------------------+            | |  CPER      |
> +| | error_status_address-----+-+ +------->| |error_block_address2 |--------+   | |  CPER      |
> +| | .................        |   |        | +---------------------+        |   | |  ....      |
> +| | read_ack_register--------+-+ |        | |    ..............   |        |   | |  CPER      |
> +| | read_ack_preserve        | | |        +-----------------------+        |   | +------------+
> +| | read_ack_write           | | | +----->| |error_block_addressN |------+ |   | Error Status |
> ++ +--------------------------+ | | |      | +---------------------+      | |   | Data Block 2 |
> +| | GHES2                    | +-+-+----->| |read_ack_register1   |      | +-->| +------------+
> ++ +--------------------------+   | |      | +---------------------+      |     | |  CPER      |
> +| | .................        |   | | +--->| |read_ack_register2   |      |     | |  CPER      |
> +| | error_status_address-----+---+ | |    | +---------------------+      |     | |  ....      |
> +| | .................        |     | |    | |  .............      |      |     | |  CPER      |
> +| | read_ack_register--------+-----+-+    | +---------------------+      |     +-+------------+
> +| | read_ack_preserve        |     |   +->| |read_ack_registerN   |      |     | |..........  |
> +| | read_ack_write           |     |   |  | +---------------------+      |     | +------------+
> ++ +--------------------------|     |   |                                 |     | Error Status |
> +| | ...............          |     |   |                                 |     | Data Block N |
> ++ +--------------------------+     |   |                                 +---->| +------------+
> +| | GHESN                    |     |   |                                       | |  CPER      |
> ++ +--------------------------+     |   |                                       | |  CPER      |
> +| | .................        |     |   |                                       | |  ....      |
> +| | error_status_address-----+-----+   |                                       | |  CPER      |
> +| | .................        |         |                                       +-+------------+
> +| | read_ack_register--------+---------+
> +| | read_ack_preserve        |
> +| | read_ack_write           |
> ++ +--------------------------+
> +
> +(1) QEMU generates the ACPI HEST table. This table goes in the current
> +    "etc/acpi/tables" fw_cfg blob. Each error source has different
> +    notification type.
> +
> +(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
> +    also need to populate this blob. The "etc/hardwre_errors" fw_cfg blob
> +    contains one address registers table and one Error Status Data Block

s/one/a/
in both cases

> +    table, all of which are pre-allocated.

drop /, all of which are pre-allocated./

> +
> +(3) The address registers table contains N Error Block Address entries
> +    and N Read Ack Address entries, the size for each entry is 8-byte.
> +    The Error Status Data Block table contains N Error Status Data Block
> +    entries, the size for each entry is 4096(0x1000) bytes. The total size
> +    for "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
where 'N' is specified?

> +
> +(4) QEMU generates the ACPI linker/loader script for the firmware
> +
> +(4a) The HEST table is part of "etc/acpi/tables", the firmware already
> +    allocates the memory for it, because QEMU already generates an ALLOCATE
> +    linker/loader command for it
> +
> +(4b) QEMU creates another ALLOCATE command for the "etc/hardware_errors"
> +    blob. The firmware allocates memory for this blob and downloads it.
may be merge both points, like:

    the firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
    and copies blobs content there.

> +
> +(5) QEMU generates N ADD_POINTER commands, which patch address in the
> +    "error_status_address" fields of the HEST table with a pointer to the
> +    corresponding "address registers" in the downloaded "etc/hardware_errors"
> +    blob.

s/the downloaded//
the same applies to to other similar occurrences below

> +
> +(6) QEMU generates N ADD_POINTER commands, which patch address in the
> +    "read_ack_register" fields of the HEST table with a pointer to the
> +    corresponding "address registers" in the downloaded "etc/hardware_errors" blob.
> +
> +(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
> +    address in the " error_block_address" fields with a pointer to the
> +    respective "Error Status Data Block" in the downloaded "etc/hardware_errors"
> +    blob.
> +
> +(8) QEMU Defines a third and write-only fw_cfg blob which is called
> +    "etc/hardware_errors_addr". Through that blob, the firmware can send back
> +    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
> +    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER commands
> +    for the firmware, the firmware will write back the start address of
> +    "etc/hardware_errors" blob to fw_cfg file "etc/hardware_errors_addr". 

> Then
> +    Qemu will know the Error Status Data Block for every error source. Each of
> +    Error Status Data Block has fixed size which is 4096(0x1000).

this probably is not necessary.

> +
> +(9) When QEMU gets SIGBUS from the kernel, QEMU formats the CPER right into
> +    guest memory, and then injects whatever interrupt (or assert whatever GPIO line)
> +    as a notification which is necessary for notifying the guest.
> +
> +(10) This notification (in virtual hardware) will be handled by guest kernel,
> +    guest APEI driver will read the CPER which is recorded by QEMU and do the
> +    recovery.

WARNING: multiple messages have this Message-ID (diff)
From: Igor Mammedov <imammedo@redhat.com>
To: Dongjiu Geng <gengdongjiu@huawei.com>
Cc: peter.maydell@linaro.org, ehabkost@redhat.com,
	kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	qemu-devel@nongnu.org, linuxarm@huawei.com,
	shannon.zhaosl@gmail.com, zhengxiang9@huawei.com,
	qemu-arm@nongnu.org, james.morse@arm.com, xuwei5@huawei.com,
	jonathan.cameron@huawei.com, pbonzini@redhat.com,
	lersek@redhat.com, rth@twiddle.net
Subject: Re: [Qemu-devel] [PATCH v17 06/10] docs: APEI GHES generation and CPER record description
Date: Mon, 24 Jun 2019 13:39:08 +0200	[thread overview]
Message-ID: <20190624133908.635ff763@redhat.com> (raw)
In-Reply-To: <1557832703-42620-7-git-send-email-gengdongjiu@huawei.com>

On Tue, 14 May 2019 04:18:19 -0700
Dongjiu Geng <gengdongjiu@huawei.com> wrote:

> Add APEI/GHES detailed design document
> 
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> ---
>  docs/specs/acpi_hest_ghes.txt | 97 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 97 insertions(+)
>  create mode 100644 docs/specs/acpi_hest_ghes.txt
> 
> diff --git a/docs/specs/acpi_hest_ghes.txt b/docs/specs/acpi_hest_ghes.txt
> new file mode 100644
> index 0000000..fbfc787
> --- /dev/null
> +++ b/docs/specs/acpi_hest_ghes.txt
> @@ -0,0 +1,97 @@
> +APEI tables generating and CPER record
> +=============================
> +
> +Copyright (C) 2017 HuaWei Corporation.
> +
> +Design Details:
> +-------------------
> +
> +       etc/acpi/tables                                 etc/hardware_errors
> +    ====================                      ==========================================
> ++ +--------------------------+            +-----------------------+
> +| | HEST                     |            |    address            |            +--------------+
> +| +--------------------------+            |    registers          |            | Error Status |
> +| | GHES1                    |            | +---------------------+            | Data Block 1 |
> +| +--------------------------+ +--------->| |error_block_address1 |----------->| +------------+
> +| | .................        | |          | +---------------------+            | |  CPER      |
> +| | error_status_address-----+-+ +------->| |error_block_address2 |--------+   | |  CPER      |
> +| | .................        |   |        | +---------------------+        |   | |  ....      |
> +| | read_ack_register--------+-+ |        | |    ..............   |        |   | |  CPER      |
> +| | read_ack_preserve        | | |        +-----------------------+        |   | +------------+
> +| | read_ack_write           | | | +----->| |error_block_addressN |------+ |   | Error Status |
> ++ +--------------------------+ | | |      | +---------------------+      | |   | Data Block 2 |
> +| | GHES2                    | +-+-+----->| |read_ack_register1   |      | +-->| +------------+
> ++ +--------------------------+   | |      | +---------------------+      |     | |  CPER      |
> +| | .................        |   | | +--->| |read_ack_register2   |      |     | |  CPER      |
> +| | error_status_address-----+---+ | |    | +---------------------+      |     | |  ....      |
> +| | .................        |     | |    | |  .............      |      |     | |  CPER      |
> +| | read_ack_register--------+-----+-+    | +---------------------+      |     +-+------------+
> +| | read_ack_preserve        |     |   +->| |read_ack_registerN   |      |     | |..........  |
> +| | read_ack_write           |     |   |  | +---------------------+      |     | +------------+
> ++ +--------------------------|     |   |                                 |     | Error Status |
> +| | ...............          |     |   |                                 |     | Data Block N |
> ++ +--------------------------+     |   |                                 +---->| +------------+
> +| | GHESN                    |     |   |                                       | |  CPER      |
> ++ +--------------------------+     |   |                                       | |  CPER      |
> +| | .................        |     |   |                                       | |  ....      |
> +| | error_status_address-----+-----+   |                                       | |  CPER      |
> +| | .................        |         |                                       +-+------------+
> +| | read_ack_register--------+---------+
> +| | read_ack_preserve        |
> +| | read_ack_write           |
> ++ +--------------------------+
> +
> +(1) QEMU generates the ACPI HEST table. This table goes in the current
> +    "etc/acpi/tables" fw_cfg blob. Each error source has different
> +    notification type.
> +
> +(2) A new fw_cfg blob called "etc/hardware_errors" is introduced. QEMU
> +    also need to populate this blob. The "etc/hardwre_errors" fw_cfg blob
> +    contains one address registers table and one Error Status Data Block

s/one/a/
in both cases

> +    table, all of which are pre-allocated.

drop /, all of which are pre-allocated./

> +
> +(3) The address registers table contains N Error Block Address entries
> +    and N Read Ack Address entries, the size for each entry is 8-byte.
> +    The Error Status Data Block table contains N Error Status Data Block
> +    entries, the size for each entry is 4096(0x1000) bytes. The total size
> +    for "etc/hardware_errors" fw_cfg blob is (N * 8 * 2 + N * 4096) bytes.
where 'N' is specified?

> +
> +(4) QEMU generates the ACPI linker/loader script for the firmware
> +
> +(4a) The HEST table is part of "etc/acpi/tables", the firmware already
> +    allocates the memory for it, because QEMU already generates an ALLOCATE
> +    linker/loader command for it
> +
> +(4b) QEMU creates another ALLOCATE command for the "etc/hardware_errors"
> +    blob. The firmware allocates memory for this blob and downloads it.
may be merge both points, like:

    the firmware pre-allocates memory for "etc/acpi/tables", "etc/hardware_errors"
    and copies blobs content there.

> +
> +(5) QEMU generates N ADD_POINTER commands, which patch address in the
> +    "error_status_address" fields of the HEST table with a pointer to the
> +    corresponding "address registers" in the downloaded "etc/hardware_errors"
> +    blob.

s/the downloaded//
the same applies to to other similar occurrences below

> +
> +(6) QEMU generates N ADD_POINTER commands, which patch address in the
> +    "read_ack_register" fields of the HEST table with a pointer to the
> +    corresponding "address registers" in the downloaded "etc/hardware_errors" blob.
> +
> +(7) QEMU generates N ADD_POINTER commands for the firmware, which patch
> +    address in the " error_block_address" fields with a pointer to the
> +    respective "Error Status Data Block" in the downloaded "etc/hardware_errors"
> +    blob.
> +
> +(8) QEMU Defines a third and write-only fw_cfg blob which is called
> +    "etc/hardware_errors_addr". Through that blob, the firmware can send back
> +    the guest-side allocation addresses to QEMU. The "etc/hardware_errors_addr"
> +    blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER commands
> +    for the firmware, the firmware will write back the start address of
> +    "etc/hardware_errors" blob to fw_cfg file "etc/hardware_errors_addr". 

> Then
> +    Qemu will know the Error Status Data Block for every error source. Each of
> +    Error Status Data Block has fixed size which is 4096(0x1000).

this probably is not necessary.

> +
> +(9) When QEMU gets SIGBUS from the kernel, QEMU formats the CPER right into
> +    guest memory, and then injects whatever interrupt (or assert whatever GPIO line)
> +    as a notification which is necessary for notifying the guest.
> +
> +(10) This notification (in virtual hardware) will be handled by guest kernel,
> +    guest APEI driver will read the CPER which is recorded by QEMU and do the
> +    recovery.



  reply	other threads:[~2019-06-24 11:39 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14 11:18 [PATCH v17 00/10] Add ARMv8 RAS virtualization support in QEMU Dongjiu Geng
2019-05-14 11:18 ` [Qemu-devel] " Dongjiu Geng
2019-05-14 11:18 ` [PATCH v17 01/10] hw/arm/virt: Add RAS platform version for migration Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-20 12:04   ` Igor Mammedov
2019-06-20 12:04     ` [Qemu-devel] " Igor Mammedov
2019-06-24 12:19     ` gengdongjiu
2019-06-24 12:19       ` [Qemu-devel] " gengdongjiu
2019-06-25 13:16       ` Igor Mammedov
2019-06-25 13:16         ` [Qemu-devel] " Igor Mammedov
2019-06-25 13:29         ` gengdongjiu
2019-06-25 13:29           ` [Qemu-devel] " gengdongjiu
2019-05-14 11:18 ` [PATCH v17 02/10] ACPI: add some GHES structures and macros definition Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-05-29  3:40   ` Michael S. Tsirkin
2019-05-29  3:40     ` [Qemu-devel] " Michael S. Tsirkin
2019-05-30 14:58     ` gengdongjiu
2019-05-30 14:58       ` [Qemu-devel] " gengdongjiu
2019-06-20 12:10   ` Igor Mammedov
2019-06-20 12:10     ` Igor Mammedov
2019-06-20 14:04     ` gengdongjiu
2019-06-20 14:04       ` gengdongjiu
2019-06-20 15:09       ` Igor Mammedov
2019-06-20 15:09         ` Igor Mammedov
2019-06-20 17:17         ` gengdongjiu
2019-06-20 17:17           ` gengdongjiu
2019-06-24 11:16           ` Igor Mammedov
2019-06-24 11:16             ` Igor Mammedov
2019-06-25  9:56             ` gengdongjiu
2019-06-25  9:56               ` gengdongjiu
2019-06-25 13:33               ` Igor Mammedov
2019-06-25 13:33                 ` Igor Mammedov
2019-05-14 11:18 ` [PATCH v17 03/10] acpi: add build_append_ghes_notify() helper for Hardware Error Notification Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-24 11:21   ` Igor Mammedov
2019-06-24 11:21     ` [Qemu-devel] " Igor Mammedov
2019-05-14 11:18 ` [PATCH v17 04/10] acpi: add build_append_ghes_generic_data() helper for Generic Error Data Entry Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-20 12:28   ` Igor Mammedov
2019-06-20 12:28     ` [Qemu-devel] " Igor Mammedov
2019-06-24 12:37     ` gengdongjiu
2019-06-24 12:37       ` [Qemu-devel] " gengdongjiu
2019-05-14 11:18 ` [PATCH v17 05/10] acpi: add build_append_ghes_generic_status() helper for Generic Error Status Block Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-20 12:42   ` Igor Mammedov
2019-06-20 12:42     ` [Qemu-devel] " Igor Mammedov
2019-06-25 12:11     ` gengdongjiu
2019-06-25 12:11       ` [Qemu-devel] " gengdongjiu
2019-06-25 13:41       ` Igor Mammedov
2019-06-25 13:41         ` [Qemu-devel] " Igor Mammedov
2019-05-14 11:18 ` [PATCH v17 06/10] docs: APEI GHES generation and CPER record description Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-24 11:39   ` Igor Mammedov [this message]
2019-06-24 11:39     ` Igor Mammedov
2019-05-14 11:18 ` [PATCH v17 07/10] ACPI: Add APEI GHES table generation support Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-05-29  3:37   ` Michael S. Tsirkin
2019-05-29  3:37     ` [Qemu-devel] " Michael S. Tsirkin
2019-05-30 14:47     ` gengdongjiu
2019-05-30 14:47       ` [Qemu-devel] " gengdongjiu
2019-06-06 13:43   ` Jonathan Cameron
2019-06-06 13:43     ` [Qemu-devel] " Jonathan Cameron
2019-06-24 12:27   ` Igor Mammedov
2019-06-24 12:27     ` [Qemu-devel] " Igor Mammedov
2019-06-25 13:48     ` gengdongjiu
2019-06-25 13:48       ` [Qemu-devel] " gengdongjiu
2019-06-26 14:25       ` Igor Mammedov
2019-05-14 11:18 ` [PATCH v17 08/10] KVM: Move related hwpoison page functions to accel/kvm/ folder Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-24 12:32   ` Igor Mammedov
2019-06-24 12:32     ` [Qemu-devel] " Igor Mammedov
2019-06-25 12:28     ` gengdongjiu
2019-06-25 12:28       ` [Qemu-devel] " gengdongjiu
2019-05-14 11:18 ` [PATCH v17 09/10] target-arm: kvm64: inject synchronous External Abort Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-05-14 11:18 ` [PATCH v17 10/10] target-arm: kvm64: handle SIGBUS signal from kernel or KVM Dongjiu Geng
2019-05-14 11:18   ` [Qemu-devel] " Dongjiu Geng
2019-06-06 13:31   ` Jonathan Cameron
2019-06-06 13:31     ` [Qemu-devel] " Jonathan Cameron
2019-06-24 13:08   ` Igor Mammedov
2019-06-24 13:08     ` [Qemu-devel] " Igor Mammedov
2019-06-25 12:24     ` gengdongjiu
2019-06-25 12:24       ` [Qemu-devel] " gengdongjiu
2019-06-25 13:32       ` Igor Mammedov
2019-06-25 13:32         ` [Qemu-devel] " Igor Mammedov
2019-05-15  9:40 ` [PATCH v17 00/10] Add ARMv8 RAS virtualization support in QEMU gengdongjiu
2019-05-15  9:40   ` [Qemu-devel] " gengdongjiu
2019-05-24 14:25   ` gengdongjiu
2019-05-24 14:25   ` [Qemu-arm] " gengdongjiu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190624133908.635ff763@redhat.com \
    --to=imammedo@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=gengdongjiu@huawei.com \
    --cc=james.morse@arm.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=lersek@redhat.com \
    --cc=linuxarm@huawei.com \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=shannon.zhaosl@gmail.com \
    --cc=xuwei5@huawei.com \
    --cc=zhengxiang9@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.