qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation
@ 2024-08-01 14:47 Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 1/7] arm/virt: place power button pin number on a define Mauro Carvalho Chehab
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab, Ani Sinha,
	Dongjiu Geng, Paolo Bonzini, Peter Maydell, Shannon Zhao,
	qemu-arm, qemu-devel

Testing OS kernel ACPI APEI CPER support is tricky, as one depends on
having hardware with special-purpose BIOS and/or hardware.

With QEMU, it becomes a lot easier, as it can be done via QMP.

This series add support for injecting CPER records on ARM emulation.

The QEMU side changes add a QAPI able to do CPER error injection
on ARM, with a raw data parameter, making it very flexible.

A script is provided at the final patch implementing support for
ARM Processor CPER error injection according with ACPI 6.x and 
UEFI 2.9A/2.10 specs, via QMP.

Injecting such errors can be done using the provided script:

	$ ./scripts/ghes_inject.py arm 
		 {"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 9}, "package": "v9.0.0-2621-g3de6991b870a"}, "capabilities": ["oob"]}}
	{ "execute": "qmp_capabilities" } 
		 {"return": {}}
	{ "execute": "ghes-cper", "arguments": {"cper": {"notification-type": [22, 61, 158, 225, 17, 188, 228, 17, 156, 170, 194, 5, 29, 93, 70, 176], "raw-data": [0, 0, 0, 0, 1, 0, 0, 0, 72, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 32, 4, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 239, 190, 173, 222, 0, 0, 0, 0, 173, 11, 186, 171, 0, 0, 0, 0]}} }
		 {"return": {}}

Produces a simple CPER register, properly handled by the Linux
Kernel:

[ 5876.041410] {18}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[ 5876.041775] {18}[Hardware Error]: event severity: recoverable
[ 5876.042023] {18}[Hardware Error]:  Error 0, type: recoverable
[ 5876.042280] {18}[Hardware Error]:   section_type: ARM processor error
[ 5876.042538] {18}[Hardware Error]:   MIDR: 0x0000000000000000
[ 5876.042781] {18}[Hardware Error]:   Error info structure 0:
[ 5876.043013] {18}[Hardware Error]:   num errors: 2
[ 5876.043222] {18}[Hardware Error]:    error_type: 0x02: cache error
[ 5876.043500] {18}[Hardware Error]:    error_info: 0x0000000000000000
[ 5876.043800] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error

More complex use cases can be done, like:

	$ ./scripts/ghes_inject.py arm --mpidr 0x444 --running --affinity 1 --error-info 12345678 --vendor 0x13,123,4,5,1 --ctx-array 0,1,2,3,4,5 -t cache tlb bus vendor tlb,vendor
		 {"QMP": {"version": {"qemu": {"micro": 50, "minor": 0, "major": 9}, "package": "v9.0.0-2621-g3de6991b870a"}, "capabilities": ["oob"]}}
	{ "execute": "qmp_capabilities" } 
		 {"return": {}}
	{ "execute": "ghes-cper", "arguments": {"cper": {"notification-type": [22, 61, 158, 225, 17, 188, 228, 17, 156, 170, 194, 5, 29, 93, 70, 176], "raw-data": [7, 0, 0, 0, 5, 0, 1, 0, 13, 1, 0, 0, 1, 0, 0, 0, 68, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 32, 4, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 239, 190, 173, 222, 0, 0, 0, 0, 173, 11, 186, 171, 0, 0, 0, 0, 0, 32, 4, 0, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 239, 190, 173, 222, 0, 0, 0, 0, 173, 11, 186, 171, 0, 0, 0, 0, 0, 32, 4, 0, 8, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 239, 190, 173, 222, 0, 0, 0, 0, 173, 11, 186, 171, 0, 0, 0, 0, 0, 32, 4, 0, 16, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 239, 190, 173, 222, 0, 0, 0, 0, 173, 11, 186, 171, 0, 0, 0, 0, 0, 32, 0, 0, 20, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 239, 190, 173, 222, 0, 0, 0, 0, 173, 11, 186, 171, 0, 0, 0, 0, 0, 0, 5, 0, 56, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 123, 4, 5, 1]}} }
		 {"return": {}}

964.134325] {19}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[ 5964.134692] {19}[Hardware Error]: event severity: recoverable
[ 5964.134942] {19}[Hardware Error]:  Error 0, type: recoverable
[ 5964.135200] {19}[Hardware Error]:   section_type: ARM processor error
[ 5964.135466] {19}[Hardware Error]:   MIDR: 0x0000000000000000
[ 5964.135700] {19}[Hardware Error]:   Multiprocessor Affinity Register (MPIDR): 0x0000000000000444
[ 5964.136025] {19}[Hardware Error]:   error affinity level: 1
[ 5964.136255] {19}[Hardware Error]:   running state: 0x1
[ 5964.136468] {19}[Hardware Error]:   Power State Coordination Interface state: 0
[ 5964.136767] {19}[Hardware Error]:   Error info structure 0:
[ 5964.137001] {19}[Hardware Error]:   num errors: 2
[ 5964.137210] {19}[Hardware Error]:    error_type: 0x02: cache error
[ 5964.137472] {19}[Hardware Error]:    error_info: 0x0000000000000000
[ 5964.137737] {19}[Hardware Error]:   Error info structure 1:
[ 5964.137976] {19}[Hardware Error]:   num errors: 2
[ 5964.138192] {19}[Hardware Error]:    error_type: 0x04: TLB error
[ 5964.138459] {19}[Hardware Error]:    error_info: 0x0000000000000000
[ 5964.138727] {19}[Hardware Error]:   Error info structure 2:
[ 5964.138967] {19}[Hardware Error]:   num errors: 2
[ 5964.139185] {19}[Hardware Error]:    error_type: 0x08: bus error
[ 5964.139451] {19}[Hardware Error]:    error_info: 0x0000000000000000
[ 5964.139751] {19}[Hardware Error]:   Error info structure 3:
[ 5964.139993] {19}[Hardware Error]:   num errors: 2
[ 5964.140210] {19}[Hardware Error]:    error_type: 0x10: micro-architectural error
[ 5964.140522] {19}[Hardware Error]:    error_info: 0x0000000000000000
[ 5964.140790] {19}[Hardware Error]:   Error info structure 4:
[ 5964.141030] {19}[Hardware Error]:   num errors: 2
[ 5964.141261] {19}[Hardware Error]:    error_type: 0x14: TLB error|micro-architectural error
[ 5964.141599] {19}[Hardware Error]:   Context info structure 0:
[ 5964.141843] {19}[Hardware Error]:    register context type: AArch64 EL1 context registers
[ 5964.142195] {19}[Hardware Error]:    00000000: 00000000 00000000 00000001 00000000
[ 5964.142534] {19}[Hardware Error]:    00000010: 00000002 00000000 00000003 00000000
[ 5964.142867] {19}[Hardware Error]:    00000020: 00000004 00000000 00000005 00000000
[ 5964.143193] {19}[Hardware Error]:    00000030: 00000000 00000000
[ 5964.143464] {19}[Hardware Error]:   Vendor specific error info has 5 bytes:
[ 5964.143750] {19}[Hardware Error]:    00000000: 13 7b 04 05 01                                   .{...
[ 5964.144164] [Firmware Warn]: GHES: Unhandled processor error type 0x02: cache error
[ 5964.144483] [Firmware Warn]: GHES: Unhandled processor error type 0x04: TLB error
[ 5964.144793] [Firmware Warn]: GHES: Unhandled processor error type 0x08: bus error
[ 5964.145099] [Firmware Warn]: GHES: Unhandled processor error type 0x10: micro-architectural error
[ 5964.145454] [Firmware Warn]: GHES: Unhandled processor error type 0x14: TLB error|micro-architectural error

---

v4:
- CPER generation moved to happen outside QEMU;
- One patch adding support for mpidr query was removed.

v3:
- patch 1 cleanups with some comment changes and adding another place where
  the poweroff GPIO define should be used. No changes on other patches (except
  due to conflict resolution).

v2:
- added a new patch using a define for GPIO power pin;
- patch 2 changed to also use a define for generic error GPIO pin;
- a couple cleanups at patch 2 removing uneeded else clauses.



Jonathan Cameron (1):
  acpi/ghes: Support GPIO error source

Mauro Carvalho Chehab (6):
  arm/virt: place power button pin number on a define
  acpi/generic_event_device: add an APEI error device
  arm/virt: Wire up GPIO error source for ACPI / GHES
  qapi/ghes-cper: add an interface to do generic CPER error injection
  acpi/ghes: add support for generic error injection via QAPI
  scripts/ghes_inject: add a script to generate GHES error inject

 MAINTAINERS                            |   8 +
 hw/acpi/Kconfig                        |   5 +
 hw/acpi/generic_event_device.c         |  17 +
 hw/acpi/ghes.c                         | 178 ++++++-
 hw/acpi/ghes_cper.c                    |  53 ++
 hw/acpi/meson.build                    |   2 +
 hw/arm/Kconfig                         |   5 +
 hw/arm/virt-acpi-build.c               |  25 +-
 hw/arm/virt.c                          |  33 +-
 include/hw/acpi/acpi_dev_interface.h   |   1 +
 include/hw/acpi/generic_event_device.h |   3 +
 include/hw/acpi/ghes.h                 |  14 +-
 include/hw/arm/virt.h                  |   5 +
 qapi/ghes-cper.json                    |  54 ++
 qapi/meson.build                       |   1 +
 qapi/qapi-schema.json                  |   1 +
 scripts/ghes_inject.py                 | 673 +++++++++++++++++++++++++
 17 files changed, 1048 insertions(+), 30 deletions(-)
 create mode 100644 hw/acpi/ghes_cper.c
 create mode 100644 qapi/ghes-cper.json
 create mode 100755 scripts/ghes_inject.py

-- 
2.45.2




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/7] arm/virt: place power button pin number on a define
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 2/7] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab,
	Michael S. Tsirkin, Ani Sinha, Igor Mammedov, Peter Maydell,
	Shannon Zhao, linux-kernel, qemu-arm, qemu-devel

Having magic numbers inside the code is not a good idea, as it
is error-prone. So, instead, create a macro with the number
definition.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/arm/virt-acpi-build.c | 6 +++---
 hw/arm/virt.c            | 7 ++++---
 include/hw/arm/virt.h    | 3 +++
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index e10cad86dd73..f76fb117adff 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -154,10 +154,10 @@ static void acpi_dsdt_add_gpio(Aml *scope, const MemMapEntry *gpio_memmap,
     aml_append(dev, aml_name_decl("_CRS", crs));
 
     Aml *aei = aml_resource_template();
-    /* Pin 3 for power button */
-    const uint32_t pin_list[1] = {3};
+
+    const uint32_t pin = GPIO_PIN_POWER_BUTTON;
     aml_append(aei, aml_gpio_int(AML_CONSUMER, AML_EDGE, AML_ACTIVE_HIGH,
-                                 AML_EXCLUSIVE, AML_PULL_UP, 0, pin_list, 1,
+                                 AML_EXCLUSIVE, AML_PULL_UP, 0, &pin, 1,
                                  "GPO0", NULL, 0));
     aml_append(dev, aml_name_decl("_AEI", aei));
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 719e83e6a1e7..687fe0bb8bc9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1004,7 +1004,7 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
     if (s->acpi_dev) {
         acpi_send_event(s->acpi_dev, ACPI_POWER_DOWN_STATUS);
     } else {
-        /* use gpio Pin 3 for power button event */
+        /* use gpio Pin for power button event */
         qemu_set_irq(qdev_get_gpio_in(gpio_key_dev, 0), 1);
     }
 }
@@ -1013,7 +1013,8 @@ static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
                              uint32_t phandle)
 {
     gpio_key_dev = sysbus_create_simple("gpio-key", -1,
-                                        qdev_get_gpio_in(pl061_dev, 3));
+                                        qdev_get_gpio_in(pl061_dev,
+                                                         GPIO_PIN_POWER_BUTTON));
 
     qemu_fdt_add_subnode(fdt, "/gpio-keys");
     qemu_fdt_setprop_string(fdt, "/gpio-keys", "compatible", "gpio-keys");
@@ -1024,7 +1025,7 @@ static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
     qemu_fdt_setprop_cell(fdt, "/gpio-keys/poweroff", "linux,code",
                           KEY_POWER);
     qemu_fdt_setprop_cells(fdt, "/gpio-keys/poweroff",
-                           "gpios", phandle, 3, 0);
+                           "gpios", phandle, GPIO_PIN_POWER_BUTTON, 0);
 }
 
 #define SECURE_GPIO_POWEROFF 0
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index ab961bb6a9b8..a4d937ed45ac 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -47,6 +47,9 @@
 /* See Linux kernel arch/arm64/include/asm/pvclock-abi.h */
 #define PVTIME_SIZE_PER_CPU 64
 
+/* GPIO pins */
+#define GPIO_PIN_POWER_BUTTON  3
+
 enum {
     VIRT_FLASH,
     VIRT_MEM,
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/7] acpi/generic_event_device: add an APEI error device
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 1/7] arm/virt: place power button pin number on a define Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 3/7] arm/virt: Wire up GPIO error source for ACPI / GHES Mauro Carvalho Chehab
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab,
	Michael S. Tsirkin, Ani Sinha, Igor Mammedov, linux-kernel,
	qemu-devel

Adds a Generic Event Device to handle generic hardware error
events, supporting General Purpose Event (GPE) as specified at
ACPI 6.5 specification at 18.3.2.7.2:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#event-notification-for-generic-error-sources
using HID PNP0C33.

The PNP0C33 device is used to report hardware errors to
the bios via ACPI APEI Generic Hardware Error Source (GHES).

Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/acpi/generic_event_device.c         | 17 +++++++++++++++++
 include/hw/acpi/acpi_dev_interface.h   |  1 +
 include/hw/acpi/generic_event_device.h |  3 +++
 3 files changed, 21 insertions(+)

diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
index 15b4c3ebbf24..b9ad05e98c05 100644
--- a/hw/acpi/generic_event_device.c
+++ b/hw/acpi/generic_event_device.c
@@ -26,6 +26,7 @@ static const uint32_t ged_supported_events[] = {
     ACPI_GED_PWR_DOWN_EVT,
     ACPI_GED_NVDIMM_HOTPLUG_EVT,
     ACPI_GED_CPU_HOTPLUG_EVT,
+    ACPI_GED_ERROR_EVT
 };
 
 /*
@@ -116,6 +117,11 @@ void build_ged_aml(Aml *table, const char *name, HotplugHandler *hotplug_dev,
                            aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
                                       aml_int(0x80)));
                 break;
+            case ACPI_GED_ERROR_EVT:
+                aml_append(if_ctx,
+                           aml_notify(aml_name(ACPI_APEI_ERROR_DEVICE),
+                                      aml_int(0x80)));
+                break;
             case ACPI_GED_NVDIMM_HOTPLUG_EVT:
                 aml_append(if_ctx,
                            aml_notify(aml_name("\\_SB.NVDR"),
@@ -153,6 +159,15 @@ void acpi_dsdt_add_power_button(Aml *scope)
     aml_append(scope, dev);
 }
 
+void acpi_dsdt_add_error_device(Aml *scope)
+{
+    Aml *dev = aml_device(ACPI_APEI_ERROR_DEVICE);
+    aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C33")));
+    aml_append(dev, aml_name_decl("_UID", aml_int(0)));
+    aml_append(dev, aml_name_decl("_STA", aml_int(0xF)));
+    aml_append(scope, dev);
+}
+
 /* Memory read by the GED _EVT AML dynamic method */
 static uint64_t ged_evt_read(void *opaque, hwaddr addr, unsigned size)
 {
@@ -295,6 +310,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev)
         sel = ACPI_GED_MEM_HOTPLUG_EVT;
     } else if (ev & ACPI_POWER_DOWN_STATUS) {
         sel = ACPI_GED_PWR_DOWN_EVT;
+    } else if (ev & ACPI_GENERIC_ERROR) {
+        sel = ACPI_GED_ERROR_EVT;
     } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) {
         sel = ACPI_GED_NVDIMM_HOTPLUG_EVT;
     } else if (ev & ACPI_CPU_HOTPLUG_STATUS) {
diff --git a/include/hw/acpi/acpi_dev_interface.h b/include/hw/acpi/acpi_dev_interface.h
index 68d9d15f50aa..8294f8f0ccca 100644
--- a/include/hw/acpi/acpi_dev_interface.h
+++ b/include/hw/acpi/acpi_dev_interface.h
@@ -13,6 +13,7 @@ typedef enum {
     ACPI_NVDIMM_HOTPLUG_STATUS = 16,
     ACPI_VMGENID_CHANGE_STATUS = 32,
     ACPI_POWER_DOWN_STATUS = 64,
+    ACPI_GENERIC_ERROR = 128,
 } AcpiEventStatusBits;
 
 #define TYPE_ACPI_DEVICE_IF "acpi-device-interface"
diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/generic_event_device.h
index 40af3550b56d..b8f2f1328e0c 100644
--- a/include/hw/acpi/generic_event_device.h
+++ b/include/hw/acpi/generic_event_device.h
@@ -66,6 +66,7 @@
 #include "qom/object.h"
 
 #define ACPI_POWER_BUTTON_DEVICE "PWRB"
+#define ACPI_APEI_ERROR_DEVICE   "GEDD"
 
 #define TYPE_ACPI_GED "acpi-ged"
 OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
@@ -98,6 +99,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(AcpiGedState, ACPI_GED)
 #define ACPI_GED_PWR_DOWN_EVT      0x2
 #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4
 #define ACPI_GED_CPU_HOTPLUG_EVT    0x8
+#define ACPI_GED_ERROR_EVT          0x10
 
 typedef struct GEDState {
     MemoryRegion evt;
@@ -120,5 +122,6 @@ struct AcpiGedState {
 void build_ged_aml(Aml *table, const char* name, HotplugHandler *hotplug_dev,
                    uint32_t ged_irq, AmlRegionSpace rs, hwaddr ged_base);
 void acpi_dsdt_add_power_button(Aml *scope);
+void acpi_dsdt_add_error_device(Aml *scope);
 
 #endif
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 3/7] arm/virt: Wire up GPIO error source for ACPI / GHES
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 1/7] arm/virt: place power button pin number on a define Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 2/7] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 4/7] acpi/ghes: Support GPIO error source Mauro Carvalho Chehab
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab,
	Michael S. Tsirkin, Ani Sinha, Dongjiu Geng, Igor Mammedov,
	Peter Maydell, Shannon Zhao, linux-kernel, qemu-arm, qemu-devel

Adds support to ARM virtualization to allow handling
a General Purpose Event (GPE) via GED error device.

It is aligned with Linux Kernel patch:
https://lore.kernel.org/lkml/1272350481-27951-8-git-send-email-ying.huang@intel.com/

Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/acpi/ghes.c           |  3 +++
 hw/arm/virt-acpi-build.c | 21 +++++++++++++++++----
 hw/arm/virt.c            | 26 +++++++++++++++++++++++---
 include/hw/acpi/ghes.h   |  3 +++
 include/hw/arm/virt.h    |  2 ++
 5 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index e9511d9b8f71..8d0262e6c1aa 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -444,6 +444,9 @@ int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
     return ret;
 }
 
+NotifierList generic_error_notifiers =
+    NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
+
 bool acpi_ghes_present(void)
 {
     AcpiGedState *acpi_ged_state;
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f76fb117adff..cdd1c6d9f1a3 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -142,6 +142,8 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry *memmap,
 static void acpi_dsdt_add_gpio(Aml *scope, const MemMapEntry *gpio_memmap,
                                            uint32_t gpio_irq)
 {
+    uint32_t pin;
+
     Aml *dev = aml_device("GPO0");
     aml_append(dev, aml_name_decl("_HID", aml_string("ARMH0061")));
     aml_append(dev, aml_name_decl("_UID", aml_int(0)));
@@ -155,7 +157,12 @@ static void acpi_dsdt_add_gpio(Aml *scope, const MemMapEntry *gpio_memmap,
 
     Aml *aei = aml_resource_template();
 
-    const uint32_t pin = GPIO_PIN_POWER_BUTTON;
+    pin = GPIO_PIN_POWER_BUTTON;
+    aml_append(aei, aml_gpio_int(AML_CONSUMER, AML_EDGE, AML_ACTIVE_HIGH,
+                                 AML_EXCLUSIVE, AML_PULL_UP, 0, &pin, 1,
+                                 "GPO0", NULL, 0));
+
+    pin = GPIO_PIN_GENERIC_ERROR;
     aml_append(aei, aml_gpio_int(AML_CONSUMER, AML_EDGE, AML_ACTIVE_HIGH,
                                  AML_EXCLUSIVE, AML_PULL_UP, 0, &pin, 1,
                                  "GPO0", NULL, 0));
@@ -166,6 +173,12 @@ static void acpi_dsdt_add_gpio(Aml *scope, const MemMapEntry *gpio_memmap,
     aml_append(method, aml_notify(aml_name(ACPI_POWER_BUTTON_DEVICE),
                                   aml_int(0x80)));
     aml_append(dev, method);
+
+    method = aml_method("_E06", 0, AML_NOTSERIALIZED);
+    aml_append(method, aml_notify(aml_name(ACPI_APEI_ERROR_DEVICE),
+                                  aml_int(0x80)));
+    aml_append(dev, method);
+
     aml_append(scope, dev);
 }
 
@@ -841,10 +854,9 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
                       HOTPLUG_HANDLER(vms->acpi_dev),
                       irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE, AML_SYSTEM_MEMORY,
                       memmap[VIRT_ACPI_GED].base);
-    } else {
-        acpi_dsdt_add_gpio(scope, &memmap[VIRT_GPIO],
-                           (irqmap[VIRT_GPIO] + ARM_SPI_BASE));
     }
+    acpi_dsdt_add_gpio(scope, &memmap[VIRT_GPIO],
+                       irqmap[VIRT_GPIO] + ARM_SPI_BASE);
 
     if (vms->acpi_dev) {
         uint32_t event = object_property_get_uint(OBJECT(vms->acpi_dev),
@@ -858,6 +870,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
     }
 
     acpi_dsdt_add_power_button(scope);
+    acpi_dsdt_add_error_device(scope);
 #ifdef CONFIG_TPM
     acpi_dsdt_add_tpm(scope, vms);
 #endif
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 687fe0bb8bc9..e76cc9690788 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -73,6 +73,7 @@
 #include "standard-headers/linux/input.h"
 #include "hw/arm/smmuv3.h"
 #include "hw/acpi/acpi.h"
+#include "hw/acpi/ghes.h"
 #include "target/arm/cpu-qom.h"
 #include "target/arm/internals.h"
 #include "target/arm/multiprocessing.h"
@@ -677,7 +678,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
     DeviceState *dev;
     MachineState *ms = MACHINE(vms);
     int irq = vms->irqmap[VIRT_ACPI_GED];
-    uint32_t event = ACPI_GED_PWR_DOWN_EVT;
+    uint32_t event = ACPI_GED_PWR_DOWN_EVT | ACPI_GED_ERROR_EVT;
 
     if (ms->ram_slots) {
         event |= ACPI_GED_MEM_HOTPLUG_EVT;
@@ -1009,12 +1010,28 @@ static void virt_powerdown_req(Notifier *n, void *opaque)
     }
 }
 
+static DeviceState *gpio_error_dev;
+static void virt_generic_error_req(Notifier *n, void *opaque)
+{
+    VirtMachineState *s = container_of(n, VirtMachineState, generic_error_notifier);
+
+    if (s->acpi_dev) {
+        acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
+    } else {
+        /* use gpio Pin for power button event */
+        qemu_set_irq(qdev_get_gpio_in(gpio_error_dev, 0), 1);
+    }
+}
+
 static void create_gpio_keys(char *fdt, DeviceState *pl061_dev,
                              uint32_t phandle)
 {
     gpio_key_dev = sysbus_create_simple("gpio-key", -1,
                                         qdev_get_gpio_in(pl061_dev,
                                                          GPIO_PIN_POWER_BUTTON));
+    gpio_error_dev = sysbus_create_simple("gpio-key", -1,
+                                          qdev_get_gpio_in(pl061_dev,
+                                                           GPIO_PIN_GENERIC_ERROR));
 
     qemu_fdt_add_subnode(fdt, "/gpio-keys");
     qemu_fdt_setprop_string(fdt, "/gpio-keys", "compatible", "gpio-keys");
@@ -2385,9 +2402,8 @@ static void machvirt_init(MachineState *machine)
 
     if (has_ged && aarch64 && firmware_loaded && virt_is_acpi_enabled(vms)) {
         vms->acpi_dev = create_acpi_ged(vms);
-    } else {
-        create_gpio_devices(vms, VIRT_GPIO, sysmem);
     }
+    create_gpio_devices(vms, VIRT_GPIO, sysmem);
 
     if (vms->secure && !vmc->no_secure_gpio) {
         create_gpio_devices(vms, VIRT_SECURE_GPIO, secure_sysmem);
@@ -2397,6 +2413,10 @@ static void machvirt_init(MachineState *machine)
      vms->powerdown_notifier.notify = virt_powerdown_req;
      qemu_register_powerdown_notifier(&vms->powerdown_notifier);
 
+     vms->generic_error_notifier.notify = virt_generic_error_req;
+     notifier_list_add(&generic_error_notifiers,
+                       &vms->generic_error_notifier);
+
     /* Create mmio transports, so the user can create virtio backends
      * (which will be automatically plugged in to the transports). If
      * no backend is created the transport will just sit harmlessly idle.
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 674f6958e905..6891eafff5ab 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -23,6 +23,9 @@
 #define ACPI_GHES_H
 
 #include "hw/acpi/bios-linker-loader.h"
+#include "qemu/notify.h"
+
+extern NotifierList generic_error_notifiers;
 
 /*
  * Values for Hardware Error Notification Type field
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index a4d937ed45ac..0cbe9a05605f 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -49,6 +49,7 @@
 
 /* GPIO pins */
 #define GPIO_PIN_POWER_BUTTON  3
+#define GPIO_PIN_GENERIC_ERROR 6
 
 enum {
     VIRT_FLASH,
@@ -175,6 +176,7 @@ struct VirtMachineState {
     DeviceState *gic;
     DeviceState *acpi_dev;
     Notifier powerdown_notifier;
+    Notifier generic_error_notifier;
     PCIBus *bus;
     char *oem_id;
     char *oem_table_id;
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 4/7] acpi/ghes: Support GPIO error source
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2024-08-01 14:47 ` [PATCH v4 3/7] arm/virt: Wire up GPIO error source for ACPI / GHES Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 5/7] qapi/ghes-cper: add an interface to do generic CPER error injection Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Michael S. Tsirkin, Ani Sinha,
	Dongjiu Geng, Igor Mammedov, linux-kernel, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Add error notification to GHES v2 using the GPIO source.

[mchehab: do some cleanups at ACPI_HEST_SRC_ID_* checks]

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/acpi/ghes.c         | 16 ++++++++++------
 include/hw/acpi/ghes.h |  3 ++-
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 8d0262e6c1aa..a745dcc7be5e 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -34,8 +34,8 @@
 /* The max size in bytes for one error block */
 #define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
 
-/* Now only support ARMv8 SEA notification type error source */
-#define ACPI_GHES_ERROR_SOURCE_COUNT        1
+/* Support ARMv8 SEA notification type error source and GPIO interrupt. */
+#define ACPI_GHES_ERROR_SOURCE_COUNT        2
 
 /* Generic Hardware Error Source version 2 */
 #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
@@ -290,6 +290,9 @@ void build_ghes_error_table(GArray *hardware_errors, BIOSLinker *linker)
 static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
 {
     uint64_t address_offset;
+
+    assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
+
     /*
      * Type:
      * Generic Hardware Error Source version 2(GHESv2 - Type 10)
@@ -327,6 +330,9 @@ static void build_ghes_v2(GArray *table_data, int source_id, BIOSLinker *linker)
          */
         build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_SEA);
         break;
+    case ACPI_HEST_SRC_ID_GPIO:
+        build_ghes_hw_error_notification(table_data, ACPI_GHES_NOTIFY_GPIO);
+        break;
     default:
         error_report("Not support this error source");
         abort();
@@ -370,6 +376,7 @@ void acpi_build_hest(GArray *table_data, BIOSLinker *linker,
     /* Error Source Count */
     build_append_int_noprefix(table_data, ACPI_GHES_ERROR_SOURCE_COUNT, 4);
     build_ghes_v2(table_data, ACPI_HEST_SRC_ID_SEA, linker);
+    build_ghes_v2(table_data, ACPI_HEST_SRC_ID_GPIO, linker);
 
     acpi_table_end(linker, &table);
 }
@@ -406,10 +413,7 @@ int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
     start_addr = le64_to_cpu(ags->ghes_addr_le);
 
     if (physical_address) {
-
-        if (source_id < ACPI_HEST_SRC_ID_RESERVED) {
-            start_addr += source_id * sizeof(uint64_t);
-        }
+        start_addr += source_id * sizeof(uint64_t);
 
         cpu_physical_memory_read(start_addr, &error_block_addr,
                                  sizeof(error_block_addr));
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 6891eafff5ab..33be1eb5acf4 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -59,9 +59,10 @@ enum AcpiGhesNotifyType {
     ACPI_GHES_NOTIFY_RESERVED = 12
 };
 
+/* Those are used as table indexes when building GHES tables */
 enum {
     ACPI_HEST_SRC_ID_SEA = 0,
-    /* future ids go here */
+    ACPI_HEST_SRC_ID_GPIO,
     ACPI_HEST_SRC_ID_RESERVED,
 };
 
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 5/7] qapi/ghes-cper: add an interface to do generic CPER error injection
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
                   ` (3 preceding siblings ...)
  2024-08-01 14:47 ` [PATCH v4 4/7] acpi/ghes: Support GPIO error source Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  2024-08-02  9:44   ` Markus Armbruster
  2024-08-01 14:47 ` [PATCH v4 6/7] acpi/ghes: add support for generic error injection via QAPI Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 7/7] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
  6 siblings, 1 reply; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab,
	Michael S. Tsirkin, Ani Sinha, Dongjiu Geng, Eric Blake,
	Igor Mammedov, Markus Armbruster, Michael Roth, Paolo Bonzini,
	Peter Maydell, linux-kernel, qemu-arm, qemu-devel

Creates a QAPI to be used for generic ACPI APEI hardware error
injection (HEST) via GHESv2.

The actual GHES code will be added at the followup patch.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 MAINTAINERS            |  7 ++++++
 hw/acpi/Kconfig        |  5 ++++
 hw/acpi/ghes_cper.c    | 53 +++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build    |  2 ++
 hw/arm/Kconfig         |  5 ++++
 include/hw/acpi/ghes.h |  6 +++++
 qapi/ghes-cper.json    | 54 ++++++++++++++++++++++++++++++++++++++++++
 qapi/meson.build       |  1 +
 qapi/qapi-schema.json  |  1 +
 9 files changed, 134 insertions(+)
 create mode 100644 hw/acpi/ghes_cper.c
 create mode 100644 qapi/ghes-cper.json

diff --git a/MAINTAINERS b/MAINTAINERS
index 98eddf7ae155..655edcb6688c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2075,6 +2075,13 @@ F: hw/acpi/ghes.c
 F: include/hw/acpi/ghes.h
 F: docs/specs/acpi_hest_ghes.rst
 
+ACPI/HEST/GHES/ARM processor CPER
+R: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+S: Maintained
+F: hw/arm/ghes_cper.c
+F: hw/acpi/ghes_cper_stub.c
+F: qapi/ghes-cper.json
+
 ppc4xx
 L: qemu-ppc@nongnu.org
 S: Orphan
diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index e07d3204eb36..73ffbb82c150 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -51,6 +51,11 @@ config ACPI_APEI
     bool
     depends on ACPI
 
+config GHES_CPER
+    bool
+    depends on ACPI_APEI
+    default y
+
 config ACPI_PCI
     bool
     depends on ACPI && PCI
diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
new file mode 100644
index 000000000000..0d32874a84f1
--- /dev/null
+++ b/hw/acpi/ghes_cper.c
@@ -0,0 +1,53 @@
+/*
+ * ARM Processor error injection
+ *
+ * Copyright(C) 2024 Huawei LTD.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "qapi/qapi-commands-ghes-cper.h"
+#include "hw/acpi/ghes.h"
+
+void qmp_ghes_cper(CommonPlatformErrorRecord *qmp_cper,
+                   Error **errp)
+{
+    unsigned int uuid_len = 0, i;
+    uint8List *p;
+    AcpiGhesCper cper;
+
+    for (p = qmp_cper->notification_type; p; p = p->next) {
+        uuid_len++;
+        if (uuid_len > 16) {
+            break;
+        }
+        cper.uuid[uuid_len - 1] = p->value;
+    }
+
+    if (uuid_len != 16) {
+        error_report("GHES CPER record has an invalid UUID size");
+        return;
+    }
+
+    cper.data_len = 0;
+
+    for (p = qmp_cper->raw_data; p; p = p->next) {
+        cper.data_len++;
+    }
+
+    cper.data = g_malloc(cper.data_len);
+
+    p = qmp_cper->raw_data;
+    for (i = 0; i < cper.data_len; i++, p = p->next) {
+        cper.data[i] = p->value;
+    }
+
+    /* TODO: call a function at ghes */
+
+    g_free(cper.data);
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index fa5c07db9068..6cbf430eb66d 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -34,4 +34,6 @@ endif
 system_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c', 'ghes-stub.c', 'acpi_interface.c'))
 system_ss.add(when: 'CONFIG_ACPI_PCI_BRIDGE', if_false: files('pci-bridge-stub.c'))
 system_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
+system_ss.add(when: 'CONFIG_GHES_CPER', if_true: files('ghes_cper.c'))
+system_ss.add(when: 'CONFIG_GHES_CPER', if_false: files('ghes_cper_stub.c'))
 system_ss.add(files('acpi-qmp-cmds.c'))
diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 1ad60da7aa2d..bed6ba27d715 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -712,3 +712,8 @@ config ARMSSE
     select UNIMP
     select SSE_COUNTER
     select SSE_TIMER
+
+config GHES_CPER
+    bool
+    depends on ARM
+    default y if AARCH64
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 33be1eb5acf4..02ea3715f655 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -78,6 +78,12 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
                           GArray *hardware_errors);
 int acpi_ghes_record_errors(uint8_t notify, uint64_t error_physical_addr);
 
+typedef struct AcpiGhesCper {
+    uint8_t uuid[16];
+    char *data;
+    uint32_t data_len;
+} AcpiGhesCper;
+
 /**
  * acpi_ghes_present: Report whether ACPI GHES table is present
  *
diff --git a/qapi/ghes-cper.json b/qapi/ghes-cper.json
new file mode 100644
index 000000000000..2b0438431054
--- /dev/null
+++ b/qapi/ghes-cper.json
@@ -0,0 +1,54 @@
+# -*- Mode: Python -*-
+# vim: filetype=python
+
+##
+# = GHESv2 CPER Error Injection
+#
+# These are defined at
+# ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
+# (GHESv2 - Type 10)
+##
+
+##
+# @CommonPlatformErrorRecord:
+#
+# Common Platform Error Record - CPER - as defined at the UEFI
+# specification. See
+# https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#record-header
+# for more details.
+#
+# @notification-type: pre-assigned GUID value indicating the record
+#   association with an error event notification type, as defined
+#   at https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#record-header
+#
+# @raw-data: Contains the payload of the CPER.
+#
+# Since: 9.2
+##
+{ 'struct': 'CommonPlatformErrorRecord',
+  'data': {
+      'notification-type': ['uint8'],
+      'raw-data': ['uint8']
+  }
+}
+
+##
+# @ghes-cper:
+#
+# Inject ARM Processor error with data to be filled according with
+# ACPI 6.2 GHESv2 spec.
+#
+# @cper: a single CPER record to be sent to the guest OS.
+#
+# Features:
+#
+# @unstable: This command is experimental.
+#
+# Since: 9.2
+##
+{ 'command': 'ghes-cper',
+  'data': {
+    'cper': 'CommonPlatformErrorRecord'
+  },
+  'features': [ 'unstable' ]
+}
diff --git a/qapi/meson.build b/qapi/meson.build
index e7bc54e5d047..bd13cd7d40c9 100644
--- a/qapi/meson.build
+++ b/qapi/meson.build
@@ -35,6 +35,7 @@ qapi_all_modules = [
   'dump',
   'ebpf',
   'error',
+  'ghes-cper',
   'introspect',
   'job',
   'machine-common',
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index b1581988e4eb..54165c8d6655 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -81,3 +81,4 @@
 { 'include': 'vfio.json' }
 { 'include': 'cryptodev.json' }
 { 'include': 'cxl.json' }
+{ 'include': 'ghes-cper.json' }
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 6/7] acpi/ghes: add support for generic error injection via QAPI
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
                   ` (4 preceding siblings ...)
  2024-08-01 14:47 ` [PATCH v4 5/7] qapi/ghes-cper: add an interface to do generic CPER error injection Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  2024-08-01 14:47 ` [PATCH v4 7/7] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab
  6 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab,
	Michael S. Tsirkin, Ani Sinha, Dongjiu Geng, Igor Mammedov,
	linux-kernel, qemu-arm, qemu-devel

Provide a generic interface for error injection via GHESv2.

This patch is co-authored:
    - original ghes logic to inject a simple ARM record by Shiju Jose;
    - generic logic to handle block addresses by Jonathan Cameron;
    - generic GHESv2 error inject by Mauro Carvalho Chehab;

Co-authored-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-authored-by: Shiju Jose <shiju.jose@huawei.com>
Co-authored-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/acpi/ghes.c         | 159 ++++++++++++++++++++++++++++++++++++++---
 hw/acpi/ghes_cper.c    |   2 +-
 include/hw/acpi/ghes.h |   2 +
 3 files changed, 151 insertions(+), 12 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index a745dcc7be5e..f5a05f4cb81d 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -395,23 +395,22 @@ void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
     ags->present = true;
 }
 
+static uint64_t ghes_get_state_start_address(void)
+{
+    AcpiGedState *acpi_ged_state =
+        ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED, NULL));
+    AcpiGhesState *ags = &acpi_ged_state->ghes_state;
+
+    return le64_to_cpu(ags->ghes_addr_le);
+}
+
 int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
 {
     uint64_t error_block_addr, read_ack_register_addr, read_ack_register = 0;
-    uint64_t start_addr;
+    uint64_t start_addr = ghes_get_state_start_address();
     bool ret = -1;
-    AcpiGedState *acpi_ged_state;
-    AcpiGhesState *ags;
-
     assert(source_id < ACPI_HEST_SRC_ID_RESERVED);
 
-    acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
-                                                       NULL));
-    g_assert(acpi_ged_state);
-    ags = &acpi_ged_state->ghes_state;
-
-    start_addr = le64_to_cpu(ags->ghes_addr_le);
-
     if (physical_address) {
         start_addr += source_id * sizeof(uint64_t);
 
@@ -448,9 +447,147 @@ int acpi_ghes_record_errors(uint8_t source_id, uint64_t physical_address)
     return ret;
 }
 
+/*
+ * Error register block data layout
+ *
+ * | +---------------------+ ges.ghes_addr_le
+ * | |error_block_address0 |
+ * | +---------------------+
+ * | |error_block_address1 |
+ * | +---------------------+ --+--
+ * | |    .............    | GHES_ADDRESS_SIZE
+ * | +---------------------+ --+--
+ * | |error_block_addressN |
+ * | +---------------------+
+ * | | read_ack0           |
+ * | +---------------------+ --+--
+ * | | read_ack1           | GHES_ADDRESS_SIZE
+ * | +---------------------+ --+--
+ * | |   .............     |
+ * | +---------------------+
+ * | | read_ackN           |
+ * | +---------------------+ --+--
+ * | |      CPER           |   |
+ * | |      ....           | GHES_MAX_RAW_DATA_LENGT
+ * | |      CPER           |   |
+ * | +---------------------+ --+--
+ * | |    ..........       |
+ * | +---------------------+
+ * | |      CPER           |
+ * | |      ....           |
+ * | |      CPER           |
+ * | +---------------------+
+ */
+
+/* Map from uint32_t notify to entry offset in GHES */
+static const uint8_t error_source_to_index[] = { 0xff, 0xff, 0xff, 0xff,
+                                                 0xff, 0xff, 0xff, 1, 0};
+
+static bool ghes_get_addr(uint32_t notify, uint64_t *error_block_addr,
+                          uint64_t *read_ack_addr)
+{
+    uint64_t base;
+
+    if (notify >= ACPI_GHES_NOTIFY_RESERVED) {
+        return false;
+    }
+
+    /* Find and check the source id for this new CPER */
+    if (error_source_to_index[notify] == 0xff) {
+        return false;
+    }
+
+    base = ghes_get_state_start_address();
+
+    *read_ack_addr = base +
+        ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t) +
+        error_source_to_index[notify] * sizeof(uint64_t);
+
+    /* Could also be read back from the error_block_address register */
+    *error_block_addr = base +
+        ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t) +
+        ACPI_GHES_ERROR_SOURCE_COUNT * sizeof(uint64_t) +
+        error_source_to_index[notify] * ACPI_GHES_MAX_RAW_DATA_LENGTH;
+
+    return true;
+}
+
 NotifierList generic_error_notifiers =
     NOTIFIER_LIST_INITIALIZER(error_device_notifiers);
 
+void ghes_record_cper_errors(AcpiGhesCper *cper, uint32_t notify)
+{
+    int read_ack = 0;
+    uint32_t i;
+    uint64_t read_ack_addr = 0;
+    uint64_t error_block_addr = 0;
+    uint32_t data_length;
+    GArray *block;
+
+    if (!ghes_get_addr(notify, &error_block_addr, &read_ack_addr)) {
+        error_report("GHES: Invalid error block/ack address(es)");
+        return;
+    }
+
+    cpu_physical_memory_read(read_ack_addr,
+                             &read_ack, sizeof(uint64_t));
+
+    /* zero means OSPM does not acknowledge the error */
+    if (!read_ack) {
+        error_report("Last time OSPM does not acknowledge the error,"
+                     " record CPER failed this time, set the ack value to"
+                     " avoid blocking next time CPER record! exit");
+        read_ack = 1;
+        cpu_physical_memory_write(read_ack_addr,
+                                  &read_ack, sizeof(uint64_t));
+        return;
+    }
+
+    read_ack = cpu_to_le64(0);
+    cpu_physical_memory_write(read_ack_addr,
+                              &read_ack, sizeof(uint64_t));
+
+    /* Build CPER record */
+
+    /*
+     * Invalid fru id: ACPI 4.0: 17.3.2.6.1 Generic Error Data,
+     * Table 17-13 Generic Error Data Entry
+     */
+    QemuUUID fru_id = {};
+
+    block = g_array_new(false, true /* clear */, 1);
+    data_length = ACPI_GHES_DATA_LENGTH + cper->data_len;
+
+    /*
+        * It should not run out of the preallocated memory if
+        * adding a new generic error data entry
+        */
+    assert((data_length + ACPI_GHES_GESB_SIZE) <=
+            ACPI_GHES_MAX_RAW_DATA_LENGTH);
+
+    /* Build the new generic error status block header */
+    acpi_ghes_generic_error_status(block, ACPI_GEBS_UNCORRECTABLE,
+                                    0, 0, data_length,
+                                    ACPI_CPER_SEV_RECOVERABLE);
+
+    /* Build this new generic error data entry header */
+    acpi_ghes_generic_error_data(block, cper->uuid,
+                                ACPI_CPER_SEV_RECOVERABLE, 0, 0,
+                                cper->data_len, fru_id, 0);
+
+    /* Add CPER data */
+    for (i = 0; i < cper->data_len; i++) {
+        build_append_int_noprefix(block, cper->data[i], 1);
+    }
+
+    /* Write the generic error data entry into guest memory */
+    cpu_physical_memory_write(error_block_addr, block->data, block->len);
+
+    g_array_free(block, true);
+
+    notifier_list_notify(&generic_error_notifiers, NULL);
+}
+
 bool acpi_ghes_present(void)
 {
     AcpiGedState *acpi_ged_state;
diff --git a/hw/acpi/ghes_cper.c b/hw/acpi/ghes_cper.c
index 0d32874a84f1..504ba43ad0cc 100644
--- a/hw/acpi/ghes_cper.c
+++ b/hw/acpi/ghes_cper.c
@@ -47,7 +47,7 @@ void qmp_ghes_cper(CommonPlatformErrorRecord *qmp_cper,
         cper.data[i] = p->value;
     }
 
-    /* TODO: call a function at ghes */
+    ghes_record_cper_errors(&cper, ACPI_GHES_NOTIFY_GPIO);
 
     g_free(cper.data);
 }
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index 02ea3715f655..a426c47f2b96 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -84,6 +84,8 @@ typedef struct AcpiGhesCper {
     uint32_t data_len;
 } AcpiGhesCper;
 
+void ghes_record_cper_errors(AcpiGhesCper *cper, uint32_t notify);
+
 /**
  * acpi_ghes_present: Report whether ACPI GHES table is present
  *
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 7/7] scripts/ghes_inject: add a script to generate GHES error inject
  2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
                   ` (5 preceding siblings ...)
  2024-08-01 14:47 ` [PATCH v4 6/7] acpi/ghes: add support for generic error injection via QAPI Mauro Carvalho Chehab
@ 2024-08-01 14:47 ` Mauro Carvalho Chehab
  6 siblings, 0 replies; 9+ messages in thread
From: Mauro Carvalho Chehab @ 2024-08-01 14:47 UTC (permalink / raw)
  Cc: Jonathan Cameron, Shiju Jose, Mauro Carvalho Chehab, Cleber Rosa,
	John Snow, linux-kernel, qemu-devel

Using the QMP GHESv2 API requires preparing a raw data array
containing a CPER record.

Add a helper script with subcommands to prepare such data.

Currently, only ARM Processor error CPER record is supported.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 MAINTAINERS            |   1 +
 scripts/ghes_inject.py | 673 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 674 insertions(+)
 create mode 100755 scripts/ghes_inject.py

diff --git a/MAINTAINERS b/MAINTAINERS
index 655edcb6688c..9e4874bb552d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2081,6 +2081,7 @@ S: Maintained
 F: hw/arm/ghes_cper.c
 F: hw/acpi/ghes_cper_stub.c
 F: qapi/ghes-cper.json
+F: scripts/ghes_inject.py
 
 ppc4xx
 L: qemu-ppc@nongnu.org
diff --git a/scripts/ghes_inject.py b/scripts/ghes_inject.py
new file mode 100755
index 000000000000..99ee93bc2f34
--- /dev/null
+++ b/scripts/ghes_inject.py
@@ -0,0 +1,673 @@
+#!/usr/bin/env python3
+#
+# pylint: disable=C0301, C0114, R0912, R0913, R0915, W0511
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) 2024 Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
+
+# TODO: current implementation has dummy defaults.
+#
+# For a better implementation, a QMP addition/call is needed to
+# retrieve some data for ARM Processor Error injection:
+#
+#   - machine emulation architecture, as ARM current default is
+#     for AArch64;
+#   - ARM registers: power_state, midr, mpidr.
+
+import argparse
+import json
+import socket
+import sys
+
+EINJ_DESCRIPTION = """
+Handle ACPI GHESv2 error injection logic QEMU QMP interface.\n
+
+It allows using UEFI BIOS EINJ features to generate GHES records.
+
+It helps testing Linux CPER and GHES drivers and to test rasdaemon
+error handling logic.
+
+Currently, it support ARM processor error injection for ARM processor
+events, being compatible with UEFI 2.9A Errata.
+
+This small utility works together with those QEMU additions:
+- https://gitlab.com/mchehab_kernel/qemu/-/tree/arm-error-inject-v2
+"""
+
+
+#
+# Socket QMP send command
+#
+def qmp_command(host, port, commands):
+    """Send commands to QEMU though QMP TCP socket"""
+
+    # Needed to negotiate QMP and for QEMU to accept the command
+    commands.insert(0, '{ "execute": "qmp_capabilities" } ')
+
+    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+    s.connect((host, port))
+
+    data = s.recv(1024)
+
+    print("\t", data.decode("utf-8"), end="")
+
+    for command in commands:
+        print(command)
+
+        s.sendall(command.encode("utf-8"))
+        data = s.recv(1024)
+        print("\t", data.decode("utf-8"), end="")
+
+    s.shutdown(socket.SHUT_WR)
+    while 1:
+        data = s.recv(1024)
+        if data == b"":
+            break
+        print("\t", data.decode("utf-8"))
+
+    s.close()
+
+
+#
+# Helper routines to handle multiple choice arguments
+#
+def get_choice(name, value, choices, suffixes=None):
+    """Produce a list from multiple choice argument"""
+
+    new_values = []
+
+    if not value:
+        return new_values
+
+    for val in value.split(","):
+        val = val.lower()
+
+        if suffixes:
+            for suffix in suffixes:
+                val = val.removesuffix(suffix)
+
+        if val not in choices.keys():
+            sys.exit(f"Error on '{name}': choice {val} is invalid.")
+
+        val = choices[val]
+
+        new_values.append(val)
+
+    return new_values
+
+
+def get_mult_array(mult, name, values, allow_zero=False, max_val=None):
+    """Add numbered hashes from integer lists"""
+
+    if not allow_zero:
+        if not values:
+            return
+    else:
+        if values is None:
+            return
+
+        if not values:
+            i = 0
+            if i not in mult:
+                mult[i] = {}
+
+            mult[i][name] = []
+            return
+
+    i = 0
+    for value in values:
+        for val in value.split(","):
+            try:
+                val = int(val, 0)
+            except ValueError:
+                sys.exit(f"Error on '{name}': {val} is not an integer")
+
+            if val < 0:
+                sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+            if max_val and val > max_val:
+                sys.exit(f"Error on '{name}': {val} is too little")
+
+            if i not in mult:
+                mult[i] = {}
+
+            if name not in mult[i]:
+                mult[i][name] = []
+
+            mult[i][name].append(val)
+
+        i += 1
+
+
+def get_mult_choices(mult, name, values, choices,
+                     suffixes=None, allow_zero=False):
+    """Add numbered hashes from multiple choice arguments"""
+
+    if not allow_zero:
+        if not values:
+            return
+    else:
+        if values is None:
+            return
+
+    i = 0
+    for val in values:
+        new_values = get_choice(name, val, choices, suffixes)
+
+        if i not in mult:
+            mult[i] = {}
+
+        mult[i][name] = new_values
+        i += 1
+
+
+def get_mult_int(mult, name, values, allow_zero=False):
+    """Add numbered hashes from integer arguments"""
+    if not allow_zero:
+        if not values:
+            return
+    else:
+        if values is None:
+            return
+
+    i = 0
+    for val in values:
+        try:
+            val = int(val, 0)
+        except ValueError:
+            sys.exit(f"Error on '{name}': {val} is not an integer")
+
+        if val < 0:
+            sys.exit(f"Error on '{name}': {val} is not unsigned")
+
+        if i not in mult:
+            mult[i] = {}
+
+        mult[i][name] = val
+        i += 1
+
+
+#
+# Data encode helper functions
+#
+def bit(b):
+    """Simple macro to define a bit on a bitmask"""
+    return 1 << b
+
+
+def data_add(data, value, num_bytes):
+    """
+    Adds bytes from value inside a bitarray.
+    If the value is a list, assume it to be a bitmap, where each
+    bit is on a different element inside a list. Such list is
+    converted to a single bitmap integer.
+    """
+
+    if isinstance(value, list):
+        bits = 0
+        for b in value:
+            bits |= b
+
+        value = bits
+
+    data.extend(value.to_bytes(num_bytes, byteorder="little"))
+
+
+def to_uuid(time_low, time_mid, time_high, nodes):
+    """Create an integer array with elements from an UUID"""
+
+    values = bytearray()
+
+    data_add(values, time_low, 4)
+    data_add(values, time_mid, 2)
+    data_add(values, time_high, 2)
+
+    for i in nodes:
+        data_add(values, i, 1)
+
+    return list(values)
+
+
+#
+# Arm processor EINJ logic
+#
+ACPI_GHES_ARM_CPER_LENGTH = 40
+ACPI_GHES_ARM_CPER_PEI_LENGTH = 32
+
+# TODO: query it from emulation. Current default valid only for Aarch64
+CONTEXT_AARCH64_EL1 = 5
+
+CPER_ARM_PROCESSOR_ERROR = to_uuid(0xE19E3D16, 0xBC11, 0x11E4,
+                                   [0x9C, 0xAA, 0xC2, 0x05,
+                                    0x1D, 0x5D, 0x46, 0xB0])
+
+
+class ArmProcessorEinj:
+    """
+    Implements ARM Processor Error injection via GHES
+    """
+
+    def __init__(self, args=None):
+        """
+        Initialize the error injection class. There are two possible
+        ways to initialize it:
+        1. passing a set of arguments;
+        2. passing a dict with error inject command parameters. Each
+            column is handled in separate.
+        """
+
+        # Valid choice values
+        self.arm_valid_bits = {
+            "mpidr":    bit(0),
+            "affinity": bit(1),
+            "running":  bit(2),
+            "vendor":   bit(3),
+        }
+
+        self.pei_flags = {
+            "first":        bit(0),
+            "last":         bit(1),
+            "propagated":   bit(2),
+            "overflow":     bit(3),
+        }
+
+        self.pei_error_types = {
+            "cache":        bit(1),
+            "tlb":          bit(2),
+            "bus":          bit(3),
+
+            "micro-arch":   bit(4),
+            "vendor":       bit(4),
+        }
+
+        self.pei_valid_bits = {
+            "multiple-error":   bit(0),
+            "flags":            bit(1),
+
+            "error":            bit(2),
+            "error-info":       bit(2),
+
+            "virt":             bit(3),
+            "virtual":          bit(3),
+
+            "phy":              bit(4),
+            "physical":         bit(4),
+        }
+
+        self.arm = {}
+
+        if not args:
+            self.args = args
+            return
+
+        pei = {}
+        ctx = {}
+        vendor = {}
+
+        # Handle global parameters
+        if args.arm:
+            arm_validation_init = False
+            self.arm["validation"] = get_choice(name="validation",
+                                                value=args.arm,
+                                                choices=self.arm_valid_bits,
+                                                suffixes=["-error", "-err"])
+        else:
+            self.arm["validation"] = []
+            arm_validation_init = True
+
+        if args.affinity:
+            self.arm["affinity-level"] = args.affinity
+            if arm_validation_init:
+                self.arm["validation"].append(self.arm_valid_bits["affinity"])
+        else:
+            self.arm["affinity-level"] = 0
+
+        if args.mpidr:
+            self.arm["mpidr-el1"] = args.mpidr
+            if arm_validation_init:
+                self.arm["validation"].append(self.arm_valid_bits["mpidr"])
+        else:
+            # TODO: query it from emulation
+            self.arm["mpidr-el1"] = 0
+
+        if args.midr:
+            self.arm["midr-el1"] = args.midr
+        else:
+            # TODO: query it from emulation
+            self.arm["midr-el1"] = 0
+
+        if args.running is not None:
+            if args.running:
+                self.arm["running-state"] = bit(0)
+            else:
+                self.arm["running-state"] = 0
+            if arm_validation_init:
+                self.arm["validation"].append(self.arm_valid_bits["running"])
+        else:
+            # TODO: query it from emulation
+            self.arm["running-state"] = 0
+
+        if args.psci:
+            self.arm["psci-state"] = args.psci
+            if arm_validation_init:
+                self.arm["validation"].append(self.arm_valid_bits["running"])
+        else:
+            # TODO: query it from emulation
+            self.arm["psci-state"] = 0
+
+        # Handle PEI
+        if not args.type:
+            args.type = ["cache-error"]
+
+        get_mult_choices(
+            pei,
+            name="validation",
+            values=args.pei_valid,
+            choices=self.pei_valid_bits,
+            suffixes=["-valid", "-info", "--information", "--addr"],
+        )
+        get_mult_choices(
+            pei,
+            name="type",
+            values=args.type,
+            choices=self.pei_error_types,
+            suffixes=["-error", "-err"],
+        )
+        get_mult_choices(
+            pei,
+            name="flags",
+            values=args.flags,
+            choices=self.pei_flags,
+            suffixes=["-error", "-cap"],
+        )
+        get_mult_int(pei, "error-info", args.error_info)
+        get_mult_int(pei, "multiple-error", args.multiple_error)
+        get_mult_int(pei, "phy-addr", args.physical_address)
+        get_mult_int(pei, "virt-addr", args.virtual_address)
+
+        for i, p in pei.items():        # pylint: disable=W0612
+            # UEFI 2.10 doesn't define how to encode error information
+            # when multiple types are raised. So, provide a default only
+            # if a single type is there
+            if "error-info" not in p:
+                if len(p["type"]) == 1:
+                    if p["type"][0] == bit(1):
+                        p["error-info"] = 0x0091000F
+                    if p["type"][0] == bit(2):
+                        p["error-info"] = 0x0054007F
+                    if p["type"][0] == bit(3):
+                        p["error-info"] = 0x80D6460FFF
+                    if p["type"][0] == bit(4):
+                        p["error-info"] = 0x78DA03FF
+
+            if "validation" not in p:
+                p["validation"] = []
+                if "multiple-error" in p:
+                    p["validation"].append(self.pei_valid_bits["multiple-error"])
+
+                if "flags" in p:
+                    p["validation"].append(self.pei_valid_bits["flags"])
+
+                if "error-info" in p:
+                    p["validation"].append(self.pei_valid_bits["error-info"])
+
+                if "phy-addr" in p:
+                    p["validation"].append(self.pei_valid_bits["phy-addr"])
+
+                if "virt-addr" in p:
+                    p["validation"].append(self.pei_valid_bits["virt-addr"])
+
+        # Handle context
+        get_mult_int(ctx, "type", args.ctx_type, allow_zero=True)
+        get_mult_int(ctx, "minimal-size", args.ctx_size, allow_zero=True)
+        get_mult_array(ctx, "register", args.ctx_array, allow_zero=True)
+
+        get_mult_array(vendor, "bytes", args.vendor, max_val=255)
+
+        # Store PEI
+        self.arm["error"] = []
+        for k in sorted(pei.keys()):
+            self.arm["error"].append(pei[k])
+
+        # Store Context
+        self.arm["context"] = []
+        if ctx:
+            for k in sorted(ctx.keys()):
+                self.arm["context"].append(ctx[k])
+
+        # Vendor-specific bytes are not grouped
+        self.arm["vendor-specific"] = []
+        if vendor:
+            for k in sorted(vendor.keys()):
+                self.arm["vendor-specific"] += vendor[k]["bytes"]
+
+    def encode_pei(self):
+        """Encode bytes at the PEI table"""
+
+        data = bytearray()
+
+        for pei in self.arm["error"]:
+            # Version
+            data_add(data, 0, 1)
+
+            data_add(data, ACPI_GHES_ARM_CPER_PEI_LENGTH, 1)
+
+            data_add(data, pei["validation"], 2)
+            data_add(data, pei["type"], 1)
+            data_add(data, pei.get("multiple-error", 1), 2)
+            data_add(data, pei.get("flags", 0), 1)
+            data_add(data, pei.get("error_info", 0), 8)
+            data_add(data, pei.get("virt_addr", 0xDEADBEEF), 8)
+            data_add(data, pei.get("phy_addr", 0xABBA0BAD), 8)
+
+        return data
+
+    def encode_vendor(self):
+        """Encode bytes at the vendor data"""
+
+        data = bytearray()
+
+        for i in self.arm.get("vendor-specific", []):
+            data_add(data, i, 1)
+
+        return data
+
+    def encode_context(self):
+        """Encode bytes for the register context"""
+
+        data = bytearray()
+
+        if "context" not in self.arm:
+            return data
+
+        for ctx in self.arm["context"]:
+            if "type" not in ctx:
+                ctx["type"] = CONTEXT_AARCH64_EL1
+
+            reg_size = len(ctx["register"])
+            size = 0
+
+            if "minimal-size" in ctx:
+                size = ctx["minimal-size"]
+
+            size = max(size, reg_size)
+
+            size = (size + 1) % 0xFFFE
+
+            # Version
+            data_add(data, 0, 2)
+
+            data_add(data, ctx["type"], 2)
+
+            data_add(data, 8 * size, 4)
+
+            for r in ctx["register"]:
+                data_add(data, r, 8)
+
+            for i in range(reg_size, size):   # pylint: disable=W0612
+                data_add(data, 0, 8)
+
+        return data
+
+    def encode(self):
+        """Encode bytes for the ARM processor Error record"""
+
+        pei_data = self.encode_pei()
+        vendor_data = self.encode_vendor()
+        context_data = self.encode_context()
+
+        data = bytearray()
+
+        data_add(data, self.arm["validation"], 4)
+
+        error_info_num = len(self.arm["error"])
+        data_add(data, error_info_num, 2)
+
+        context_info_num = len(self.arm["context"])
+        data_add(data, context_info_num, 2)
+
+        # Calculate the length of the CPER data
+        cper_length = ACPI_GHES_ARM_CPER_LENGTH
+        cper_length += len(pei_data)
+        cper_length += len(vendor_data)
+        cper_length += len(context_data)
+        data_add(data, cper_length, 4)
+
+        data_add(data, self.arm["affinity-level"], 1)
+
+        # Reserved
+        data_add(data, 0, 3)
+
+        data_add(data, self.arm["mpidr-el1"], 8)
+        data_add(data, self.arm["midr-el1"], 8)
+        data_add(data, self.arm["running-state"], 4)
+        data_add(data, self.arm["psci-state"], 4)
+
+        # Add PEI
+        data.extend(pei_data)
+        data.extend(context_data)
+        data.extend(vendor_data)
+
+        return list(data)
+
+    def run(self, host, port):
+        """Execute the QMP commands"""
+
+        # Fill the commands to be sent
+        commands = []
+
+        cmd_arg = {
+            'cper': {
+                'notification-type': CPER_ARM_PROCESSOR_ERROR,
+                "raw-data": ""
+            }
+        }
+
+        cmd_arg["cper"]["raw-data"] = self.encode()
+
+        command = '{ "execute": "ghes-cper", '
+        command += '"arguments": ' + json.dumps(cmd_arg) + " }"
+
+        commands.append(command)
+
+        qmp_command(host, port, commands)
+
+
+#
+# Argument parser for ARM Processor CPER
+#
+
+
+def arm_handle_args(subparsers):
+    """Add a subparser for ARM-specific arguments"""
+
+    parser = subparsers.add_parser("arm", help="Generate an ARM processor CPER")
+
+    # UEFI N.16 ARM Validation bits
+    g_arm = parser.add_argument_group("ARM processor")
+    g_arm.add_argument("--arm", "--arm-valid",
+                       help="ARM validation bits: mpidr,affinity,running,vendor")
+    g_arm.add_argument("-a", "--affinity",  "--level", "--affinity-level",
+                       type=lambda x: int(x, 0),
+                       help="Affinity level (when multiple levels apply)")
+    g_arm.add_argument("-l", "--mpidr", type=lambda x: int(x, 0),
+                       help="Multiprocessor Affinity Register")
+    g_arm.add_argument("-i", "--midr", type=lambda x: int(x, 0),
+                       help="Main ID Register")
+    g_arm.add_argument("-r", "--running",
+                       action=argparse.BooleanOptionalAction, default=None,
+                       help="Indicates if the processor is running or not")
+    g_arm.add_argument("--psci", "--psci-state", type=lambda x: int(x, 0),
+                       help="Power State Coordination Interface - PSCI state")
+
+    # TODO: Add vendor-specific support
+
+    # UEFI N.17 bitmaps (type and flags)
+    g_pei = parser.add_argument_group("ARM Processor Error Info (PEI)")
+    g_pei.add_argument("-t", "--type", nargs="+",
+                       help="one or more error types: cache,tlb,bus,vendor")
+    g_pei.add_argument("-f", "--flags", nargs="*",
+                       help="zero or more error flags: first-error,last-error,propagated,overflow")
+    g_arm.add_argument("-V", "--pei-valid", "--error-valid", nargs="*",
+                       help="zero or more PEI validation bits: multiple-error,flags,error-info,virt,phy")
+
+    # UEFI N.17 Integer values
+    g_pei.add_argument("-m", "--multiple-error", nargs="+",
+                       help="Number of errors: 0: Single error, 1: Multiple errors, 2-65535: Error count if known")
+    g_pei.add_argument("-e", "--error-info", nargs="+",
+                       help="Error information (UEFI 2.10 tables N.18 to N.20)")
+    g_pei.add_argument("-p", "--physical-address",  nargs="+",
+                       help="Physical address")
+    g_pei.add_argument("-v", "--virtual-address",  nargs="+",
+                       help="Virtual address")
+
+    # UEFI N.21 Context
+    g_pei.add_argument("--ctx-type", "--context-type", nargs="*",
+                       help="Type of the context (0=ARM32 GPR, 5=ARM64 EL1, other values supported)")
+    g_pei.add_argument("--ctx-size", "--context-size", nargs="*",
+                       help="Minimal size of the context")
+    g_pei.add_argument("--ctx-array", "--context-array", nargs="*",
+                       help="Comma-separated arrays for each context")
+
+    # Vendor-specific data
+    g_pei.add_argument("--vendor", "--vendor-specific", nargs="+",
+                       help="Vendor-specific byte arrays of data")
+
+
+#
+# Main Program. Each error injection logic is handled by a separate subparser
+#
+
+
+def main():
+    """Main program"""
+
+    # Main parser - handle generic args like QEMU QMP TCP socket options
+    parser = argparse.ArgumentParser(prog="einj.py",
+                                     formatter_class=argparse.RawDescriptionHelpFormatter,
+                                     usage="%(prog)s [options]",
+                                     description=EINJ_DESCRIPTION,
+                                     epilog="If a field is not defined, a default value will be applied by QEMU.")
+
+    g_options = parser.add_argument_group("QEMU QMP socket options")
+    g_options.add_argument("-H", "--host", default="localhost", type=str,
+                           help="host name")
+    g_options.add_argument("-P", "--port", default=4445, type=int,
+                           help="TCP port number")
+
+    # Call subparsers
+    subparsers = parser.add_subparsers()
+    arm_handle_args(subparsers)
+
+    args = parser.parse_args()
+
+    # Handle subparser commands
+    if "arm" in args:
+        einj = ArmProcessorEinj(args)
+        einj.run(args.host, args.port)
+    else:
+        sys.exit("Error: type of error injection missing.")
+
+
+if __name__ == "__main__":
+    main()
-- 
2.45.2



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 5/7] qapi/ghes-cper: add an interface to do generic CPER error injection
  2024-08-01 14:47 ` [PATCH v4 5/7] qapi/ghes-cper: add an interface to do generic CPER error injection Mauro Carvalho Chehab
@ 2024-08-02  9:44   ` Markus Armbruster
  0 siblings, 0 replies; 9+ messages in thread
From: Markus Armbruster @ 2024-08-02  9:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Jonathan Cameron, Shiju Jose, Michael S. Tsirkin, Ani Sinha,
	Dongjiu Geng, Eric Blake, Igor Mammedov, Michael Roth,
	Paolo Bonzini, Peter Maydell, linux-kernel, qemu-arm, qemu-devel

Mauro Carvalho Chehab <mchehab+huawei@kernel.org> writes:

> Creates a QAPI to be used for generic ACPI APEI hardware error
> injection (HEST) via GHESv2.

Creates a QMP command

> The actual GHES code will be added at the followup patch.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
>  MAINTAINERS            |  7 ++++++
>  hw/acpi/Kconfig        |  5 ++++
>  hw/acpi/ghes_cper.c    | 53 +++++++++++++++++++++++++++++++++++++++++
>  hw/acpi/meson.build    |  2 ++
>  hw/arm/Kconfig         |  5 ++++
>  include/hw/acpi/ghes.h |  6 +++++
>  qapi/ghes-cper.json    | 54 ++++++++++++++++++++++++++++++++++++++++++
>  qapi/meson.build       |  1 +
>  qapi/qapi-schema.json  |  1 +
>  9 files changed, 134 insertions(+)
>  create mode 100644 hw/acpi/ghes_cper.c
>  create mode 100644 qapi/ghes-cper.json
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 98eddf7ae155..655edcb6688c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2075,6 +2075,13 @@ F: hw/acpi/ghes.c
>  F: include/hw/acpi/ghes.h
>  F: docs/specs/acpi_hest_ghes.rst
>  
> +ACPI/HEST/GHES/ARM processor CPER
> +R: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> +S: Maintained
> +F: hw/arm/ghes_cper.c
> +F: hw/acpi/ghes_cper_stub.c
> +F: qapi/ghes-cper.json
> +
>  ppc4xx
>  L: qemu-ppc@nongnu.org
>  S: Orphan

[...]

> diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
> index fa5c07db9068..6cbf430eb66d 100644
> --- a/hw/acpi/meson.build
> +++ b/hw/acpi/meson.build
> @@ -34,4 +34,6 @@ endif
>  system_ss.add(when: 'CONFIG_ACPI', if_false: files('acpi-stub.c', 'aml-build-stub.c', 'ghes-stub.c', 'acpi_interface.c'))
>  system_ss.add(when: 'CONFIG_ACPI_PCI_BRIDGE', if_false: files('pci-bridge-stub.c'))
>  system_ss.add_all(when: 'CONFIG_ACPI', if_true: acpi_ss)
> +system_ss.add(when: 'CONFIG_GHES_CPER', if_true: files('ghes_cper.c'))
> +system_ss.add(when: 'CONFIG_GHES_CPER', if_false: files('ghes_cper_stub.c'))

../hw/acpi/meson.build:38:50: ERROR: File ghes_cper_stub.c does not exist.

>  system_ss.add(files('acpi-qmp-cmds.c'))

[...]

> diff --git a/qapi/ghes-cper.json b/qapi/ghes-cper.json
> new file mode 100644
> index 000000000000..2b0438431054
> --- /dev/null
> +++ b/qapi/ghes-cper.json

Would this fit into acpi.json?

If yes: would the maintainers of acpi.json be happy to take it?
Michael, Igor?

> @@ -0,0 +1,54 @@
> +# -*- Mode: Python -*-
> +# vim: filetype=python
> +
> +##
> +# = GHESv2 CPER Error Injection
> +#
> +# These are defined at
> +# ACPI 6.2: 18.3.2.8 Generic Hardware Error Source version 2
> +# (GHESv2 - Type 10)
> +##
> +
> +##
> +# @CommonPlatformErrorRecord:
> +#
> +# Common Platform Error Record - CPER - as defined at the UEFI
> +# specification. See

Separate sentences with two spaces for consistency, please.

> +# https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#record-header
> +# for more details.
> +#
> +# @notification-type: pre-assigned GUID value indicating the record
> +#   association with an error event notification type, as defined
> +#   at https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#record-header
> +#
> +# @raw-data: Contains the payload of the CPER.
> +#
> +# Since: 9.2
> +##
> +{ 'struct': 'CommonPlatformErrorRecord',
> +  'data': {
> +      'notification-type': ['uint8'],
> +      'raw-data': ['uint8']
> +  }
> +}

You use ['uint8'] for binary.  We commonly represent binary as
base64-encoded strings in QAPI.

If @notification-type is a UUID, we should use the usual text
representation.  qemu/uuid.h has functions to convert between binary and
text.

> +
> +##
> +# @ghes-cper:
> +#
> +# Inject ARM Processor error with data to be filled according with
> +# ACPI 6.2 GHESv2 spec.
> +#
> +# @cper: a single CPER record to be sent to the guest OS.
> +#
> +# Features:
> +#
> +# @unstable: This command is experimental.
> +#
> +# Since: 9.2
> +##
> +{ 'command': 'ghes-cper',
> +  'data': {
> +    'cper': 'CommonPlatformErrorRecord'
> +  },
> +  'features': [ 'unstable' ]
> +}
> diff --git a/qapi/meson.build b/qapi/meson.build
> index e7bc54e5d047..bd13cd7d40c9 100644
> --- a/qapi/meson.build
> +++ b/qapi/meson.build
> @@ -35,6 +35,7 @@ qapi_all_modules = [
>    'dump',
>    'ebpf',
>    'error',
> +  'ghes-cper',
>    'introspect',
>    'job',
>    'machine-common',
> diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
> index b1581988e4eb..54165c8d6655 100644
> --- a/qapi/qapi-schema.json
> +++ b/qapi/qapi-schema.json
> @@ -81,3 +81,4 @@
>  { 'include': 'vfio.json' }
>  { 'include': 'cryptodev.json' }
>  { 'include': 'cxl.json' }
> +{ 'include': 'ghes-cper.json' }

Put it next to acpi.json, perhaps?



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-08-02  9:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 14:47 [PATCH v4 0/7] Add ACPI CPER firmware first error injection on ARM emulation Mauro Carvalho Chehab
2024-08-01 14:47 ` [PATCH v4 1/7] arm/virt: place power button pin number on a define Mauro Carvalho Chehab
2024-08-01 14:47 ` [PATCH v4 2/7] acpi/generic_event_device: add an APEI error device Mauro Carvalho Chehab
2024-08-01 14:47 ` [PATCH v4 3/7] arm/virt: Wire up GPIO error source for ACPI / GHES Mauro Carvalho Chehab
2024-08-01 14:47 ` [PATCH v4 4/7] acpi/ghes: Support GPIO error source Mauro Carvalho Chehab
2024-08-01 14:47 ` [PATCH v4 5/7] qapi/ghes-cper: add an interface to do generic CPER error injection Mauro Carvalho Chehab
2024-08-02  9:44   ` Markus Armbruster
2024-08-01 14:47 ` [PATCH v4 6/7] acpi/ghes: add support for generic error injection via QAPI Mauro Carvalho Chehab
2024-08-01 14:47 ` [PATCH v4 7/7] scripts/ghes_inject: add a script to generate GHES error inject Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).