qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/3] Attempt to add GHES for x86
@ 2025-03-04 13:30 Mauro Carvalho Chehab
  2025-03-04 13:30 ` [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest() Mauro Carvalho Chehab
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-04 13:30 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Eduardo Habkost,
	Paolo Bonzini, Peter Maydell, Richard Henderson, Shannon Zhao

Hi Igor,

This patch series comes after:
   https://lore.kernel.org/qemu-devel/cover.1740903110.git.mchehab+huawei@kernel.org/

I'm basically trying to add support for error injection for x86,
without success so far.

On x86, the notification mechanism is different: it is via QMP.
Yet, from what I saw on Linux implementation, it works on
a similar way to GED. So, I ended implementing a notification
via GED.

HEST table build seems to be working OK on it, and the
error injection notification for x86 is called. Yet, OSPM is not
receiving any notifications.

Could you help me figuring out what are the missing bits?

PS.: there are some things at the code that require polishing,
plus there are some extra printf() there to help debugging.
Finally, we would need to add x86 CPU error event at the
ghes script, but I'd like to have at least something that the 
OSPM receives before improving it further.

It can be tested with:
   $ ./scripts/ghes_inject.py arm

(Ok, this would be generating an ARM processor event , but
I guess Linux would at least mark the event as read, even if
it doesn't recognize it)

Mauro Carvalho Chehab (3):
  acpi/ghes: move use_hest_addr out of acpi_build_hest()
  hw/i186: add support for HEST table with SCI
  Add a GED device for RAS notification

 hw/acpi/ghes.c           | 16 ++++-----
 hw/arm/virt-acpi-build.c | 12 ++++---
 hw/i386/Kconfig          |  1 +
 hw/i386/acpi-build.c     | 75 ++++++++++++++++++++++++++++++++++++++++
 hw/i386/pc.c             | 41 ++++++++++++++++++++++
 include/hw/acpi/ghes.h   | 25 +++++++-------
 include/hw/i386/pc.h     |  5 +++
 include/hw/i386/x86.h    |  2 ++
 8 files changed, 153 insertions(+), 24 deletions(-)

-- 
2.48.1




^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest()
  2025-03-04 13:30 [PATCH RFC 0/3] Attempt to add GHES for x86 Mauro Carvalho Chehab
@ 2025-03-04 13:30 ` Mauro Carvalho Chehab
  2025-05-30 14:35   ` Igor Mammedov
  2025-03-04 13:30 ` [PATCH RFC 2/3] hw/i186: add support for HEST table with SCI Mauro Carvalho Chehab
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-04 13:30 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Ani Sinha, Dongjiu Geng, Peter Maydell,
	Shannon Zhao, linux-kernel

The only reason why we're passing ags to acpi HEST table build
is to check if migration will be used or not.

Well, we only need migration for arm, as other architectures
will only use the new code. So, move this out of acpi_build_hest(),
as otherwise we can't use it for x86, as the hotplug logic there
may not initialize ags during acpi table build time.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/acpi/ghes.c           | 16 ++++++++--------
 hw/arm/virt-acpi-build.c | 12 ++++++++----
 include/hw/acpi/ghes.h   | 25 +++++++++++++------------
 3 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 1fd5ba941771..ea00fed75c16 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -243,7 +243,7 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
  * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
  * See docs/specs/acpi_hest_ghes.rst for blobs format.
  */
-static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
+static void build_ghes_error_table(GArray *hardware_errors, bool use_hest_addr,
                                    BIOSLinker *linker, int num_sources)
 {
     int i, error_status_block_offset;
@@ -289,7 +289,7 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
                                        i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
     }
 
-    if (!ags->use_hest_addr) {
+    if (!use_hest_addr) {
         /*
          * Tell firmware to write hardware_errors GPA into
          * hardware_errors_addr fw_cfg, once the former has been initialized.
@@ -372,7 +372,7 @@ static void build_ghes_v2_entry(GArray *table_data,
 }
 
 /* Build Hardware Error Source Table */
-void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+void acpi_build_hest(GArray *table_data, bool use_hest_addr,
                      GArray *hardware_errors,
                      BIOSLinker *linker,
                      const AcpiNotificationSourceId *notif_source,
@@ -386,7 +386,7 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
 
     hest_offset = table_data->len;
 
-    build_ghes_error_table(ags, hardware_errors, linker, num_sources);
+    build_ghes_error_table(hardware_errors, use_hest_addr, linker, num_sources);
 
     acpi_table_begin(&table, table_data);
 
@@ -398,7 +398,7 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
 
     acpi_table_end(linker, &table);
 
-    if (ags->use_hest_addr) {
+    if (use_hest_addr) {
         /*
          * Tell firmware to write into GPA the address of HEST via fw_cfg,
          * once initialized.
@@ -411,13 +411,13 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
 }
 
 void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
-                          GArray *hardware_error)
+                          bool use_hest_addr, GArray *hardware_error)
 {
     /* Create a read-only fw_cfg file for GHES */
     fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
                     hardware_error->len);
 
-    if (ags->use_hest_addr) {
+    if (use_hest_addr) {
         fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
             NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
     } else {
@@ -529,7 +529,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
         return;
     }
 
-    if (!ags->use_hest_addr) {
+    if (ags->hw_error_le) {
         get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
                              &cper_addr, &read_ack_register_addr);
     } else {
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 154337e1a77b..71da17b652b2 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -964,9 +964,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
         acpi_ged_state = ACPI_GED(vms->acpi_dev);
         ags = &acpi_ged_state->ghes_state;
         if (ags) {
+            bool use_hest_addr = ags->use_hest_addr;
+
             acpi_add_table(table_offsets, tables_blob);
 
-            if (!ags->use_hest_addr) {
+            if (!use_hest_addr) {
                 notify = hest_ghes_notify_9_2;
                 notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
             } else {
@@ -974,7 +976,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
                 notify_sz = ARRAY_SIZE(hest_ghes_notify);
             }
 
-            acpi_build_hest(ags, tables_blob, tables->hardware_errors,
+            acpi_build_hest(tables_blob, use_hest_addr, tables->hardware_errors,
                             tables->linker, notify, notify_sz,
                             vms->oem_id, vms->oem_table_id);
         }
@@ -1143,8 +1145,10 @@ void virt_acpi_setup(VirtMachineState *vms)
     if (vms->ras) {
         assert(vms->acpi_dev);
         acpi_ged_state = ACPI_GED(vms->acpi_dev);
-        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
-                             vms->fw_cfg, tables.hardware_errors);
+        AcpiGhesState *ags = &acpi_ged_state->ghes_state;
+
+        acpi_ghes_add_fw_cfg(ags, vms->fw_cfg, ags->use_hest_addr,
+                             tables.hardware_errors);
     }
 
     build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
index df2ecbf6e4a9..eae6d4d0a562 100644
--- a/include/hw/acpi/ghes.h
+++ b/include/hw/acpi/ghes.h
@@ -73,31 +73,32 @@ typedef struct AcpiNotificationSourceId {
     enum AcpiGhesNotifyType notify;
 } AcpiNotificationSourceId;
 
-/*
- * AcpiGhesState stores GPA values that will be used to fill HEST entries.
+/**
+ * struct AcpiGhesState - GPA values that will be used to fill HEST entries
  *
- * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
- * is stored at hw_error_le. This is the default on QEMU 9.x.
+ * @hest_addr_le: GPA of the HEST table. Used on QEMU 10.x and above.
+ * @hw_error_le: GPA of the etc/hardware_errors firmware.
+ *               Used only on arm64 virt-9.x to preserve compatibility
+ *               with QEMU 9.x.
+ * @use_hest_addr: True if HEST address is present. Used only on arm64,
+ *                 to identify if QEMU 9.x migration is needed.
  *
- * When use_hest_addr is true, the GPA of the HEST table is stored at
- * hest_addr_le. This is the default for QEMU 10.x and above.
- *
- * Whe both GPA values are equal to zero means that GHES is not present.
+ * When both GPA values are equal to zero means that GHES is not present.
  */
 typedef struct AcpiGhesState {
     uint64_t hest_addr_le;
     uint64_t hw_error_le;
-    bool use_hest_addr; /* True if HEST address is present */
+    bool use_hest_addr;
 } AcpiGhesState;
 
-void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
+void acpi_build_hest(GArray *table_data, bool use_hest_addr,
                      GArray *hardware_errors,
                      BIOSLinker *linker,
                      const AcpiNotificationSourceId * const notif_source,
                      int num_sources,
                      const char *oem_id, const char *oem_table_id);
-void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
-                          GArray *hardware_errors);
+void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
+                          bool use_hest_addr, GArray *hardware_error);
 int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
                             uint64_t error_physical_addr);
 void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH RFC 2/3] hw/i186: add support for HEST table with SCI
  2025-03-04 13:30 [PATCH RFC 0/3] Attempt to add GHES for x86 Mauro Carvalho Chehab
  2025-03-04 13:30 ` [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest() Mauro Carvalho Chehab
@ 2025-03-04 13:30 ` Mauro Carvalho Chehab
  2025-03-04 13:30 ` [PATCH RFC 3/3] Add a GED device for RAS notification Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-04 13:30 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Ani Sinha, Eduardo Habkost,
	Marcel Apfelbaum, Paolo Bonzini, Richard Henderson, linux-kernel

Can be tested by setting machine to:
	q35,ras=on

E.g.:

qemu-system-x86_64 --enable-kvm -cpu host -m 4g,maxmem=8G,slots=8 \
	-M q35,nvdimm=on,ras=on \
	-monitor stdio -no-reboot -drive if=pflash,file=OVMF_CODE.fd,format=raw \
	-kernel ../linux/arch/x86_64/boot/bzImage \
	-device pcie-root-port,id=root_port1 -device virtio-blk-pci,drive=hd \
	-device virtio-net-pci,netdev=mynet,id=bob \
	-drive if=none,file=debian.qcow2,format=qcow2,id=hd \
	-object memory-backend-ram,size=4G,id=mem0 \
	-netdev type=user,id=mynet,hostfwd=tcp::5555-:22 \
	-qmp tcp:localhost:4445,server=on,wait=off \
	-append 'earlycon nomodeset root=/dev/vda1 fsck.mode=skip tp_printk maxcpus=4 console=ttyS0 console=tty0'

TODO: add a notifier logic.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/i386/Kconfig      |  1 +
 hw/i386/acpi-build.c | 33 +++++++++++++++++++++++++++++++++
 hw/i386/pc.c         | 42 ++++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/pc.h |  5 +++++
 4 files changed, 81 insertions(+)

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index d34ce07b215d..a07d5aa1a138 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -60,6 +60,7 @@ config PC_ACPI
     select ACPI_CPU_HOTPLUG
     select ACPI_MEMORY_HOTPLUG
     select ACPI_PCI_BRIDGE
+    select ACPI_APEI
     select ACPI_VIOT
     select SMBUS_EEPROM
     select PFLASH_CFI01
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 3fffa4a33280..cf11231b5fe6 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -38,6 +38,7 @@
 #include "hw/nvram/fw_cfg.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/acpi_aml_interface.h"
+#include "hw/acpi/generic_event_device.h"
 #include "hw/input/i8042.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "system/tpm.h"
@@ -69,6 +70,7 @@
 #include "hw/acpi/utils.h"
 #include "hw/acpi/pci.h"
 #include "hw/acpi/cxl.h"
+#include "hw/acpi/ghes.h"
 
 #include "qom/qom-qobject.h"
 #include "hw/i386/amd_iommu.h"
@@ -2431,6 +2433,10 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg)
     return true;
 }
 
+static const AcpiNotificationSourceId hest_ghes_notify[] = {
+    {ACPI_HEST_SRC_ID_QMP, ACPI_GHES_NOTIFY_GPIO},
+};
+
 static
 void acpi_build(AcpiBuildTables *tables, MachineState *machine)
 {
@@ -2587,6 +2593,15 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
         cxl_build_cedt(table_offsets, tables_blob, tables->linker,
                        x86ms->oem_id, x86ms->oem_table_id, &pcms->cxl_devices_state);
     }
+    if (pcms->ras) {
+        printf("ADD HEST\n");
+        acpi_add_table(table_offsets, tables_blob);
+        acpi_build_hest(tables_blob, true, tables->hardware_errors,
+                        tables->linker, hest_ghes_notify,
+                        ARRAY_SIZE(hest_ghes_notify),
+                        x86ms->oem_id, x86ms->oem_table_id);
+        printf("ADD HEST: added\n");
+    }
 
     acpi_add_table(table_offsets, tables_blob);
     build_waet(tables_blob, tables->linker, x86ms->oem_id, x86ms->oem_table_id);
@@ -2742,6 +2757,24 @@ void acpi_setup(void)
     }
 #endif
 
+    if (pcms->ras) {
+        AcpiGhesState *ags;
+        AcpiGedState *acpi_ged_state;
+
+        acpi_ged_state = ACPI_GED(object_resolve_path_type("", TYPE_ACPI_GED,
+                                                           NULL));
+        if (acpi_ged_state) {
+            printf("GHES FW_CFG: %p\n", acpi_ged_state);
+
+            ags = &acpi_ged_state->ghes_state;
+
+            acpi_ghes_add_fw_cfg(ags, x86ms->fw_cfg, true,
+                                 tables.hardware_errors);
+
+            printf("GHES FW_CFG: end\n");
+        }
+    }
+
     vmgenid_dev = find_vmgenid_dev();
     if (vmgenid_dev) {
         vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), x86ms->fw_cfg,
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 63a96cd23f84..b9c32dbdbcd8 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -47,6 +47,7 @@
 #include "qobject/qlist.h"
 #include "qemu/error-report.h"
 #include "hw/acpi/cpu_hotplug.h"
+#include "hw/acpi/ghes.h"
 #include "acpi-build.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/cxl/cxl_host.h"
@@ -1683,6 +1684,37 @@ static void pc_machine_set_max_fw_size(Object *obj, Visitor *v,
     pcms->max_fw_size = value;
 }
 
+static bool virt_get_ras(Object *obj, Error **errp)
+{
+    PCMachineState *pcms = PC_MACHINE(obj);
+
+    return pcms->ras;
+}
+
+static void virt_set_ras(Object *obj, bool value, Error **errp)
+{
+    PCMachineState *pcms = PC_MACHINE(obj);
+
+    pcms->ras = value;
+}
+
+static void pc_sci_error(Notifier *n, void *opaque)
+{
+    uint16_t *source_id = opaque;
+
+    fprintf(stderr, "GHES error for source ID: %d\n", *source_id);
+
+    /* Currently, only QMP injection is supported */
+    if (*source_id != ACPI_HEST_SRC_ID_QMP)
+        return;
+
+    fprintf(stderr, "GHES error notified for QMP\n");
+
+    // TODO: add something equivalent to:
+    // PCMachineState *s = container_of(n, PCMachineState, ghes_sci_notifier);
+    // acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
+    // by calling acpi_update_sci();
+}
 
 static void pc_machine_initfn(Object *obj)
 {
@@ -1717,6 +1749,10 @@ static void pc_machine_initfn(Object *obj)
     if (pcmc->pci_enabled) {
         cxl_machine_init(obj, &pcms->cxl_devices_state);
     }
+
+    pcms->ghes_sci_notifier.notify = pc_sci_error;
+    notifier_list_add(&acpi_generic_error_notifiers,
+                        &pcms->ghes_sci_notifier);
 }
 
 static void pc_machine_reset(MachineState *machine, ResetType type)
@@ -1853,6 +1889,12 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     object_class_property_set_description(oc, PC_MACHINE_SMBIOS_EP,
         "SMBIOS Entry Point type [32, 64]");
 
+    object_class_property_add_bool(oc, "ras", virt_get_ras,
+                                   virt_set_ras);
+    object_class_property_set_description(oc, "ras",
+                                          "Set on/off to enable/disable reporting host memory errors "
+                                          "to a KVM guest using ACPI and guest external abort exceptions");
+
     object_class_property_add_bool(oc, "fd-bootchk",
         pc_machine_get_fd_bootchk,
         pc_machine_set_fd_bootchk);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 103b54301f82..105b087e7af6 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -30,6 +30,9 @@ typedef struct PCMachineState {
     /* State for other subsystems/APIs: */
     Notifier machine_done;
 
+    /* Triggered when a new SCI GHES error raises */
+    Notifier ghes_sci_notifier;
+
     /* Pointers to devices and objects: */
     PCIBus *pcibus;
     I2CBus *smbus;
@@ -51,6 +54,8 @@ typedef struct PCMachineState {
     bool i8042_enabled;
     bool default_bus_bypass_iommu;
     bool fd_bootchk;
+    bool ras;
+
     uint64_t max_fw_size;
 
     /* ACPI Memory hotplug IO base address */
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH RFC 3/3] Add a GED device for RAS notification
  2025-03-04 13:30 [PATCH RFC 0/3] Attempt to add GHES for x86 Mauro Carvalho Chehab
  2025-03-04 13:30 ` [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest() Mauro Carvalho Chehab
  2025-03-04 13:30 ` [PATCH RFC 2/3] hw/i186: add support for HEST table with SCI Mauro Carvalho Chehab
@ 2025-03-04 13:30 ` Mauro Carvalho Chehab
  2025-04-04 12:19 ` [PATCH RFC 0/3] Attempt to add GHES for x86 Igor Mammedov
  2025-05-30 14:41 ` Igor Mammedov
  4 siblings, 0 replies; 7+ messages in thread
From: Mauro Carvalho Chehab @ 2025-03-04 13:30 UTC (permalink / raw)
  To: Igor Mammedov, Michael S . Tsirkin
  Cc: Jonathan Cameron, Shiju Jose, qemu-arm, qemu-devel,
	Mauro Carvalho Chehab, Ani Sinha, Eduardo Habkost,
	Marcel Apfelbaum, Paolo Bonzini, Richard Henderson, linux-kernel

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 hw/i386/acpi-build.c  | 42 ++++++++++++++++++++++++++++++++++++++++++
 hw/i386/pc.c          |  7 +++----
 include/hw/i386/x86.h |  2 ++
 3 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index cf11231b5fe6..ad829e2c75c8 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1352,6 +1352,28 @@ static void build_acpi0017(Aml *table)
     aml_append(table, scope);
 }
 
+#define GED_COMMON_MMIO_BASE         0xfee00000
+#define GED_COMMON_MMIO_BASE_REGS    (GED_COMMON_MMIO_BASE + 0x100)
+#define GED_MMIO_IRQ          9
+
+static DeviceState *create_acpi_ged(X86MachineState *x86ms)
+{
+    DeviceState *dev = qdev_new(TYPE_ACPI_GED);
+
+    fprintf(stderr, "creating GED\n");
+
+    assert(x86ms->gsi);
+
+    qdev_prop_set_uint32(dev, "ged-event", ACPI_GED_ERROR_EVT);
+    sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, GED_COMMON_MMIO_BASE);
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 1, GED_COMMON_MMIO_BASE_REGS);
+    sysbus_connect_irq(SYS_BUS_DEVICE(dev), 0, x86ms->gsi[GED_MMIO_IRQ]);
+
+    return dev;
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1794,6 +1816,26 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         aml_append(dsdt, scope);
     }
 
+    /* Notification type SCI requires GED */
+    if (pcms->ras) {
+        if (!x86ms->ged_dev) {
+            x86ms->ged_dev = create_acpi_ged(x86ms);
+        }
+
+        sb_scope = aml_scope("_SB");
+        build_ged_aml(sb_scope, GED_DEVICE, HOTPLUG_HANDLER(x86ms->ged_dev),
+                      GED_MMIO_IRQ, AML_SYSTEM_MEMORY, GED_COMMON_MMIO_BASE);
+        aml_append(sb_scope, aml_error_device());
+        aml_append(dsdt, sb_scope);
+
+        scope =  aml_scope("_GPE");
+        method = aml_method("L08", 0, AML_NOTSERIALIZED);
+        aml_append(method, aml_notify(aml_name("\\_SB.GEDD"),
+                                      aml_int(0x80)));
+        aml_append(scope, method);
+        aml_append(dsdt, scope);
+    }
+
     /* copy AML table into ACPI tables blob and patch header there */
     g_array_append_vals(table_data, dsdt->buf->data, dsdt->buf->len);
     acpi_table_end(linker, &table);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b9c32dbdbcd8..b4a2fe61da49 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1700,6 +1700,8 @@ static void virt_set_ras(Object *obj, bool value, Error **errp)
 
 static void pc_sci_error(Notifier *n, void *opaque)
 {
+    PCMachineState *pcms = container_of(n, PCMachineState, ghes_sci_notifier);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     uint16_t *source_id = opaque;
 
     fprintf(stderr, "GHES error for source ID: %d\n", *source_id);
@@ -1710,10 +1712,7 @@ static void pc_sci_error(Notifier *n, void *opaque)
 
     fprintf(stderr, "GHES error notified for QMP\n");
 
-    // TODO: add something equivalent to:
-    // PCMachineState *s = container_of(n, PCMachineState, ghes_sci_notifier);
-    // acpi_send_event(s->acpi_dev, ACPI_GENERIC_ERROR);
-    // by calling acpi_update_sci();
+    acpi_send_event(x86ms->ged_dev, ACPI_GENERIC_ERROR);
 }
 
 static void pc_machine_initfn(Object *obj)
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index d43cb3908e65..06d62a944835 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -54,6 +54,8 @@ struct X86MachineState {
     GMappedFile *initrd_mapped_file;
     HotplugHandler *acpi_dev;
 
+    DeviceState *ged_dev;
+
     /*
      * Map the whole BIOS just underneath the 4 GiB address boundary. Only used
      * in the ROM (-bios) case.
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 0/3] Attempt to add GHES for x86
  2025-03-04 13:30 [PATCH RFC 0/3] Attempt to add GHES for x86 Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2025-03-04 13:30 ` [PATCH RFC 3/3] Add a GED device for RAS notification Mauro Carvalho Chehab
@ 2025-04-04 12:19 ` Igor Mammedov
  2025-05-30 14:41 ` Igor Mammedov
  4 siblings, 0 replies; 7+ messages in thread
From: Igor Mammedov @ 2025-04-04 12:19 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Ani Sinha, Dongjiu Geng, Eduardo Habkost,
	Paolo Bonzini, Peter Maydell, Richard Henderson, Shannon Zhao

On Tue,  4 Mar 2025 14:30:55 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Hi Igor,
> 
> This patch series comes after:
>    https://lore.kernel.org/qemu-devel/cover.1740903110.git.mchehab+huawei@kernel.org/
> 
> I'm basically trying to add support for error injection for x86,
> without success so far.
> 
> On x86, the notification mechanism is different: it is via QMP.
> Yet, from what I saw on Linux implementation, it works on
> a similar way to GED. So, I ended implementing a notification
> via GED.
> 
> HEST table build seems to be working OK on it, and the
> error injection notification for x86 is called. Yet, OSPM is not
> receiving any notifications.

microvm already uses GED, so GHES support with GED is fine there  

However ps/q35 have already acpi controller, and I'd rather
add GHES support there instead of adding extra acpi device.
and use GPE as event delivery mechanism (like it's done for
hotplug event delivery)

> Could you help me figuring out what are the missing bits?
> 
> PS.: there are some things at the code that require polishing,
> plus there are some extra printf() there to help debugging.

use tracing instead of printfs

> Finally, we would need to add x86 CPU error event at the
> ghes script, but I'd like to have at least something that the 
> OSPM receives before improving it further.
> 
> It can be tested with:
>    $ ./scripts/ghes_inject.py arm
> 
> (Ok, this would be generating an ARM processor event , but
> I guess Linux would at least mark the event as read, even if
> it doesn't recognize it)
> 
> Mauro Carvalho Chehab (3):
>   acpi/ghes: move use_hest_addr out of acpi_build_hest()
>   hw/i186: add support for HEST table with SCI
>   Add a GED device for RAS notification
> 
>  hw/acpi/ghes.c           | 16 ++++-----
>  hw/arm/virt-acpi-build.c | 12 ++++---
>  hw/i386/Kconfig          |  1 +
>  hw/i386/acpi-build.c     | 75 ++++++++++++++++++++++++++++++++++++++++
>  hw/i386/pc.c             | 41 ++++++++++++++++++++++
>  include/hw/acpi/ghes.h   | 25 +++++++-------
>  include/hw/i386/pc.h     |  5 +++
>  include/hw/i386/x86.h    |  2 ++
>  8 files changed, 153 insertions(+), 24 deletions(-)
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest()
  2025-03-04 13:30 ` [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest() Mauro Carvalho Chehab
@ 2025-05-30 14:35   ` Igor Mammedov
  0 siblings, 0 replies; 7+ messages in thread
From: Igor Mammedov @ 2025-05-30 14:35 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Ani Sinha, Dongjiu Geng, Peter Maydell, Shannon Zhao,
	linux-kernel

On Tue,  4 Mar 2025 14:30:56 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> The only reason why we're passing ags to acpi HEST table build
> is to check if migration will be used or not.
> 
> Well, we only need migration for arm, as other architectures
> will only use the new code. So, move this out of acpi_build_hest(),

> as otherwise we can't use it for x86, as the hotplug logic there
> may not initialize ags during acpi table build time.

can you expand on what hotplug logic you are talking about?

> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
>  hw/acpi/ghes.c           | 16 ++++++++--------
>  hw/arm/virt-acpi-build.c | 12 ++++++++----
>  include/hw/acpi/ghes.h   | 25 +++++++++++++------------
>  3 files changed, 29 insertions(+), 24 deletions(-)
> 
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 1fd5ba941771..ea00fed75c16 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -243,7 +243,7 @@ ghes_gen_err_data_uncorrectable_recoverable(GArray *block,
>   * Initialize "etc/hardware_errors" and "etc/hardware_errors_addr" fw_cfg blobs.
>   * See docs/specs/acpi_hest_ghes.rst for blobs format.
>   */
> -static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
> +static void build_ghes_error_table(GArray *hardware_errors, bool use_hest_addr,
>                                     BIOSLinker *linker, int num_sources)
>  {
>      int i, error_status_block_offset;
> @@ -289,7 +289,7 @@ static void build_ghes_error_table(AcpiGhesState *ags, GArray *hardware_errors,
>                                         i * ACPI_GHES_MAX_RAW_DATA_LENGTH);
>      }
>  
> -    if (!ags->use_hest_addr) {
> +    if (!use_hest_addr) {
>          /*
>           * Tell firmware to write hardware_errors GPA into
>           * hardware_errors_addr fw_cfg, once the former has been initialized.
> @@ -372,7 +372,7 @@ static void build_ghes_v2_entry(GArray *table_data,
>  }
>  
>  /* Build Hardware Error Source Table */
> -void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> +void acpi_build_hest(GArray *table_data, bool use_hest_addr,
>                       GArray *hardware_errors,
>                       BIOSLinker *linker,
>                       const AcpiNotificationSourceId *notif_source,
> @@ -386,7 +386,7 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>  
>      hest_offset = table_data->len;
>  
> -    build_ghes_error_table(ags, hardware_errors, linker, num_sources);
> +    build_ghes_error_table(hardware_errors, use_hest_addr, linker, num_sources);
>  
>      acpi_table_begin(&table, table_data);
>  
> @@ -398,7 +398,7 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>  
>      acpi_table_end(linker, &table);
>  
> -    if (ags->use_hest_addr) {
> +    if (use_hest_addr) {
>          /*
>           * Tell firmware to write into GPA the address of HEST via fw_cfg,
>           * once initialized.
> @@ -411,13 +411,13 @@ void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
>  }
>  
>  void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> -                          GArray *hardware_error)
> +                          bool use_hest_addr, GArray *hardware_error)
>  {
>      /* Create a read-only fw_cfg file for GHES */
>      fw_cfg_add_file(s, ACPI_HW_ERROR_FW_CFG_FILE, hardware_error->data,
>                      hardware_error->len);
>  
> -    if (ags->use_hest_addr) {
> +    if (use_hest_addr) {
>          fw_cfg_add_file_callback(s, ACPI_HEST_ADDR_FW_CFG_FILE, NULL, NULL,
>              NULL, &(ags->hest_addr_le), sizeof(ags->hest_addr_le), false);
>      } else {
> @@ -529,7 +529,7 @@ void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,
>          return;
>      }
>  
> -    if (!ags->use_hest_addr) {
> +    if (ags->hw_error_le) {
>          get_hw_error_offsets(le64_to_cpu(ags->hw_error_le),
>                               &cper_addr, &read_ack_register_addr);
>      } else {
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 154337e1a77b..71da17b652b2 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -964,9 +964,11 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>          acpi_ged_state = ACPI_GED(vms->acpi_dev);
>          ags = &acpi_ged_state->ghes_state;
>          if (ags) {
> +            bool use_hest_addr = ags->use_hest_addr;
> +
>              acpi_add_table(table_offsets, tables_blob);
>  
> -            if (!ags->use_hest_addr) {
> +            if (!use_hest_addr) {
>                  notify = hest_ghes_notify_9_2;
>                  notify_sz = ARRAY_SIZE(hest_ghes_notify_9_2);
>              } else {
> @@ -974,7 +976,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables)
>                  notify_sz = ARRAY_SIZE(hest_ghes_notify);
>              }
>  
> -            acpi_build_hest(ags, tables_blob, tables->hardware_errors,
> +            acpi_build_hest(tables_blob, use_hest_addr, tables->hardware_errors,
>                              tables->linker, notify, notify_sz,
>                              vms->oem_id, vms->oem_table_id);
>          }
> @@ -1143,8 +1145,10 @@ void virt_acpi_setup(VirtMachineState *vms)
>      if (vms->ras) {
>          assert(vms->acpi_dev);
>          acpi_ged_state = ACPI_GED(vms->acpi_dev);
> -        acpi_ghes_add_fw_cfg(&acpi_ged_state->ghes_state,
> -                             vms->fw_cfg, tables.hardware_errors);
> +        AcpiGhesState *ags = &acpi_ged_state->ghes_state;
> +
> +        acpi_ghes_add_fw_cfg(ags, vms->fw_cfg, ags->use_hest_addr,
> +                             tables.hardware_errors);
>      }
>  
>      build_state->rsdp_mr = acpi_add_rom_blob(virt_acpi_build_update,
> diff --git a/include/hw/acpi/ghes.h b/include/hw/acpi/ghes.h
> index df2ecbf6e4a9..eae6d4d0a562 100644
> --- a/include/hw/acpi/ghes.h
> +++ b/include/hw/acpi/ghes.h
> @@ -73,31 +73,32 @@ typedef struct AcpiNotificationSourceId {
>      enum AcpiGhesNotifyType notify;
>  } AcpiNotificationSourceId;
>  
> -/*
> - * AcpiGhesState stores GPA values that will be used to fill HEST entries.
> +/**
> + * struct AcpiGhesState - GPA values that will be used to fill HEST entries
>   *
> - * When use_hest_addr is false, the GPA of the etc/hardware_errors firmware
> - * is stored at hw_error_le. This is the default on QEMU 9.x.
> + * @hest_addr_le: GPA of the HEST table. Used on QEMU 10.x and above.
> + * @hw_error_le: GPA of the etc/hardware_errors firmware.
> + *               Used only on arm64 virt-9.x to preserve compatibility
> + *               with QEMU 9.x.
> + * @use_hest_addr: True if HEST address is present. Used only on arm64,
> + *                 to identify if QEMU 9.x migration is needed.
>   *
> - * When use_hest_addr is true, the GPA of the HEST table is stored at
> - * hest_addr_le. This is the default for QEMU 10.x and above.
> - *
> - * Whe both GPA values are equal to zero means that GHES is not present.
> + * When both GPA values are equal to zero means that GHES is not present.
>   */
>  typedef struct AcpiGhesState {
>      uint64_t hest_addr_le;
>      uint64_t hw_error_le;
> -    bool use_hest_addr; /* True if HEST address is present */
> +    bool use_hest_addr;
>  } AcpiGhesState;
>  
> -void acpi_build_hest(AcpiGhesState *ags, GArray *table_data,
> +void acpi_build_hest(GArray *table_data, bool use_hest_addr,
>                       GArray *hardware_errors,
>                       BIOSLinker *linker,
>                       const AcpiNotificationSourceId * const notif_source,
>                       int num_sources,
>                       const char *oem_id, const char *oem_table_id);
> -void acpi_ghes_add_fw_cfg(AcpiGhesState *vms, FWCfgState *s,
> -                          GArray *hardware_errors);
> +void acpi_ghes_add_fw_cfg(AcpiGhesState *ags, FWCfgState *s,
> +                          bool use_hest_addr, GArray *hardware_error);
>  int acpi_ghes_memory_errors(AcpiGhesState *ags, uint16_t source_id,
>                              uint64_t error_physical_addr);
>  void ghes_record_cper_errors(AcpiGhesState *ags, const void *cper, size_t len,



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC 0/3] Attempt to add GHES for x86
  2025-03-04 13:30 [PATCH RFC 0/3] Attempt to add GHES for x86 Mauro Carvalho Chehab
                   ` (3 preceding siblings ...)
  2025-04-04 12:19 ` [PATCH RFC 0/3] Attempt to add GHES for x86 Igor Mammedov
@ 2025-05-30 14:41 ` Igor Mammedov
  4 siblings, 0 replies; 7+ messages in thread
From: Igor Mammedov @ 2025-05-30 14:41 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Michael S . Tsirkin, Jonathan Cameron, Shiju Jose, qemu-arm,
	qemu-devel, Ani Sinha, Dongjiu Geng, Eduardo Habkost,
	Paolo Bonzini, Peter Maydell, Richard Henderson, Shannon Zhao

On Tue,  4 Mar 2025 14:30:55 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Hi Igor,
> 
> This patch series comes after:
>    https://lore.kernel.org/qemu-devel/cover.1740903110.git.mchehab+huawei@kernel.org/
> 
> I'm basically trying to add support for error injection for x86,
> without success so far.
> 
> On x86, the notification mechanism is different: it is via QMP.
> Yet, from what I saw on Linux implementation, it works on
> a similar way to GED. So, I ended implementing a notification
> via GED.

I'd use GED only for microvm.
pc/q35 already have ACPI controller builtin,
so I'd rather use that for hosting hest addr logic and event delivery.


> 
> HEST table build seems to be working OK on it, and the
> error injection notification for x86 is called. Yet, OSPM is not
> receiving any notifications.
> 
> Could you help me figuring out what are the missing bits?
> 
> PS.: there are some things at the code that require polishing,
> plus there are some extra printf() there to help debugging.
> Finally, we would need to add x86 CPU error event at the
> ghes script, but I'd like to have at least something that the 
> OSPM receives before improving it further.
> 
> It can be tested with:
>    $ ./scripts/ghes_inject.py arm
> 
> (Ok, this would be generating an ARM processor event , but
> I guess Linux would at least mark the event as read, even if
> it doesn't recognize it)
> 
> Mauro Carvalho Chehab (3):
>   acpi/ghes: move use_hest_addr out of acpi_build_hest()
>   hw/i186: add support for HEST table with SCI
>   Add a GED device for RAS notification
> 
>  hw/acpi/ghes.c           | 16 ++++-----
>  hw/arm/virt-acpi-build.c | 12 ++++---
>  hw/i386/Kconfig          |  1 +
>  hw/i386/acpi-build.c     | 75 ++++++++++++++++++++++++++++++++++++++++
>  hw/i386/pc.c             | 41 ++++++++++++++++++++++
>  include/hw/acpi/ghes.h   | 25 +++++++-------
>  include/hw/i386/pc.h     |  5 +++
>  include/hw/i386/x86.h    |  2 ++
>  8 files changed, 153 insertions(+), 24 deletions(-)
> 



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-05-30 14:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-04 13:30 [PATCH RFC 0/3] Attempt to add GHES for x86 Mauro Carvalho Chehab
2025-03-04 13:30 ` [PATCH RFC 1/3] acpi/ghes: move use_hest_addr out of acpi_build_hest() Mauro Carvalho Chehab
2025-05-30 14:35   ` Igor Mammedov
2025-03-04 13:30 ` [PATCH RFC 2/3] hw/i186: add support for HEST table with SCI Mauro Carvalho Chehab
2025-03-04 13:30 ` [PATCH RFC 3/3] Add a GED device for RAS notification Mauro Carvalho Chehab
2025-04-04 12:19 ` [PATCH RFC 0/3] Attempt to add GHES for x86 Igor Mammedov
2025-05-30 14:41 ` Igor Mammedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).