* [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture
@ 2025-10-29 11:26 Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 01/10] riscv: Define ioremap_cache for RISC-V Himanshu Chauhan
` (9 more replies)
0 siblings, 10 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
RAS stands for Reliability, Availability and Serviceability.
This series implements the RAS support for RISC-V architecture using
RISC-V RERI specification. It is conformant to ACPI platform error
interfaces (APEI). It uses the highest priority Supervisor Software
Events (SSE)[2] to deliver the hardware error events to the kernel.
The SSE implementation has already been merged in OpenSBI. Clement
has sent a patch series for its implemenation in Linux kernel.[5]
The GHES driver framework is used as is with the following changes for RISC-V:
1. Register each ghes entry with SSE layer. Ghes notification vector is SSE event.
2. Add RISC-V specific entries for processor type and ISA string
3. Add fixmap indices GHES SSE Low and High Priority to help map and read from
physical addresses present in GHES entry.
4. Other changes to build/configure the RAS support
How to Use:
----------
This RAS stack consists of Qemu[3], OpenSBI, EDK2[4], Linux kernel and devmem utility to inject and trigger
errors. Qemu [Ref.] has support to emulate RISC-V RERI. The RAS agent is implemented in OpenSBI which
creates CPER records. EDK2 generates HEST table and populates it with GHES entries with the help of
OpenSBI.
Qemu Command:
------------
<qemu-dir>/build/qemu-system-riscv64 \
-s -accel tcg -m 4096 -smp 2 \
-cpu rv64,smepmp=false \
-serial mon:stdio \
-d guest_errors -D ./qemu.log \
-bios <opensbi-dir>/build/platform/generic/firmware/fw_dynamic.bin \
-monitor telnet:127.0.0.1:55555,server,nowait \
-device virtio-gpu-pci -full-screen \
-device qemu-xhci \
-device usb-kbd \
-blockdev node-name=pflash0,driver=file,read-only=on,filename=<edk2-build-dir>/RiscVVirtQemu/RELEASE_GCC5/FV/RISCV_VIRT_CODE.fd \
-blockdev node-name=pflash1,driver=file,filename=<edk2-build-dir>/RiscVVirtQemu/RELEASE_GCC5/FV/RISCV_VIRT_VARS.fd \
-M virt,pflash0=pflash0,pflash1=pflash1,rpmi=true,reri=true,aia=aplic-imsic \
-kernel <kernel image> \
-initrd <rootfs image> \
-append "root=/dev/ram rw console=ttyS0 earlycon=uart8250,mmio,0x10000000"
Error Injection & Triggering:
----------------------------
devmem 0x4010040 32 0x2a1
devmem 0x4010048 32 0x9001404
devmem 0x4010044 8 1
The above commands injects a TLB error on CPU 0.
Sample Output (CPU 0):
---------------------
[ 34.370282] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[ 34.371375] {1}[Hardware Error]: event severity: recoverable
[ 34.372149] {1}[Hardware Error]: Error 0, type: recoverable
[ 34.372756] {1}[Hardware Error]: section_type: general processor error
[ 34.373357] {1}[Hardware Error]: processor_type: 3, RISCV
[ 34.373806] {1}[Hardware Error]: processor_isa: 6, RISCV64
[ 34.374294] {1}[Hardware Error]: error_type: 0x02
[ 34.374845] {1}[Hardware Error]: TLB error
[ 34.375448] {1}[Hardware Error]: operation: 1, data read
[ 34.376100] {1}[Hardware Error]: target_address: 0x0000000000000000
References:
----------
[1] RERI Specification: https://github.com/riscv-non-isa/riscv-ras-eri/releases/download/v1.0/riscv-reri.pdf
[2] SSE Section in OpenSBI v3.0: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v3.0/riscv-sbi.pdf
[3] Qemu source (with RERI emulation support): https://github.com/ventanamicro/qemu.git (branch: dev-upstream)
[4] EDK2: https://github.com/ventanamicro/edk2.git (branch: dev-upstream)
[5] SSE Kernel Patches (v7): https://lore.kernel.org/all/20250908181717.1997461-1-cleger@rivosinc.com/
Changes in v2:
- Made changes to be conformant with SSE v7 patches
- Fixed some bot warnings
Himanshu Chauhan (10):
riscv: Define ioremap_cache for RISC-V
riscv: Define arch_apei_get_mem_attribute for RISC-V
acpi: Introduce SSE in HEST notification types
riscv: Add fixmap indices for GHES IRQ and SSE contexts
riscv: conditionally compile GHES NMI spool function
riscv: Add functions to register ghes having SSE notification
riscv: Add RISC-V entries in processor type and ISA strings
riscv: Introduce HEST SSE notification handlers
riscv: Select HAVE_ACPI_APEI required for RAS
riscv: Enable APEI GHES driver in defconfig
arch/riscv/Kconfig | 1 +
arch/riscv/configs/defconfig | 3 +
arch/riscv/include/asm/acpi.h | 20 ++++
arch/riscv/include/asm/fixmap.h | 8 ++
arch/riscv/include/asm/io.h | 3 +
drivers/acpi/apei/Kconfig | 5 +
drivers/acpi/apei/ghes.c | 103 +++++++++++++++--
drivers/firmware/efi/cper.c | 3 +
drivers/firmware/riscv/riscv_sbi_sse.c | 147 +++++++++++++++++++++++++
include/acpi/actbl1.h | 3 +-
include/linux/riscv_sbi_sse.h | 16 +++
11 files changed, 300 insertions(+), 12 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [RFC PATCH v2 01/10] riscv: Define ioremap_cache for RISC-V
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 02/10] riscv: Define arch_apei_get_mem_attribute " Himanshu Chauhan
` (8 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
bert and einj drivers use ioremap_cache for mapping entries
but ioremap_cache is not defined for RISC-V.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
arch/riscv/include/asm/io.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
index 09bb5f57a9d3..5550b28f38db 100644
--- a/arch/riscv/include/asm/io.h
+++ b/arch/riscv/include/asm/io.h
@@ -142,6 +142,9 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
#ifdef CONFIG_MMU
#define arch_memremap_wb(addr, size, flags) \
((__force void *)ioremap_prot((addr), (size), __pgprot(_PAGE_KERNEL)))
+
+#define ioremap_cache(addr, size) \
+ ((__force void *)ioremap_prot((addr), (size), PAGE_KERNEL))
#endif
#endif /* _ASM_RISCV_IO_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 02/10] riscv: Define arch_apei_get_mem_attribute for RISC-V
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 01/10] riscv: Define ioremap_cache for RISC-V Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 03/10] acpi: Introduce SSE in HEST notification types Himanshu Chauhan
` (7 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
ghes_map function uses arch_apei_get_mem_attribute to get the
protection bits for a given physical address. These protection
bits are then used to map the physical address.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
arch/riscv/include/asm/acpi.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/arch/riscv/include/asm/acpi.h b/arch/riscv/include/asm/acpi.h
index 6e13695120bc..0c599452ef48 100644
--- a/arch/riscv/include/asm/acpi.h
+++ b/arch/riscv/include/asm/acpi.h
@@ -27,6 +27,26 @@ extern int acpi_disabled;
extern int acpi_noirq;
extern int acpi_pci_disabled;
+#ifdef CONFIG_ACPI_APEI
+/*
+ * acpi_disable_cmcff is used in drivers/acpi/apei/hest.c for disabling
+ * IA-32 Architecture Corrected Machine Check (CMC) Firmware-First mode
+ * with a kernel command line parameter "acpi=nocmcoff". But we don't
+ * have this IA-32 specific feature on ARM64, this definition is only
+ * for compatibility.
+ */
+#define acpi_disable_cmcff 1
+static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
+{
+ /*
+ * Until we have a way to look for EFI memory attributes.
+ */
+ return PAGE_KERNEL;
+}
+#else /* CONFIG_ACPI_APEI */
+#define acpi_disable_cmcff 0
+#endif /* !CONFIG_ACPI_APEI */
+
static inline void disable_acpi(void)
{
acpi_disabled = 1;
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 03/10] acpi: Introduce SSE in HEST notification types
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 01/10] riscv: Define ioremap_cache for RISC-V Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 02/10] riscv: Define arch_apei_get_mem_attribute " Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-11-05 8:33 ` Clément Léger
2025-10-29 11:26 ` [RFC PATCH v2 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts Himanshu Chauhan
` (6 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
Introduce a new HEST notification type for RISC-V SSE events.
The GHES entry's notification structure contains the notification
to be used for a given error source. For error sources delivering
events over SSE, it should contain the new SSE notification type.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
include/acpi/actbl1.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
index 7f35eb0e8458..20b490227398 100644
--- a/include/acpi/actbl1.h
+++ b/include/acpi/actbl1.h
@@ -1535,7 +1535,8 @@ enum acpi_hest_notify_types {
ACPI_HEST_NOTIFY_SEI = 9, /* ACPI 6.1 */
ACPI_HEST_NOTIFY_GSIV = 10, /* ACPI 6.1 */
ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED = 11, /* ACPI 6.2 */
- ACPI_HEST_NOTIFY_RESERVED = 12 /* 12 and greater are reserved */
+ ACPI_HEST_NOTIFY_SSE = 12, /* RISCV SSE */
+ ACPI_HEST_NOTIFY_RESERVED = 13 /* 13 and greater are reserved */
};
/* Values for config_write_enable bitfield above */
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (2 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 03/10] acpi: Introduce SSE in HEST notification types Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-11-05 8:41 ` Clément Léger
2025-10-29 11:26 ` [RFC PATCH v2 05/10] riscv: conditionally compile GHES NMI spool function Himanshu Chauhan
` (5 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
GHES error handling requires fixmap entries for IRQ notifications.
Add fixmap indices for IRQ, SSE Low and High priority notifications.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
arch/riscv/include/asm/fixmap.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
index 0a55099bb734..e874fd952286 100644
--- a/arch/riscv/include/asm/fixmap.h
+++ b/arch/riscv/include/asm/fixmap.h
@@ -38,6 +38,14 @@ enum fixed_addresses {
FIX_TEXT_POKE0,
FIX_EARLYCON_MEM_BASE,
+#ifdef CONFIG_ACPI_APEI_GHES
+ /* Used for GHES mapping from assorted contexts */
+ FIX_APEI_GHES_IRQ,
+#ifdef CONFIG_RISCV_SBI_SSE
+ FIX_APEI_GHES_SSE_LOW_PRIORITY,
+ FIX_APEI_GHES_SSE_HIGH_PRIORITY,
+#endif /* CONFIG_RISCV_SBI_SSE */
+#endif /* CONFIG_ACPI_APEI_GHES */
__end_of_permanent_fixed_addresses,
/*
* Temporary boot-time mappings, used by early_ioremap(),
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 05/10] riscv: conditionally compile GHES NMI spool function
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (3 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 06/10] riscv: Add functions to register ghes having SSE notification Himanshu Chauhan
` (4 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
Compile ghes_in_nmi_spool_from_list only when NMI and SEA
is enabled. Otherwise compilation fails with "defined but
not used" error.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
drivers/acpi/apei/ghes.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 97ee19f2cae0..f2cbd7414faf 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1356,6 +1356,7 @@ static int ghes_in_nmi_queue_one_entry(struct ghes *ghes,
return rc;
}
+#if defined(CONFIG_HAVE_ACPI_APEI_NMI) || defined(CONFIG_ACPI_APEI_SEA)
static int ghes_in_nmi_spool_from_list(struct list_head *rcu_list,
enum fixed_addresses fixmap_idx)
{
@@ -1374,6 +1375,7 @@ static int ghes_in_nmi_spool_from_list(struct list_head *rcu_list,
return ret;
}
+#endif
#ifdef CONFIG_ACPI_APEI_SEA
static LIST_HEAD(ghes_sea);
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 06/10] riscv: Add functions to register ghes having SSE notification
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (4 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 05/10] riscv: conditionally compile GHES NMI spool function Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-11-05 10:33 ` Clément Léger
2025-10-29 11:26 ` [RFC PATCH v2 07/10] riscv: Add RISC-V entries in processor type and ISA strings Himanshu Chauhan
` (3 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
Add functions to register the ghes entries which have SSE as
notification type. The vector inside the ghes is the SSE event
ID that should be registered.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
drivers/firmware/riscv/riscv_sbi_sse.c | 147 +++++++++++++++++++++++++
include/linux/riscv_sbi_sse.h | 16 +++
2 files changed, 163 insertions(+)
diff --git a/drivers/firmware/riscv/riscv_sbi_sse.c b/drivers/firmware/riscv/riscv_sbi_sse.c
index 6561c7acdaaa..46ebc9e9651c 100644
--- a/drivers/firmware/riscv/riscv_sbi_sse.c
+++ b/drivers/firmware/riscv/riscv_sbi_sse.c
@@ -5,6 +5,8 @@
#define pr_fmt(fmt) "sse: " fmt
+#include <acpi/ghes.h>
+#include <linux/acpi.h>
#include <linux/cpu.h>
#include <linux/cpuhotplug.h>
#include <linux/cpu_pm.h>
@@ -700,3 +702,148 @@ static int __init sse_init(void)
return ret;
}
arch_initcall(sse_init);
+
+struct sse_ghes_callback {
+ struct list_head head;
+ struct ghes *ghes;
+ sse_event_handler_fn *callback;
+};
+
+struct sse_ghes_event_data {
+ struct list_head head;
+ u32 event_num;
+ struct list_head callback_list;
+ struct sse_event *event;
+};
+
+static DEFINE_SPINLOCK(sse_ghes_event_list_lock);
+static LIST_HEAD(sse_ghes_event_list);
+
+static int sse_ghes_handler(u32 event_num, void *arg, struct pt_regs *regs)
+{
+ struct sse_ghes_event_data *ev_data = arg;
+ struct sse_ghes_callback *cb = NULL;
+
+ list_for_each_entry(cb, &ev_data->callback_list, head) {
+ if (cb && cb->ghes && cb->callback) {
+ cb->callback(ev_data->event_num, cb->ghes, regs);
+ }
+ }
+
+ return 0;
+}
+
+int sse_register_ghes(struct ghes *ghes, sse_event_handler_fn *lo_cb,
+ sse_event_handler_fn *hi_cb)
+{
+ struct sse_ghes_event_data *ev_data, *evd;
+ struct sse_ghes_callback *cb;
+ u32 ev_num;
+ int err;
+
+ if (!sse_available)
+ return -EOPNOTSUPP;
+ if (!ghes || !lo_cb || !hi_cb)
+ return -EINVAL;
+
+ ev_num = ghes->generic->notify.vector;
+
+ ev_data = NULL;
+ spin_lock(&sse_ghes_event_list_lock);
+ list_for_each_entry(evd, &sse_ghes_event_list, head) {
+ if (evd->event_num == ev_num) {
+ ev_data = evd;
+ break;
+ }
+ }
+ spin_unlock(&sse_ghes_event_list_lock);
+
+ if (!ev_data) {
+ ev_data = kzalloc(sizeof(*ev_data), GFP_KERNEL);
+ if (!ev_data)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&ev_data->head);
+ ev_data->event_num = ev_num;
+
+ INIT_LIST_HEAD(&ev_data->callback_list);
+
+ ev_data->event = sse_event_register(ev_num, ev_num,
+ sse_ghes_handler, ev_data);
+ if (IS_ERR(ev_data->event)) {
+ pr_err("%s: Couldn't register event 0x%x\n", __func__, ev_num);
+ kfree(ev_data);
+ return -ENOMEM;
+ }
+
+ err = sse_event_enable(ev_data->event);
+ if (err) {
+ pr_err("%s: Couldn't enable event 0x%x\n", __func__, ev_num);
+ sse_event_unregister(ev_data->event);
+ kfree(ev_data);
+ return err;
+ }
+
+ spin_lock(&sse_ghes_event_list_lock);
+ list_add_tail(&ev_data->head, &sse_ghes_event_list);
+ spin_unlock(&sse_ghes_event_list_lock);
+ }
+
+ list_for_each_entry(cb, &ev_data->callback_list, head) {
+ if (cb->ghes == ghes)
+ return -EALREADY;
+ }
+
+ cb = kzalloc(sizeof(*cb), GFP_KERNEL);
+ if (!cb)
+ return -ENOMEM;
+ INIT_LIST_HEAD(&cb->head);
+ cb->ghes = ghes;
+ cb->callback = lo_cb;
+ list_add_tail(&cb->head, &ev_data->callback_list);
+
+ return 0;
+}
+
+int sse_unregister_ghes(struct ghes *ghes)
+{
+ struct sse_ghes_event_data *ev_data, *tmp;
+ struct sse_ghes_callback *cb;
+ int free_ev_data = 0;
+
+ if (!ghes)
+ return -EINVAL;
+
+ spin_lock(&sse_ghes_event_list_lock);
+
+ list_for_each_entry_safe(ev_data, tmp, &sse_ghes_event_list, head) {
+ list_for_each_entry(cb, &ev_data->callback_list, head) {
+ if (cb->ghes != ghes)
+ continue;
+
+ list_del(&cb->head);
+ kfree(cb);
+ break;
+ }
+
+ if (list_empty(&ev_data->callback_list))
+ free_ev_data = 1;
+
+ if (free_ev_data) {
+ spin_unlock(&sse_ghes_event_list_lock);
+
+ sse_event_disable(ev_data->event);
+ sse_event_unregister(ev_data->event);
+ ev_data->event = NULL;
+
+ spin_lock(&sse_ghes_event_list_lock);
+
+ list_del(&ev_data->head);
+ kfree(ev_data);
+ }
+ }
+
+ spin_unlock(&sse_ghes_event_list_lock);
+
+ return 0;
+}
diff --git a/include/linux/riscv_sbi_sse.h b/include/linux/riscv_sbi_sse.h
index a1b58e89dd19..cd615b479f82 100644
--- a/include/linux/riscv_sbi_sse.h
+++ b/include/linux/riscv_sbi_sse.h
@@ -11,6 +11,7 @@
struct sse_event;
struct pt_regs;
+struct ghes;
typedef int (sse_event_handler_fn)(u32 event_num, void *arg,
struct pt_regs *regs);
@@ -24,6 +25,10 @@ void sse_event_unregister(struct sse_event *evt);
int sse_event_set_target_cpu(struct sse_event *sse_evt, unsigned int cpu);
+int sse_register_ghes(struct ghes *ghes, sse_event_handler_fn *lo_cb,
+ sse_event_handler_fn *hi_cb);
+int sse_unregister_ghes(struct ghes *ghes);
+
int sse_event_enable(struct sse_event *sse_evt);
void sse_event_disable(struct sse_event *sse_evt);
@@ -47,6 +52,17 @@ static inline int sse_event_set_target_cpu(struct sse_event *sse_evt,
return -EOPNOTSUPP;
}
+static inline int sse_register_ghes(struct ghes *ghes, sse_event_handler_fn *lo_cb,
+ sse_event_handler_fn *hi_cb)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline int sse_unregister_ghes(struct ghes *ghes)
+{
+ return -EOPNOTSUPP;
+}
+
static inline int sse_event_enable(struct sse_event *sse_evt)
{
return -EOPNOTSUPP;
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 07/10] riscv: Add RISC-V entries in processor type and ISA strings
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (5 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 06/10] riscv: Add functions to register ghes having SSE notification Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 08/10] riscv: Introduce HEST SSE notification handlers Himanshu Chauhan
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
Add RISCV and RISCV32/64 strings in the in processor type and ISA strings
respectively. These are defined for cper records.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
drivers/firmware/efi/cper.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 928409199a1a..ebdd92ba1e15 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -110,6 +110,7 @@ static const char * const proc_type_strs[] = {
"IA32/X64",
"IA64",
"ARM",
+ "RISCV",
};
static const char * const proc_isa_strs[] = {
@@ -118,6 +119,8 @@ static const char * const proc_isa_strs[] = {
"X64",
"ARM A32/T32",
"ARM A64",
+ "RISCV32",
+ "RISCV64",
};
const char * const cper_proc_error_type_strs[] = {
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 08/10] riscv: Introduce HEST SSE notification handlers
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (6 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 07/10] riscv: Add RISC-V entries in processor type and ISA strings Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 09/10] riscv: Select HAVE_ACPI_APEI required for RAS Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 10/10] riscv: Enable APEI GHES driver in defconfig Himanshu Chauhan
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
Add config option to enable SSE in APEI. When it is enabled, functions
to register/unregister a ghes entry with SSE are avilable along with
low and high priority event handers. If a SSE notification type is
determined, a ghes common handler to handle an error event is registered.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
drivers/acpi/apei/Kconfig | 5 ++
drivers/acpi/apei/ghes.c | 101 +++++++++++++++++++++++++++++++++-----
2 files changed, 95 insertions(+), 11 deletions(-)
diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig
index 070c07d68dfb..ada95a50805f 100644
--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -46,6 +46,11 @@ config ACPI_APEI_SEA
depends on ARM64 && ACPI_APEI_GHES
default y
+config ACPI_APEI_SSE
+ bool
+ depends on RISCV && RISCV_SBI_SSE && ACPI_APEI_GHES
+ default y
+
config ACPI_APEI_MEMORY_FAILURE
bool "APEI memory error recovering support"
depends on ACPI_APEI && MEMORY_FAILURE
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index f2cbd7414faf..3c47249245d1 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -17,6 +17,8 @@
* Author: Huang Ying <ying.huang@intel.com>
*/
+#include <linux/err.h>
+#include <linux/riscv_sbi_sse.h>
#include <linux/arm_sdei.h>
#include <linux/kernel.h>
#include <linux/moduleparam.h>
@@ -97,6 +99,11 @@
#define FIX_APEI_GHES_SDEI_CRITICAL __end_of_fixed_addresses
#endif
+#ifndef CONFIG_RISCV_SBI_SSE
+#define FIX_APEI_GHES_SSE_LOW_PRIORITY __end_of_fixed_addresses
+#define FIX_APEI_GHES_SSE_HIGH_PRIORITY __end_of_fixed_addresses
+#endif
+
static ATOMIC_NOTIFIER_HEAD(ghes_report_chain);
static inline bool is_hest_type_generic_v2(struct ghes *ghes)
@@ -1530,6 +1537,63 @@ static int apei_sdei_unregister_ghes(struct ghes *ghes)
return sdei_unregister_ghes(ghes);
}
+#if defined(CONFIG_ACPI_APEI_SSE)
+/* SSE Handlers */
+static int __ghes_sse_callback(struct ghes *ghes,
+ enum fixed_addresses fixmap_idx)
+{
+ if (!ghes_in_nmi_queue_one_entry(ghes, fixmap_idx)) {
+ irq_work_queue(&ghes_proc_irq_work);
+
+ return 0;
+ }
+
+ return -ENOENT;
+}
+
+/* Low priority */
+static int ghes_sse_lo_callback(u32 event_num, void *arg, struct pt_regs *regs)
+{
+ static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sse_lo);
+ struct ghes *ghes = arg;
+ int err;
+
+ raw_spin_lock(&ghes_notify_lock_sse_lo);
+ err = __ghes_sse_callback(ghes, FIX_APEI_GHES_SSE_LOW_PRIORITY);
+ raw_spin_unlock(&ghes_notify_lock_sse_lo);
+
+ return err;
+}
+
+/* High priority */
+static int ghes_sse_hi_callback(u32 event_num, void *arg, struct pt_regs *regs)
+{
+ static DEFINE_RAW_SPINLOCK(ghes_notify_lock_sse_hi);
+ struct ghes *ghes = arg;
+ int err;
+
+ raw_spin_lock(&ghes_notify_lock_sse_hi);
+ err = __ghes_sse_callback(ghes, FIX_APEI_GHES_SSE_HIGH_PRIORITY);
+ raw_spin_unlock(&ghes_notify_lock_sse_hi);
+
+ return err;
+}
+
+static int apei_sse_register_ghes(struct ghes *ghes)
+{
+ return sse_register_ghes(ghes, ghes_sse_lo_callback,
+ ghes_sse_hi_callback);
+}
+
+static int apei_sse_unregister_ghes(struct ghes *ghes)
+{
+ return sse_unregister_ghes(ghes);
+}
+#else /* CONFIG_ACPI_APEI_SSE */
+static int apei_sse_register_ghes(struct ghes *ghes) { return -ENOTSUPP; }
+static int apei_sse_unregister_ghes(struct ghes *ghes) { return -ENOTSUPP; }
+#endif
+
static int ghes_probe(struct platform_device *ghes_dev)
{
struct acpi_hest_generic *generic;
@@ -1576,6 +1640,15 @@ static int ghes_probe(struct platform_device *ghes_dev)
pr_warn(GHES_PFX "Generic hardware error source: %d notified via local interrupt is not supported!\n",
generic->header.source_id);
goto err;
+ case ACPI_HEST_NOTIFY_SSE:
+ if (!IS_ENABLED(CONFIG_ACPI_APEI_SSE)) {
+ pr_warn(GHES_PFX "Generic hardware error source: %d "
+ "notified via SSE is not supported\n",
+ generic->header.source_id);
+ rc = -ENOTSUPP;
+ goto err;
+ }
+ break;
default:
pr_warn(FW_WARN GHES_PFX "Unknown notification type: %u for generic hardware error source: %d\n",
generic->notify.type, generic->header.source_id);
@@ -1639,6 +1712,18 @@ static int ghes_probe(struct platform_device *ghes_dev)
if (rc)
goto err;
break;
+
+ case ACPI_HEST_NOTIFY_SSE:
+ rc = apei_sse_register_ghes(ghes);
+ if (rc) {
+ pr_err(GHES_PFX "Failed to register for SSE notification"
+ " on vector %d\n",
+ generic->notify.vector);
+ goto err;
+ }
+ pr_err(GHES_PFX "Registered SSE notification on vector %d\n",
+ generic->notify.vector);
+ break;
default:
BUG();
}
@@ -1668,7 +1753,6 @@ static int ghes_probe(struct platform_device *ghes_dev)
static void ghes_remove(struct platform_device *ghes_dev)
{
- int rc;
struct ghes *ghes;
struct acpi_hest_generic *generic;
@@ -1702,16 +1786,11 @@ static void ghes_remove(struct platform_device *ghes_dev)
ghes_nmi_remove(ghes);
break;
case ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED:
- rc = apei_sdei_unregister_ghes(ghes);
- if (rc) {
- /*
- * Returning early results in a resource leak, but we're
- * only here if stopping the hardware failed.
- */
- dev_err(&ghes_dev->dev, "Failed to unregister ghes (%pe)\n",
- ERR_PTR(rc));
- return;
- }
+ apei_sdei_unregister_ghes(ghes);
+ break;
+
+ case ACPI_HEST_NOTIFY_SSE:
+ apei_sse_unregister_ghes(ghes);
break;
default:
BUG();
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 09/10] riscv: Select HAVE_ACPI_APEI required for RAS
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (7 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 08/10] riscv: Introduce HEST SSE notification handlers Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 10/10] riscv: Enable APEI GHES driver in defconfig Himanshu Chauhan
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
Select the HAVE_ACPI_APEI option so that APEI GHES config options
are visible.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
arch/riscv/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 22cda9c452d2..97aa3726e9f6 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -185,6 +185,7 @@ config RISCV
select HAVE_MOVE_PUD
select HAVE_PAGE_SIZE_4KB
select HAVE_PCI
+ select HAVE_ACPI_APEI if ACPI
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC PATCH v2 10/10] riscv: Enable APEI GHES driver in defconfig
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
` (8 preceding siblings ...)
2025-10-29 11:26 ` [RFC PATCH v2 09/10] riscv: Select HAVE_ACPI_APEI required for RAS Himanshu Chauhan
@ 2025-10-29 11:26 ` Himanshu Chauhan
9 siblings, 0 replies; 14+ messages in thread
From: Himanshu Chauhan @ 2025-10-29 11:26 UTC (permalink / raw)
To: linux-riscv, linux-kernel, linux-acpi, linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
cleger, robert.moore, sunilvl, apatel, Himanshu Chauhan
The APEI GHES driver is very important for error handling on ACPI
based platforms so enable it in defconfig.
Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
---
arch/riscv/configs/defconfig | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
index fc2725cbca18..3e62484e148f 100644
--- a/arch/riscv/configs/defconfig
+++ b/arch/riscv/configs/defconfig
@@ -44,6 +44,9 @@ CONFIG_ACPI_CPPC_CPUFREQ=m
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=m
CONFIG_ACPI=y
+CONFIG_ACPI_APEI=y
+CONFIG_ACPI_APEI_GHES=y
+CONFIG_ACPI_APEI_ERST_DEBUG=y
CONFIG_JUMP_LABEL=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 03/10] acpi: Introduce SSE in HEST notification types
2025-10-29 11:26 ` [RFC PATCH v2 03/10] acpi: Introduce SSE in HEST notification types Himanshu Chauhan
@ 2025-11-05 8:33 ` Clément Léger
0 siblings, 0 replies; 14+ messages in thread
From: Clément Léger @ 2025-11-05 8:33 UTC (permalink / raw)
To: Himanshu Chauhan, linux-riscv, linux-kernel, linux-acpi,
linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
robert.moore, sunilvl, apatel
On 10/29/25 12:26, Himanshu Chauhan wrote:
> Introduce a new HEST notification type for RISC-V SSE events.
> The GHES entry's notification structure contains the notification
> to be used for a given error source. For error sources delivering
> events over SSE, it should contain the new SSE notification type.
>
> Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
> ---
> include/acpi/actbl1.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
> index 7f35eb0e8458..20b490227398 100644
> --- a/include/acpi/actbl1.h
> +++ b/include/acpi/actbl1.h
> @@ -1535,7 +1535,8 @@ enum acpi_hest_notify_types {
> ACPI_HEST_NOTIFY_SEI = 9, /* ACPI 6.1 */
> ACPI_HEST_NOTIFY_GSIV = 10, /* ACPI 6.1 */
> ACPI_HEST_NOTIFY_SOFTWARE_DELEGATED = 11, /* ACPI 6.2 */
> - ACPI_HEST_NOTIFY_RESERVED = 12 /* 12 and greater are reserved */
> + ACPI_HEST_NOTIFY_SSE = 12, /* RISCV SSE */
> + ACPI_HEST_NOTIFY_RESERVED = 13 /* 13 and greater are reserved */
> };
Hi Himanshu,
Looks good to me,
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Thanks,
Clément
>
> /* Values for config_write_enable bitfield above */
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts
2025-10-29 11:26 ` [RFC PATCH v2 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts Himanshu Chauhan
@ 2025-11-05 8:41 ` Clément Léger
0 siblings, 0 replies; 14+ messages in thread
From: Clément Léger @ 2025-11-05 8:41 UTC (permalink / raw)
To: Himanshu Chauhan, linux-riscv, linux-kernel, linux-acpi,
linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
robert.moore, sunilvl, apatel
On 10/29/25 12:26, Himanshu Chauhan wrote:
> GHES error handling requires fixmap entries for IRQ notifications.
> Add fixmap indices for IRQ, SSE Low and High priority notifications.
>
> Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
> ---
> arch/riscv/include/asm/fixmap.h | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> index 0a55099bb734..e874fd952286 100644
> --- a/arch/riscv/include/asm/fixmap.h
> +++ b/arch/riscv/include/asm/fixmap.h
> @@ -38,6 +38,14 @@ enum fixed_addresses {
> FIX_TEXT_POKE0,
> FIX_EARLYCON_MEM_BASE,
>
> +#ifdef CONFIG_ACPI_APEI_GHES
> + /* Used for GHES mapping from assorted contexts */
> + FIX_APEI_GHES_IRQ,
> +#ifdef CONFIG_RISCV_SBI_SSE
> + FIX_APEI_GHES_SSE_LOW_PRIORITY,
> + FIX_APEI_GHES_SSE_HIGH_PRIORITY,
> +#endif /* CONFIG_RISCV_SBI_SSE */
> +#endif /* CONFIG_ACPI_APEI_GHES */
> __end_of_permanent_fixed_addresses,
> /*
> * Temporary boot-time mappings, used by early_ioremap(),
Hi Himanshu,
Reviewed-By: Clément Léger <cleger@rivosinc.com>
Thanks,
Clément
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 06/10] riscv: Add functions to register ghes having SSE notification
2025-10-29 11:26 ` [RFC PATCH v2 06/10] riscv: Add functions to register ghes having SSE notification Himanshu Chauhan
@ 2025-11-05 10:33 ` Clément Léger
0 siblings, 0 replies; 14+ messages in thread
From: Clément Léger @ 2025-11-05 10:33 UTC (permalink / raw)
To: Himanshu Chauhan, linux-riscv, linux-kernel, linux-acpi,
linux-efi, acpica-devel
Cc: paul.walmsley, palmer, lenb, james.morse, tony.luck, ardb, conor,
robert.moore, sunilvl, apatel
On 10/29/25 12:26, Himanshu Chauhan wrote:
> Add functions to register the ghes entries which have SSE as
> notification type. The vector inside the ghes is the SSE event
> ID that should be registered.
>
> Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
> ---
> drivers/firmware/riscv/riscv_sbi_sse.c | 147 +++++++++++++++++++++++++
> include/linux/riscv_sbi_sse.h | 16 +++
> 2 files changed, 163 insertions(+)
>
> diff --git a/drivers/firmware/riscv/riscv_sbi_sse.c b/drivers/firmware/riscv/riscv_sbi_sse.c
> index 6561c7acdaaa..46ebc9e9651c 100644
> --- a/drivers/firmware/riscv/riscv_sbi_sse.c
> +++ b/drivers/firmware/riscv/riscv_sbi_sse.c
> @@ -5,6 +5,8 @@
>
> #define pr_fmt(fmt) "sse: " fmt
>
> +#include <acpi/ghes.h>
> +#include <linux/acpi.h>
> #include <linux/cpu.h>
> #include <linux/cpuhotplug.h>
> #include <linux/cpu_pm.h>
> @@ -700,3 +702,148 @@ static int __init sse_init(void)
> return ret;
> }
> arch_initcall(sse_init);
> +
> +struct sse_ghes_callback {
> + struct list_head head;
> + struct ghes *ghes;
> + sse_event_handler_fn *callback;
> +};
> +
> +struct sse_ghes_event_data {
> + struct list_head head;
> + u32 event_num;
> + struct list_head callback_list;
> + struct sse_event *event;
> +};
> +
> +static DEFINE_SPINLOCK(sse_ghes_event_list_lock);
> +static LIST_HEAD(sse_ghes_event_list);
Hi Himanshu,
Please declare these structs/functions at the beggining of the file.
> +
> +static int sse_ghes_handler(u32 event_num, void *arg, struct pt_regs *regs)
> +{
> + struct sse_ghes_event_data *ev_data = arg;
> + struct sse_ghes_callback *cb = NULL;
> +
> + list_for_each_entry(cb, &ev_data->callback_list, head) {
> + if (cb && cb->ghes && cb->callback) {
> + cb->callback(ev_data->event_num, cb->ghes, regs);
> + }
> + }
> +
> + return 0;
> +}
> +
> +int sse_register_ghes(struct ghes *ghes, sse_event_handler_fn *lo_cb,
> + sse_event_handler_fn *hi_cb)
> +{
> + struct sse_ghes_event_data *ev_data, *evd;
> + struct sse_ghes_callback *cb;
> + u32 ev_num;
> + int err;
> +
> + if (!sse_available)
> + return -EOPNOTSUPP;
> + if (!ghes || !lo_cb || !hi_cb)
> + return -EINVAL;
> +
> + ev_num = ghes->generic->notify.vector;
> +
> + ev_data = NULL;
> + spin_lock(&sse_ghes_event_list_lock);
> + list_for_each_entry(evd, &sse_ghes_event_list, head) {
> + if (evd->event_num == ev_num) {
> + ev_data = evd;
> + break;
> + }
> + }
> + spin_unlock(&sse_ghes_event_list_lock);
That lock should cover the whole ev_data creation. Because if two CPUs
enters this function at the same time, the following scneario can occur:
CPU0 CPU1
lock
ev_data = NULL
unlock
lock
ev_data = NULL
unlock
create ev_data create ev_data
-> Both will have read a ev_data = NULL and create an ev_data.
The lock should be kept and unlocked at the end of the function, you can
use a guard() for that.
> +
> + if (!ev_data) {
> + ev_data = kzalloc(sizeof(*ev_data), GFP_KERNEL);
> + if (!ev_data)
> + return -ENOMEM;
> +
> + INIT_LIST_HEAD(&ev_data->head);
I think this isn't necessary since list_add_tail() will anyway overwrite
the head->next/prev field. BTW it's confusing to call this member head
since it will be used as a node in the list. It could probably be
renamed node/list.
> + ev_data->event_num = ev_num;
> +
> + INIT_LIST_HEAD(&ev_data->callback_list);
> +
> + ev_data->event = sse_event_register(ev_num, ev_num,
> + sse_ghes_handler, ev_data);
> + if (IS_ERR(ev_data->event)) {
> + pr_err("%s: Couldn't register event 0x%x\n", __func__, ev_num);
> + kfree(ev_data);
> + return -ENOMEM;
> + }
> +
> + err = sse_event_enable(ev_data->event);
> + if (err) {
> + pr_err("%s: Couldn't enable event 0x%x\n", __func__, ev_num);
> + sse_event_unregister(ev_data->event);
> + kfree(ev_data);
> + return err;
> + }
> +
> + spin_lock(&sse_ghes_event_list_lock);
> + list_add_tail(&ev_data->head, &sse_ghes_event_list);
> + spin_unlock(&sse_ghes_event_list_lock);
> + }
> +
> + list_for_each_entry(cb, &ev_data->callback_list, head) {
> + if (cb->ghes == ghes)
> + return -EALREADY;
> + }
> +
> + cb = kzalloc(sizeof(*cb), GFP_KERNEL);
> + if (!cb)
> + return -ENOMEM;
> + INIT_LIST_HEAD(&cb->head);
> + cb->ghes = ghes;
> + cb->callback = lo_cb;
> + list_add_tail(&cb->head, &ev_data->callback_list);
AFAIU, at this point, the SSE event is already enabled, it means the
sse_ghes_handler() can be called. This one can potentially access
&ev_data->callback_list concurrently which would result in a corrupted
list. You should mask/disable the SSE event while adding the callback to
this list.
BTW, accessing the ev_data->callback here means that if multiple CPUs
access this function at the same time, it could result in a corrupted
ev_data list. Not sure if it can happen but better be safe than sorry.
> +
> + return 0;
> +}
> +
> +int sse_unregister_ghes(struct ghes *ghes)
> +{
> + struct sse_ghes_event_data *ev_data, *tmp;
> + struct sse_ghes_callback *cb;
> + int free_ev_data = 0;
> +
> + if (!ghes)
> + return -EINVAL;
> +
> + spin_lock(&sse_ghes_event_list_lock);
> +
> + list_for_each_entry_safe(ev_data, tmp, &sse_ghes_event_list, head) {
> + list_for_each_entry(cb, &ev_data->callback_list, head) {
> + if (cb->ghes != ghes)
> + continue;
> +
> + list_del(&cb->head);
> + kfree(cb);
> + break;
> + }
> +
> + if (list_empty(&ev_data->callback_list))
> + free_ev_data = 1;
> +
> + if (free_ev_data) {
Remove free_ev_data and use the following:
if (list_empty(&ev_data->callback_list)) {> +
spin_unlock(&sse_ghes_event_list_lock);
> +
> + sse_event_disable(ev_data->event);
> + sse_event_unregister(ev_data->event);
> + ev_data->event = NULL;
> +
> + spin_lock(&sse_ghes_event_list_lock);
> +
> + list_del(&ev_data->head);
> + kfree(ev_data);
> + }
> + }
> +
> + spin_unlock(&sse_ghes_event_list_lock);
> +
> + return 0;
> +}
Please declare this above the arch_initcall() function
> diff --git a/include/linux/riscv_sbi_sse.h b/include/linux/riscv_sbi_sse.h
> index a1b58e89dd19..cd615b479f82 100644
> --- a/include/linux/riscv_sbi_sse.h
> +++ b/include/linux/riscv_sbi_sse.h
> @@ -11,6 +11,7 @@
>
> struct sse_event;
> struct pt_regs;
> +struct ghes;
>
> typedef int (sse_event_handler_fn)(u32 event_num, void *arg,
> struct pt_regs *regs);
> @@ -24,6 +25,10 @@ void sse_event_unregister(struct sse_event *evt);
>
> int sse_event_set_target_cpu(struct sse_event *sse_evt, unsigned int cpu);
>
> +int sse_register_ghes(struct ghes *ghes, sse_event_handler_fn *lo_cb,
> + sse_event_handler_fn *hi_cb);
> +int sse_unregister_ghes(struct ghes *ghes);
> +
> int sse_event_enable(struct sse_event *sse_evt);
>
> void sse_event_disable(struct sse_event *sse_evt);
> @@ -47,6 +52,17 @@ static inline int sse_event_set_target_cpu(struct sse_event *sse_evt,
> return -EOPNOTSUPP;
> }
>
> +static inline int sse_register_ghes(struct ghes *ghes, sse_event_handler_fn *lo_cb,
> + sse_event_handler_fn *hi_cb)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static inline int sse_unregister_ghes(struct ghes *ghes)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> static inline int sse_event_enable(struct sse_event *sse_evt)
> {
> return -EOPNOTSUPP;
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-11-05 10:33 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-29 11:26 [RFC PATCH v2 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 01/10] riscv: Define ioremap_cache for RISC-V Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 02/10] riscv: Define arch_apei_get_mem_attribute " Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 03/10] acpi: Introduce SSE in HEST notification types Himanshu Chauhan
2025-11-05 8:33 ` Clément Léger
2025-10-29 11:26 ` [RFC PATCH v2 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts Himanshu Chauhan
2025-11-05 8:41 ` Clément Léger
2025-10-29 11:26 ` [RFC PATCH v2 05/10] riscv: conditionally compile GHES NMI spool function Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 06/10] riscv: Add functions to register ghes having SSE notification Himanshu Chauhan
2025-11-05 10:33 ` Clément Léger
2025-10-29 11:26 ` [RFC PATCH v2 07/10] riscv: Add RISC-V entries in processor type and ISA strings Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 08/10] riscv: Introduce HEST SSE notification handlers Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 09/10] riscv: Select HAVE_ACPI_APEI required for RAS Himanshu Chauhan
2025-10-29 11:26 ` [RFC PATCH v2 10/10] riscv: Enable APEI GHES driver in defconfig Himanshu Chauhan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox