From: Ruidong Tian <tianruidong@linux.alibaba.com>
To: Himanshu Chauhan <hchauhan@ventanamicro.com>,
linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
linux-acpi@vger.kernel.org, linux-efi@vger.kernel.org,
acpica-devel@lists.linux.dev
Cc: paul.walmsley@sifive.com, palmer@dabbelt.com, lenb@kernel.org,
james.morse@arm.com, tony.luck@intel.com, ardb@kernel.org,
conor@kernel.org, cleger@rivosinc.com, robert.moore@intel.com,
sunilvl@ventanamicro.com, apatel@ventanamicro.com,
xueshuai@linux.alibaba.com
Subject: Re: [RFC PATCH v1 00/10] Add RAS support for RISC-V architecture
Date: Fri, 12 Sep 2025 15:30:41 +0800 [thread overview]
Message-ID: <72563756-a53a-4f50-9bf4-87f6b26af036@linux.alibaba.com> (raw)
In-Reply-To: <20250227123628.2931490-1-hchauhan@ventanamicro.com>
在 2025/2/27 20:36, Himanshu Chauhan 写道:
> This series implements the RAS (Reliability, Availability and Serviceability)
> support for RISC-V architecture using RISC-V RERI specification. It is conformant
> to ACPI platform error interfaces (APEI). It uses the highest priority
> Supervisor Software Events (SSE)[2] to deliver the hardware error events to the kernel.
> The SSE implemetation has already been merged in OpenSBI. Clement has sent a patch series for
> its implemenation in Linux kernel.[5]
>
> The GHES driver framework is used as is with the following changes for RISC-V:
> 1. Register each ghes entry with SSE layer. Ghes notification vector is SSE event.
> 2. Add RISC-V specific entries for processor type and ISA string
> 3. Add fixmap indices GHES SSE Low and High Priority to help map and read from
> physical addresses present in GHES entry.
> 4. Other changes to build/configure the RAS support
>
> How to Use:
> ----------
> This RAS stack consists of Qemu[3], OpenSBI, EDK2[4], Linux kernel and devmem utility to inject and trigger
> errors. Qemu [Ref.] has support to emulate RISC-V RERI. The RAS agent is implemented in OpenSBI which
> creates CPER records. EDK2 generates HEST table and populates it with GHES entries with the help of
> OpenSBI.
>
> Qemu Command:
> ------------
> <qemu-dir>/build/qemu-system-riscv64 \
> -s -accel tcg -m 4096 -smp 2 \
> -cpu rv64,smepmp=false \
> -serial mon:stdio \
> -d guest_errors -D ./qemu.log \
> -bios <opensbi-dir>/build/platform/generic/firmware/fw_dynamic.bin \
> -monitor telnet:127.0.0.1:55555,server,nowait \
> -device virtio-gpu-pci -full-screen \
> -device qemu-xhci \
> -device usb-kbd \
> -blockdev node-name=pflash0,driver=file,read-only=on,filename=<edk2-build-dir>/RiscVVirtQemu/RELEASE_GCC5/FV/RISCV_VIRT_CODE.fd \
> -blockdev node-name=pflash1,driver=file,filename=<edk2-build-dir>/RiscVVirtQemu/RELEASE_GCC5/FV/RISCV_VIRT_VARS.fd \
> -M virt,pflash0=pflash0,pflash1=pflash1,rpmi=true,reri=true,aia=aplic-imsic \
> -kernel <kernel image> \
> -initrd <rootfs image> \
> -append "root=/dev/ram rw console=ttyS0 earlycon=uart8250,mmio,0x10000000"
>
> Error Injection & Triggering:
> ----------------------------
> devmem 0x4010040 32 0x2a1
> devmem 0x4010048 32 0x9001404
> devmem 0x4010044 8 1
>
> The above commands injects a TLB error on CPU 0.
>
> Sample Output (CPU 0):
> ---------------------
> [ 34.370282] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
> [ 34.371375] {1}[Hardware Error]: event severity: recoverable
> [ 34.372149] {1}[Hardware Error]: Error 0, type: recoverable
> [ 34.372756] {1}[Hardware Error]: section_type: general processor error
> [ 34.373357] {1}[Hardware Error]: processor_type: 3, RISCV
> [ 34.373806] {1}[Hardware Error]: processor_isa: 6, RISCV64
> [ 34.374294] {1}[Hardware Error]: error_type: 0x02
> [ 34.374845] {1}[Hardware Error]: TLB error
> [ 34.375448] {1}[Hardware Error]: operation: 1, data read
> [ 34.376100] {1}[Hardware Error]: target_address: 0x0000000000000000
>
> References:
> ----------
> [1] RERI Specification: https://github.com/riscv-non-isa/riscv-ras-eri/releases/download/v1.0/riscv-reri.pdf
> [2] SSE Section in OpenSBI v3.0: https://github.com/riscv-non-isa/riscv-sbi-doc/releases/download/v3.0-rc3/riscv-sbi.pdf
> [3] Qemu source (with RERI emulation support): https://github.com/ventanamicro/qemu.git (branch: dev-upstream)
> [4] EDK2: https://github.com/ventanamicro/edk2.git (branch: dev-upstream)
> [5] SSE Kernel Patches: https://lore.kernel.org/linux-riscv/649fdead-09b0-4f94-a6ff-099fc970d890@rivosinc.com/T/
Hi,
Thanks for this series.
I'm doing some work related to your patch. Besides SSE, I'm working on support
for another notification type for synchronous hardware errors (e.g., on a poison
read), which called Hardware Error Exception (HEE) in Dhaval Sharma's UEFI
proposal[0] in PRS-TG. I have a patch for HEE support which I've sent out
separately[1].
Perhaps we could merge my work into your patchset to bringing a complete RAS
solution to the RISC-V architecture? Or, I'm also happy to wait for your patches
to land and then continue my work on top.
Let me know what you think would be best.
Cheers,
Ruidong Tian
[0]: https://lists.riscv.org/g/tech-prs/topic/risc_v_ras_related_ecrs/113685653
[1]: https://lore.kernel.org/all/20250910093347.75822-6-tianruidong@linux.alibaba.com/
> Himanshu Chauhan (10):
> riscv: Define ioremap_cache for RISC-V
> riscv: Define arch_apei_get_mem_attribute for RISC-V
> acpi: Introduce SSE in HEST notification types
> riscv: Add fixmap indices for GHES IRQ and SSE contexts
> riscv: conditionally compile GHES NMI spool function
> riscv: Add functions to register ghes having SSE notification
> riscv: Add RISC-V entries in processor type and ISA strings
> riscv: Introduce HEST SSE notification handlers
> riscv: Add config option to enable APEI SSE handler
> riscv: Enable APEI and NMI safe cmpxchg options required for RAS
>
> arch/riscv/Kconfig | 2 +
> arch/riscv/include/asm/acpi.h | 20 ++++
> arch/riscv/include/asm/fixmap.h | 8 ++
> arch/riscv/include/asm/io.h | 3 +
> drivers/acpi/apei/Kconfig | 5 +
> drivers/acpi/apei/ghes.c | 102 +++++++++++++++++---
> drivers/firmware/efi/cper.c | 3 +
> drivers/firmware/riscv/riscv_sse.c | 147 +++++++++++++++++++++++++++++
> include/acpi/actbl1.h | 3 +-
> include/linux/riscv_sse.h | 15 +++
> 10 files changed, 296 insertions(+), 12 deletions(-)
>
prev parent reply other threads:[~2025-09-12 7:30 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-27 12:36 [RFC PATCH v1 00/10] Add RAS support for RISC-V architecture Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 01/10] riscv: Define ioremap_cache for RISC-V Himanshu Chauhan
2025-05-05 12:32 ` Anup Patel
2025-02-27 12:36 ` [RFC PATCH v1 02/10] riscv: Define arch_apei_get_mem_attribute " Himanshu Chauhan
2025-02-27 12:57 ` Clément Léger
2025-02-27 12:36 ` [RFC PATCH v1 03/10] acpi: Introduce SSE in HEST notification types Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 04/10] riscv: Add fixmap indices for GHES IRQ and SSE contexts Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 05/10] riscv: conditionally compile GHES NMI spool function Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 06/10] riscv: Add functions to register ghes having SSE notification Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 07/10] riscv: Add RISC-V entries in processor type and ISA strings Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 08/10] riscv: Introduce HEST SSE notification handlers Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 09/10] riscv: Add config option to enable APEI SSE handler Himanshu Chauhan
2025-02-27 12:36 ` [RFC PATCH v1 10/10] riscv: Enable APEI and NMI safe cmpxchg options required for RAS Himanshu Chauhan
2025-09-12 7:30 ` Ruidong Tian [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=72563756-a53a-4f50-9bf4-87f6b26af036@linux.alibaba.com \
--to=tianruidong@linux.alibaba.com \
--cc=acpica-devel@lists.linux.dev \
--cc=apatel@ventanamicro.com \
--cc=ardb@kernel.org \
--cc=cleger@rivosinc.com \
--cc=conor@kernel.org \
--cc=hchauhan@ventanamicro.com \
--cc=james.morse@arm.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-efi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=robert.moore@intel.com \
--cc=sunilvl@ventanamicro.com \
--cc=tony.luck@intel.com \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox