All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Ahmed Tiba" <ahmed.tiba@arm.com>
Cc: devicetree@vger.kernel.org, conor+dt@kernel.org, robh@kernel.org
Subject: Re: [PATCH v5 10/10] RAS: add firmware-first CPER provider
Date: Fri, 29 May 2026 11:07:08 +0000	[thread overview]
Message-ID: <20260529110709.0B3C01F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260529-topics-ahmtib01-ras_ffh_arm_internal_review-v5-10-2e0500d42642@arm.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Handling synchronous errors via an asynchronous threaded IRQ causes force_sig(SIGBUS) to kill the kernel IRQ thread instead of the faulting user task.
- [High] The driver fails to validate or clear the hardware CPER 'block_status' field, leading to the processing of invalid/stale records and violating the firmware state machine.
- [High] The driver blindly trusts the memory-mapped region size and firmware-provided CPER record lengths, leading to out-of-bounds heap reads.
--

commit 079bcf67c79972332f188b664beb5efc21cf8796
Author: Ahmed Tiba <ahmed.tiba@arm.com>

RAS: add firmware-first CPER provider

Add a firmware-first CPER provider that reuses the shared GHES helpers, wire
it into the RAS Kconfig/Makefile and document it in the admin guide.

> diff --git a/drivers/ras/cper-esource.c b/drivers/ras/cper-esource.c
> new file mode 100644
> index 0000000000000..83f7a910e50a7
> --- /dev/null
> +++ b/drivers/ras/cper-esource.c
[ ... ]
> +static int cper_esource_copy_status(struct cper_esource *ctx)
> +{
> +	memcpy_fromio(ctx->estatus, ctx->status, ctx->status_len);
> +	return 0;
> +}
[ ... ]
> +static void cper_esource_process(struct cper_esource *ctx)
> +{
> +	int sev;
> +
> +	guard(spinlock_irqsave)(&ctx->lock);
> +
> +	if (cper_esource_copy_status(ctx))
> +		return;

[Severity: High]
Should this code check and clear the CPER block_status field?

The CPER/HEST specification uses block_status to indicate if a valid error
record is present. Since cper_esource_copy_status() unconditionally copies
the buffer without checking ctx->estatus->block_status, could a spurious
interrupt cause the driver to process invalid or stale records?

Also, does the hardware require the OS to clear the block_status in IO
memory after consumption, similar to ghes_clear_estatus()? Without doing so,
the firmware might consider the buffer permanently full or the OS might
repeatedly process the same stale error upon subsequent interrupts.

> +
> +	sev = ghes_severity(ctx->estatus->error_severity);
> +	if (sev >= GHES_SEV_PANIC)
> +		cper_esource_fatal(ctx);
> +
> +	if (!ghes_estatus_cached(ctx->estatus) &&

[Severity: High]
Can this lead to out-of-bounds heap reads if the firmware provides
unexpected lengths?

The estatus buffer is dynamically allocated with a size of ctx->status_len,
but the ACPI GHES core helpers (like ghes_estatus_cached() and
apei_estatus_for_each_section()) rely on the firmware-provided data_length
and raw_data_length fields to calculate the total record size.

If cper_estatus_len(ctx->estatus) evaluates to a size larger than
ctx->status_len, these helpers could perform out-of-bounds reads when
comparing memory or iterating over sections. Should the total length
be validated against ctx->status_len before passing it to these helpers?

> +	    ghes_print_estatus(NULL, ctx->generic, ctx->estatus))
> +		ghes_estatus_cache_add(ctx->generic, ctx->estatus);
> +
> +	ghes_cper_handle_status(ctx->dev, ctx->generic, ctx->estatus, ctx->sync);

[Severity: High]
For synchronous errors (when ctx->sync is true), will this signal the wrong
task?

ghes_cper_handle_status() uses force_sig(SIGBUS) to kill the current process
for unrecoverable synchronous errors:

drivers/acpi/apei/ghes_cper.c:ghes_cper_handle_status() {
    ...
    if (sync && !queued) {
        ...
        force_sig(SIGBUS);
    }
}

Because cper_esource_irq() is registered below as a threaded IRQ via
devm_request_threaded_irq(), it executes in the context of a kernel IRQ
thread. Doesn't this mean force_sig(SIGBUS) will kill the IRQ thread instead
of the user task that actually triggered the fault?

> +	cper_esource_ack(ctx);
> +}
[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260529-topics-ahmtib01-ras_ffh_arm_internal_review-v5-0-2e0500d42642@arm.com?part=10

  reply	other threads:[~2026-05-29 11:07 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29  9:50 [PATCH v5 00/10] ACPI: APEI: share GHES CPER helpers and add DT FFH provider Ahmed Tiba
2026-05-29  9:50 ` [PATCH v5 01/10] ACPI: APEI: GHES: share macros via a private header Ahmed Tiba
2026-05-29 10:23   ` sashiko-bot
2026-05-29 15:52   ` Jonathan Cameron
2026-06-01 22:46   ` Borislav Petkov
2026-05-29  9:50 ` [PATCH v5 02/10] ACPI: APEI: GHES: move CPER read helpers Ahmed Tiba
2026-05-29 10:37   ` sashiko-bot
2026-05-29 15:51   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 03/10] ACPI: APEI: GHES: move GHESv2 ack and alloc helpers Ahmed Tiba
2026-05-29 10:42   ` sashiko-bot
2026-05-29 15:54   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 04/10] ACPI: APEI: GHES: move estatus cache helpers Ahmed Tiba
2026-05-29 10:21   ` sashiko-bot
2026-05-29 16:03   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 05/10] ACPI: APEI: GHES: move vendor record helpers Ahmed Tiba
2026-05-29 16:10   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 06/10] ACPI: APEI: GHES: move CXL CPER helpers Ahmed Tiba
2026-05-29 10:34   ` sashiko-bot
2026-05-29 16:16   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 07/10] ACPI: APEI: introduce GHES helper Ahmed Tiba
2026-05-29 10:36   ` sashiko-bot
2026-05-29 16:21   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 08/10] ACPI: APEI: share GHES CPER helpers Ahmed Tiba
2026-05-29 10:40   ` sashiko-bot
2026-05-29 16:32   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 09/10] dt-bindings: firmware: add arm,ras-cper Ahmed Tiba
2026-05-29 16:44   ` Jonathan Cameron
2026-05-29  9:50 ` [PATCH v5 10/10] RAS: add firmware-first CPER provider Ahmed Tiba
2026-05-29 11:07   ` sashiko-bot [this message]
2026-05-29 17:06   ` Jonathan Cameron
2026-05-29 16:36 ` [PATCH v5 00/10] ACPI: APEI: share GHES CPER helpers and add DT FFH provider Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260529110709.0B3C01F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=ahmed.tiba@arm.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.