From: sashiko-bot@kernel.org
To: "Ahmed Tiba" <ahmed.tiba@arm.com>
Cc: conor+dt@kernel.org, robh@kernel.org, linux-cxl@vger.kernel.org,
devicetree@vger.kernel.org
Subject: Re: [PATCH v6 10/10] RAS: add firmware-first CPER provider
Date: Wed, 17 Jun 2026 14:10:56 +0000 [thread overview]
Message-ID: <20260617141057.75CAF1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260617-topics-ahmtib01-ras_ffh_arm_internal_review-v6-10-91f725174aa0@arm.com>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] Unbounded memory leak of the global GHES estatus pool backing memory on probe failure or repeated module bind/unbind.
- [High] Unnecessary `spinlock_irqsave` entirely defeats the threaded IRQ handler and causes high interrupt latency.
- [High] The threaded IRQ handler unconditionally returns `IRQ_HANDLED` even when no error block is present, bypassing the kernel's spurious interrupt detector.
- [Medium] Missing Kconfig dependency on `OF_RESERVED_MEM` causes guaranteed probe failure.
Pre-existing issues:
- [Critical] Integer overflow in `data_length` validation allows massive out-of-bounds kernel memory read when `raw_data_length` is non-zero.
--
commit d0750a54f0673f163ace66f8e9921ee0f1ba6bd4
Author: Ahmed Tiba <ahmed.tiba@arm.com>
RAS: add firmware-first CPER provider
Add a firmware-first CPER provider that reuses the shared GHES helpers,
wire it into the RAS Kconfig/Makefile and document it in the admin guide.
> diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig
> index fc4f4bb94a4c6..3c1c63b2fefc3 100644
> --- a/drivers/ras/Kconfig
> +++ b/drivers/ras/Kconfig
> @@ -34,6 +34,17 @@ if RAS
> source "arch/x86/ras/Kconfig"
> source "drivers/ras/amd/atl/Kconfig"
>
> +config RAS_CPER_ESOURCE
> + bool "Firmware-first CPER error source block provider"
> + select GHES_CPER_HELPERS
[Severity: Medium]
Will this fail to probe on configurations without OF_RESERVED_MEM?
The driver calls of_reserved_mem_region_to_resource() unconditionally, but
if CONFIG_OF_RESERVED_MEM is disabled, that function is an inline stub that
always returns -ENOSYS. Should there be a "depends on OF_RESERVED_MEM" here
so it doesn't build broken configurations?
> + help
> + Enable support for firmware-first Common Platform Error Record
[ ... ]
> diff --git a/drivers/ras/cper-esource.c b/drivers/ras/cper-esource.c
> new file mode 100644
> index 0000000000000..cc9f5f522400d
> --- /dev/null
> +++ b/drivers/ras/cper-esource.c
[ ... ]
> +static int cper_esource_validate_status(struct cper_esource *ctx)
> +{
> + size_t estatus_len;
> +
> + if (!ctx->estatus->block_status)
> + return -ENOENT;
> +
> + if (cper_estatus_check_header(ctx->estatus))
> + return -EINVAL;
> +
> + if (ctx->estatus->raw_data_length &&
> + (ctx->estatus->raw_data_offset > ctx->status_len ||
> + ctx->estatus->raw_data_length >
> + ctx->status_len - ctx->estatus->raw_data_offset))
> + return -EINVAL;
[Severity: Critical]
This isn't a bug introduced by this patch, but does the validation logic
have an integer overflow vulnerability?
When raw_data_length is non-zero, this function relies on
cper_estatus_check_header() to validate data_length.
In drivers/firmware/efi/cper.c:cper_estatus_check_header():
if (estatus->raw_data_length &&
estatus->raw_data_offset < sizeof(*estatus) + estatus->data_length)
return -EINVAL;
If firmware provides a massive data_length (e.g., 0xFFFFFFF0), the addition
wraps around to a very small integer, bypassing the check. The subsequent
cper_estatus_check() will then iterate over this unvalidated length, allowing
a massive out-of-bounds kernel memory read past the ctx->estatus buffer.
Should this driver independently verify that
sizeof(*estatus) + data_length <= ctx->status_len to protect against this?
> +
> + estatus_len = cper_esource_estatus_len(ctx->estatus);
[ ... ]
> +static void cper_esource_process(struct cper_esource *ctx)
> +{
> + int rc;
> + int sev;
> +
> + guard(spinlock_irqsave)(&ctx->lock);
[Severity: High]
Does this unnecessarily disable local interrupts?
The cper_esource_irq() handler is requested as a threaded IRQ with
IRQF_ONESHOT, which naturally serializes execution. Disabling interrupts
across this entire function defeats the purpose of delegating work to a
threaded context and introduces high interrupt latency, especially since it
contains memory copies and potentially blocking reporting helpers.
> +
> + memcpy(ctx->estatus, ctx->status, ctx->status_len);
[ ... ]
> +static irqreturn_t cper_esource_irq(int irq, void *data)
> +{
> + struct cper_esource *ctx = data;
> +
> + cper_esource_process(ctx);
> +
> + return IRQ_HANDLED;
> +}
[Severity: High]
If cper_esource_process() returns early without doing any work, shouldn't
this propagate an IRQ_NONE return?
If the firmware hasn't set block_status, cper_esource_process() bails out
early without clearing device state or acknowledging the interrupt. By
always returning IRQ_HANDLED here, the spurious interrupt detector is
bypassed. If this is a level-triggered interrupt, won't this result in an
infinite loop of executing the threaded handler and cause a hard CPU lockup?
> +
> +static int cper_esource_init_ack(struct cper_esource *ctx)
[ ... ]
> +static int cper_esource_probe(struct platform_device *pdev)
> +{
> + struct device *dev = &pdev->dev;
> + struct cper_esource *ctx;
> + size_t size;
> + int source_id;
> + int rc;
[ ... ]
> + rc = cper_esource_init_ack(ctx);
> + if (rc)
> + return rc;
> +
> + rc = cper_esource_init_pool();
> + if (rc)
> + return rc;
[Severity: High]
Does this cause an unbounded memory leak if the probe sequence fails or
the module is repeatedly bound/unbound?
cper_esource_init_pool() calls ghes_estatus_pool_init(1).
In drivers/acpi/apei/ghes_cper.c:ghes_estatus_pool_init():
addr = (unsigned long)vmalloc(len);
...
rc = gen_pool_add(ghes_estatus_pool, addr, len, -1);
This dynamically allocates and adds memory to the global pool on every call,
even if the pool already exists. If the driver is unbound and rebound, or if
platform_get_irq() later returns -EPROBE_DEFER, the allocated memory is
permanently leaked.
> +
> + ctx->estatus = devm_kzalloc(dev, ctx->status_len, GFP_KERNEL);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260617-topics-ahmtib01-ras_ffh_arm_internal_review-v6-0-91f725174aa0@arm.com?part=10
prev parent reply other threads:[~2026-06-17 14:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 13:54 [PATCH v6 00/10] ACPI: APEI: share GHES CPER helpers and add DT FFH provider Ahmed Tiba
2026-06-17 13:54 ` [PATCH v6 01/10] ACPI: APEI: GHES: share macros via a private header Ahmed Tiba
2026-06-17 14:09 ` sashiko-bot
2026-06-17 13:54 ` [PATCH v6 02/10] ACPI: APEI: GHES: move CPER read helpers Ahmed Tiba
2026-06-17 14:15 ` sashiko-bot
2026-06-17 13:54 ` [PATCH v6 03/10] ACPI: APEI: GHES: move GHESv2 ack and alloc helpers Ahmed Tiba
2026-06-17 13:54 ` [PATCH v6 04/10] ACPI: APEI: GHES: move estatus cache helpers Ahmed Tiba
2026-06-17 14:10 ` sashiko-bot
2026-06-17 13:54 ` [PATCH v6 05/10] ACPI: APEI: GHES: move vendor record helpers Ahmed Tiba
2026-06-17 13:54 ` [PATCH v6 06/10] ACPI: APEI: GHES: move CXL CPER helpers Ahmed Tiba
2026-06-17 14:14 ` sashiko-bot
2026-06-17 13:54 ` [PATCH v6 07/10] ACPI: APEI: introduce GHES helper Ahmed Tiba
2026-06-17 13:54 ` [PATCH v6 08/10] ACPI: APEI: share GHES CPER helpers Ahmed Tiba
2026-06-17 14:11 ` sashiko-bot
2026-06-17 13:54 ` [PATCH v6 09/10] dt-bindings: firmware: add arm,ras-cper Ahmed Tiba
2026-06-17 13:54 ` [PATCH v6 10/10] RAS: add firmware-first CPER provider Ahmed Tiba
2026-06-17 14:10 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260617141057.75CAF1F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=ahmed.tiba@arm.com \
--cc=conor+dt@kernel.org \
--cc=devicetree@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=robh@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox