* [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records
@ 2026-01-06 10:01 Mauro Carvalho Chehab
2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab
0 siblings, 1 reply; 3+ messages in thread
From: Mauro Carvalho Chehab @ 2026-01-06 10:01 UTC (permalink / raw)
To: Rafael J. Wysocki, Robert Moore
Cc: Mauro Carvalho Chehab, Ard Biesheuvel, Ankit Agrawal,
Borislav Petkov, Breno Leitao, Dan Williams, Dave Jiang,
Gregory Price, Hanjun Guo, Jason Tian, Jonathan Cameron,
Jonathan Cameron, Len Brown, Mauro Carvalho Chehab, Shuai Xue,
Smita Koralahalli, Tony Luck, acpica-devel, linux-acpi,
linux-edac, linux-efi, linux-kernel, pengdonglin
Rafael,
Current parsing logic at apei/ghes for ARM Processor Error
assumes that the record sizes are correct. Yet, a bad BIOS
might produce malformed GHES reports.
Worse than that, it may end exposing data from other memory
addresses, as the logic may end dumping large portions of
the memory.
Avoid that by checking the buffer sizes where needed.
---
v4:
- addressed Jonathan comments;
- added two extra patches to prevent other OOM issues.
v3:
- addressed Shuai feedback;
- moved all ghes code to one patch;
- fixed a typo and a bad indent;
- cleanup the size check logic at ghes.c.
Mauro Carvalho Chehab (4):
apei/ghes: ARM processor Error: don't go past allocated memory
efi/cper: don't go past the ARM processor CPER record buffer
apei/ghes: ensure that won't go past CPER allocated record
efi/cper: don't dump the entire memory region
drivers/acpi/apei/ghes.c | 38 ++++++++++++++++++++++++++++-----
drivers/firmware/efi/cper-arm.c | 12 +++++++----
drivers/firmware/efi/cper.c | 8 ++++++-
drivers/ras/ras.c | 6 +++++-
include/acpi/ghes.h | 1 +
include/linux/cper.h | 3 ++-
6 files changed, 56 insertions(+), 12 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory
2026-01-06 10:01 [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records Mauro Carvalho Chehab
@ 2026-01-06 10:01 ` Mauro Carvalho Chehab
2026-01-06 15:57 ` Jonathan Cameron
0 siblings, 1 reply; 3+ messages in thread
From: Mauro Carvalho Chehab @ 2026-01-06 10:01 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Mauro Carvalho Chehab, Ankit Agrawal, Borislav Petkov,
Breno Leitao, Hanjun Guo, Jason Tian, Jonathan Cameron, Len Brown,
Mauro Carvalho Chehab, Shuai Xue, Smita Koralahalli, Tony Luck,
linux-acpi, linux-edac, linux-kernel, pengdonglin
If the BIOS generates a very small ARM Processor Error, or
an incomplete one, the current logic will fail to deferrence
err->section_length
and
ctx_info->size
Add checks to avoid that. With such changes, such GHESv2
records won't cause OOPSes like this:
[ 1.492129] Internal error: Oops: 0000000096000005 [#1] SMP
[ 1.495449] Modules linked in:
[ 1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT
[ 1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
[ 1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred
[ 1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 1.497199] pc : log_arm_hw_error+0x5c/0x200
[ 1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220
0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75).
70 err_info = (struct cper_arm_err_info *)(err + 1);
71 ctx_info = (struct cper_arm_ctx_info *)(err_info + err->err_info_num);
72 ctx_err = (u8 *)ctx_info;
73
74 for (n = 0; n < err->context_info_num; n++) {
75 sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size;
76 ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz);
77 ctx_len += sz;
78 }
79
and similar ones while trying to access section_length on an
error dump with too small size.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
drivers/acpi/apei/ghes.c | 32 ++++++++++++++++++++++++++++----
drivers/ras/ras.c | 6 +++++-
2 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 0dc767392a6c..fc3f8aed99d5 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -552,21 +552,45 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
{
struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
int flags = sync ? MF_ACTION_REQUIRED : 0;
+ int length = gdata->error_data_length;
char error_type[120];
bool queued = false;
int sec_sev, i;
char *p;
sec_sev = ghes_severity(gdata->error_severity);
- log_arm_hw_error(err, sec_sev);
+ if (length >= sizeof(*err)) {
+ log_arm_hw_error(err, sec_sev);
+ } else {
+ pr_warn(FW_BUG "arm error length: %d\n", length);
+ pr_warn(FW_BUG "length is too small\n");
+ pr_warn(FW_BUG "firmware-generated error record is incorrect\n");
+ return false;
+ }
+
if (sev != GHES_SEV_RECOVERABLE || sec_sev != GHES_SEV_RECOVERABLE)
return false;
p = (char *)(err + 1);
+ length -= sizeof(err);
+
for (i = 0; i < err->err_info_num; i++) {
- struct cper_arm_err_info *err_info = (struct cper_arm_err_info *)p;
- bool is_cache = err_info->type & CPER_ARM_CACHE_ERROR;
- bool has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR);
+ struct cper_arm_err_info *err_info;
+ bool is_cache, has_pa;
+
+ /* Ensure we have enough data for the error info header */
+ if (length < sizeof(*err_info))
+ break;
+
+ err_info = (struct cper_arm_err_info *)p;
+
+ /* Validate the claimed length before using it */
+ length -= err_info->length;
+ if (length < 0)
+ break;
+
+ is_cache = err_info->type & CPER_ARM_CACHE_ERROR;
+ has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR);
/*
* The field (err_info->error_info & BIT(26)) is fixed to set to
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index 2a5b5a9fdcb3..03df3db62334 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -72,7 +72,11 @@ void log_arm_hw_error(struct cper_sec_proc_arm *err, const u8 sev)
ctx_err = (u8 *)ctx_info;
for (n = 0; n < err->context_info_num; n++) {
- sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size;
+ sz = sizeof(struct cper_arm_ctx_info);
+
+ if (sz + (long)ctx_info - (long)err >= err->section_length)
+ sz += ctx_info->size;
+
ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz);
ctx_len += sz;
}
--
2.52.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory
2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab
@ 2026-01-06 15:57 ` Jonathan Cameron
0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Cameron @ 2026-01-06 15:57 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Rafael J. Wysocki, Ankit Agrawal, Borislav Petkov, Breno Leitao,
Hanjun Guo, Jason Tian, Len Brown, Mauro Carvalho Chehab,
Shuai Xue, Smita Koralahalli, Tony Luck, linux-acpi, linux-edac,
linux-kernel, pengdonglin
On Tue, 6 Jan 2026 11:01:35 +0100
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> If the BIOS generates a very small ARM Processor Error, or
> an incomplete one, the current logic will fail to deferrence
>
> err->section_length
> and
> ctx_info->size
>
> Add checks to avoid that. With such changes, such GHESv2
> records won't cause OOPSes like this:
>
> [ 1.492129] Internal error: Oops: 0000000096000005 [#1] SMP
> [ 1.495449] Modules linked in:
> [ 1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT
> [ 1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
> [ 1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred
> [ 1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [ 1.497199] pc : log_arm_hw_error+0x5c/0x200
> [ 1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220
>
> 0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75).
> 70 err_info = (struct cper_arm_err_info *)(err + 1);
> 71 ctx_info = (struct cper_arm_ctx_info *)(err_info + err->err_info_num);
> 72 ctx_err = (u8 *)ctx_info;
> 73
> 74 for (n = 0; n < err->context_info_num; n++) {
> 75 sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size;
> 76 ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz);
> 77 ctx_len += sz;
> 78 }
> 79
>
> and similar ones while trying to access section_length on an
> error dump with too small size.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Hi Mauro,
Resolved the stuff I pointed out in previous, so LGTM.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-01-06 15:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-06 10:01 [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records Mauro Carvalho Chehab
2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab
2026-01-06 15:57 ` Jonathan Cameron
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox