* [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records @ 2026-01-06 10:01 Mauro Carvalho Chehab 2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab 2026-01-06 10:01 ` [PATCH v4 3/4] apei/ghes: ensure that won't go past CPER allocated record Mauro Carvalho Chehab 0 siblings, 2 replies; 5+ messages in thread From: Mauro Carvalho Chehab @ 2026-01-06 10:01 UTC (permalink / raw) To: Rafael J. Wysocki, Robert Moore Cc: Mauro Carvalho Chehab, Ard Biesheuvel, Ankit Agrawal, Borislav Petkov, Breno Leitao, Dan Williams, Dave Jiang, Gregory Price, Hanjun Guo, Jason Tian, Jonathan Cameron, Jonathan Cameron, Len Brown, Mauro Carvalho Chehab, Shuai Xue, Smita Koralahalli, Tony Luck, acpica-devel, linux-acpi, linux-edac, linux-efi, linux-kernel, pengdonglin Rafael, Current parsing logic at apei/ghes for ARM Processor Error assumes that the record sizes are correct. Yet, a bad BIOS might produce malformed GHES reports. Worse than that, it may end exposing data from other memory addresses, as the logic may end dumping large portions of the memory. Avoid that by checking the buffer sizes where needed. --- v4: - addressed Jonathan comments; - added two extra patches to prevent other OOM issues. v3: - addressed Shuai feedback; - moved all ghes code to one patch; - fixed a typo and a bad indent; - cleanup the size check logic at ghes.c. Mauro Carvalho Chehab (4): apei/ghes: ARM processor Error: don't go past allocated memory efi/cper: don't go past the ARM processor CPER record buffer apei/ghes: ensure that won't go past CPER allocated record efi/cper: don't dump the entire memory region drivers/acpi/apei/ghes.c | 38 ++++++++++++++++++++++++++++----- drivers/firmware/efi/cper-arm.c | 12 +++++++---- drivers/firmware/efi/cper.c | 8 ++++++- drivers/ras/ras.c | 6 +++++- include/acpi/ghes.h | 1 + include/linux/cper.h | 3 ++- 6 files changed, 56 insertions(+), 12 deletions(-) -- 2.52.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory 2026-01-06 10:01 [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records Mauro Carvalho Chehab @ 2026-01-06 10:01 ` Mauro Carvalho Chehab 2026-01-06 15:57 ` Jonathan Cameron 2026-01-06 10:01 ` [PATCH v4 3/4] apei/ghes: ensure that won't go past CPER allocated record Mauro Carvalho Chehab 1 sibling, 1 reply; 5+ messages in thread From: Mauro Carvalho Chehab @ 2026-01-06 10:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Mauro Carvalho Chehab, Ankit Agrawal, Borislav Petkov, Breno Leitao, Hanjun Guo, Jason Tian, Jonathan Cameron, Len Brown, Mauro Carvalho Chehab, Shuai Xue, Smita Koralahalli, Tony Luck, linux-acpi, linux-edac, linux-kernel, pengdonglin If the BIOS generates a very small ARM Processor Error, or an incomplete one, the current logic will fail to deferrence err->section_length and ctx_info->size Add checks to avoid that. With such changes, such GHESv2 records won't cause OOPSes like this: [ 1.492129] Internal error: Oops: 0000000096000005 [#1] SMP [ 1.495449] Modules linked in: [ 1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT [ 1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022 [ 1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred [ 1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 1.497199] pc : log_arm_hw_error+0x5c/0x200 [ 1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220 0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75). 70 err_info = (struct cper_arm_err_info *)(err + 1); 71 ctx_info = (struct cper_arm_ctx_info *)(err_info + err->err_info_num); 72 ctx_err = (u8 *)ctx_info; 73 74 for (n = 0; n < err->context_info_num; n++) { 75 sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size; 76 ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz); 77 ctx_len += sz; 78 } 79 and similar ones while trying to access section_length on an error dump with too small size. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> --- drivers/acpi/apei/ghes.c | 32 ++++++++++++++++++++++++++++---- drivers/ras/ras.c | 6 +++++- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0dc767392a6c..fc3f8aed99d5 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -552,21 +552,45 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, { struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); int flags = sync ? MF_ACTION_REQUIRED : 0; + int length = gdata->error_data_length; char error_type[120]; bool queued = false; int sec_sev, i; char *p; sec_sev = ghes_severity(gdata->error_severity); - log_arm_hw_error(err, sec_sev); + if (length >= sizeof(*err)) { + log_arm_hw_error(err, sec_sev); + } else { + pr_warn(FW_BUG "arm error length: %d\n", length); + pr_warn(FW_BUG "length is too small\n"); + pr_warn(FW_BUG "firmware-generated error record is incorrect\n"); + return false; + } + if (sev != GHES_SEV_RECOVERABLE || sec_sev != GHES_SEV_RECOVERABLE) return false; p = (char *)(err + 1); + length -= sizeof(err); + for (i = 0; i < err->err_info_num; i++) { - struct cper_arm_err_info *err_info = (struct cper_arm_err_info *)p; - bool is_cache = err_info->type & CPER_ARM_CACHE_ERROR; - bool has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR); + struct cper_arm_err_info *err_info; + bool is_cache, has_pa; + + /* Ensure we have enough data for the error info header */ + if (length < sizeof(*err_info)) + break; + + err_info = (struct cper_arm_err_info *)p; + + /* Validate the claimed length before using it */ + length -= err_info->length; + if (length < 0) + break; + + is_cache = err_info->type & CPER_ARM_CACHE_ERROR; + has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR); /* * The field (err_info->error_info & BIT(26)) is fixed to set to diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c index 2a5b5a9fdcb3..03df3db62334 100644 --- a/drivers/ras/ras.c +++ b/drivers/ras/ras.c @@ -72,7 +72,11 @@ void log_arm_hw_error(struct cper_sec_proc_arm *err, const u8 sev) ctx_err = (u8 *)ctx_info; for (n = 0; n < err->context_info_num; n++) { - sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size; + sz = sizeof(struct cper_arm_ctx_info); + + if (sz + (long)ctx_info - (long)err >= err->section_length) + sz += ctx_info->size; + ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz); ctx_len += sz; } -- 2.52.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory 2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab @ 2026-01-06 15:57 ` Jonathan Cameron 0 siblings, 0 replies; 5+ messages in thread From: Jonathan Cameron @ 2026-01-06 15:57 UTC (permalink / raw) To: Mauro Carvalho Chehab Cc: Rafael J. Wysocki, Ankit Agrawal, Borislav Petkov, Breno Leitao, Hanjun Guo, Jason Tian, Len Brown, Mauro Carvalho Chehab, Shuai Xue, Smita Koralahalli, Tony Luck, linux-acpi, linux-edac, linux-kernel, pengdonglin On Tue, 6 Jan 2026 11:01:35 +0100 Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote: > If the BIOS generates a very small ARM Processor Error, or > an incomplete one, the current logic will fail to deferrence > > err->section_length > and > ctx_info->size > > Add checks to avoid that. With such changes, such GHESv2 > records won't cause OOPSes like this: > > [ 1.492129] Internal error: Oops: 0000000096000005 [#1] SMP > [ 1.495449] Modules linked in: > [ 1.495820] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.18.0-rc1-00017-gabadcc3553dd-dirty #18 PREEMPT > [ 1.496125] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022 > [ 1.496433] Workqueue: kacpi_notify acpi_os_execute_deferred > [ 1.496967] pstate: 814000c5 (Nzcv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) > [ 1.497199] pc : log_arm_hw_error+0x5c/0x200 > [ 1.497380] lr : ghes_handle_arm_hw_error+0x94/0x220 > > 0xffff8000811c5324 is in log_arm_hw_error (../drivers/ras/ras.c:75). > 70 err_info = (struct cper_arm_err_info *)(err + 1); > 71 ctx_info = (struct cper_arm_ctx_info *)(err_info + err->err_info_num); > 72 ctx_err = (u8 *)ctx_info; > 73 > 74 for (n = 0; n < err->context_info_num; n++) { > 75 sz = sizeof(struct cper_arm_ctx_info) + ctx_info->size; > 76 ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + sz); > 77 ctx_len += sz; > 78 } > 79 > > and similar ones while trying to access section_length on an > error dump with too small size. > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Hi Mauro, Resolved the stuff I pointed out in previous, so LGTM. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v4 3/4] apei/ghes: ensure that won't go past CPER allocated record 2026-01-06 10:01 [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records Mauro Carvalho Chehab 2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab @ 2026-01-06 10:01 ` Mauro Carvalho Chehab 2026-01-06 16:07 ` Jonathan Cameron 1 sibling, 1 reply; 5+ messages in thread From: Mauro Carvalho Chehab @ 2026-01-06 10:01 UTC (permalink / raw) To: Rafael J. Wysocki, Robert Moore Cc: Mauro Carvalho Chehab, Ankit Agrawal, Borislav Petkov, Breno Leitao, Hanjun Guo, Jason Tian, Jonathan Cameron, Len Brown, Mauro Carvalho Chehab, Shuai Xue, Smita Koralahalli, Tony Luck, acpica-devel, linux-acpi, linux-kernel The logic at ghes_new() prevents allocating too large records, by checking if they're bigger than GHES_ESTATUS_MAX_SIZE (currently, 64KB). Yet, the allocation is done with the actual number of pages from the CPER bios table location, which can be smaller. Yet, a bad firmware could send data with a different size, which might be bigger than the allocated memory, causing an OOPS: [13095.899926] Unable to handle kernel paging request at virtual address fff00000f9b40000 [13095.899961] Mem abort info: [13095.900017] ESR = 0x0000000096000007 [13095.900088] EC = 0x25: DABT (current EL), IL = 32 bits [13095.900156] SET = 0, FnV = 0 [13095.900181] EA = 0, S1PTW = 0 [13095.900211] FSC = 0x07: level 3 translation fault [13095.900255] Data abort info: [13095.900421] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000 [13095.900486] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [13095.900525] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [13095.900713] swapper pgtable: 4k pages, 52-bit VAs, pgdp=000000008ba16000 [13095.900752] [fff00000f9b40000] pgd=180000013ffff403, p4d=180000013fffe403, pud=180000013f85b403, pmd=180000013f68d403, pte=0000000000000000 [13095.901312] Internal error: Oops: 0000000096000007 [#1] SMP [13095.901659] Modules linked in: [13095.902201] CPU: 0 UID: 0 PID: 303 Comm: kworker/0:1 Not tainted 6.19.0-rc1-00002-gda407d200220 #34 PREEMPT [13095.902461] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022 [13095.902719] Workqueue: kacpi_notify acpi_os_execute_deferred [13095.903778] pstate: 214020c5 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [13095.903892] pc : hex_dump_to_buffer+0x30c/0x4a0 [13095.904146] lr : hex_dump_to_buffer+0x328/0x4a0 [13095.904204] sp : ffff800080e13880 [13095.904291] x29: ffff800080e13880 x28: ffffac9aba86f6a8 x27: 0000000000000083 [13095.904704] x26: fff00000f9b3fffc x25: 0000000000000004 x24: 0000000000000004 [13095.905335] x23: ffff800080e13905 x22: 0000000000000010 x21: 0000000000000083 [13095.905483] x20: 0000000000000001 x19: 0000000000000008 x18: 0000000000000010 [13095.905617] x17: 0000000000000001 x16: 00000007c7f20fec x15: 0000000000000020 [13095.905850] x14: 0000000000000008 x13: 0000000000081020 x12: 0000000000000008 [13095.906175] x11: ffff800080e13905 x10: ffff800080e13988 x9 : 0000000000000000 [13095.906733] x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000020 [13095.907197] x5 : 0000000000000030 x4 : 00000000fffffffe x3 : 0000000000000000 [13095.907623] x2 : ffffac9aba78c1c8 x1 : ffffac9aba76d0a8 x0 : 0000000000000008 [13095.908284] Call trace: [13095.908866] hex_dump_to_buffer+0x30c/0x4a0 (P) [13095.909135] print_hex_dump+0xac/0x170 [13095.909179] cper_estatus_print_section+0x90c/0x968 [13095.909336] cper_estatus_print+0xf0/0x158 [13095.909348] __ghes_print_estatus+0xa0/0x148 [13095.909656] ghes_proc+0x1bc/0x220 [13095.909883] ghes_notify_hed+0x5c/0xb8 [13095.909957] notifier_call_chain+0x78/0x148 [13095.910180] blocking_notifier_call_chain+0x4c/0x80 [13095.910246] acpi_hed_notify+0x28/0x40 [13095.910558] acpi_ev_notify_dispatch+0x50/0x80 [13095.910576] acpi_os_execute_deferred+0x24/0x48 [13095.911161] process_one_work+0x15c/0x3b0 [13095.911326] worker_thread+0x2d0/0x400 [13095.911775] kthread+0x148/0x228 [13095.912082] ret_from_fork+0x10/0x20 [13095.912687] Code: 6b14033f 540001ad a94707e2 f100029f (b8747b44) [13095.914085] ---[ end trace 0000000000000000 ]--- Prevent that by taking the actual allocated are into account when checking for CPER length. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> --- drivers/acpi/apei/ghes.c | 6 +++++- include/acpi/ghes.h | 1 + 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index fc3f8aed99d5..350f666b7783 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -29,6 +29,7 @@ #include <linux/cper.h> #include <linux/cleanup.h> #include <linux/platform_device.h> +#include <linux/minmax.h> #include <linux/mutex.h> #include <linux/ratelimit.h> #include <linux/vmalloc.h> @@ -294,6 +295,7 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic) error_block_length = GHES_ESTATUS_MAX_SIZE; } ghes->estatus = kmalloc(error_block_length, GFP_KERNEL); + ghes->error_block_length = error_block_length; if (!ghes->estatus) { rc = -ENOMEM; goto err_unmap_status_addr; @@ -365,13 +367,15 @@ static int __ghes_check_estatus(struct ghes *ghes, struct acpi_hest_generic_status *estatus) { u32 len = cper_estatus_len(estatus); + u32 max_len = min(ghes->generic->error_block_length, + ghes->error_block_length); if (len < sizeof(*estatus)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Truncated error status block!\n"); return -EIO; } - if (len > ghes->generic->error_block_length) { + if (!len || len > max_len) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid error status block length!\n"); return -EIO; } diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h index ebd21b05fe6e..5866f50bac0c 100644 --- a/include/acpi/ghes.h +++ b/include/acpi/ghes.h @@ -27,6 +27,7 @@ struct ghes { struct timer_list timer; unsigned int irq; }; + unsigned int error_block_length; struct device *dev; struct list_head elist; }; -- 2.52.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v4 3/4] apei/ghes: ensure that won't go past CPER allocated record 2026-01-06 10:01 ` [PATCH v4 3/4] apei/ghes: ensure that won't go past CPER allocated record Mauro Carvalho Chehab @ 2026-01-06 16:07 ` Jonathan Cameron 0 siblings, 0 replies; 5+ messages in thread From: Jonathan Cameron @ 2026-01-06 16:07 UTC (permalink / raw) To: Mauro Carvalho Chehab Cc: Rafael J. Wysocki, Robert Moore, Ankit Agrawal, Borislav Petkov, Breno Leitao, Hanjun Guo, Jason Tian, Len Brown, Mauro Carvalho Chehab, Shuai Xue, Smita Koralahalli, Tony Luck, acpica-devel, linux-acpi, linux-kernel On Tue, 6 Jan 2026 11:01:37 +0100 Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote: > The logic at ghes_new() prevents allocating too large records, by > checking if they're bigger than GHES_ESTATUS_MAX_SIZE (currently, 64KB). > Yet, the allocation is done with the actual number of pages from the > CPER bios table location, which can be smaller. > > Yet, a bad firmware could send data with a different size, which might > be bigger than the allocated memory, causing an OOPS: > > [13095.899926] Unable to handle kernel paging request at virtual address fff00000f9b40000 > [13095.899961] Mem abort info: > [13095.900017] ESR = 0x0000000096000007 > [13095.900088] EC = 0x25: DABT (current EL), IL = 32 bits > [13095.900156] SET = 0, FnV = 0 > [13095.900181] EA = 0, S1PTW = 0 > [13095.900211] FSC = 0x07: level 3 translation fault > [13095.900255] Data abort info: > [13095.900421] ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000 > [13095.900486] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [13095.900525] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [13095.900713] swapper pgtable: 4k pages, 52-bit VAs, pgdp=000000008ba16000 > [13095.900752] [fff00000f9b40000] pgd=180000013ffff403, p4d=180000013fffe403, pud=180000013f85b403, pmd=180000013f68d403, pte=0000000000000000 > [13095.901312] Internal error: Oops: 0000000096000007 [#1] SMP > [13095.901659] Modules linked in: > [13095.902201] CPU: 0 UID: 0 PID: 303 Comm: kworker/0:1 Not tainted 6.19.0-rc1-00002-gda407d200220 #34 PREEMPT > [13095.902461] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022 > [13095.902719] Workqueue: kacpi_notify acpi_os_execute_deferred > [13095.903778] pstate: 214020c5 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--) > [13095.903892] pc : hex_dump_to_buffer+0x30c/0x4a0 > [13095.904146] lr : hex_dump_to_buffer+0x328/0x4a0 > [13095.904204] sp : ffff800080e13880 > [13095.904291] x29: ffff800080e13880 x28: ffffac9aba86f6a8 x27: 0000000000000083 > [13095.904704] x26: fff00000f9b3fffc x25: 0000000000000004 x24: 0000000000000004 > [13095.905335] x23: ffff800080e13905 x22: 0000000000000010 x21: 0000000000000083 > [13095.905483] x20: 0000000000000001 x19: 0000000000000008 x18: 0000000000000010 > [13095.905617] x17: 0000000000000001 x16: 00000007c7f20fec x15: 0000000000000020 > [13095.905850] x14: 0000000000000008 x13: 0000000000081020 x12: 0000000000000008 > [13095.906175] x11: ffff800080e13905 x10: ffff800080e13988 x9 : 0000000000000000 > [13095.906733] x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000020 > [13095.907197] x5 : 0000000000000030 x4 : 00000000fffffffe x3 : 0000000000000000 > [13095.907623] x2 : ffffac9aba78c1c8 x1 : ffffac9aba76d0a8 x0 : 0000000000000008 > [13095.908284] Call trace: > [13095.908866] hex_dump_to_buffer+0x30c/0x4a0 (P) > [13095.909135] print_hex_dump+0xac/0x170 > [13095.909179] cper_estatus_print_section+0x90c/0x968 > [13095.909336] cper_estatus_print+0xf0/0x158 > [13095.909348] __ghes_print_estatus+0xa0/0x148 > [13095.909656] ghes_proc+0x1bc/0x220 > [13095.909883] ghes_notify_hed+0x5c/0xb8 > [13095.909957] notifier_call_chain+0x78/0x148 > [13095.910180] blocking_notifier_call_chain+0x4c/0x80 > [13095.910246] acpi_hed_notify+0x28/0x40 > [13095.910558] acpi_ev_notify_dispatch+0x50/0x80 > [13095.910576] acpi_os_execute_deferred+0x24/0x48 > [13095.911161] process_one_work+0x15c/0x3b0 > [13095.911326] worker_thread+0x2d0/0x400 > [13095.911775] kthread+0x148/0x228 > [13095.912082] ret_from_fork+0x10/0x20 > [13095.912687] Code: 6b14033f 540001ad a94707e2 f100029f (b8747b44) > [13095.914085] ---[ end trace 0000000000000000 ]--- > > Prevent that by taking the actual allocated are into account when > checking for CPER length. > > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> A naming comment inline. The bikeshed must be blue! Might have been better to +CC these to linux-edac like earlier ones. If nothing else that wouldn't have broken my filtering and I'd have gotten patch 4 (didn't as don't subscribe to efi or lkml lists). FWIW patch 4 looks fine to me. > --- > drivers/acpi/apei/ghes.c | 6 +++++- > include/acpi/ghes.h | 1 + > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index fc3f8aed99d5..350f666b7783 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -29,6 +29,7 @@ > #include <linux/cper.h> > #include <linux/cleanup.h> > #include <linux/platform_device.h> > +#include <linux/minmax.h> > #include <linux/mutex.h> > #include <linux/ratelimit.h> > #include <linux/vmalloc.h> > @@ -294,6 +295,7 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic) > error_block_length = GHES_ESTATUS_MAX_SIZE; > } > ghes->estatus = kmalloc(error_block_length, GFP_KERNEL); > + ghes->error_block_length = error_block_length; Maybe it would be clearer to call it after what we care about which is the length of estatus. So ghes->estatus_length or something like that. The reason I raise this is it feels a bit random stashed in the ghes struct and there is no documentation to say what it is the length of wrt to other elements of that structure. > if (!ghes->estatus) { > rc = -ENOMEM; > goto err_unmap_status_addr; > @@ -365,13 +367,15 @@ static int __ghes_check_estatus(struct ghes *ghes, > struct acpi_hest_generic_status *estatus) > { > u32 len = cper_estatus_len(estatus); > + u32 max_len = min(ghes->generic->error_block_length, > + ghes->error_block_length); > > if (len < sizeof(*estatus)) { > pr_warn_ratelimited(FW_WARN GHES_PFX "Truncated error status block!\n"); > return -EIO; > } > > - if (len > ghes->generic->error_block_length) { > + if (!len || len > max_len) { > pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid error status block length!\n"); > return -EIO; > } > diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h > index ebd21b05fe6e..5866f50bac0c 100644 > --- a/include/acpi/ghes.h > +++ b/include/acpi/ghes.h > @@ -27,6 +27,7 @@ struct ghes { > struct timer_list timer; > unsigned int irq; > }; > + unsigned int error_block_length; Would it cause a big hole if this moved up near estatus? If not I would be tempted to do that given it's effectively the length of that. > struct device *dev; > struct list_head elist; > }; ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-01-06 16:07 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-01-06 10:01 [PATCH v4 0/4] apei/ghes: don't OOPS with bad ARM error CPER records Mauro Carvalho Chehab 2026-01-06 10:01 ` [PATCH v4 1/4] apei/ghes: ARM processor Error: don't go past allocated memory Mauro Carvalho Chehab 2026-01-06 15:57 ` Jonathan Cameron 2026-01-06 10:01 ` [PATCH v4 3/4] apei/ghes: ensure that won't go past CPER allocated record Mauro Carvalho Chehab 2026-01-06 16:07 ` Jonathan Cameron
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox