From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63E243BD63B for ; Fri, 29 May 2026 10:40:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051239; cv=none; b=IjzZi+1kSFPlP59vPdGap3tx8R1qaItIWaOtfQcwLRtQWYsNHQt7X4azyhf5juoFGdWaDcBgSVegCvoamPYyStYn3uf6eal/OHA2gmc/+mDD2CGY/yU6TMiqDpqLNTH63THRcHT1IyAdYX4OJJtYR+LrWT+6VS2yZ35hGceT26Q= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780051239; c=relaxed/simple; bh=KLF11ChEzJICtnL6Wogz+Q1HvAHnThVriwwEkcG4y2k=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=op8kPXBb0eDZ8mulcOmZb7V4RbvjIL5KtIQZ2VGNiAGUzR25CFx0yisS5pdfGAT3gC1e+sJ6ncfWjOMf3QnHH4HPOpZTtaaGhadVfPMpxRd1x7E9aKMk/lutmDI+rAQJXXczIQcx7XT7yibMjrvjBzmmwq6bj024wLSYMLP7cJM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ObHFXI8I; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ObHFXI8I" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 978111F00893; Fri, 29 May 2026 10:40:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780051238; bh=Bvhofl3GMa57cwEXjxtsGcUlUeVU/Jx6nSkcEUWho64=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=ObHFXI8IVujRiPGE/fAAAJPDtpEt0gDu0UFMRzsvQPviV1xo+tCNPuZnuTLNEFrTw pJ4rAOPZHKDfa+j7dLWNx6E5tbECjxyA4hhh2HV3x4h8Q77OzmFNkpLNIYv47QYKBW dUKT7q+2JNFJNEg8+AkXUMRd54FS68CJAG3tc3g2e8oa2Eg0Lz3YKZAr/A8qVpcNhm xwWjf7Zw9yInDujHle1lw2QiNOG4+q9uq7ylh4Ygka7rPRXiHLdKx5I2vjP96IhpXq 6yuf7uttt1y6xzKBL6uwtGzhYQw2F63SeEvceHp2Z8YMxOGMHm1lkspHH3ujNzCaNU UJ+KbEOQt4nzA== From: sashiko-bot@kernel.org Subject: Re: [PATCH v5 08/10] ACPI: APEI: share GHES CPER helpers Reply-To: sashiko-reviews@lists.linux.dev To: "Ahmed Tiba" Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org In-Reply-To: <20260529-topics-ahmtib01-ras_ffh_arm_internal_review-v5-8-2e0500d42642@arm.com> References: <20260529-topics-ahmtib01-ras_ffh_arm_internal_review-v5-8-2e0500d42642@arm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Fri, 29 May 2026 10:40:36 +0000 Message-Id: <20260529104037.978111F00893@smtp.kernel.org> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 6 potential issue(= s) to consider: New issues: - [Low] The `generic` parameter in `ghes_cper_handle_status()` is completel= y unused. - [Low] Internal helper functions are unnecessarily exposed in the global `= ` header. Pre-existing issues: - [High] The `queued` boolean is unconditionally overwritten during iterati= on over multiple error sections, potentially masking a successful recovery = queueing and causing an unintended SIGBUS. - [High] Incorrect pointer math using `sizeof(err)` where `err` is a pointe= r defeats bounds checking, allowing out-of-bounds reads. - [High] The `twcb->pfn` field is accessed after its containing memory chun= k has been freed, causing a Use-After-Free. - [High] The global `ghes_estatus_pool` pointer is left dangling if its ini= tialization fails, leading to a Use-After-Free. -- commit 286897690a5dfe45676a004c4969ff7b921f6c22 Author: Ahmed Tiba ACPI: APEI: share GHES CPER helpers =20 Wire GHES up to the helper routines in ghes_cper.c and remove the local copies from ghes.c. This keeps the control flow identical while letting the helpers be shared with other firmware-first providers. > diff --git a/drivers/acpi/apei/ghes_cper.c b/drivers/acpi/apei/ghes_cper.c > index d7a666a163c3a..0ff9d06eb78f9 100644 > --- a/drivers/acpi/apei/ghes_cper.c > +++ b/drivers/acpi/apei/ghes_cper.c [ ... ] > +int ghes_estatus_pool_init(unsigned int num_ghes) > +{ > + unsigned long addr, len; > + int rc; > + > + ghes_estatus_pool =3D gen_pool_create(GHES_ESTATUS_POOL_MIN_ALLOC_ORDER= , -1); > + if (!ghes_estatus_pool) > + return -ENOMEM; > + > + len =3D GHES_ESTATUS_CACHE_AVG_SIZE * GHES_ESTATUS_CACHE_ALLOCED_MAX; > + len +=3D (num_ghes * GHES_ESOURCE_PREALLOC_MAX_SIZE); > + > + addr =3D (unsigned long)vmalloc(PAGE_ALIGN(len)); > + if (!addr) > + goto err_pool_alloc; > + > + rc =3D gen_pool_add(ghes_estatus_pool, addr, PAGE_ALIGN(len), -1); > + if (rc) > + goto err_pool_add; > + > + return 0; > + > +err_pool_add: > + vfree((void *)addr); > + > +err_pool_alloc: > + gen_pool_destroy(ghes_estatus_pool); > + > + return -ENOMEM; > +} [Severity: High] This is a pre-existing issue, but does this leave ghes_estatus_pool as a dangling pointer? If vmalloc or gen_pool_add fails, gen_pool_destroy is called, but ghes_estatus_pool is not set to NULL. If other code checks for ghes_estatus_pool to avoid double initialization, it might use the destroyed pool. [ ... ] > +static void memory_failure_cb(struct callback_head *twork) > +{ > + struct ghes_task_work *twcb =3D container_of(twork, struct ghes_task_wo= rk, twork); > + int ret; > + > + ret =3D memory_failure(twcb->pfn, twcb->flags); > + gen_pool_free(ghes_estatus_pool, (unsigned long)twcb, sizeof(*twcb)); > + > + if (!ret || ret =3D=3D -EHWPOISON || ret =3D=3D -EOPNOTSUPP) > + return; > + > + pr_err("%#llx: Sending SIGBUS to %s:%d due to hardware memory corruptio= n\n", > + twcb->pfn, current->comm, task_pid_nr(current)); [Severity: High] This isn't a bug introduced by this patch, but is it safe to access twcb->pfn after twcb has been freed via gen_pool_free? If the memory chunk is reallocated and overwritten by another thread before the pr_err call, this could log corrupted data. [ ... ] > +bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, > + int sev, bool sync) > +{ > + struct cper_sec_proc_arm *err =3D acpi_hest_get_payload(gdata); > + int flags =3D sync ? MF_ACTION_REQUIRED : 0; > + int length =3D gdata->error_data_length; > + char error_type[120]; > + bool queued =3D false; > + int sec_sev, i; > + char *p; [ ... ] > + p =3D (char *)(err + 1); > + length -=3D sizeof(err); [Severity: High] This is a pre-existing issue, but does length -=3D sizeof(err) subtract the correct amount here? Since err is a pointer, sizeof(err) evaluates to the pointer size (8 bytes) rather than the size of the underlying struct cper_sec_proc_arm. This might artificially inflate the remaining length and bypass the bounds checks in the loop below. > + > + for (i =3D 0; i < err->err_info_num; i++) { > + struct cper_arm_err_info *err_info; > + bool is_cache, has_pa; > + > + /* Ensure we have enough data for the error info header */ > + if (length < sizeof(*err_info)) > + break; [ ... ] > + if (is_cache && has_pa) { > + queued =3D ghes_do_memory_failure(err_info->physical_fault_addr, flag= s); [Severity: High] This is a pre-existing issue, but does assigning directly to queued overwrite successful recovery queueings? If an earlier iteration successfully sets queued to true, but a later one fails and sets it to false, the function might incorrectly return false and cause an unintended SIGBUS. Would a logical OR (queued |=3D ...) be more appropriate here? [ ... ] > +void ghes_cper_handle_status(struct device *dev, > + const struct acpi_hest_generic *generic, [Severity: Low] Is the generic parameter used anywhere in this function body? It looks like it could be removed to improve clarity. > + const struct acpi_hest_generic_status *estatus, > + bool sync) > +{ > + int sev, sec_sev; > + struct acpi_hest_generic_data *gdata; > + guid_t *sec_type; > + const guid_t *fru_id =3D &guid_null; > + char *fru_text =3D ""; > + bool queued =3D false; > + > + sev =3D ghes_severity(estatus->error_severity); > + apei_estatus_for_each_section(estatus, gdata) { > + sec_type =3D (guid_t *)gdata->section_type; [ ... ] > + if (guid_equal(sec_type, &CPER_SEC_PLATFORM_MEM)) { > + struct cper_sec_mem_err *mem_err =3D acpi_hest_get_payload(gdata); > + > + atomic_notifier_call_chain(&ghes_report_chain, sev, mem_err); > + > + arch_apei_report_mem_error(sev, mem_err); > + queued =3D ghes_handle_memory_failure(gdata, sev, sync); > + } else if (guid_equal(sec_type, &CPER_SEC_PCIE)) { > + ghes_handle_aer(gdata); > + } else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { > + queued =3D ghes_handle_arm_hw_error(gdata, sev, sync); [Severity: High] This is a pre-existing issue, but similar to the loop above, does assigning directly to queued overwrite successful queueings from earlier sections? [ ... ] > diff --git a/include/acpi/ghes_cper.h b/include/acpi/ghes_cper.h > index dd49e9179b63a..511b95b509111 100644 > --- a/include/acpi/ghes_cper.h > +++ b/include/acpi/ghes_cper.h [ ... ] > @@ -107,6 +110,23 @@ void ghes_estatus_cache_add(struct acpi_hest_generic= *generic, > struct acpi_hest_generic_status *estatus); > void ghes_defer_non_standard_event(struct acpi_hest_generic_data *gdata, > int sev); > +int ghes_severity(int severity); > +bool ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, > + int sev, bool sync); > +bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, > + int sev, bool sync); > +void ghes_handle_aer(struct acpi_hest_generic_data *gdata); [Severity: Low] Do these helper functions need to be exposed in the global header? Since they are only invoked locally by ghes_cper_handle_status, could they be marked static in ghes_cper.c to prevent namespace pollution? --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260529-topics-ahm= tib01-ras_ffh_arm_internal_review-v5-0-2e0500d42642@arm.com?part=3D8