From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE53E37F735; Wed, 17 Jun 2026 14:14:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781705686; cv=none; b=dCXkWj5RPTaEwPe1AJvNwOcvrZPTf5P4r0AeBNkFC3sjLszlcaxi7FR1x18pGSNjflHZnPbsTBAwsQADS+nY8Os3VOKUVv8xKGC/WRjoEFCSQ7Me9xP36aKa/sG1Dep5nTUrQGajlG/tYQZ1/DfbX/IxxyCSWJDxqquIGi+VLkY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781705686; c=relaxed/simple; bh=bDKA5uVnGOhKpvgaV78zmV5TCN7LOJw1yhiqRK5NNRg=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=otDtabmGOJBNGKI8EyqN8X0U2CIe3ABmBKV23SQ7LHgHN6TxJ/V31DLZYqiJKzUrv5qf4aKSnY0l7tWsE9HWASK6LFvJ5BjEMMNSVD76s6fg7li1YzH9RBQ9iRpxlThdSAH/5Blkitlmko65YROl2mYhNxKzrQHCGB3Is3mA4Mg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=H0X4jx7K; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="H0X4jx7K" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 386901F000E9; Wed, 17 Jun 2026 14:14:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781705685; bh=3BL1znwxwyF+/w91p1vW3R6pP3XesgLUhUfFfV7/vrs=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=H0X4jx7Keg2AA8bvmRknDU24m3M1F3lvtTdPecrGAeySPgufTJUHZVF/HYHzdx+rT D8YYW8PQEgpLb9D8kLK2MYV/cgGe+pgYT6HZ5kUE+2yX7GM8n92mrfy8uPnlRt/ewh YV/msCTia3/o+7Z+fiSr+Phgp44+blhGdPOBBWYiD/RpeK5/TyCJveu32LAfecP6LF XybR76u5N8Bic0rfF5RthkvWFrjOTuJgM85MXLH6BzsuxUeDJsscb+YqkzQueF73rO qisbmwminu8Ak1hKjZ30sYDL//jBU7dDfmzkoendVaO7z8Cd6zMkqH0czXjnTAm9f0 Eu5ETF0uPaHgg== From: sashiko-bot@kernel.org Subject: Re: [PATCH v6 06/10] ACPI: APEI: GHES: move CXL CPER helpers Reply-To: sashiko-reviews@lists.linux.dev To: "Ahmed Tiba" Cc: robh@kernel.org, devicetree@vger.kernel.org, linux-cxl@vger.kernel.org, conor+dt@kernel.org In-Reply-To: <20260617-topics-ahmtib01-ras_ffh_arm_internal_review-v6-6-91f725174aa0@arm.com> References: <20260617-topics-ahmtib01-ras_ffh_arm_internal_review-v6-0-91f725174aa0@arm.com> <20260617-topics-ahmtib01-ras_ffh_arm_internal_review-v6-6-91f725174aa0@arm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Wed, 17 Jun 2026 14:14:44 +0000 Message-Id: <20260617141445.386901F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 3 potential issue(= s) to consider: Pre-existing issues: - [High] Inconsistent spinlock usage between process context and interrupt = context for `cxl_cper_work_lock` and `cxl_cper_prot_err_work_lock` causes a= potential deadlock. - [High] Unconditional fixed-size copy leads to out-of-bounds read if the A= CPI payload is short. - [High] Lack of bounds checking for DVSEC length leads to out-of-bounds re= ad in CXL protocol error handling. -- commit bb34e6b6044e94825caae9bb7ad029942aef5c28 Author: Ahmed Tiba ACPI: APEI: GHES: move CXL CPER helpers Move the CXL CPER handling paths out of ghes.c and into ghes_cper.c so the helpers can be reused. The code is moved as-is, with the public prototypes updated so GHES keeps calling into the new translation unit. > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c [ ... ] > diff --git a/drivers/acpi/apei/ghes_cper.c b/drivers/acpi/apei/ghes_cper.c > index b26943eafd792..66bf1af4db00b 100644 > --- a/drivers/acpi/apei/ghes_cper.c > +++ b/drivers/acpi/apei/ghes_cper.c [ ... ] > @@ -336,6 +339,138 @@ void ghes_defer_non_standard_event(struct acpi_hest= _generic_data *gdata, [ ... ] > +void cxl_cper_post_prot_err(struct cxl_cper_sec_prot_err *prot_err, > + int severity) > +{ > +#ifdef CONFIG_ACPI_APEI_PCIEAER > + struct cxl_cper_prot_err_work_data wd; > + > + if (cxl_cper_sec_prot_err_valid(prot_err)) > + return; > + > + guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock); > + > + if (!cxl_cper_prot_err_work) > + return; > + > + if (cxl_cper_setup_prot_err_work_data(&wd, prot_err, severity)) [Severity: High] This is a pre-existing issue, but could this setup step read out of bounds? The cxl_cper_setup_prot_err_work_data() function uses the firmware-provided dvsec_len to compute a pointer for copying the RAS capability registers: ghes_helpers.c:cxl_cper_setup_prot_err_work_data() { ... cap_start =3D dvsec_start + prot_err->dvsec_len; memcpy(&wd->ras_cap, cap_start, sizeof(wd->ras_cap)); Since cxl_cper_post_prot_err() and its callers do not validate that dvsec_len falls within the maximum bounds of the ACPI payload (gdata->error_data_length), could a malformed firmware record cause cap_sta= rt to point beyond the valid buffer? > + return; > + > + if (!kfifo_put(&cxl_cper_prot_err_fifo, wd)) { > + pr_err_ratelimited("CXL CPER kfifo overflow\n"); > + return; > + } > + > + schedule_work(cxl_cper_prot_err_work); > +#endif > +} > + > +int cxl_cper_register_prot_err_work(struct work_struct *work) > +{ > + if (cxl_cper_prot_err_work) > + return -EINVAL; > + > + guard(spinlock)(&cxl_cper_prot_err_work_lock); [Severity: High] This isn't a bug introduced by this patch, but is there a potential deadlock between process context and interrupt context here? The registration function acquires the lock with guard(spinlock), leaving local interrupts enabled. If a hardware interrupt occurs while this lock is held, and the interrupt handler calls cxl_cper_post_prot_err() which attempts to acquire the same lock using guard(spinlock_irqsave): cxl_cper_post_prot_err() { ... guard(spinlock_irqsave)(&cxl_cper_prot_err_work_lock); Could this cause the interrupt handler to spin forever on the local CPU? > + cxl_cper_prot_err_work =3D work; > + return 0; > +} [ ... ] > +void cxl_cper_post_event(enum cxl_event_type event_type, > + struct cxl_cper_event_rec *rec) > +{ > + struct cxl_cper_work_data wd; > + > + if (rec->hdr.length <=3D sizeof(rec->hdr) || > + rec->hdr.length > sizeof(*rec)) { > + pr_err(FW_WARN "CXL CPER Invalid section length (%u)\n", > + rec->hdr.length); > + return; > + } > + > + if (!(rec->hdr.validation_bits & CPER_CXL_COMP_EVENT_LOG_VALID)) { > + pr_err(FW_WARN "CXL CPER invalid event\n"); > + return; > + } > + > + guard(spinlock_irqsave)(&cxl_cper_work_lock); > + > + if (!cxl_cper_work) > + return; > + > + wd.event_type =3D event_type; > + memcpy(&wd.rec, rec, sizeof(wd.rec)); [Severity: High] This is a pre-existing issue, but can this copy read out of bounds if the ACPI payload is unexpectedly short? The memcpy unconditionally copies sizeof(wd.rec) bytes from the firmware-provided payload. Since gdata->error_data_length is never verified to be at least the size of the record before being passed to this function by ghes_cper_handle_status(), would a malformed payload smaller than the struct cause an uncontrolled read of kernel memory? > + > + if (!kfifo_put(&cxl_cper_fifo, wd)) { [ ... ] --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260617-topics-ahm= tib01-ras_ffh_arm_internal_review-v6-0-91f725174aa0@arm.com?part=3D6