public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>,
	Jonathan Cameron <jonathan.cameron@huawei.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Hanjun Guo <guohanjun@huawei.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Sasha Levin <sashal@kernel.org>,
	rafael@kernel.org, robert.moore@intel.com,
	xueshuai@linux.alibaba.com, tony.luck@intel.com,
	fabio.m.de.francesco@linux.intel.com, leitao@debian.org,
	Smita.KoralahalliChannabasappa@amd.com,
	jason@os.amperecomputing.com, linux-acpi@vger.kernel.org,
	acpica-devel@lists.linux.dev
Subject: [PATCH AUTOSEL 6.19-5.10] APEI/GHES: ensure that won't go past CPER allocated record
Date: Wed, 11 Feb 2026 07:30:43 -0500	[thread overview]
Message-ID: <20260211123112.1330287-33-sashal@kernel.org> (raw)
In-Reply-To: <20260211123112.1330287-1-sashal@kernel.org>

From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>

[ Upstream commit fa2408a24f8f0db14d9cfc613ef162dc267d7ad4 ]

The logic at ghes_new() prevents allocating too large records, by
checking if they're bigger than GHES_ESTATUS_MAX_SIZE (currently, 64KB).
Yet, the allocation is done with the actual number of pages from the
CPER bios table location, which can be smaller.

Yet, a bad firmware could send data with a different size, which might
be bigger than the allocated memory, causing an OOPS:

    Unable to handle kernel paging request at virtual address fff00000f9b40000
    Mem abort info:
      ESR = 0x0000000096000007
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      FSC = 0x07: level 3 translation fault
    Data abort info:
      ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
      CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    swapper pgtable: 4k pages, 52-bit VAs, pgdp=000000008ba16000
    [fff00000f9b40000] pgd=180000013ffff403, p4d=180000013fffe403, pud=180000013f85b403, pmd=180000013f68d403, pte=0000000000000000
    Internal error: Oops: 0000000096000007 [#1]  SMP
    Modules linked in:
    CPU: 0 UID: 0 PID: 303 Comm: kworker/0:1 Not tainted 6.19.0-rc1-00002-gda407d200220 #34 PREEMPT
    Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 02/02/2022
    Workqueue: kacpi_notify acpi_os_execute_deferred
    pstate: 214020c5 (nzCv daIF +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
    pc : hex_dump_to_buffer+0x30c/0x4a0
    lr : hex_dump_to_buffer+0x328/0x4a0
    sp : ffff800080e13880
    x29: ffff800080e13880 x28: ffffac9aba86f6a8 x27: 0000000000000083
    x26: fff00000f9b3fffc x25: 0000000000000004 x24: 0000000000000004
    x23: ffff800080e13905 x22: 0000000000000010 x21: 0000000000000083
    x20: 0000000000000001 x19: 0000000000000008 x18: 0000000000000010
    x17: 0000000000000001 x16: 00000007c7f20fec x15: 0000000000000020
    x14: 0000000000000008 x13: 0000000000081020 x12: 0000000000000008
    x11: ffff800080e13905 x10: ffff800080e13988 x9 : 0000000000000000
    x8 : 0000000000000000 x7 : 0000000000000001 x6 : 0000000000000020
    x5 : 0000000000000030 x4 : 00000000fffffffe x3 : 0000000000000000
    x2 : ffffac9aba78c1c8 x1 : ffffac9aba76d0a8 x0 : 0000000000000008
    Call trace:
     hex_dump_to_buffer+0x30c/0x4a0 (P)
     print_hex_dump+0xac/0x170
     cper_estatus_print_section+0x90c/0x968
     cper_estatus_print+0xf0/0x158
     __ghes_print_estatus+0xa0/0x148
     ghes_proc+0x1bc/0x220
     ghes_notify_hed+0x5c/0xb8
     notifier_call_chain+0x78/0x148
     blocking_notifier_call_chain+0x4c/0x80
     acpi_hed_notify+0x28/0x40
     acpi_ev_notify_dispatch+0x50/0x80
     acpi_os_execute_deferred+0x24/0x48
     process_one_work+0x15c/0x3b0
     worker_thread+0x2d0/0x400
     kthread+0x148/0x228
     ret_from_fork+0x10/0x20
    Code: 6b14033f 540001ad a94707e2 f100029f (b8747b44)
    ---[ end trace 0000000000000000 ]---

Prevent that by taking the actual allocated are into account when
checking for CPER length.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject tweaks ]
Link: https://patch.msgid.link/4e70310a816577fabf37d94ed36cde4ad62b1e0a.1767871950.git.mchehab+huawei@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

Now I have a complete picture. Let me compile this detailed analysis.

---

## Detailed Analysis of APEI/GHES CPER Buffer Overflow Fix

### 1. COMMIT MESSAGE ANALYSIS

The commit message clearly describes a **buffer overflow bug** in the
ACPI APEI/GHES subsystem. Key indicators:

- **Concrete crash trace**: A full kernel OOPS with call trace is
  included, proving this is a real, reproducible crash
- **Clear root cause explanation**: The commit explains the discrepancy
  between the allocated buffer size and the validation check limit
- **Crash path**: `ghes_proc` -> `__ghes_print_estatus` ->
  `cper_estatus_print_section` -> `print_hex_dump` ->
  `hex_dump_to_buffer` — a pagefault when reading past the allocated
  buffer
- **Multiple reviewers**: Reviewed-by Jonathan Cameron, Acked-by Ard
  Biesheuvel (prominent EFI/ACPI maintainer), Reviewed-by Hanjun Guo —
  strong confidence signals

### 2. CODE CHANGE ANALYSIS — The Bug

The vulnerability exists in the interaction between two functions:

**`ghes_new()` (allocation)**:

```288:296:drivers/acpi/apei/ghes.c
        error_block_length = generic->error_block_length;
        if (error_block_length > GHES_ESTATUS_MAX_SIZE) {
                pr_warn(FW_WARN GHES_PFX
                        "Error status block length is too long: %u for "
                        "generic hardware error source: %d.\n",
                        error_block_length, generic->header.source_id);
                error_block_length = GHES_ESTATUS_MAX_SIZE;
        }
        ghes->estatus = kmalloc(error_block_length, GFP_KERNEL);
```

Here, if the BIOS HEST table declares `generic->error_block_length =
128KB`, the local variable `error_block_length` is capped to
`GHES_ESTATUS_MAX_SIZE` (64KB), and only 64KB is allocated for
`ghes->estatus`. Critically, `ghes->generic->error_block_length` retains
the original uncapped value of 128KB.

**`__ghes_check_estatus()` (validation)**:

```364:385:drivers/acpi/apei/ghes.c
static int __ghes_check_estatus(struct ghes *ghes,
                                struct acpi_hest_generic_status
*estatus)
{
        u32 len = cper_estatus_len(estatus);

        if (len < sizeof(*estatus)) {
                pr_warn_ratelimited(FW_WARN GHES_PFX "Truncated error
status block!\n");
                return -EIO;
        }

        if (len > ghes->generic->error_block_length) {
                pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid error
status block length!\n");
                return -EIO;
        }
        // ...
}
```

The validation at line 374 checks `len >
ghes->generic->error_block_length` — comparing against the **uncapped
BIOS-declared value** (128KB in our example), NOT the actual allocated
buffer size (64KB).

**`ghes_read_estatus()` (the dangerous read)**:

```430:446:drivers/acpi/apei/ghes.c
static int ghes_read_estatus(struct ghes *ghes,
                             struct acpi_hest_generic_status *estatus,
                             u64 *buf_paddr, enum fixed_addresses
fixmap_idx)
{
        int rc;
        rc = __ghes_peek_estatus(ghes, estatus, buf_paddr, fixmap_idx);
        if (rc) return rc;
        rc = __ghes_check_estatus(ghes, estatus);
        if (rc) return rc;
        return __ghes_read_estatus(estatus, *buf_paddr, fixmap_idx,
                                   cper_estatus_len(estatus));
}
```

After `__ghes_check_estatus()` erroneously allows a 70KB CPER record
(because 70KB < 128KB), `__ghes_read_estatus()` copies 70KB from
firmware into the 64KB `ghes->estatus` buffer — **a 6KB buffer
overflow**.

**`ghes_proc()`** is the main caller, passing `ghes->estatus` (the
undersized buffer) directly:

```1166:1172:drivers/acpi/apei/ghes.c
static int ghes_proc(struct ghes *ghes)
{
        struct acpi_hest_generic_status *estatus = ghes->estatus;
        u64 buf_paddr;
        int rc;
        rc = ghes_read_estatus(ghes, estatus, &buf_paddr,
FIX_APEI_GHES_IRQ);
```

### 3. THE FIX

The fix is minimal and surgical — 3 logical changes:

1. **New field in `struct ghes`**: `unsigned int estatus_length` to
   track the actual allocated size
2. **Store actual allocation size**: `ghes->estatus_length =
   error_block_length` right after `kmalloc`
3. **Fix the validation check**: Change `len >
   ghes->generic->error_block_length` to `!len || len >
   min(ghes->generic->error_block_length, ghes->estatus_length)` — now
   validates against the **minimum** of the declared and allocated sizes

The fix also adds a `!len` zero-length check, preventing issues with
zero-length CPER records that previously weren't caught.

### 4. CLASSIFICATION

- **Bug type**: Out-of-bounds write / buffer overflow
- **Trigger**: Bad firmware (BIOS declares `error_block_length >
  GHES_ESTATUS_MAX_SIZE`, then sends CPER record between the allocation
  size and the declared size)
- **This is NOT theoretical**: The commit includes a real OOPS trace on
  QEMU demonstrating the crash
- **Security relevance**: Firmware-controlled data causes kernel memory
  corruption. While firmware is generally trusted, this is still a
  defense-in-depth issue, and buggy firmware is common in practice

### 5. SCOPE AND RISK ASSESSMENT

- **Files changed**: 2 (`drivers/acpi/apei/ghes.c`,
  `include/acpi/ghes.h`)
- **Lines changed**: ~10 meaningful lines (adding a struct field,
  storing it, and a `min()` check)
- **Risk of regression**: **Extremely low**. The fix only makes the
  validation check more restrictive — it can only reject records that
  would have previously been accepted. No record that was correctly
  handled before will be rejected now.
- **Subsystem**: ACPI APEI/GHES — critical hardware error reporting used
  on servers, enterprise systems, and ARM platforms. This is important
  infrastructure.

### 6. BUG AGE AND AFFECTED VERSIONS

The vulnerable pattern has existed since the original GHES code was
introduced in commit `d334a49113a4a` from **2010** (Linux v2.6.35). The
`error_block_length` capping in `ghes_new()` and the check against the
uncapped value in the validation function have coexisted since the
beginning. This means **every stable kernel tree** with GHES support is
affected.

### 7. DEPENDENCY CHECK

The fix is **fully self-contained**. It:
- Adds a new `unsigned int` field to `struct ghes` (header change)
- Stores the allocation size in `ghes_new()` (initialization)
- Uses `min()` in `__ghes_check_estatus()` (validation fix)
- Includes `<linux/minmax.h>` for the `min()` macro (though this is
  likely already available through other includes in older kernels)

No other patches are needed for this fix to work. The fix applies to the
GHES code that has been stable for many years.

### 8. USER IMPACT

- **Who is affected**: Anyone with ACPI GHES hardware error reporting,
  primarily:
  - Enterprise servers (HP, Dell, Lenovo)
  - ARM platforms with SDEI/GHES
  - Systems with buggy firmware that declares large error blocks
- **Severity**: **Kernel OOPS/crash** — system becomes unstable or
  crashes entirely
- **Trigger likelihood**: Requires firmware with `error_block_length >
  64KB` AND a CPER record larger than 64KB. While not every system will
  trigger this, the crash trace proves it happens in the real world
  (tested on QEMU with specific firmware).

### 9. REVIEW QUALITY

- **Reviewed-by: Jonathan Cameron** — experienced APEI contributor
- **Acked-by: Ard Biesheuvel** — prominent EFI/ACPI/ARM maintainer
- **Reviewed-by: Hanjun Guo** — ACPI subsystem reviewer
- **Committed by: Rafael J. Wysocki** — ACPI subsystem maintainer

This is an exceptionally well-reviewed patch.

### 10. SUMMARY

| Criterion | Assessment |
|-----------|------------|
| Fixes real bug | YES — buffer overflow causing kernel OOPS |
| Obviously correct | YES — adds allocation tracking and min()
validation |
| Small and contained | YES — ~10 lines of meaningful change |
| Tested | YES — crash trace demonstrates the bug; multiple reviews
confirm the fix |
| No new features | YES — pure bug fix |
| Risk of regression | VERY LOW — only makes validation stricter |
| Affects stable users | YES — long-standing bug in all stable trees |
| Self-contained | YES — no dependencies on other patches |

This is a textbook stable kernel backport candidate: a small, surgical,
well-reviewed fix for a real buffer overflow that causes kernel crashes,
with no risk of regression. The bug has existed since 2010 and affects
all stable kernel trees.

**YES**

 drivers/acpi/apei/ghes.c | 6 +++++-
 include/acpi/ghes.h      | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 0dc767392a6c6..a37c8fb574832 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -29,6 +29,7 @@
 #include <linux/cper.h>
 #include <linux/cleanup.h>
 #include <linux/platform_device.h>
+#include <linux/minmax.h>
 #include <linux/mutex.h>
 #include <linux/ratelimit.h>
 #include <linux/vmalloc.h>
@@ -294,6 +295,7 @@ static struct ghes *ghes_new(struct acpi_hest_generic *generic)
 		error_block_length = GHES_ESTATUS_MAX_SIZE;
 	}
 	ghes->estatus = kmalloc(error_block_length, GFP_KERNEL);
+	ghes->estatus_length = error_block_length;
 	if (!ghes->estatus) {
 		rc = -ENOMEM;
 		goto err_unmap_status_addr;
@@ -365,13 +367,15 @@ static int __ghes_check_estatus(struct ghes *ghes,
 				struct acpi_hest_generic_status *estatus)
 {
 	u32 len = cper_estatus_len(estatus);
+	u32 max_len = min(ghes->generic->error_block_length,
+			  ghes->estatus_length);
 
 	if (len < sizeof(*estatus)) {
 		pr_warn_ratelimited(FW_WARN GHES_PFX "Truncated error status block!\n");
 		return -EIO;
 	}
 
-	if (len > ghes->generic->error_block_length) {
+	if (!len || len > max_len) {
 		pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid error status block length!\n");
 		return -EIO;
 	}
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index ebd21b05fe6ed..93db60da5934e 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -21,6 +21,7 @@ struct ghes {
 		struct acpi_hest_generic_v2 *generic_v2;
 	};
 	struct acpi_hest_generic_status *estatus;
+	unsigned int estatus_length;
 	unsigned long flags;
 	union {
 		struct list_head list;
-- 
2.51.0


  parent reply	other threads:[~2026-02-11 12:32 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-11 12:30 [PATCH AUTOSEL 6.19-5.10] s390/perf: Disable register readout on sampling events Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] arm64: Add support for TSV110 Spectre-BHB mitigation Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] xenbus: Use .freeze/.thaw to handle xenbus devices Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] s390/purgatory: Add -Wno-default-const-init-unsafe to KBUILD_CFLAGS Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] s390/boot: " Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.1] perf/arm-cmn: Support CMN-600AE Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] ntfs: ->d_compare() must not block Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: x86: s2idle: Invoke Microsoft _DSM Function 9 (Turn On Display) Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] block: decouple secure erase size limit from discard size limit Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] sparc: don't reference obsolete termio struct for TC* constants Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] EFI/CPER: don't go past the ARM processor CPER record buffer Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19] ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] cpufreq: dt-platdev: Block the driver from probing on more QC platforms Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] EFI/CPER: don't dump the entire memory region Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: battery: fix incorrect charging status when current is zero Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] rust: cpufreq: always inline functions using build_assert with arguments Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] blk-mq-sched: unify elevators checking for async requests Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] x86/xen/pvh: Enable PAE mode for 32-bit guest only when CONFIG_X86_PAE is set Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] APEI/GHES: ARM processor Error: don't go past allocated memory Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] md raid: fix hang when stopping arrays with metadata through dm-raid Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] tools/power cpupower: Reset errno before strtoull() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] sparc: Synchronize user stack on fork and clone Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] rnbd-srv: Zero the rsp buffer before using it Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] alpha: fix user-space corruption during memory compaction Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.10] ACPICA: Abort AML bytecode execution when executing AML_FATAL_OP Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19] arm64: mte: Set TCMA1 whenever MTE is present in the kernel Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] tools/cpupower: Fix inverted APERF capability check Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-5.15] ACPI: processor: Fix NULL-pointer dereference in acpi_processor_errata_piix4() Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] ACPI: resource: Add JWIPC JVC9100 to irq1_level_low_skip_override[] Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREAD Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.6] md-cluster: fix NULL pointer dereference in process_metadata_update Sasha Levin
2026-02-11 12:30 ` Sasha Levin [this message]
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.12] powercap: intel_rapl: Add PL4 support for Ice Lake Sasha Levin
2026-02-11 12:30 ` [PATCH AUTOSEL 6.19-6.18] io_uring/timeout: annotate data race in io_flush_timeouts() Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260211123112.1330287-33-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=acpica-devel@lists.linux.dev \
    --cc=ardb@kernel.org \
    --cc=fabio.m.de.francesco@linux.intel.com \
    --cc=guohanjun@huawei.com \
    --cc=jason@os.amperecomputing.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=leitao@debian.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rafael@kernel.org \
    --cc=robert.moore@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox