LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Jinjie Ruan <ruanjinjie@huawei.com>
To: Breno Leitao <leitao@debian.org>
Cc: <corbet@lwn.net>, <skhan@linuxfoundation.org>,
	<catalin.marinas@arm.com>, <will@kernel.org>,
	<chenhuacai@kernel.org>, <kernel@xen0n.name>,
	<maddy@linux.ibm.com>, <mpe@ellerman.id.au>, <npiggin@gmail.com>,
	<chleroy@kernel.org>, <pjw@kernel.org>, <palmer@dabbelt.com>,
	<aou@eecs.berkeley.edu>, <alex@ghiti.fr>, <tglx@kernel.org>,
	<mingo@redhat.com>, <bp@alien8.de>, <dave.hansen@linux.intel.com>,
	<hpa@zytor.com>, <robh@kernel.org>, <saravanak@kernel.org>,
	<akpm@linux-foundation.org>, <bhe@redhat.com>, <rppt@kernel.org>,
	<pasha.tatashin@soleen.com>, <pratyush@kernel.org>,
	<ruirui.yang@linux.dev>, <rdunlap@infradead.org>,
	<pmladek@suse.com>, <dapeng1.mi@linux.intel.com>,
	<kees@kernel.org>, <elver@google.com>, <kuba@kernel.org>,
	<ebiggers@kernel.org>, <lirongqing@baidu.com>,
	<paulmck@kernel.org>, <sourabhjain@linux.ibm.com>,
	<coxu@redhat.com>, <jbohac@suse.cz>, <ryan.roberts@arm.com>,
	<osandov@fb.com>, <cfsworks@gmail.com>, <tangyouling@kylinos.cn>,
	<ritesh.list@gmail.com>, <adityag@linux.ibm.com>,
	<guoren@kernel.org>, <songshuaishuai@tinylab.org>,
	<kevin.brodsky@arm.com>, <vishal.moola@gmail.com>,
	<junhui.liu@pigmoral.tech>, <wangruikang@iscas.ac.cn>,
	<namcao@linutronix.de>, <chao.gao@intel.com>, <seanjc@google.com>,
	<fuqiang.wang@easystack.cn>, <ardb@kernel.org>,
	<chenjiahao16@huawei.com>, <hbathini@linux.ibm.com>,
	<takahiro.akashi@linaro.org>, <james.morse@arm.com>,
	<lizhengyu3@huawei.com>, <x86@kernel.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<loongarch@lists.linux.dev>, <linuxppc-dev@lists.ozlabs.org>,
	<linux-riscv@lists.infradead.org>, <devicetree@vger.kernel.org>,
	<kexec@lists.infradead.org>
Subject: Re: [PATCH v13 04/15] arm64: kexec_file: Fix potential buffer overflow in prepare_elf_headers()
Date: Tue, 19 May 2026 20:42:04 +0800	[thread overview]
Message-ID: <3ca700ed-e081-4c62-8289-5bbd4248e630@huawei.com> (raw)
In-Reply-To: <agGkvrg06KNDNfDi@gmail.com>



On 5/11/2026 5:46 PM, Breno Leitao wrote:
> On Mon, May 11, 2026 at 11:04:43AM +0800, Jinjie Ruan wrote:
>> There is a race condition between the kexec_load() system call
>> (crash kernel loading path) and memory hotplug operations that can
>> lead to buffer overflow and potential kernel crash.
>>
>> During prepare_elf_headers(), the following steps occur:
>> 1. The first for_each_mem_range() queries current System RAM memory ranges
>> 2. Allocates buffer based on queried count
>> 3. The 2st for_each_mem_range() populates ranges from memblock
>>
>> If memory hotplug occurs between step 1 and step 3, the number of ranges
>> can increase, causing out-of-bounds write when populating cmem->ranges[].
>>
>> This happens because kexec_load() uses kexec_trylock (atomic_t) while
>> memory hotplug uses device_hotplug_lock (mutex), so they don't serialize
>> with each other.
>>
>> Add the explicit bounds checking to prevent out-of-bounds access.
> 
> It seems you have a TOCTOU type of issue, and this seems to be shrinking
> the window, but not fully solving it?

I plan to fix this issue as follows, and would appreciate your feedback
on whether this is reasonable.

Sashiko AI code review pointed out there is a TOCTOU (Time-of-Check to
Time-of-Use) race condition in prepare_elf_headers() between the initial
pass that counts System RAM ranges and the second pass that populates them.
If a memory hotplug event occurs between these two steps, the number of
memory regions may increase, causing an out-of-bounds write to
the cmem->ranges[] array.

To resolve this and ensure data consistency, this patch:

1. Wraps the counting and population passes with get_online_mems() and
   crash_hotplug_lock(). This serializes the kexec_file_load() path
   with concurrent memory hotplug operations, ensuring the memory
   map remains consistent throughout the header preparation.

2. Adds an explicit boundary check in prepare_elf64_ram_headers_callback().
   If the number of ranges exceeds the allocated maximum, it now returns
   -EAGAIN, which indicates a transient race, signaling userspace
   kexec-tools to retry the syscall instead of leaving the system
without a loaded crash kernel.

index daf81a873bbd..546be6261177 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -15,6 +15,7 @@
 #include <linux/kexec.h>
 #include <linux/libfdt.h>
 #include <linux/memblock.h>
+#include <linux/memory_hotplug.h>
 #include <linux/of.h>
 #include <linux/of_fdt.h>
 #include <linux/slab.h>
@@ -40,7 +41,7 @@ int arch_kimage_file_post_load_cleanup(struct kimage
*image)
 }

 #ifdef CONFIG_CRASH_DUMP
-int prepare_elf_headers(void **addr, unsigned long *sz)
+static int __prepare_elf_headers(void **addr, unsigned long *sz)
 {
 	struct crash_mem *cmem;
 	unsigned int nr_ranges;
@@ -59,6 +60,11 @@ int prepare_elf_headers(void **addr, unsigned long *sz)
 	cmem->max_nr_ranges = nr_ranges;
 	cmem->nr_ranges = 0;
 	for_each_mem_range(i, &start, &end) {
+		if (cmem->nr_ranges >= cmem->max_nr_ranges) {
+			ret = -EAGAIN;
+			goto out;
+		}
+
 		cmem->ranges[cmem->nr_ranges].start = start;
 		cmem->ranges[cmem->nr_ranges].end = end - 1;
 		cmem->nr_ranges++;
@@ -81,6 +87,21 @@ int prepare_elf_headers(void **addr, unsigned long *sz)
 	kfree(cmem);
 	return ret;
 }
+
+int prepare_elf_headers(void **addr, unsigned long *sz)
+{
+	int ret;
+
+	crash_hotplug_lock();
+	get_online_mems();
+
+	ret = __prepare_elf_headers(addr, sz);
+
+	put_online_mems();
+	crash_hotplug_unlock();
+
+	return ret;
+}
 #endif

> 
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Breno Leitao <leitao@debian.org>
>> Cc: stable@vger.kernel.org
>> Fixes: 3751e728cef2 ("arm64: kexec_file: add crash dump support")
>> Closes: https://sashiko.dev/#/patchset/20260323072745.2481719-1-ruanjinjie%40huawei.com
>> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
>> ---
>>  arch/arm64/kernel/machine_kexec_file.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/machine_kexec_file.c b/arch/arm64/kernel/machine_kexec_file.c
>> index e31fabed378a..a67e7b1abbab 100644
>> --- a/arch/arm64/kernel/machine_kexec_file.c
>> +++ b/arch/arm64/kernel/machine_kexec_file.c
>> @@ -59,6 +59,11 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
>>  	cmem->max_nr_ranges = nr_ranges;
>>  	cmem->nr_ranges = 0;
>>  	for_each_mem_range(i, &start, &end) {
>> +		if (cmem->nr_ranges >= cmem->max_nr_ranges) {
>> +			ret = -ENOMEM;
> 
> -ENOMEM seems to be the the wrong errno. This isn't an allocation
> failure; it's a transient race. -EBUSY or -EAGAIN would be more honest



  parent reply	other threads:[~2026-05-19 12:42 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  3:04 [PATCH v13 00/15] arm64/riscv: Add support for crashkernel CMA reservation Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 01/15] riscv: kexec_file: Fix crashk_low_res not exclude bug Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 02/15] powerpc/crash: Fix possible memory leak in update_crash_elfcorehdr() Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 03/15] x86/kexec: Fix potential buffer overflow in prepare_elf_headers() Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 04/15] arm64: kexec_file: " Jinjie Ruan
2026-05-11  9:46   ` Breno Leitao
2026-05-11 11:30     ` Jinjie Ruan
2026-05-11 12:30       ` Breno Leitao
2026-05-19 12:42     ` Jinjie Ruan [this message]
2026-05-11  3:04 ` [PATCH v13 05/15] riscv: " Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 06/15] LoongArch: kexec: " Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 07/15] powerpc/crash: sort crash memory ranges before preparing elfcorehdr Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 08/15] crash: Add crash_prepare_headers() to exclude crash kernel memory Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 09/15] arm64: kexec_file: Use crash_prepare_headers() helper to simplify code Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 10/15] x86/kexec: " Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 11/15] riscv: kexec_file: " Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 12/15] LoongArch: kexec: " Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 13/15] crash: Use crash_exclude_core_ranges() on powerpc Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 14/15] arm64: kexec: Add support for crashkernel CMA reservation Jinjie Ruan
2026-05-11  3:04 ` [PATCH v13 15/15] riscv: " Jinjie Ruan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ca700ed-e081-4c62-8289-5bbd4248e630@huawei.com \
    --to=ruanjinjie@huawei.com \
    --cc=adityag@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=ardb@kernel.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=cfsworks@gmail.com \
    --cc=chao.gao@intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=chenjiahao16@huawei.com \
    --cc=chleroy@kernel.org \
    --cc=corbet@lwn.net \
    --cc=coxu@redhat.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=devicetree@vger.kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=elver@google.com \
    --cc=fuqiang.wang@easystack.cn \
    --cc=guoren@kernel.org \
    --cc=hbathini@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=jbohac@suse.cz \
    --cc=junhui.liu@pigmoral.tech \
    --cc=kees@kernel.org \
    --cc=kernel@xen0n.name \
    --cc=kevin.brodsky@arm.com \
    --cc=kexec@lists.infradead.org \
    --cc=kuba@kernel.org \
    --cc=leitao@debian.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lirongqing@baidu.com \
    --cc=lizhengyu3@huawei.com \
    --cc=loongarch@lists.linux.dev \
    --cc=maddy@linux.ibm.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=namcao@linutronix.de \
    --cc=npiggin@gmail.com \
    --cc=osandov@fb.com \
    --cc=palmer@dabbelt.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=pjw@kernel.org \
    --cc=pmladek@suse.com \
    --cc=pratyush@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=ritesh.list@gmail.com \
    --cc=robh@kernel.org \
    --cc=rppt@kernel.org \
    --cc=ruirui.yang@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=saravanak@kernel.org \
    --cc=seanjc@google.com \
    --cc=skhan@linuxfoundation.org \
    --cc=songshuaishuai@tinylab.org \
    --cc=sourabhjain@linux.ibm.com \
    --cc=takahiro.akashi@linaro.org \
    --cc=tangyouling@kylinos.cn \
    --cc=tglx@kernel.org \
    --cc=vishal.moola@gmail.com \
    --cc=wangruikang@iscas.ac.cn \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox