From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEE1626CE17 for ; Wed, 3 Dec 2025 18:13:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764785581; cv=none; b=rUj8vkzrosqyI4XNPD1gPWfotj3cDeq7U7ow5tRZPQRHKCesKPRPIfemzjp8FlsKSbMs2vGjYIQejiVrU4hbYfBTJhF0JmKWQ8QBj0CAFrFezHBB689ikYgGKcxZfnyxu4sYHzXwS4kFnuqwxUuhZVeYH8FE3MyiyIF700DE5aU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764785581; c=relaxed/simple; bh=yPr3IxZVqe7Ygjwx7j2K72edn8v9AUH2gEdXVP8EKBk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=cyGiO/jdWVjTJrXrUDTCQL52lU083rXLmFNFIlrDzdHqbnIGVS1aWuodF8+37q4CK4U37ICpbWXCupL7MBv0BXvQZ8se6EII0178DD4JoKgPCJP7e9VSZsofL0Jc8oWRwI+yuiWvcgyJayLsjv6PfWu0RtegSvYl+fqYhyeDIGc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tpBxKUor; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tpBxKUor" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 648FDC4CEF5; Wed, 3 Dec 2025 18:12:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1764785580; bh=yPr3IxZVqe7Ygjwx7j2K72edn8v9AUH2gEdXVP8EKBk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=tpBxKUorrzKIqETlKcmxFm2dmIvg3J0LG1zoThbkfUW0NV34jrrxrQrk7OrrvazFU 01KBO833Mff043DLvuF9OTfhBLFgUBqprERNUIkdhLp5rUMEv5uNlrBpa7sRA/fyiN qN5keF45K4Nlnt9A8+svzJfGbi36eSotacP2qdXP1oIiuwrRFIEyUUjiMfwgoFmY+6 F29pn9ERlLA9NKcDiCaJcHucIN2+ikRMW9uetwUgg4+dFiz0pXB6OUAmZSYjGsK8aM ccKbxgQ0b2Z8KXJsUyuvPX/RnDsPGzNsTN5siELZkmEK9UN9AComa5YB4u8WJTejtw Kya/lfU/qWrOg== Date: Wed, 3 Dec 2025 19:12:55 +0100 From: Ingo Molnar To: James Le Cuirot Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Ard Biesheuvel Subject: Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot Message-ID: References: <20251126173209.374755-2-chewi@gentoo.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20251126173209.374755-2-chewi@gentoo.org> * James Le Cuirot wrote: > kexec on x86 passes initrd details via the boot_params. If no initrd is > supplied, then ramdisk_size is 0. When determining whether to reserve > memory for the initrd on the subsequent boot, ramdisk_size being 0 > causes the logic to fall back to phys_initrd_start and phys_initrd_size > set from the EFI tables in efi.c. This is stale information from the > initial boot. The system continues to boot and has even been seen to > function under heavy load for days, but allocating very large amounts of > memory reliably triggers an oops rather than the OOM killer. > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > #PF: supervisor write access in kernel mode > #PF: error_code(0x0002) - not-present page > PGD 0 P4D 0 > Oops: Oops: 0002 [#1] SMP NOPTI > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 > when the EFI stub initrd loading was unified between architectures. > > Avoid the issue by checking whether the bootloader is not kexec before > falling back to the EFI table values. > > I strongly suspect this also affects other architectures. A different > fix would be required there, and I do have a fix in mind, but I was > unable to reproduce the issue under QEMU's aarch64 virt machine. I think > this is at least partly because it relies on ACPI while kexec passes the > initd details via the device tree. > > Signed-off-by: James Le Cuirot > --- > arch/x86/kernel/setup.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 1b2edd07a3e1..8aa65daf121f 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) > > ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; > > - if (ramdisk_image == 0) > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > ramdisk_image = phys_initrd_start; > > return ramdisk_image; > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) > > ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; > > - if (ramdisk_size == 0) > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > ramdisk_size = phys_initrd_size; Yeah, so this looks like a good fix - but please let's introduce some sort of enum for the bootloader IDs in arch/x86/include/uapi/asm/bootparam.h, I had to search way too long to figure out what 0xD is and where it was defined :-) Also, please introduce a "x86_bootloader_is_kexec()" kind of helper inline function as well. Thanks, Ingo