public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: fix oops caused by old EFI info on kexec boot
@ 2025-11-26 17:32 James Le Cuirot
  2025-12-03 18:12 ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: James Le Cuirot @ 2025-11-26 17:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, Ard Biesheuvel, James Le Cuirot

kexec on x86 passes initrd details via the boot_params. If no initrd is
supplied, then ramdisk_size is 0. When determining whether to reserve
memory for the initrd on the subsequent boot, ramdisk_size being 0
causes the logic to fall back to phys_initrd_start and phys_initrd_size
set from the EFI tables in efi.c. This is stale information from the
initial boot. The system continues to boot and has even been seen to
function under heavy load for days, but allocating very large amounts of
memory reliably triggers an oops rather than the OOM killer.

  BUG: kernel NULL pointer dereference, address: 0000000000000008
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 0 P4D 0
  Oops: Oops: 0002 [#1] SMP NOPTI

This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
when the EFI stub initrd loading was unified between architectures.

Avoid the issue by checking whether the bootloader is not kexec before
falling back to the EFI table values.

I strongly suspect this also affects other architectures. A different
fix would be required there, and I do have a fix in mind, but I was
unable to reproduce the issue under QEMU's aarch64 virt machine. I think
this is at least partly because it relies on ACPI while kexec passes the
initd details via the device tree.

Signed-off-by: James Le Cuirot <chewi@gentoo.org>
---
 arch/x86/kernel/setup.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 1b2edd07a3e1..8aa65daf121f 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)
 
 	ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
 
-	if (ramdisk_image == 0)
+	/* Don't fall back for kexec as phys_initrd_start will be stale */
+	if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
 		ramdisk_image = phys_initrd_start;
 
 	return ramdisk_image;
@@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)
 
 	ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
 
-	if (ramdisk_size == 0)
+	/* Don't fall back for kexec as phys_initrd_start will be stale */
+	if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
 		ramdisk_size = phys_initrd_size;
 
 	return ramdisk_size;
-- 
2.51.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-11-26 17:32 [PATCH] x86: fix oops caused by old EFI info on kexec boot James Le Cuirot
@ 2025-12-03 18:12 ` Ingo Molnar
  2025-12-03 20:59   ` H. Peter Anvin
  2025-12-03 22:57   ` Ard Biesheuvel
  0 siblings, 2 replies; 9+ messages in thread
From: Ingo Molnar @ 2025-12-03 18:12 UTC (permalink / raw)
  To: James Le Cuirot
  Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, Ard Biesheuvel

* James Le Cuirot <chewi@gentoo.org> wrote:

> kexec on x86 passes initrd details via the boot_params. If no initrd is
> supplied, then ramdisk_size is 0. When determining whether to reserve
> memory for the initrd on the subsequent boot, ramdisk_size being 0
> causes the logic to fall back to phys_initrd_start and phys_initrd_size
> set from the EFI tables in efi.c. This is stale information from the
> initial boot. The system continues to boot and has even been seen to
> function under heavy load for days, but allocating very large amounts of
> memory reliably triggers an oops rather than the OOM killer.
>
>   BUG: kernel NULL pointer dereference, address: 0000000000000008
>   #PF: supervisor write access in kernel mode
>   #PF: error_code(0x0002) - not-present page
>   PGD 0 P4D 0
>   Oops: Oops: 0002 [#1] SMP NOPTI
>
> This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
> when the EFI stub initrd loading was unified between architectures.
>
> Avoid the issue by checking whether the bootloader is not kexec before
> falling back to the EFI table values.
>
> I strongly suspect this also affects other architectures. A different
> fix would be required there, and I do have a fix in mind, but I was
> unable to reproduce the issue under QEMU's aarch64 virt machine. I think
> this is at least partly because it relies on ACPI while kexec passes the
> initd details via the device tree.
>
> Signed-off-by: James Le Cuirot <chewi@gentoo.org>
> ---
>  arch/x86/kernel/setup.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 1b2edd07a3e1..8aa65daf121f 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)
>
>  	ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
>
> -	if (ramdisk_image == 0)
> +	/* Don't fall back for kexec as phys_initrd_start will be stale */
> +	if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
>  		ramdisk_image = phys_initrd_start;
>
>  	return ramdisk_image;
> @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)
>
>  	ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
>
> -	if (ramdisk_size == 0)
> +	/* Don't fall back for kexec as phys_initrd_start will be stale */
> +	if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
>  		ramdisk_size = phys_initrd_size;

Yeah, so this looks like a good fix - but please let's
introduce some sort of enum for the bootloader IDs
in arch/x86/include/uapi/asm/bootparam.h, I had to search
way too long to figure out what 0xD is and where it
was defined :-)

Also, please introduce a "x86_bootloader_is_kexec()" kind
of helper inline function as well.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-03 18:12 ` Ingo Molnar
@ 2025-12-03 20:59   ` H. Peter Anvin
  2025-12-03 21:27     ` H. Peter Anvin
  2025-12-03 22:57   ` Ard Biesheuvel
  1 sibling, 1 reply; 9+ messages in thread
From: H. Peter Anvin @ 2025-12-03 20:59 UTC (permalink / raw)
  To: Ingo Molnar, James Le Cuirot
  Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Ard Biesheuvel

On December 3, 2025 10:12:55 AM PST, Ingo Molnar <mingo@kernel.org> wrote:
>* James Le Cuirot <chewi@gentoo.org> wrote:
>
>> kexec on x86 passes initrd details via the boot_params. If no initrd is
>> supplied, then ramdisk_size is 0. When determining whether to reserve
>> memory for the initrd on the subsequent boot, ramdisk_size being 0
>> causes the logic to fall back to phys_initrd_start and phys_initrd_size
>> set from the EFI tables in efi.c. This is stale information from the
>> initial boot. The system continues to boot and has even been seen to
>> function under heavy load for days, but allocating very large amounts of
>> memory reliably triggers an oops rather than the OOM killer.
>>
>>   BUG: kernel NULL pointer dereference, address: 0000000000000008
>>   #PF: supervisor write access in kernel mode
>>   #PF: error_code(0x0002) - not-present page
>>   PGD 0 P4D 0
>>   Oops: Oops: 0002 [#1] SMP NOPTI
>>
>> This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
>> when the EFI stub initrd loading was unified between architectures.
>>
>> Avoid the issue by checking whether the bootloader is not kexec before
>> falling back to the EFI table values.
>>
>> I strongly suspect this also affects other architectures. A different
>> fix would be required there, and I do have a fix in mind, but I was
>> unable to reproduce the issue under QEMU's aarch64 virt machine. I think
>> this is at least partly because it relies on ACPI while kexec passes the
>> initd details via the device tree.
>>
>> Signed-off-by: James Le Cuirot <chewi@gentoo.org>
>> ---
>>  arch/x86/kernel/setup.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index 1b2edd07a3e1..8aa65daf121f 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)
>>
>>  	ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
>>
>> -	if (ramdisk_image == 0)
>> +	/* Don't fall back for kexec as phys_initrd_start will be stale */
>> +	if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
>>  		ramdisk_image = phys_initrd_start;
>>
>>  	return ramdisk_image;
>> @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)
>>
>>  	ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
>>
>> -	if (ramdisk_size == 0)
>> +	/* Don't fall back for kexec as phys_initrd_start will be stale */
>> +	if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
>>  		ramdisk_size = phys_initrd_size;
>
>Yeah, so this looks like a good fix - but please let's
>introduce some sort of enum for the bootloader IDs
>in arch/x86/include/uapi/asm/bootparam.h, I had to search
>way too long to figure out what 0xD is and where it
>was defined :-)
>
>Also, please introduce a "x86_bootloader_is_kexec()" kind
>of helper inline function as well.
>
>Thanks,
>
>	Ingo
>

An enum and an accessor, please, that decodes the extended ID if necessary.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-03 20:59   ` H. Peter Anvin
@ 2025-12-03 21:27     ` H. Peter Anvin
  2025-12-06 11:45       ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: H. Peter Anvin @ 2025-12-03 21:27 UTC (permalink / raw)
  To: Ingo Molnar, James Le Cuirot
  Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, Ard Biesheuvel

[-- Attachment #1: Type: text/plain, Size: 528 bytes --]

On 2025-12-03 12:59, H. Peter Anvin wrote:
>>
>> Yeah, so this looks like a good fix - but please let's
>> introduce some sort of enum for the bootloader IDs
>> in arch/x86/include/uapi/asm/bootparam.h, I had to search
>> way too long to figure out what 0xD is and where it
>> was defined :-)
>>
>> Also, please introduce a "x86_bootloader_is_kexec()" kind
>> of helper inline function as well.
>>
> 
> An enum and an accessor, please, that decodes the extended ID if necessary.
> 

Maybe something like this (untested.)

	-hpa

[-- Attachment #2: 0001-x86-boot-add-enumeration-and-accessors-for-boot-load.patch --]
[-- Type: text/x-patch, Size: 2802 bytes --]

From b65a27b0096bd953f56facee7c41213bce5cb13e Mon Sep 17 00:00:00 2001
From: "H. Peter Anvin" <hpa@zytor.com>
Date: Wed, 3 Dec 2025 13:20:07 -0800
Subject: [PATCH 1/1] x86/boot: add enumeration and accessors for boot loader
 ID and version

Add an enumeration of boot loader type IDs; since both the type ID and
the version number is split between two fields, add accessors to read
and write these fields (the latter is intended for boot loaders to use
as this is a uapi header.)

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
---
 arch/x86/include/uapi/asm/bootparam.h | 44 +++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index dafbf581c515..38f98012fd2b 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -26,6 +26,28 @@
 #define XLF_5LEVEL_ENABLED		(1<<6)
 #define XLF_MEM_ENCRYPTION		(1<<7)
 
+/* bootloader ID */
+#define BOOTLOADER_LILO		0x00
+#define BOOTLOADER_LOADLIN	0x01
+/* 0x02 used by kernel internal boot sector - long since obsolete */
+#define BOOTLOADER_SYSLINUX	0x03
+#define BOOTLOADER_IPXE		0x04
+#define BOOTLOADER_ELILO	0x05
+/* 0x06 unknown user */
+#define BOOTLOADER_GRUB		0x07
+#define BOOTLOADER_UBOOT	0x08
+#define BOOTLOADER_XEN		0x09
+#define BOOTLOADER_GUJIN	0x0a
+#define BOOTLOADER_QEMU		0x0b
+#define BOOTLOADER_ARCTURUS	0x0c	/* "Arcturus Networks uCbootloader" */
+#define BOOTLOADER_KEXEC_TOOLS	0x0d
+#define BOOTLOADER_EXTENDED_ID	0x0e	/* Escape into the 0x10+ space */
+#define BOOTLOADER_MISSING_ID	0x0f	/* Bootloader unknown */
+#define BOOTLOADER_EXTID_BASE	0x10	/* Error: ext_loader_type never set */
+#define BOOTLOADER_MINIMAL	0x11	/* "Minimal Linux Bootloader" */
+#define BOOTLOADER_OVMF		0x12
+#define BOOTLOADER_BAREBOX	0x13
+
 #ifndef __ASSEMBLER__
 
 #include <linux/types.h>
@@ -162,6 +184,28 @@ struct boot_params {
 	__u8  _pad9[276];				/* 0xeec */
 } __attribute__((packed));
 
+static inline unsigned int boot_loader_type(const struct boot_params *_bp)
+{
+	unsigned int _type = _bp->type_of_loader >> 4;
+	if (_type == BOOTLOADER_EXTENDED_ID)
+		_type = _bp->ext_loader_type + BOOTLOADER_EXTID_BASE
+	return _type;
+}
+static inline unsigned int boot_loader_ver(const struct boot_params *_bp)
+{
+	return (_bp->type_of_loader & 0x0f) + (_bp->ext_loader_ver << 4);
+}
+static inline void set_boot_loader(struct boot_params *_bp,
+				   unsigned int _type, unsigned int _ver)
+{
+	if (_type >= BOOTLOADER_EXTID_BASE) {
+		_bp->ext_loader_type = _type - BOOTLOADER_EXTID_BASE;
+		_type = BOOTLOADER_EXTENDED_ID;
+	}
+	_bp->ext_loader_ver = _ver >> 4;
+	_bp->type_of_loader = _type + (_ver & 0x0f);
+}
+
 /**
  * enum x86_hardware_subarch - x86 hardware subarchitecture
  *
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-03 18:12 ` Ingo Molnar
  2025-12-03 20:59   ` H. Peter Anvin
@ 2025-12-03 22:57   ` Ard Biesheuvel
  2025-12-03 23:01     ` Ard Biesheuvel
  1 sibling, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2025-12-03 22:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: James Le Cuirot, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin

On Wed, 3 Dec 2025 at 19:13, Ingo Molnar <mingo@kernel.org> wrote:
>
> * James Le Cuirot <chewi@gentoo.org> wrote:
>
> > kexec on x86 passes initrd details via the boot_params. If no initrd is
> > supplied, then ramdisk_size is 0. When determining whether to reserve
> > memory for the initrd on the subsequent boot, ramdisk_size being 0
> > causes the logic to fall back to phys_initrd_start and phys_initrd_size
> > set from the EFI tables in efi.c. This is stale information from the
> > initial boot. The system continues to boot and has even been seen to
> > function under heavy load for days, but allocating very large amounts of
> > memory reliably triggers an oops rather than the OOM killer.
> >
> >   BUG: kernel NULL pointer dereference, address: 0000000000000008
> >   #PF: supervisor write access in kernel mode
> >   #PF: error_code(0x0002) - not-present page
> >   PGD 0 P4D 0
> >   Oops: Oops: 0002 [#1] SMP NOPTI
> >
> > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
> > when the EFI stub initrd loading was unified between architectures.
> >
> > Avoid the issue by checking whether the bootloader is not kexec before
> > falling back to the EFI table values.
> >
> > I strongly suspect this also affects other architectures. A different
> > fix would be required there, and I do have a fix in mind, but I was
> > unable to reproduce the issue under QEMU's aarch64 virt machine. I think
> > this is at least partly because it relies on ACPI while kexec passes the
> > initd details via the device tree.
> >
> > Signed-off-by: James Le Cuirot <chewi@gentoo.org>
> > ---
> >  arch/x86/kernel/setup.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 1b2edd07a3e1..8aa65daf121f 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)
> >
> >       ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
> >
> > -     if (ramdisk_image == 0)
> > +     /* Don't fall back for kexec as phys_initrd_start will be stale */
> > +     if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
> >               ramdisk_image = phys_initrd_start;
> >
> >       return ramdisk_image;
> > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)
> >
> >       ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
> >
> > -     if (ramdisk_size == 0)
> > +     /* Don't fall back for kexec as phys_initrd_start will be stale */
> > +     if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
> >               ramdisk_size = phys_initrd_size;
>
> Yeah, so this looks like a good fix - but please let's
> introduce some sort of enum for the bootloader IDs
> in arch/x86/include/uapi/asm/bootparam.h, I had to search
> way too long to figure out what 0xD is and where it
> was defined :-)
>
> Also, please introduce a "x86_bootloader_is_kexec()" kind
> of helper inline function as well.
>

It might be better to fix this in the generic EFI code, and simply
wipe the EFI config table that the EFI stub created to pass the initrd
info. That way, it works for all architectures, and there is no need
for special x86 hacks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-03 22:57   ` Ard Biesheuvel
@ 2025-12-03 23:01     ` Ard Biesheuvel
  2025-12-04 21:40       ` James Le Cuirot
  0 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2025-12-03 23:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: James Le Cuirot, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H . Peter Anvin

On Wed, 3 Dec 2025 at 23:57, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 3 Dec 2025 at 19:13, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * James Le Cuirot <chewi@gentoo.org> wrote:
> >
> > > kexec on x86 passes initrd details via the boot_params. If no initrd is
> > > supplied, then ramdisk_size is 0. When determining whether to reserve
> > > memory for the initrd on the subsequent boot, ramdisk_size being 0
> > > causes the logic to fall back to phys_initrd_start and phys_initrd_size
> > > set from the EFI tables in efi.c. This is stale information from the
> > > initial boot. The system continues to boot and has even been seen to
> > > function under heavy load for days, but allocating very large amounts of
> > > memory reliably triggers an oops rather than the OOM killer.
> > >
> > >   BUG: kernel NULL pointer dereference, address: 0000000000000008
> > >   #PF: supervisor write access in kernel mode
> > >   #PF: error_code(0x0002) - not-present page
> > >   PGD 0 P4D 0
> > >   Oops: Oops: 0002 [#1] SMP NOPTI
> > >
> > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
> > > when the EFI stub initrd loading was unified between architectures.
> > >
> > > Avoid the issue by checking whether the bootloader is not kexec before
> > > falling back to the EFI table values.
> > >
> > > I strongly suspect this also affects other architectures. A different
> > > fix would be required there, and I do have a fix in mind, but I was
> > > unable to reproduce the issue under QEMU's aarch64 virt machine. I think
> > > this is at least partly because it relies on ACPI while kexec passes the
> > > initd details via the device tree.
> > >
> > > Signed-off-by: James Le Cuirot <chewi@gentoo.org>
> > > ---
> > >  arch/x86/kernel/setup.c | 6 ++++--
> > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > > index 1b2edd07a3e1..8aa65daf121f 100644
> > > --- a/arch/x86/kernel/setup.c
> > > +++ b/arch/x86/kernel/setup.c
> > > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)
> > >
> > >       ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
> > >
> > > -     if (ramdisk_image == 0)
> > > +     /* Don't fall back for kexec as phys_initrd_start will be stale */
> > > +     if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
> > >               ramdisk_image = phys_initrd_start;
> > >
> > >       return ramdisk_image;
> > > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)
> > >
> > >       ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
> > >
> > > -     if (ramdisk_size == 0)
> > > +     /* Don't fall back for kexec as phys_initrd_start will be stale */
> > > +     if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
> > >               ramdisk_size = phys_initrd_size;
> >
> > Yeah, so this looks like a good fix - but please let's
> > introduce some sort of enum for the bootloader IDs
> > in arch/x86/include/uapi/asm/bootparam.h, I had to search
> > way too long to figure out what 0xD is and where it
> > was defined :-)
> >
> > Also, please introduce a "x86_bootloader_is_kexec()" kind
> > of helper inline function as well.
> >
>
> It might be better to fix this in the generic EFI code, and simply
> wipe the EFI config table that the EFI stub created to pass the initrd
> info. That way, it works for all architectures, and there is no need
> for special x86 hacks.

I.e.,

--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -818,6 +818,7 @@
                if (tbl) {
                        phys_initrd_start = tbl->base;
                        phys_initrd_size = tbl->size;
+                       tbl->base = tbl->size = 0;
                        early_memunmap(tbl, sizeof(*tbl));
                }
        }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-03 23:01     ` Ard Biesheuvel
@ 2025-12-04 21:40       ` James Le Cuirot
  2025-12-04 23:18         ` H. Peter Anvin
  0 siblings, 1 reply; 9+ messages in thread
From: James Le Cuirot @ 2025-12-04 21:40 UTC (permalink / raw)
  To: Ard Biesheuvel, Ingo Molnar
  Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen,
	H . Peter Anvin

On Thu, 2025-12-04 at 00:01 +0100, Ard Biesheuvel wrote:
> On Wed, 3 Dec 2025 at 23:57, Ard Biesheuvel <ardb@kernel.org> wrote:
> > 
> > On Wed, 3 Dec 2025 at 19:13, Ingo Molnar <mingo@kernel.org> wrote:
> > > 
> > > * James Le Cuirot <chewi@gentoo.org> wrote:
> > > 
> > > > kexec on x86 passes initrd details via the boot_params. If no initrd is
> > > > supplied, then ramdisk_size is 0. When determining whether to reserve
> > > > memory for the initrd on the subsequent boot, ramdisk_size being 0
> > > > causes the logic to fall back to phys_initrd_start and phys_initrd_size
> > > > set from the EFI tables in efi.c. This is stale information from the
> > > > initial boot. The system continues to boot and has even been seen to
> > > > function under heavy load for days, but allocating very large amounts of
> > > > memory reliably triggers an oops rather than the OOM killer.
> > > > 
> > > >   BUG: kernel NULL pointer dereference, address: 0000000000000008
> > > >   #PF: supervisor write access in kernel mode
> > > >   #PF: error_code(0x0002) - not-present page
> > > >   PGD 0 P4D 0
> > > >   Oops: Oops: 0002 [#1] SMP NOPTI
> > > > 
> > > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8
> > > > when the EFI stub initrd loading was unified between architectures.
> > > > 
> > > > Avoid the issue by checking whether the bootloader is not kexec before
> > > > falling back to the EFI table values.
> > > > 
> > > > I strongly suspect this also affects other architectures. A different
> > > > fix would be required there, and I do have a fix in mind, but I was
> > > > unable to reproduce the issue under QEMU's aarch64 virt machine. I think
> > > > this is at least partly because it relies on ACPI while kexec passes the
> > > > initd details via the device tree.
> > > > 
> > > > Signed-off-by: James Le Cuirot <chewi@gentoo.org>
> > > > ---
> > > >  arch/x86/kernel/setup.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > > > index 1b2edd07a3e1..8aa65daf121f 100644
> > > > --- a/arch/x86/kernel/setup.c
> > > > +++ b/arch/x86/kernel/setup.c
> > > > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void)
> > > > 
> > > >       ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
> > > > 
> > > > -     if (ramdisk_image == 0)
> > > > +     /* Don't fall back for kexec as phys_initrd_start will be stale */
> > > > +     if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
> > > >               ramdisk_image = phys_initrd_start;
> > > > 
> > > >       return ramdisk_image;
> > > > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void)
> > > > 
> > > >       ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
> > > > 
> > > > -     if (ramdisk_size == 0)
> > > > +     /* Don't fall back for kexec as phys_initrd_start will be stale */
> > > > +     if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD)
> > > >               ramdisk_size = phys_initrd_size;
> > > 
> > > Yeah, so this looks like a good fix - but please let's
> > > introduce some sort of enum for the bootloader IDs
> > > in arch/x86/include/uapi/asm/bootparam.h, I had to search
> > > way too long to figure out what 0xD is and where it
> > > was defined :-)
> > > 
> > > Also, please introduce a "x86_bootloader_is_kexec()" kind
> > > of helper inline function as well.
> > > 
> > 
> > It might be better to fix this in the generic EFI code, and simply
> > wipe the EFI config table that the EFI stub created to pass the initrd
> > info. That way, it works for all architectures, and there is no need
> > for special x86 hacks.
> 
> I.e.,
> 
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -818,6 +818,7 @@
>                 if (tbl) {
>                         phys_initrd_start = tbl->base;
>                         phys_initrd_size = tbl->size;
> +                       tbl->base = tbl->size = 0;
>                         early_memunmap(tbl, sizeof(*tbl));
>                 }
>         }

I can confirm that this fixes the problem. I had considered wiping the table,
but I was trying to do it later, which seemed harder to do. I didn't consider
wiping it immediately, but I now realise this data isn't needed afterwards. I
only tested amd64, but I trust it will work for other architectures. Please go
ahead with this.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-04 21:40       ` James Le Cuirot
@ 2025-12-04 23:18         ` H. Peter Anvin
  0 siblings, 0 replies; 9+ messages in thread
From: H. Peter Anvin @ 2025-12-04 23:18 UTC (permalink / raw)
  To: James Le Cuirot, Ard Biesheuvel, Ingo Molnar
  Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen

On 2025-12-04 13:40, James Le Cuirot wrote:
>>>
>>> It might be better to fix this in the generic EFI code, and simply
>>> wipe the EFI config table that the EFI stub created to pass the initrd
>>> info. That way, it works for all architectures, and there is no need
>>> for special x86 hacks.
>>
>> I.e.,
>>
>> --- a/drivers/firmware/efi/efi.c
>> +++ b/drivers/firmware/efi/efi.c
>> @@ -818,6 +818,7 @@
>>                 if (tbl) {
>>                         phys_initrd_start = tbl->base;
>>                         phys_initrd_size = tbl->size;
>> +                       tbl->base = tbl->size = 0;
>>                         early_memunmap(tbl, sizeof(*tbl));
>>                 }
>>         }
> 
> I can confirm that this fixes the problem. I had considered wiping the table,
> but I was trying to do it later, which seemed harder to do. I didn't consider
> wiping it immediately, but I now realise this data isn't needed afterwards. I
> only tested amd64, but I trust it will work for other architectures. Please go
> ahead with this.
> 

Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot
  2025-12-03 21:27     ` H. Peter Anvin
@ 2025-12-06 11:45       ` Ingo Molnar
  0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2025-12-06 11:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: James Le Cuirot, x86, linux-kernel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, Ard Biesheuvel

* H. Peter Anvin <hpa@zytor.com> wrote:

> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -26,6 +26,28 @@
>  #define XLF_5LEVEL_ENABLED		(1<<6)
>  #define XLF_MEM_ENCRYPTION		(1<<7)
>
> +/* bootloader ID */
> +#define BOOTLOADER_LILO		0x00
> +#define BOOTLOADER_LOADLIN	0x01
> +/* 0x02 used by kernel internal boot sector - long since obsolete */
> +#define BOOTLOADER_SYSLINUX	0x03
> +#define BOOTLOADER_IPXE		0x04
> +#define BOOTLOADER_ELILO	0x05
> +/* 0x06 unknown user */
> +#define BOOTLOADER_GRUB		0x07
> +#define BOOTLOADER_UBOOT	0x08
> +#define BOOTLOADER_XEN		0x09
> +#define BOOTLOADER_GUJIN	0x0a
> +#define BOOTLOADER_QEMU		0x0b
> +#define BOOTLOADER_ARCTURUS	0x0c	/* "Arcturus Networks uCbootloader" */
> +#define BOOTLOADER_KEXEC_TOOLS	0x0d
> +#define BOOTLOADER_EXTENDED_ID	0x0e	/* Escape into the 0x10+ space */
> +#define BOOTLOADER_MISSING_ID	0x0f	/* Bootloader unknown */
> +#define BOOTLOADER_EXTID_BASE	0x10	/* Error: ext_loader_type never set */
> +#define BOOTLOADER_MINIMAL	0x11	/* "Minimal Linux Bootloader" */
> +#define BOOTLOADER_OVMF		0x12
> +#define BOOTLOADER_BAREBOX	0x13
> +
>  #ifndef __ASSEMBLER__
>
>  #include <linux/types.h>
> @@ -162,6 +184,28 @@ struct boot_params {
>	__u8  _pad9[276];				/* 0xeec */
>  } __attribute__((packed));
>
> +static inline unsigned int boot_loader_type(const struct boot_params *_bp)
> +{
> +	unsigned int _type = _bp->type_of_loader >> 4;
> +	if (_type == BOOTLOADER_EXTENDED_ID)
> +		_type = _bp->ext_loader_type + BOOTLOADER_EXTID_BASE
> +	return _type;
> +}
> +static inline unsigned int boot_loader_ver(const struct boot_params *_bp)
> +{
> +	return (_bp->type_of_loader & 0x0f) + (_bp->ext_loader_ver << 4);
> +}
> +static inline void set_boot_loader(struct boot_params *_bp,
> +				   unsigned int _type, unsigned int _ver)
> +{
> +	if (_type >= BOOTLOADER_EXTID_BASE) {
> +		_bp->ext_loader_type = _type - BOOTLOADER_EXTID_BASE;
> +		_type = BOOTLOADER_EXTENDED_ID;
> +	}
> +	_bp->ext_loader_ver = _ver >> 4;
> +	_bp->type_of_loader = _type + (_ver & 0x0f);
> +}

Ack, with a s/_bp/bp, s/_type/type and s/_ver/ver side note, as this
isn't a macro and thus there's no need to obfuscate the inline
function's parameter name to avoid unintended shadowing?

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-12-06 11:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-26 17:32 [PATCH] x86: fix oops caused by old EFI info on kexec boot James Le Cuirot
2025-12-03 18:12 ` Ingo Molnar
2025-12-03 20:59   ` H. Peter Anvin
2025-12-03 21:27     ` H. Peter Anvin
2025-12-06 11:45       ` Ingo Molnar
2025-12-03 22:57   ` Ard Biesheuvel
2025-12-03 23:01     ` Ard Biesheuvel
2025-12-04 21:40       ` James Le Cuirot
2025-12-04 23:18         ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox