* [PATCH] x86: fix oops caused by old EFI info on kexec boot @ 2025-11-26 17:32 James Le Cuirot 2025-12-03 18:12 ` Ingo Molnar 0 siblings, 1 reply; 9+ messages in thread From: James Le Cuirot @ 2025-11-26 17:32 UTC (permalink / raw) To: x86 Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H . Peter Anvin, Ard Biesheuvel, James Le Cuirot kexec on x86 passes initrd details via the boot_params. If no initrd is supplied, then ramdisk_size is 0. When determining whether to reserve memory for the initrd on the subsequent boot, ramdisk_size being 0 causes the logic to fall back to phys_initrd_start and phys_initrd_size set from the EFI tables in efi.c. This is stale information from the initial boot. The system continues to boot and has even been seen to function under heavy load for days, but allocating very large amounts of memory reliably triggers an oops rather than the OOM killer. BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: Oops: 0002 [#1] SMP NOPTI This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 when the EFI stub initrd loading was unified between architectures. Avoid the issue by checking whether the bootloader is not kexec before falling back to the EFI table values. I strongly suspect this also affects other architectures. A different fix would be required there, and I do have a fix in mind, but I was unable to reproduce the issue under QEMU's aarch64 virt machine. I think this is at least partly because it relies on ACPI while kexec passes the initd details via the device tree. Signed-off-by: James Le Cuirot <chewi@gentoo.org> --- arch/x86/kernel/setup.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1b2edd07a3e1..8aa65daf121f 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; - if (ramdisk_image == 0) + /* Don't fall back for kexec as phys_initrd_start will be stale */ + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) ramdisk_image = phys_initrd_start; return ramdisk_image; @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; - if (ramdisk_size == 0) + /* Don't fall back for kexec as phys_initrd_start will be stale */ + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) ramdisk_size = phys_initrd_size; return ramdisk_size; -- 2.51.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-11-26 17:32 [PATCH] x86: fix oops caused by old EFI info on kexec boot James Le Cuirot @ 2025-12-03 18:12 ` Ingo Molnar 2025-12-03 20:59 ` H. Peter Anvin 2025-12-03 22:57 ` Ard Biesheuvel 0 siblings, 2 replies; 9+ messages in thread From: Ingo Molnar @ 2025-12-03 18:12 UTC (permalink / raw) To: James Le Cuirot Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H . Peter Anvin, Ard Biesheuvel * James Le Cuirot <chewi@gentoo.org> wrote: > kexec on x86 passes initrd details via the boot_params. If no initrd is > supplied, then ramdisk_size is 0. When determining whether to reserve > memory for the initrd on the subsequent boot, ramdisk_size being 0 > causes the logic to fall back to phys_initrd_start and phys_initrd_size > set from the EFI tables in efi.c. This is stale information from the > initial boot. The system continues to boot and has even been seen to > function under heavy load for days, but allocating very large amounts of > memory reliably triggers an oops rather than the OOM killer. > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > #PF: supervisor write access in kernel mode > #PF: error_code(0x0002) - not-present page > PGD 0 P4D 0 > Oops: Oops: 0002 [#1] SMP NOPTI > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 > when the EFI stub initrd loading was unified between architectures. > > Avoid the issue by checking whether the bootloader is not kexec before > falling back to the EFI table values. > > I strongly suspect this also affects other architectures. A different > fix would be required there, and I do have a fix in mind, but I was > unable to reproduce the issue under QEMU's aarch64 virt machine. I think > this is at least partly because it relies on ACPI while kexec passes the > initd details via the device tree. > > Signed-off-by: James Le Cuirot <chewi@gentoo.org> > --- > arch/x86/kernel/setup.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 1b2edd07a3e1..8aa65daf121f 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) > > ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; > > - if (ramdisk_image == 0) > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > ramdisk_image = phys_initrd_start; > > return ramdisk_image; > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) > > ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; > > - if (ramdisk_size == 0) > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > ramdisk_size = phys_initrd_size; Yeah, so this looks like a good fix - but please let's introduce some sort of enum for the bootloader IDs in arch/x86/include/uapi/asm/bootparam.h, I had to search way too long to figure out what 0xD is and where it was defined :-) Also, please introduce a "x86_bootloader_is_kexec()" kind of helper inline function as well. Thanks, Ingo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-03 18:12 ` Ingo Molnar @ 2025-12-03 20:59 ` H. Peter Anvin 2025-12-03 21:27 ` H. Peter Anvin 2025-12-03 22:57 ` Ard Biesheuvel 1 sibling, 1 reply; 9+ messages in thread From: H. Peter Anvin @ 2025-12-03 20:59 UTC (permalink / raw) To: Ingo Molnar, James Le Cuirot Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Ard Biesheuvel On December 3, 2025 10:12:55 AM PST, Ingo Molnar <mingo@kernel.org> wrote: >* James Le Cuirot <chewi@gentoo.org> wrote: > >> kexec on x86 passes initrd details via the boot_params. If no initrd is >> supplied, then ramdisk_size is 0. When determining whether to reserve >> memory for the initrd on the subsequent boot, ramdisk_size being 0 >> causes the logic to fall back to phys_initrd_start and phys_initrd_size >> set from the EFI tables in efi.c. This is stale information from the >> initial boot. The system continues to boot and has even been seen to >> function under heavy load for days, but allocating very large amounts of >> memory reliably triggers an oops rather than the OOM killer. >> >> BUG: kernel NULL pointer dereference, address: 0000000000000008 >> #PF: supervisor write access in kernel mode >> #PF: error_code(0x0002) - not-present page >> PGD 0 P4D 0 >> Oops: Oops: 0002 [#1] SMP NOPTI >> >> This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 >> when the EFI stub initrd loading was unified between architectures. >> >> Avoid the issue by checking whether the bootloader is not kexec before >> falling back to the EFI table values. >> >> I strongly suspect this also affects other architectures. A different >> fix would be required there, and I do have a fix in mind, but I was >> unable to reproduce the issue under QEMU's aarch64 virt machine. I think >> this is at least partly because it relies on ACPI while kexec passes the >> initd details via the device tree. >> >> Signed-off-by: James Le Cuirot <chewi@gentoo.org> >> --- >> arch/x86/kernel/setup.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c >> index 1b2edd07a3e1..8aa65daf121f 100644 >> --- a/arch/x86/kernel/setup.c >> +++ b/arch/x86/kernel/setup.c >> @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) >> >> ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; >> >> - if (ramdisk_image == 0) >> + /* Don't fall back for kexec as phys_initrd_start will be stale */ >> + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) >> ramdisk_image = phys_initrd_start; >> >> return ramdisk_image; >> @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) >> >> ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; >> >> - if (ramdisk_size == 0) >> + /* Don't fall back for kexec as phys_initrd_start will be stale */ >> + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) >> ramdisk_size = phys_initrd_size; > >Yeah, so this looks like a good fix - but please let's >introduce some sort of enum for the bootloader IDs >in arch/x86/include/uapi/asm/bootparam.h, I had to search >way too long to figure out what 0xD is and where it >was defined :-) > >Also, please introduce a "x86_bootloader_is_kexec()" kind >of helper inline function as well. > >Thanks, > > Ingo > An enum and an accessor, please, that decodes the extended ID if necessary. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-03 20:59 ` H. Peter Anvin @ 2025-12-03 21:27 ` H. Peter Anvin 2025-12-06 11:45 ` Ingo Molnar 0 siblings, 1 reply; 9+ messages in thread From: H. Peter Anvin @ 2025-12-03 21:27 UTC (permalink / raw) To: Ingo Molnar, James Le Cuirot Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Ard Biesheuvel [-- Attachment #1: Type: text/plain, Size: 528 bytes --] On 2025-12-03 12:59, H. Peter Anvin wrote: >> >> Yeah, so this looks like a good fix - but please let's >> introduce some sort of enum for the bootloader IDs >> in arch/x86/include/uapi/asm/bootparam.h, I had to search >> way too long to figure out what 0xD is and where it >> was defined :-) >> >> Also, please introduce a "x86_bootloader_is_kexec()" kind >> of helper inline function as well. >> > > An enum and an accessor, please, that decodes the extended ID if necessary. > Maybe something like this (untested.) -hpa [-- Attachment #2: 0001-x86-boot-add-enumeration-and-accessors-for-boot-load.patch --] [-- Type: text/x-patch, Size: 2802 bytes --] From b65a27b0096bd953f56facee7c41213bce5cb13e Mon Sep 17 00:00:00 2001 From: "H. Peter Anvin" <hpa@zytor.com> Date: Wed, 3 Dec 2025 13:20:07 -0800 Subject: [PATCH 1/1] x86/boot: add enumeration and accessors for boot loader ID and version Add an enumeration of boot loader type IDs; since both the type ID and the version number is split between two fields, add accessors to read and write these fields (the latter is intended for boot loaders to use as this is a uapi header.) Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> --- arch/x86/include/uapi/asm/bootparam.h | 44 +++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index dafbf581c515..38f98012fd2b 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -26,6 +26,28 @@ #define XLF_5LEVEL_ENABLED (1<<6) #define XLF_MEM_ENCRYPTION (1<<7) +/* bootloader ID */ +#define BOOTLOADER_LILO 0x00 +#define BOOTLOADER_LOADLIN 0x01 +/* 0x02 used by kernel internal boot sector - long since obsolete */ +#define BOOTLOADER_SYSLINUX 0x03 +#define BOOTLOADER_IPXE 0x04 +#define BOOTLOADER_ELILO 0x05 +/* 0x06 unknown user */ +#define BOOTLOADER_GRUB 0x07 +#define BOOTLOADER_UBOOT 0x08 +#define BOOTLOADER_XEN 0x09 +#define BOOTLOADER_GUJIN 0x0a +#define BOOTLOADER_QEMU 0x0b +#define BOOTLOADER_ARCTURUS 0x0c /* "Arcturus Networks uCbootloader" */ +#define BOOTLOADER_KEXEC_TOOLS 0x0d +#define BOOTLOADER_EXTENDED_ID 0x0e /* Escape into the 0x10+ space */ +#define BOOTLOADER_MISSING_ID 0x0f /* Bootloader unknown */ +#define BOOTLOADER_EXTID_BASE 0x10 /* Error: ext_loader_type never set */ +#define BOOTLOADER_MINIMAL 0x11 /* "Minimal Linux Bootloader" */ +#define BOOTLOADER_OVMF 0x12 +#define BOOTLOADER_BAREBOX 0x13 + #ifndef __ASSEMBLER__ #include <linux/types.h> @@ -162,6 +184,28 @@ struct boot_params { __u8 _pad9[276]; /* 0xeec */ } __attribute__((packed)); +static inline unsigned int boot_loader_type(const struct boot_params *_bp) +{ + unsigned int _type = _bp->type_of_loader >> 4; + if (_type == BOOTLOADER_EXTENDED_ID) + _type = _bp->ext_loader_type + BOOTLOADER_EXTID_BASE + return _type; +} +static inline unsigned int boot_loader_ver(const struct boot_params *_bp) +{ + return (_bp->type_of_loader & 0x0f) + (_bp->ext_loader_ver << 4); +} +static inline void set_boot_loader(struct boot_params *_bp, + unsigned int _type, unsigned int _ver) +{ + if (_type >= BOOTLOADER_EXTID_BASE) { + _bp->ext_loader_type = _type - BOOTLOADER_EXTID_BASE; + _type = BOOTLOADER_EXTENDED_ID; + } + _bp->ext_loader_ver = _ver >> 4; + _bp->type_of_loader = _type + (_ver & 0x0f); +} + /** * enum x86_hardware_subarch - x86 hardware subarchitecture * -- 2.52.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-03 21:27 ` H. Peter Anvin @ 2025-12-06 11:45 ` Ingo Molnar 0 siblings, 0 replies; 9+ messages in thread From: Ingo Molnar @ 2025-12-06 11:45 UTC (permalink / raw) To: H. Peter Anvin Cc: James Le Cuirot, x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, Ard Biesheuvel * H. Peter Anvin <hpa@zytor.com> wrote: > --- a/arch/x86/include/uapi/asm/bootparam.h > +++ b/arch/x86/include/uapi/asm/bootparam.h > @@ -26,6 +26,28 @@ > #define XLF_5LEVEL_ENABLED (1<<6) > #define XLF_MEM_ENCRYPTION (1<<7) > > +/* bootloader ID */ > +#define BOOTLOADER_LILO 0x00 > +#define BOOTLOADER_LOADLIN 0x01 > +/* 0x02 used by kernel internal boot sector - long since obsolete */ > +#define BOOTLOADER_SYSLINUX 0x03 > +#define BOOTLOADER_IPXE 0x04 > +#define BOOTLOADER_ELILO 0x05 > +/* 0x06 unknown user */ > +#define BOOTLOADER_GRUB 0x07 > +#define BOOTLOADER_UBOOT 0x08 > +#define BOOTLOADER_XEN 0x09 > +#define BOOTLOADER_GUJIN 0x0a > +#define BOOTLOADER_QEMU 0x0b > +#define BOOTLOADER_ARCTURUS 0x0c /* "Arcturus Networks uCbootloader" */ > +#define BOOTLOADER_KEXEC_TOOLS 0x0d > +#define BOOTLOADER_EXTENDED_ID 0x0e /* Escape into the 0x10+ space */ > +#define BOOTLOADER_MISSING_ID 0x0f /* Bootloader unknown */ > +#define BOOTLOADER_EXTID_BASE 0x10 /* Error: ext_loader_type never set */ > +#define BOOTLOADER_MINIMAL 0x11 /* "Minimal Linux Bootloader" */ > +#define BOOTLOADER_OVMF 0x12 > +#define BOOTLOADER_BAREBOX 0x13 > + > #ifndef __ASSEMBLER__ > > #include <linux/types.h> > @@ -162,6 +184,28 @@ struct boot_params { > __u8 _pad9[276]; /* 0xeec */ > } __attribute__((packed)); > > +static inline unsigned int boot_loader_type(const struct boot_params *_bp) > +{ > + unsigned int _type = _bp->type_of_loader >> 4; > + if (_type == BOOTLOADER_EXTENDED_ID) > + _type = _bp->ext_loader_type + BOOTLOADER_EXTID_BASE > + return _type; > +} > +static inline unsigned int boot_loader_ver(const struct boot_params *_bp) > +{ > + return (_bp->type_of_loader & 0x0f) + (_bp->ext_loader_ver << 4); > +} > +static inline void set_boot_loader(struct boot_params *_bp, > + unsigned int _type, unsigned int _ver) > +{ > + if (_type >= BOOTLOADER_EXTID_BASE) { > + _bp->ext_loader_type = _type - BOOTLOADER_EXTID_BASE; > + _type = BOOTLOADER_EXTENDED_ID; > + } > + _bp->ext_loader_ver = _ver >> 4; > + _bp->type_of_loader = _type + (_ver & 0x0f); > +} Ack, with a s/_bp/bp, s/_type/type and s/_ver/ver side note, as this isn't a macro and thus there's no need to obfuscate the inline function's parameter name to avoid unintended shadowing? Thanks, Ingo ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-03 18:12 ` Ingo Molnar 2025-12-03 20:59 ` H. Peter Anvin @ 2025-12-03 22:57 ` Ard Biesheuvel 2025-12-03 23:01 ` Ard Biesheuvel 1 sibling, 1 reply; 9+ messages in thread From: Ard Biesheuvel @ 2025-12-03 22:57 UTC (permalink / raw) To: Ingo Molnar Cc: James Le Cuirot, x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H . Peter Anvin On Wed, 3 Dec 2025 at 19:13, Ingo Molnar <mingo@kernel.org> wrote: > > * James Le Cuirot <chewi@gentoo.org> wrote: > > > kexec on x86 passes initrd details via the boot_params. If no initrd is > > supplied, then ramdisk_size is 0. When determining whether to reserve > > memory for the initrd on the subsequent boot, ramdisk_size being 0 > > causes the logic to fall back to phys_initrd_start and phys_initrd_size > > set from the EFI tables in efi.c. This is stale information from the > > initial boot. The system continues to boot and has even been seen to > > function under heavy load for days, but allocating very large amounts of > > memory reliably triggers an oops rather than the OOM killer. > > > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > > #PF: supervisor write access in kernel mode > > #PF: error_code(0x0002) - not-present page > > PGD 0 P4D 0 > > Oops: Oops: 0002 [#1] SMP NOPTI > > > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 > > when the EFI stub initrd loading was unified between architectures. > > > > Avoid the issue by checking whether the bootloader is not kexec before > > falling back to the EFI table values. > > > > I strongly suspect this also affects other architectures. A different > > fix would be required there, and I do have a fix in mind, but I was > > unable to reproduce the issue under QEMU's aarch64 virt machine. I think > > this is at least partly because it relies on ACPI while kexec passes the > > initd details via the device tree. > > > > Signed-off-by: James Le Cuirot <chewi@gentoo.org> > > --- > > arch/x86/kernel/setup.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > index 1b2edd07a3e1..8aa65daf121f 100644 > > --- a/arch/x86/kernel/setup.c > > +++ b/arch/x86/kernel/setup.c > > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) > > > > ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; > > > > - if (ramdisk_image == 0) > > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > > + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > > ramdisk_image = phys_initrd_start; > > > > return ramdisk_image; > > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) > > > > ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; > > > > - if (ramdisk_size == 0) > > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > > + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > > ramdisk_size = phys_initrd_size; > > Yeah, so this looks like a good fix - but please let's > introduce some sort of enum for the bootloader IDs > in arch/x86/include/uapi/asm/bootparam.h, I had to search > way too long to figure out what 0xD is and where it > was defined :-) > > Also, please introduce a "x86_bootloader_is_kexec()" kind > of helper inline function as well. > It might be better to fix this in the generic EFI code, and simply wipe the EFI config table that the EFI stub created to pass the initrd info. That way, it works for all architectures, and there is no need for special x86 hacks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-03 22:57 ` Ard Biesheuvel @ 2025-12-03 23:01 ` Ard Biesheuvel 2025-12-04 21:40 ` James Le Cuirot 0 siblings, 1 reply; 9+ messages in thread From: Ard Biesheuvel @ 2025-12-03 23:01 UTC (permalink / raw) To: Ingo Molnar Cc: James Le Cuirot, x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H . Peter Anvin On Wed, 3 Dec 2025 at 23:57, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Wed, 3 Dec 2025 at 19:13, Ingo Molnar <mingo@kernel.org> wrote: > > > > * James Le Cuirot <chewi@gentoo.org> wrote: > > > > > kexec on x86 passes initrd details via the boot_params. If no initrd is > > > supplied, then ramdisk_size is 0. When determining whether to reserve > > > memory for the initrd on the subsequent boot, ramdisk_size being 0 > > > causes the logic to fall back to phys_initrd_start and phys_initrd_size > > > set from the EFI tables in efi.c. This is stale information from the > > > initial boot. The system continues to boot and has even been seen to > > > function under heavy load for days, but allocating very large amounts of > > > memory reliably triggers an oops rather than the OOM killer. > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > > > #PF: supervisor write access in kernel mode > > > #PF: error_code(0x0002) - not-present page > > > PGD 0 P4D 0 > > > Oops: Oops: 0002 [#1] SMP NOPTI > > > > > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 > > > when the EFI stub initrd loading was unified between architectures. > > > > > > Avoid the issue by checking whether the bootloader is not kexec before > > > falling back to the EFI table values. > > > > > > I strongly suspect this also affects other architectures. A different > > > fix would be required there, and I do have a fix in mind, but I was > > > unable to reproduce the issue under QEMU's aarch64 virt machine. I think > > > this is at least partly because it relies on ACPI while kexec passes the > > > initd details via the device tree. > > > > > > Signed-off-by: James Le Cuirot <chewi@gentoo.org> > > > --- > > > arch/x86/kernel/setup.c | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > > index 1b2edd07a3e1..8aa65daf121f 100644 > > > --- a/arch/x86/kernel/setup.c > > > +++ b/arch/x86/kernel/setup.c > > > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) > > > > > > ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; > > > > > > - if (ramdisk_image == 0) > > > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > > > + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > > > ramdisk_image = phys_initrd_start; > > > > > > return ramdisk_image; > > > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) > > > > > > ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; > > > > > > - if (ramdisk_size == 0) > > > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > > > + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > > > ramdisk_size = phys_initrd_size; > > > > Yeah, so this looks like a good fix - but please let's > > introduce some sort of enum for the bootloader IDs > > in arch/x86/include/uapi/asm/bootparam.h, I had to search > > way too long to figure out what 0xD is and where it > > was defined :-) > > > > Also, please introduce a "x86_bootloader_is_kexec()" kind > > of helper inline function as well. > > > > It might be better to fix this in the generic EFI code, and simply > wipe the EFI config table that the EFI stub created to pass the initrd > info. That way, it works for all architectures, and there is no need > for special x86 hacks. I.e., --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -818,6 +818,7 @@ if (tbl) { phys_initrd_start = tbl->base; phys_initrd_size = tbl->size; + tbl->base = tbl->size = 0; early_memunmap(tbl, sizeof(*tbl)); } } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-03 23:01 ` Ard Biesheuvel @ 2025-12-04 21:40 ` James Le Cuirot 2025-12-04 23:18 ` H. Peter Anvin 0 siblings, 1 reply; 9+ messages in thread From: James Le Cuirot @ 2025-12-04 21:40 UTC (permalink / raw) To: Ard Biesheuvel, Ingo Molnar Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen, H . Peter Anvin On Thu, 2025-12-04 at 00:01 +0100, Ard Biesheuvel wrote: > On Wed, 3 Dec 2025 at 23:57, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Wed, 3 Dec 2025 at 19:13, Ingo Molnar <mingo@kernel.org> wrote: > > > > > > * James Le Cuirot <chewi@gentoo.org> wrote: > > > > > > > kexec on x86 passes initrd details via the boot_params. If no initrd is > > > > supplied, then ramdisk_size is 0. When determining whether to reserve > > > > memory for the initrd on the subsequent boot, ramdisk_size being 0 > > > > causes the logic to fall back to phys_initrd_start and phys_initrd_size > > > > set from the EFI tables in efi.c. This is stale information from the > > > > initial boot. The system continues to boot and has even been seen to > > > > function under heavy load for days, but allocating very large amounts of > > > > memory reliably triggers an oops rather than the OOM killer. > > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000008 > > > > #PF: supervisor write access in kernel mode > > > > #PF: error_code(0x0002) - not-present page > > > > PGD 0 P4D 0 > > > > Oops: Oops: 0002 [#1] SMP NOPTI > > > > > > > > This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 > > > > when the EFI stub initrd loading was unified between architectures. > > > > > > > > Avoid the issue by checking whether the bootloader is not kexec before > > > > falling back to the EFI table values. > > > > > > > > I strongly suspect this also affects other architectures. A different > > > > fix would be required there, and I do have a fix in mind, but I was > > > > unable to reproduce the issue under QEMU's aarch64 virt machine. I think > > > > this is at least partly because it relies on ACPI while kexec passes the > > > > initd details via the device tree. > > > > > > > > Signed-off-by: James Le Cuirot <chewi@gentoo.org> > > > > --- > > > > arch/x86/kernel/setup.c | 6 ++++-- > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > > > > index 1b2edd07a3e1..8aa65daf121f 100644 > > > > --- a/arch/x86/kernel/setup.c > > > > +++ b/arch/x86/kernel/setup.c > > > > @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) > > > > > > > > ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32; > > > > > > > > - if (ramdisk_image == 0) > > > > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > > > > + if (ramdisk_image == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > > > > ramdisk_image = phys_initrd_start; > > > > > > > > return ramdisk_image; > > > > @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) > > > > > > > > ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32; > > > > > > > > - if (ramdisk_size == 0) > > > > + /* Don't fall back for kexec as phys_initrd_start will be stale */ > > > > + if (ramdisk_size == 0 && (boot_params.hdr.type_of_loader >> 4) != 0xD) > > > > ramdisk_size = phys_initrd_size; > > > > > > Yeah, so this looks like a good fix - but please let's > > > introduce some sort of enum for the bootloader IDs > > > in arch/x86/include/uapi/asm/bootparam.h, I had to search > > > way too long to figure out what 0xD is and where it > > > was defined :-) > > > > > > Also, please introduce a "x86_bootloader_is_kexec()" kind > > > of helper inline function as well. > > > > > > > It might be better to fix this in the generic EFI code, and simply > > wipe the EFI config table that the EFI stub created to pass the initrd > > info. That way, it works for all architectures, and there is no need > > for special x86 hacks. > > I.e., > > --- a/drivers/firmware/efi/efi.c > +++ b/drivers/firmware/efi/efi.c > @@ -818,6 +818,7 @@ > if (tbl) { > phys_initrd_start = tbl->base; > phys_initrd_size = tbl->size; > + tbl->base = tbl->size = 0; > early_memunmap(tbl, sizeof(*tbl)); > } > } I can confirm that this fixes the problem. I had considered wiping the table, but I was trying to do it later, which seemed harder to do. I didn't consider wiping it immediately, but I now realise this data isn't needed afterwards. I only tested amd64, but I trust it will work for other architectures. Please go ahead with this. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] x86: fix oops caused by old EFI info on kexec boot 2025-12-04 21:40 ` James Le Cuirot @ 2025-12-04 23:18 ` H. Peter Anvin 0 siblings, 0 replies; 9+ messages in thread From: H. Peter Anvin @ 2025-12-04 23:18 UTC (permalink / raw) To: James Le Cuirot, Ard Biesheuvel, Ingo Molnar Cc: x86, linux-kernel, Thomas Gleixner, Borislav Petkov, Dave Hansen On 2025-12-04 13:40, James Le Cuirot wrote: >>> >>> It might be better to fix this in the generic EFI code, and simply >>> wipe the EFI config table that the EFI stub created to pass the initrd >>> info. That way, it works for all architectures, and there is no need >>> for special x86 hacks. >> >> I.e., >> >> --- a/drivers/firmware/efi/efi.c >> +++ b/drivers/firmware/efi/efi.c >> @@ -818,6 +818,7 @@ >> if (tbl) { >> phys_initrd_start = tbl->base; >> phys_initrd_size = tbl->size; >> + tbl->base = tbl->size = 0; >> early_memunmap(tbl, sizeof(*tbl)); >> } >> } > > I can confirm that this fixes the problem. I had considered wiping the table, > but I was trying to do it later, which seemed harder to do. I didn't consider > wiping it immediately, but I now realise this data isn't needed afterwards. I > only tested amd64, but I trust it will work for other architectures. Please go > ahead with this. > Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-12-06 11:45 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-26 17:32 [PATCH] x86: fix oops caused by old EFI info on kexec boot James Le Cuirot 2025-12-03 18:12 ` Ingo Molnar 2025-12-03 20:59 ` H. Peter Anvin 2025-12-03 21:27 ` H. Peter Anvin 2025-12-06 11:45 ` Ingo Molnar 2025-12-03 22:57 ` Ard Biesheuvel 2025-12-03 23:01 ` Ard Biesheuvel 2025-12-04 21:40 ` James Le Cuirot 2025-12-04 23:18 ` H. Peter Anvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox