* Bug: kexec on Lenovo ThinkPad T480 disables EFI mode @ 2022-10-28 13:02 ns 2022-11-05 3:10 ` Baoquan He 0 siblings, 1 reply; 10+ messages in thread From: ns @ 2022-10-28 13:02 UTC (permalink / raw) To: Eric Biederman, kexec, linux-kernel, Ard Biesheuvel, linux-efi Greetings, I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will cause EFI mode (if that's the right term for it) to be unconditionally disabled, even when not using the --noefi option to kexec. What I mean by "EFI mode" being disabled, more than just EFI runtime services, is that basically nothing about the system's EFI is visible post-kexec. Normally you have a message like this in dmesg when the system is booted in EFI mode: [ 0.000000] efi: EFI v2.70 by EDK II [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 MEMATTR=0x7ec63018 (obviously not the real firmware of the machine I'm talking about, but I can also send that if it would be of any help) No such message pops up in my dmesg as a result of this bug, & this causes some fallout like being unable to find the system's DMI information: <6>[ 0.000000] DMI not present or invalid. The efivarfs module also fails to load with -ENODEV. I've tried also booting with efi=runtime explicitly but it doesn't change anything. The kernel still does not print the name of the EFI firmware, DMI is still missing, & efivarfs still fails to load. I've been using the kexec_load syscall for all these tests, if it's important. Also, to make it very clear, all this only ever happens post-kexec. When booting straight from UEFI (with the EFI stub), all the aforementioned stuff that fails works perfectly fine (i.e. name of firmware is printed, DMI is properly found, & efivarfs loads & mounts just fine). This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to bisect it, but it seems like it goes pretty far back. I've got vanilla mainline kernel builds dating back to 5.17 that have the exact same issue. It might be worth noting that during this testing, I made sure the version of the kernel being kexeced & the kernel kexecing were the same version. It may not have been a problem in older kernels, but that would be difficult to test for me (a pretty important driver for this machine was only merged during v5.17-rc4). So it may not have been a regression & just a hidden problem since time immemorial. I am willing to test any patches I may get to further debug or fix this issue, preferably based on the current state of torvalds/linux.git. I can build & test kernels quite a few times per day. I can also send any important materials (kernel config, dmesg, firmware information, so on & so forth) on request. I'll also just mention I'm using kexec-tools 2.0.24 upfront, if it matters. Regards, ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-10-28 13:02 Bug: kexec on Lenovo ThinkPad T480 disables EFI mode ns @ 2022-11-05 3:10 ` Baoquan He 2022-11-05 5:49 ` Dave Young 0 siblings, 1 reply; 10+ messages in thread From: Baoquan He @ 2022-11-05 3:10 UTC (permalink / raw) To: ns, dyoung; +Cc: Eric Biederman, kexec, linux-kernel, Ard Biesheuvel, linux-efi Add Dave to CC On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > Greetings, > > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > cause EFI mode (if that's the right term for it) to be unconditionally > disabled, even when not using the --noefi option to kexec. > > What I mean by "EFI mode" being disabled, more than just EFI runtime > services, is that basically nothing about the system's EFI is visible > post-kexec. Normally you have a message like this in dmesg when the > system is booted in EFI mode: > > [ 0.000000] efi: EFI v2.70 by EDK II > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > MEMATTR=0x7ec63018 > (obviously not the real firmware of the machine I'm talking about, but I > can also send that if it would be of any help) > > No such message pops up in my dmesg as a result of this bug, & this > causes some fallout like being unable to find the system's DMI > information: > > <6>[ 0.000000] DMI not present or invalid. > > The efivarfs module also fails to load with -ENODEV. > > I've tried also booting with efi=runtime explicitly but it doesn't > change anything. The kernel still does not print the name of the EFI > firmware, DMI is still missing, & efivarfs still fails to load. > > I've been using the kexec_load syscall for all these tests, if it's > important. > > Also, to make it very clear, all this only ever happens post-kexec. When > booting straight from UEFI (with the EFI stub), all the aforementioned > stuff that fails works perfectly fine (i.e. name of firmware is printed, > DMI is properly found, & efivarfs loads & mounts just fine). > > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > bisect it, but it seems like it goes pretty far back. I've got vanilla > mainline kernel builds dating back to 5.17 that have the exact same > issue. It might be worth noting that during this testing, I made sure > the version of the kernel being kexeced & the kernel kexecing were the > same version. It may not have been a problem in older kernels, but that > would be difficult to test for me (a pretty important driver for this > machine was only merged during v5.17-rc4). So it may not have been a > regression & just a hidden problem since time immemorial. > > I am willing to test any patches I may get to further debug or fix > this issue, preferably based on the current state of torvalds/linux.git. > I can build & test kernels quite a few times per day. > > I can also send any important materials (kernel config, dmesg, firmware > information, so on & so forth) on request. I'll also just mention I'm > using kexec-tools 2.0.24 upfront, if it matters. > > Regards, > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-05 3:10 ` Baoquan He @ 2022-11-05 5:49 ` Dave Young 2022-11-05 14:16 ` ns 0 siblings, 1 reply; 10+ messages in thread From: Dave Young @ 2022-11-05 5:49 UTC (permalink / raw) To: Baoquan He Cc: ns, Eric Biederman, kexec, linux-kernel, Ard Biesheuvel, linux-efi Baoquan, thanks for cc me. On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > > Add Dave to CC > > On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > > Greetings, > > > > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > > cause EFI mode (if that's the right term for it) to be unconditionally > > disabled, even when not using the --noefi option to kexec. > > > > What I mean by "EFI mode" being disabled, more than just EFI runtime > > services, is that basically nothing about the system's EFI is visible > > post-kexec. Normally you have a message like this in dmesg when the > > system is booted in EFI mode: > > > > [ 0.000000] efi: EFI v2.70 by EDK II > > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > > MEMATTR=0x7ec63018 > > (obviously not the real firmware of the machine I'm talking about, but I > > can also send that if it would be of any help) > > > > No such message pops up in my dmesg as a result of this bug, & this > > causes some fallout like being unable to find the system's DMI > > information: > > > > <6>[ 0.000000] DMI not present or invalid. > > > > The efivarfs module also fails to load with -ENODEV. > > > > I've tried also booting with efi=runtime explicitly but it doesn't > > change anything. The kernel still does not print the name of the EFI > > firmware, DMI is still missing, & efivarfs still fails to load. > > > > I've been using the kexec_load syscall for all these tests, if it's > > important. > > > > Also, to make it very clear, all this only ever happens post-kexec. When > > booting straight from UEFI (with the EFI stub), all the aforementioned > > stuff that fails works perfectly fine (i.e. name of firmware is printed, > > DMI is properly found, & efivarfs loads & mounts just fine). > > > > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > > bisect it, but it seems like it goes pretty far back. I've got vanilla > > mainline kernel builds dating back to 5.17 that have the exact same > > issue. It might be worth noting that during this testing, I made sure > > the version of the kernel being kexeced & the kernel kexecing were the > > same version. It may not have been a problem in older kernels, but that > > would be difficult to test for me (a pretty important driver for this > > machine was only merged during v5.17-rc4). So it may not have been a > > regression & just a hidden problem since time immemorial. > > > > I am willing to test any patches I may get to further debug or fix > > this issue, preferably based on the current state of torvalds/linux.git. > > I can build & test kernels quite a few times per day. > > > > I can also send any important materials (kernel config, dmesg, firmware > > information, so on & so forth) on request. I'll also just mention I'm > > using kexec-tools 2.0.24 upfront, if it matters. Can you check the efi runtime in sysfs: ls /sys/firmware/efi/runtime-map/ If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it is needed for kexec UEFI boot on x86_64. Otherwise you can add debug printf in kexec-tools efi error path to see what is wrong. kexec/arch/i386/x86-linux-setup.c : function setup_efi_data And if it still not work please post your kernel config, I can have a try although I do not have the t480 now. > > > > Regards, > > > > _______________________________________________ > > kexec mailing list > > kexec@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-05 5:49 ` Dave Young @ 2022-11-05 14:16 ` ns 2022-11-07 6:54 ` Dave Young 0 siblings, 1 reply; 10+ messages in thread From: ns @ 2022-11-05 14:16 UTC (permalink / raw) To: Dave Young Cc: Baoquan He, Eric Biederman, kexec, linux-kernel, Ard Biesheuvel, linux-efi On 2022-11-05 05:49, Dave Young wrote: > Baoquan, thanks for cc me. > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: >> >> Add Dave to CC >> >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: >> > Greetings, >> > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will >> > cause EFI mode (if that's the right term for it) to be unconditionally >> > disabled, even when not using the --noefi option to kexec. >> > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime >> > services, is that basically nothing about the system's EFI is visible >> > post-kexec. Normally you have a message like this in dmesg when the >> > system is booted in EFI mode: >> > >> > [ 0.000000] efi: EFI v2.70 by EDK II >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 >> > MEMATTR=0x7ec63018 >> > (obviously not the real firmware of the machine I'm talking about, but I >> > can also send that if it would be of any help) >> > >> > No such message pops up in my dmesg as a result of this bug, & this >> > causes some fallout like being unable to find the system's DMI >> > information: >> > >> > <6>[ 0.000000] DMI not present or invalid. >> > >> > The efivarfs module also fails to load with -ENODEV. >> > >> > I've tried also booting with efi=runtime explicitly but it doesn't >> > change anything. The kernel still does not print the name of the EFI >> > firmware, DMI is still missing, & efivarfs still fails to load. >> > >> > I've been using the kexec_load syscall for all these tests, if it's >> > important. >> > >> > Also, to make it very clear, all this only ever happens post-kexec. When >> > booting straight from UEFI (with the EFI stub), all the aforementioned >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, >> > DMI is properly found, & efivarfs loads & mounts just fine). >> > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to >> > bisect it, but it seems like it goes pretty far back. I've got vanilla >> > mainline kernel builds dating back to 5.17 that have the exact same >> > issue. It might be worth noting that during this testing, I made sure >> > the version of the kernel being kexeced & the kernel kexecing were the >> > same version. It may not have been a problem in older kernels, but that >> > would be difficult to test for me (a pretty important driver for this >> > machine was only merged during v5.17-rc4). So it may not have been a >> > regression & just a hidden problem since time immemorial. >> > >> > I am willing to test any patches I may get to further debug or fix >> > this issue, preferably based on the current state of torvalds/linux.git. >> > I can build & test kernels quite a few times per day. >> > >> > I can also send any important materials (kernel config, dmesg, firmware >> > information, so on & so forth) on request. I'll also just mention I'm >> > using kexec-tools 2.0.24 upfront, if it matters. > > Can you check the efi runtime in sysfs: > ls /sys/firmware/efi/runtime-map/ > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > is needed for kexec UEFI boot on x86_64. Oh my, it really is that simple. Indeed, enabling this in the pre-kexec kernel fixes it all up. I had blindly disabled it in my quest to downsize the pre-kexec kernel to reduce boot time (it only runs a bootloader). In hindsight, the firmware drivers section is not really a good section to tweak on a whim. I'm terribly sorry to have taken your time to "fix" this "bug". But I must ask, is there any reason why this is a visible config option, or at least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one probably wants to have kexec work properly if they can even enable it. I admit the help text for it is arguably pretty good, but I feel like the config option is only really useful for embedded, the same enviroments where people would disable stuff like CONFIG_DMI -- a config option that I would argue is pretty justifiably gated behind CONFIG_EXPERT, because far too many systems break without it & it's pretty small code, so really not worth it unless you absolutely know what you're doing. Similarly, I don't really think there's much value in disabling the ability to kexec without the firmware except if you're heavily informed & must have the size reduction, especially since in EFI land that's where your DMI info comes from, if I were to argue for it on the basis of CONFIG_DMI being gated. In summary, it can cause quite a bit of unnecessary confusion despite only being useful to a very small minority of users. Thank you! > > Otherwise you can add debug printf in kexec-tools efi error path to > see what is wrong. > kexec/arch/i386/x86-linux-setup.c : function setup_efi_data > > And if it still not work please post your kernel config, I can have a > try although I do not have the t480 now. > > >> > >> > Regards, >> > >> > _______________________________________________ >> > kexec mailing list >> > kexec@lists.infradead.org >> > http://lists.infradead.org/mailman/listinfo/kexec >> > >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-05 14:16 ` ns @ 2022-11-07 6:54 ` Dave Young 2022-11-07 7:29 ` Ard Biesheuvel 0 siblings, 1 reply; 10+ messages in thread From: Dave Young @ 2022-11-07 6:54 UTC (permalink / raw) To: ns Cc: Baoquan He, Eric Biederman, kexec, linux-kernel, Ard Biesheuvel, linux-efi Hi, On Sat, 5 Nov 2022 at 22:16, <ns@tfwno.gf> wrote: > > On 2022-11-05 05:49, Dave Young wrote: > > Baoquan, thanks for cc me. > > > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > >> > >> Add Dave to CC > >> > >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > >> > Greetings, > >> > > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > >> > cause EFI mode (if that's the right term for it) to be unconditionally > >> > disabled, even when not using the --noefi option to kexec. > >> > > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime > >> > services, is that basically nothing about the system's EFI is visible > >> > post-kexec. Normally you have a message like this in dmesg when the > >> > system is booted in EFI mode: > >> > > >> > [ 0.000000] efi: EFI v2.70 by EDK II > >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > >> > MEMATTR=0x7ec63018 > >> > (obviously not the real firmware of the machine I'm talking about, but I > >> > can also send that if it would be of any help) > >> > > >> > No such message pops up in my dmesg as a result of this bug, & this > >> > causes some fallout like being unable to find the system's DMI > >> > information: > >> > > >> > <6>[ 0.000000] DMI not present or invalid. > >> > > >> > The efivarfs module also fails to load with -ENODEV. > >> > > >> > I've tried also booting with efi=runtime explicitly but it doesn't > >> > change anything. The kernel still does not print the name of the EFI > >> > firmware, DMI is still missing, & efivarfs still fails to load. > >> > > >> > I've been using the kexec_load syscall for all these tests, if it's > >> > important. > >> > > >> > Also, to make it very clear, all this only ever happens post-kexec. When > >> > booting straight from UEFI (with the EFI stub), all the aforementioned > >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, > >> > DMI is properly found, & efivarfs loads & mounts just fine). > >> > > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > >> > bisect it, but it seems like it goes pretty far back. I've got vanilla > >> > mainline kernel builds dating back to 5.17 that have the exact same > >> > issue. It might be worth noting that during this testing, I made sure > >> > the version of the kernel being kexeced & the kernel kexecing were the > >> > same version. It may not have been a problem in older kernels, but that > >> > would be difficult to test for me (a pretty important driver for this > >> > machine was only merged during v5.17-rc4). So it may not have been a > >> > regression & just a hidden problem since time immemorial. > >> > > >> > I am willing to test any patches I may get to further debug or fix > >> > this issue, preferably based on the current state of torvalds/linux.git. > >> > I can build & test kernels quite a few times per day. > >> > > >> > I can also send any important materials (kernel config, dmesg, firmware > >> > information, so on & so forth) on request. I'll also just mention I'm > >> > using kexec-tools 2.0.24 upfront, if it matters. > > > > Can you check the efi runtime in sysfs: > > ls /sys/firmware/efi/runtime-map/ > > > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > > is needed for kexec UEFI boot on x86_64. > > Oh my, it really is that simple. > > Indeed, enabling this in the pre-kexec kernel fixes it all up. I had > blindly disabled it in my quest to downsize the pre-kexec kernel to > reduce boot time (it only runs a bootloader). In hindsight, the firmware > drivers section is not really a good section to tweak on a whim. > > I'm terribly sorry to have taken your time to "fix" this "bug". But I > must ask, is there any reason why this is a visible config option, or at > least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c > is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one > probably wants to have kexec work properly if they can even enable it. Glad to know it works with the .config tweaking. I can not recall any reason for that though. Since it sits in the efi code path, let's see how Ard thinks about your proposal. Thanks Dave ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-07 6:54 ` Dave Young @ 2022-11-07 7:29 ` Ard Biesheuvel 2022-11-07 7:36 ` Dave Young 0 siblings, 1 reply; 10+ messages in thread From: Ard Biesheuvel @ 2022-11-07 7:29 UTC (permalink / raw) To: Dave Young; +Cc: ns, Baoquan He, Eric Biederman, kexec, linux-kernel, linux-efi On Mon, 7 Nov 2022 at 07:55, Dave Young <dyoung@redhat.com> wrote: > > Hi, > > On Sat, 5 Nov 2022 at 22:16, <ns@tfwno.gf> wrote: > > > > On 2022-11-05 05:49, Dave Young wrote: > > > Baoquan, thanks for cc me. > > > > > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > > >> > > >> Add Dave to CC > > >> > > >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > > >> > Greetings, > > >> > > > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > > >> > cause EFI mode (if that's the right term for it) to be unconditionally > > >> > disabled, even when not using the --noefi option to kexec. > > >> > > > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime > > >> > services, is that basically nothing about the system's EFI is visible > > >> > post-kexec. Normally you have a message like this in dmesg when the > > >> > system is booted in EFI mode: > > >> > > > >> > [ 0.000000] efi: EFI v2.70 by EDK II > > >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > > >> > MEMATTR=0x7ec63018 > > >> > (obviously not the real firmware of the machine I'm talking about, but I > > >> > can also send that if it would be of any help) > > >> > > > >> > No such message pops up in my dmesg as a result of this bug, & this > > >> > causes some fallout like being unable to find the system's DMI > > >> > information: > > >> > > > >> > <6>[ 0.000000] DMI not present or invalid. > > >> > > > >> > The efivarfs module also fails to load with -ENODEV. > > >> > > > >> > I've tried also booting with efi=runtime explicitly but it doesn't > > >> > change anything. The kernel still does not print the name of the EFI > > >> > firmware, DMI is still missing, & efivarfs still fails to load. > > >> > > > >> > I've been using the kexec_load syscall for all these tests, if it's > > >> > important. > > >> > > > >> > Also, to make it very clear, all this only ever happens post-kexec. When > > >> > booting straight from UEFI (with the EFI stub), all the aforementioned > > >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, > > >> > DMI is properly found, & efivarfs loads & mounts just fine). > > >> > > > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > > >> > bisect it, but it seems like it goes pretty far back. I've got vanilla > > >> > mainline kernel builds dating back to 5.17 that have the exact same > > >> > issue. It might be worth noting that during this testing, I made sure > > >> > the version of the kernel being kexeced & the kernel kexecing were the > > >> > same version. It may not have been a problem in older kernels, but that > > >> > would be difficult to test for me (a pretty important driver for this > > >> > machine was only merged during v5.17-rc4). So it may not have been a > > >> > regression & just a hidden problem since time immemorial. > > >> > > > >> > I am willing to test any patches I may get to further debug or fix > > >> > this issue, preferably based on the current state of torvalds/linux.git. > > >> > I can build & test kernels quite a few times per day. > > >> > > > >> > I can also send any important materials (kernel config, dmesg, firmware > > >> > information, so on & so forth) on request. I'll also just mention I'm > > >> > using kexec-tools 2.0.24 upfront, if it matters. > > > > > > Can you check the efi runtime in sysfs: > > > ls /sys/firmware/efi/runtime-map/ > > > > > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > > > is needed for kexec UEFI boot on x86_64. > > > > Oh my, it really is that simple. > > > > Indeed, enabling this in the pre-kexec kernel fixes it all up. I had > > blindly disabled it in my quest to downsize the pre-kexec kernel to > > reduce boot time (it only runs a bootloader). In hindsight, the firmware > > drivers section is not really a good section to tweak on a whim. > > > > I'm terribly sorry to have taken your time to "fix" this "bug". But I > > must ask, is there any reason why this is a visible config option, or at > > least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c > > is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one > > probably wants to have kexec work properly if they can even enable it. > > Glad to know it works with the .config tweaking. I can not recall any > reason for that though. > > Since it sits in the efi code path, let's see how Ard thinks about > your proposal. > I don't understand why EFI_RUNTIME_MAP should depend on KEXEC_CORE at all: it is documented as a feature that can be enabled for debugging as well, and kexec does not work as expected without it. Should we just change it like this perhaps? --- a/drivers/firmware/efi/Kconfig +++ b/drivers/firmware/efi/Kconfig @@ -28,8 +28,8 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE config EFI_RUNTIME_MAP bool "Export efi runtime maps to sysfs" - depends on X86 && EFI && KEXEC_CORE - default y + depends on X86 && EFI + default KEXEC_CORE help and maybe add an 'if EXPERT' so that the option is only visible to modify when CONFIG_EXPERT=y ? In any case, I intend to move this code into arch/x86 as well, so I'll have a couple of patches out shortly. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-07 7:29 ` Ard Biesheuvel @ 2022-11-07 7:36 ` Dave Young 2022-11-07 7:39 ` Dave Young 0 siblings, 1 reply; 10+ messages in thread From: Dave Young @ 2022-11-07 7:36 UTC (permalink / raw) To: Ard Biesheuvel Cc: ns, Baoquan He, Eric Biederman, kexec, linux-kernel, linux-efi Hi Ard, On Mon, 7 Nov 2022 at 15:30, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 7 Nov 2022 at 07:55, Dave Young <dyoung@redhat.com> wrote: > > > > Hi, > > > > On Sat, 5 Nov 2022 at 22:16, <ns@tfwno.gf> wrote: > > > > > > On 2022-11-05 05:49, Dave Young wrote: > > > > Baoquan, thanks for cc me. > > > > > > > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > > > >> > > > >> Add Dave to CC > > > >> > > > >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > > > >> > Greetings, > > > >> > > > > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > > > >> > cause EFI mode (if that's the right term for it) to be unconditionally > > > >> > disabled, even when not using the --noefi option to kexec. > > > >> > > > > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime > > > >> > services, is that basically nothing about the system's EFI is visible > > > >> > post-kexec. Normally you have a message like this in dmesg when the > > > >> > system is booted in EFI mode: > > > >> > > > > >> > [ 0.000000] efi: EFI v2.70 by EDK II > > > >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > > > >> > MEMATTR=0x7ec63018 > > > >> > (obviously not the real firmware of the machine I'm talking about, but I > > > >> > can also send that if it would be of any help) > > > >> > > > > >> > No such message pops up in my dmesg as a result of this bug, & this > > > >> > causes some fallout like being unable to find the system's DMI > > > >> > information: > > > >> > > > > >> > <6>[ 0.000000] DMI not present or invalid. > > > >> > > > > >> > The efivarfs module also fails to load with -ENODEV. > > > >> > > > > >> > I've tried also booting with efi=runtime explicitly but it doesn't > > > >> > change anything. The kernel still does not print the name of the EFI > > > >> > firmware, DMI is still missing, & efivarfs still fails to load. > > > >> > > > > >> > I've been using the kexec_load syscall for all these tests, if it's > > > >> > important. > > > >> > > > > >> > Also, to make it very clear, all this only ever happens post-kexec. When > > > >> > booting straight from UEFI (with the EFI stub), all the aforementioned > > > >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, > > > >> > DMI is properly found, & efivarfs loads & mounts just fine). > > > >> > > > > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > > > >> > bisect it, but it seems like it goes pretty far back. I've got vanilla > > > >> > mainline kernel builds dating back to 5.17 that have the exact same > > > >> > issue. It might be worth noting that during this testing, I made sure > > > >> > the version of the kernel being kexeced & the kernel kexecing were the > > > >> > same version. It may not have been a problem in older kernels, but that > > > >> > would be difficult to test for me (a pretty important driver for this > > > >> > machine was only merged during v5.17-rc4). So it may not have been a > > > >> > regression & just a hidden problem since time immemorial. > > > >> > > > > >> > I am willing to test any patches I may get to further debug or fix > > > >> > this issue, preferably based on the current state of torvalds/linux.git. > > > >> > I can build & test kernels quite a few times per day. > > > >> > > > > >> > I can also send any important materials (kernel config, dmesg, firmware > > > >> > information, so on & so forth) on request. I'll also just mention I'm > > > >> > using kexec-tools 2.0.24 upfront, if it matters. > > > > > > > > Can you check the efi runtime in sysfs: > > > > ls /sys/firmware/efi/runtime-map/ > > > > > > > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > > > > is needed for kexec UEFI boot on x86_64. > > > > > > Oh my, it really is that simple. > > > > > > Indeed, enabling this in the pre-kexec kernel fixes it all up. I had > > > blindly disabled it in my quest to downsize the pre-kexec kernel to > > > reduce boot time (it only runs a bootloader). In hindsight, the firmware > > > drivers section is not really a good section to tweak on a whim. > > > > > > I'm terribly sorry to have taken your time to "fix" this "bug". But I > > > must ask, is there any reason why this is a visible config option, or at > > > least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c > > > is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one > > > probably wants to have kexec work properly if they can even enable it. > > > > Glad to know it works with the .config tweaking. I can not recall any > > reason for that though. > > > > Since it sits in the efi code path, let's see how Ard thinks about > > your proposal. > > > > I don't understand why EFI_RUNTIME_MAP should depend on KEXEC_CORE at > all: it is documented as a feature that can be enabled for debugging > as well, and kexec does not work as expected without it. Probably debugging only mentioned in text, but not been considered in the kconfig logic :( > > Should we just change it like this perhaps? > > --- a/drivers/firmware/efi/Kconfig > +++ b/drivers/firmware/efi/Kconfig > @@ -28,8 +28,8 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE > > config EFI_RUNTIME_MAP > bool "Export efi runtime maps to sysfs" > - depends on X86 && EFI && KEXEC_CORE > - default y > + depends on X86 && EFI > + default KEXEC_CORE > help > > and maybe add an 'if EXPERT' so that the option is only visible to > modify when CONFIG_EXPERT=y ? Above changes look good to me. > > In any case, I intend to move this code into arch/x86 as well, so I'll > have a couple of patches out shortly. That would be better since it is X86 only. Thanks, Ard. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-07 7:36 ` Dave Young @ 2022-11-07 7:39 ` Dave Young 2022-11-07 7:54 ` Ard Biesheuvel 0 siblings, 1 reply; 10+ messages in thread From: Dave Young @ 2022-11-07 7:39 UTC (permalink / raw) To: Ard Biesheuvel Cc: ns, Baoquan He, Eric Biederman, kexec, linux-kernel, linux-efi On Mon, 7 Nov 2022 at 15:36, Dave Young <dyoung@redhat.com> wrote: > > Hi Ard, > > On Mon, 7 Nov 2022 at 15:30, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 7 Nov 2022 at 07:55, Dave Young <dyoung@redhat.com> wrote: > > > > > > Hi, > > > > > > On Sat, 5 Nov 2022 at 22:16, <ns@tfwno.gf> wrote: > > > > > > > > On 2022-11-05 05:49, Dave Young wrote: > > > > > Baoquan, thanks for cc me. > > > > > > > > > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > > > > >> > > > > >> Add Dave to CC > > > > >> > > > > >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > > > > >> > Greetings, > > > > >> > > > > > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > > > > >> > cause EFI mode (if that's the right term for it) to be unconditionally > > > > >> > disabled, even when not using the --noefi option to kexec. > > > > >> > > > > > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime > > > > >> > services, is that basically nothing about the system's EFI is visible > > > > >> > post-kexec. Normally you have a message like this in dmesg when the > > > > >> > system is booted in EFI mode: > > > > >> > > > > > >> > [ 0.000000] efi: EFI v2.70 by EDK II > > > > >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > > > > >> > MEMATTR=0x7ec63018 > > > > >> > (obviously not the real firmware of the machine I'm talking about, but I > > > > >> > can also send that if it would be of any help) > > > > >> > > > > > >> > No such message pops up in my dmesg as a result of this bug, & this > > > > >> > causes some fallout like being unable to find the system's DMI > > > > >> > information: > > > > >> > > > > > >> > <6>[ 0.000000] DMI not present or invalid. > > > > >> > > > > > >> > The efivarfs module also fails to load with -ENODEV. > > > > >> > > > > > >> > I've tried also booting with efi=runtime explicitly but it doesn't > > > > >> > change anything. The kernel still does not print the name of the EFI > > > > >> > firmware, DMI is still missing, & efivarfs still fails to load. > > > > >> > > > > > >> > I've been using the kexec_load syscall for all these tests, if it's > > > > >> > important. > > > > >> > > > > > >> > Also, to make it very clear, all this only ever happens post-kexec. When > > > > >> > booting straight from UEFI (with the EFI stub), all the aforementioned > > > > >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, > > > > >> > DMI is properly found, & efivarfs loads & mounts just fine). > > > > >> > > > > > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > > > > >> > bisect it, but it seems like it goes pretty far back. I've got vanilla > > > > >> > mainline kernel builds dating back to 5.17 that have the exact same > > > > >> > issue. It might be worth noting that during this testing, I made sure > > > > >> > the version of the kernel being kexeced & the kernel kexecing were the > > > > >> > same version. It may not have been a problem in older kernels, but that > > > > >> > would be difficult to test for me (a pretty important driver for this > > > > >> > machine was only merged during v5.17-rc4). So it may not have been a > > > > >> > regression & just a hidden problem since time immemorial. > > > > >> > > > > > >> > I am willing to test any patches I may get to further debug or fix > > > > >> > this issue, preferably based on the current state of torvalds/linux.git. > > > > >> > I can build & test kernels quite a few times per day. > > > > >> > > > > > >> > I can also send any important materials (kernel config, dmesg, firmware > > > > >> > information, so on & so forth) on request. I'll also just mention I'm > > > > >> > using kexec-tools 2.0.24 upfront, if it matters. > > > > > > > > > > Can you check the efi runtime in sysfs: > > > > > ls /sys/firmware/efi/runtime-map/ > > > > > > > > > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > > > > > is needed for kexec UEFI boot on x86_64. > > > > > > > > Oh my, it really is that simple. > > > > > > > > Indeed, enabling this in the pre-kexec kernel fixes it all up. I had > > > > blindly disabled it in my quest to downsize the pre-kexec kernel to > > > > reduce boot time (it only runs a bootloader). In hindsight, the firmware > > > > drivers section is not really a good section to tweak on a whim. > > > > > > > > I'm terribly sorry to have taken your time to "fix" this "bug". But I > > > > must ask, is there any reason why this is a visible config option, or at > > > > least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c > > > > is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one > > > > probably wants to have kexec work properly if they can even enable it. > > > > > > Glad to know it works with the .config tweaking. I can not recall any > > > reason for that though. > > > > > > Since it sits in the efi code path, let's see how Ard thinks about > > > your proposal. > > > > > > > I don't understand why EFI_RUNTIME_MAP should depend on KEXEC_CORE at > > all: it is documented as a feature that can be enabled for debugging > > as well, and kexec does not work as expected without it. > > Probably debugging only mentioned in text, but not been considered in > the kconfig logic :( > > > > > Should we just change it like this perhaps? > > > > --- a/drivers/firmware/efi/Kconfig > > +++ b/drivers/firmware/efi/Kconfig > > @@ -28,8 +28,8 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE > > > > config EFI_RUNTIME_MAP > > bool "Export efi runtime maps to sysfs" > > - depends on X86 && EFI && KEXEC_CORE > > - default y > > + depends on X86 && EFI > > + default KEXEC_CORE > > help > > > > and maybe add an 'if EXPERT' so that the option is only visible to > > modify when CONFIG_EXPERT=y ? > > Above changes look good to me. > > > > > In any case, I intend to move this code into arch/x86 as well, so I'll > > have a couple of patches out shortly. > > That would be better since it is X86 only. Thanks, Ard. Hmm, before doing that, do you think it is useful for debugging purposes? That could be a reason to sit in efi code instead of x86 .. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-07 7:39 ` Dave Young @ 2022-11-07 7:54 ` Ard Biesheuvel 2022-11-07 8:08 ` Dave Young 0 siblings, 1 reply; 10+ messages in thread From: Ard Biesheuvel @ 2022-11-07 7:54 UTC (permalink / raw) To: Dave Young; +Cc: ns, Baoquan He, Eric Biederman, kexec, linux-kernel, linux-efi On Mon, 7 Nov 2022 at 08:40, Dave Young <dyoung@redhat.com> wrote: > > On Mon, 7 Nov 2022 at 15:36, Dave Young <dyoung@redhat.com> wrote: > > > > Hi Ard, > > > > On Mon, 7 Nov 2022 at 15:30, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 7 Nov 2022 at 07:55, Dave Young <dyoung@redhat.com> wrote: > > > > > > > > Hi, > > > > > > > > On Sat, 5 Nov 2022 at 22:16, <ns@tfwno.gf> wrote: > > > > > > > > > > On 2022-11-05 05:49, Dave Young wrote: > > > > > > Baoquan, thanks for cc me. > > > > > > > > > > > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > > > > > >> > > > > > >> Add Dave to CC > > > > > >> > > > > > >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > > > > > >> > Greetings, > > > > > >> > > > > > > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > > > > > >> > cause EFI mode (if that's the right term for it) to be unconditionally > > > > > >> > disabled, even when not using the --noefi option to kexec. > > > > > >> > > > > > > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime > > > > > >> > services, is that basically nothing about the system's EFI is visible > > > > > >> > post-kexec. Normally you have a message like this in dmesg when the > > > > > >> > system is booted in EFI mode: > > > > > >> > > > > > > >> > [ 0.000000] efi: EFI v2.70 by EDK II > > > > > >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > > > > > >> > MEMATTR=0x7ec63018 > > > > > >> > (obviously not the real firmware of the machine I'm talking about, but I > > > > > >> > can also send that if it would be of any help) > > > > > >> > > > > > > >> > No such message pops up in my dmesg as a result of this bug, & this > > > > > >> > causes some fallout like being unable to find the system's DMI > > > > > >> > information: > > > > > >> > > > > > > >> > <6>[ 0.000000] DMI not present or invalid. > > > > > >> > > > > > > >> > The efivarfs module also fails to load with -ENODEV. > > > > > >> > > > > > > >> > I've tried also booting with efi=runtime explicitly but it doesn't > > > > > >> > change anything. The kernel still does not print the name of the EFI > > > > > >> > firmware, DMI is still missing, & efivarfs still fails to load. > > > > > >> > > > > > > >> > I've been using the kexec_load syscall for all these tests, if it's > > > > > >> > important. > > > > > >> > > > > > > >> > Also, to make it very clear, all this only ever happens post-kexec. When > > > > > >> > booting straight from UEFI (with the EFI stub), all the aforementioned > > > > > >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, > > > > > >> > DMI is properly found, & efivarfs loads & mounts just fine). > > > > > >> > > > > > > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > > > > > >> > bisect it, but it seems like it goes pretty far back. I've got vanilla > > > > > >> > mainline kernel builds dating back to 5.17 that have the exact same > > > > > >> > issue. It might be worth noting that during this testing, I made sure > > > > > >> > the version of the kernel being kexeced & the kernel kexecing were the > > > > > >> > same version. It may not have been a problem in older kernels, but that > > > > > >> > would be difficult to test for me (a pretty important driver for this > > > > > >> > machine was only merged during v5.17-rc4). So it may not have been a > > > > > >> > regression & just a hidden problem since time immemorial. > > > > > >> > > > > > > >> > I am willing to test any patches I may get to further debug or fix > > > > > >> > this issue, preferably based on the current state of torvalds/linux.git. > > > > > >> > I can build & test kernels quite a few times per day. > > > > > >> > > > > > > >> > I can also send any important materials (kernel config, dmesg, firmware > > > > > >> > information, so on & so forth) on request. I'll also just mention I'm > > > > > >> > using kexec-tools 2.0.24 upfront, if it matters. > > > > > > > > > > > > Can you check the efi runtime in sysfs: > > > > > > ls /sys/firmware/efi/runtime-map/ > > > > > > > > > > > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > > > > > > is needed for kexec UEFI boot on x86_64. > > > > > > > > > > Oh my, it really is that simple. > > > > > > > > > > Indeed, enabling this in the pre-kexec kernel fixes it all up. I had > > > > > blindly disabled it in my quest to downsize the pre-kexec kernel to > > > > > reduce boot time (it only runs a bootloader). In hindsight, the firmware > > > > > drivers section is not really a good section to tweak on a whim. > > > > > > > > > > I'm terribly sorry to have taken your time to "fix" this "bug". But I > > > > > must ask, is there any reason why this is a visible config option, or at > > > > > least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c > > > > > is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one > > > > > probably wants to have kexec work properly if they can even enable it. > > > > > > > > Glad to know it works with the .config tweaking. I can not recall any > > > > reason for that though. > > > > > > > > Since it sits in the efi code path, let's see how Ard thinks about > > > > your proposal. > > > > > > > > > > I don't understand why EFI_RUNTIME_MAP should depend on KEXEC_CORE at > > > all: it is documented as a feature that can be enabled for debugging > > > as well, and kexec does not work as expected without it. > > > > Probably debugging only mentioned in text, but not been considered in > > the kconfig logic :( > > > > > > > > Should we just change it like this perhaps? > > > > > > --- a/drivers/firmware/efi/Kconfig > > > +++ b/drivers/firmware/efi/Kconfig > > > @@ -28,8 +28,8 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE > > > > > > config EFI_RUNTIME_MAP > > > bool "Export efi runtime maps to sysfs" > > > - depends on X86 && EFI && KEXEC_CORE > > > - default y > > > + depends on X86 && EFI > > > + default KEXEC_CORE > > > help > > > > > > and maybe add an 'if EXPERT' so that the option is only visible to > > > modify when CONFIG_EXPERT=y ? > > > > Above changes look good to me. > > > > > > > > In any case, I intend to move this code into arch/x86 as well, so I'll > > > have a couple of patches out shortly. > > > > That would be better since it is X86 only. Thanks, Ard. > > Hmm, before doing that, do you think it is useful for debugging > purposes? That could be a reason to sit in efi code instead of x86 .. > This code was only ever enabled on x86, and on ARM/arm64, we can capture the memory map via efi=debug on any kernel build, and capture the virtual mappings using PTDUMP (which also gives us the exact attributes for each mapped region) So I don't think it has that much value on non-x86 tbh. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bug: kexec on Lenovo ThinkPad T480 disables EFI mode 2022-11-07 7:54 ` Ard Biesheuvel @ 2022-11-07 8:08 ` Dave Young 0 siblings, 0 replies; 10+ messages in thread From: Dave Young @ 2022-11-07 8:08 UTC (permalink / raw) To: Ard Biesheuvel Cc: ns, Baoquan He, Eric Biederman, kexec, linux-kernel, linux-efi On Mon, 7 Nov 2022 at 15:55, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 7 Nov 2022 at 08:40, Dave Young <dyoung@redhat.com> wrote: > > > > On Mon, 7 Nov 2022 at 15:36, Dave Young <dyoung@redhat.com> wrote: > > > > > > Hi Ard, > > > > > > On Mon, 7 Nov 2022 at 15:30, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 7 Nov 2022 at 07:55, Dave Young <dyoung@redhat.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > On Sat, 5 Nov 2022 at 22:16, <ns@tfwno.gf> wrote: > > > > > > > > > > > > On 2022-11-05 05:49, Dave Young wrote: > > > > > > > Baoquan, thanks for cc me. > > > > > > > > > > > > > > On Sat, 5 Nov 2022 at 11:10, Baoquan He <bhe@redhat.com> wrote: > > > > > > >> > > > > > > >> Add Dave to CC > > > > > > >> > > > > > > >> On 10/28/22 at 01:02pm, ns@tfwno.gf wrote: > > > > > > >> > Greetings, > > > > > > >> > > > > > > > >> > I've been hitting a bug on my Lenovo ThinkPad T480 where kexecing will > > > > > > >> > cause EFI mode (if that's the right term for it) to be unconditionally > > > > > > >> > disabled, even when not using the --noefi option to kexec. > > > > > > >> > > > > > > > >> > What I mean by "EFI mode" being disabled, more than just EFI runtime > > > > > > >> > services, is that basically nothing about the system's EFI is visible > > > > > > >> > post-kexec. Normally you have a message like this in dmesg when the > > > > > > >> > system is booted in EFI mode: > > > > > > >> > > > > > > > >> > [ 0.000000] efi: EFI v2.70 by EDK II > > > > > > >> > [ 0.000000] efi: SMBIOS=0x7f98a000 ACPI=0x7fb7e000 ACPI 2.0=0x7fb7e014 > > > > > > >> > MEMATTR=0x7ec63018 > > > > > > >> > (obviously not the real firmware of the machine I'm talking about, but I > > > > > > >> > can also send that if it would be of any help) > > > > > > >> > > > > > > > >> > No such message pops up in my dmesg as a result of this bug, & this > > > > > > >> > causes some fallout like being unable to find the system's DMI > > > > > > >> > information: > > > > > > >> > > > > > > > >> > <6>[ 0.000000] DMI not present or invalid. > > > > > > >> > > > > > > > >> > The efivarfs module also fails to load with -ENODEV. > > > > > > >> > > > > > > > >> > I've tried also booting with efi=runtime explicitly but it doesn't > > > > > > >> > change anything. The kernel still does not print the name of the EFI > > > > > > >> > firmware, DMI is still missing, & efivarfs still fails to load. > > > > > > >> > > > > > > > >> > I've been using the kexec_load syscall for all these tests, if it's > > > > > > >> > important. > > > > > > >> > > > > > > > >> > Also, to make it very clear, all this only ever happens post-kexec. When > > > > > > >> > booting straight from UEFI (with the EFI stub), all the aforementioned > > > > > > >> > stuff that fails works perfectly fine (i.e. name of firmware is printed, > > > > > > >> > DMI is properly found, & efivarfs loads & mounts just fine). > > > > > > >> > > > > > > > >> > This is reproducible with a vanilla 6.1-rc2 kernel. I've been trying to > > > > > > >> > bisect it, but it seems like it goes pretty far back. I've got vanilla > > > > > > >> > mainline kernel builds dating back to 5.17 that have the exact same > > > > > > >> > issue. It might be worth noting that during this testing, I made sure > > > > > > >> > the version of the kernel being kexeced & the kernel kexecing were the > > > > > > >> > same version. It may not have been a problem in older kernels, but that > > > > > > >> > would be difficult to test for me (a pretty important driver for this > > > > > > >> > machine was only merged during v5.17-rc4). So it may not have been a > > > > > > >> > regression & just a hidden problem since time immemorial. > > > > > > >> > > > > > > > >> > I am willing to test any patches I may get to further debug or fix > > > > > > >> > this issue, preferably based on the current state of torvalds/linux.git. > > > > > > >> > I can build & test kernels quite a few times per day. > > > > > > >> > > > > > > > >> > I can also send any important materials (kernel config, dmesg, firmware > > > > > > >> > information, so on & so forth) on request. I'll also just mention I'm > > > > > > >> > using kexec-tools 2.0.24 upfront, if it matters. > > > > > > > > > > > > > > Can you check the efi runtime in sysfs: > > > > > > > ls /sys/firmware/efi/runtime-map/ > > > > > > > > > > > > > > If nothing then maybe you did not enable CONFIG_EFI_RUNTIME_MAP=y, it > > > > > > > is needed for kexec UEFI boot on x86_64. > > > > > > > > > > > > Oh my, it really is that simple. > > > > > > > > > > > > Indeed, enabling this in the pre-kexec kernel fixes it all up. I had > > > > > > blindly disabled it in my quest to downsize the pre-kexec kernel to > > > > > > reduce boot time (it only runs a bootloader). In hindsight, the firmware > > > > > > drivers section is not really a good section to tweak on a whim. > > > > > > > > > > > > I'm terribly sorry to have taken your time to "fix" this "bug". But I > > > > > > must ask, is there any reason why this is a visible config option, or at > > > > > > least not gated behind CONFIG_EXPERT? drivers/firmware/efi/runtime-map.c > > > > > > is pretty tiny, & considering it depends on CONFIG_KEXEC_CORE, one > > > > > > probably wants to have kexec work properly if they can even enable it. > > > > > > > > > > Glad to know it works with the .config tweaking. I can not recall any > > > > > reason for that though. > > > > > > > > > > Since it sits in the efi code path, let's see how Ard thinks about > > > > > your proposal. > > > > > > > > > > > > > I don't understand why EFI_RUNTIME_MAP should depend on KEXEC_CORE at > > > > all: it is documented as a feature that can be enabled for debugging > > > > as well, and kexec does not work as expected without it. > > > > > > Probably debugging only mentioned in text, but not been considered in > > > the kconfig logic :( > > > > > > > > > > > Should we just change it like this perhaps? > > > > > > > > --- a/drivers/firmware/efi/Kconfig > > > > +++ b/drivers/firmware/efi/Kconfig > > > > @@ -28,8 +28,8 @@ config EFI_VARS_PSTORE_DEFAULT_DISABLE > > > > > > > > config EFI_RUNTIME_MAP > > > > bool "Export efi runtime maps to sysfs" > > > > - depends on X86 && EFI && KEXEC_CORE > > > > - default y > > > > + depends on X86 && EFI > > > > + default KEXEC_CORE > > > > help > > > > > > > > and maybe add an 'if EXPERT' so that the option is only visible to > > > > modify when CONFIG_EXPERT=y ? > > > > > > Above changes look good to me. > > > > > > > > > > > In any case, I intend to move this code into arch/x86 as well, so I'll > > > > have a couple of patches out shortly. > > > > > > That would be better since it is X86 only. Thanks, Ard. > > > > Hmm, before doing that, do you think it is useful for debugging > > purposes? That could be a reason to sit in efi code instead of x86 .. > > > > This code was only ever enabled on x86, and on ARM/arm64, we can > capture the memory map via efi=debug on any kernel build, and capture > the virtual mappings using PTDUMP (which also gives us the exact > attributes for each mapped region) > > So I don't think it has that much value on non-x86 tbh. Ok, fair enough. Thanks Dave ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-11-07 8:10 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-28 13:02 Bug: kexec on Lenovo ThinkPad T480 disables EFI mode ns 2022-11-05 3:10 ` Baoquan He 2022-11-05 5:49 ` Dave Young 2022-11-05 14:16 ` ns 2022-11-07 6:54 ` Dave Young 2022-11-07 7:29 ` Ard Biesheuvel 2022-11-07 7:36 ` Dave Young 2022-11-07 7:39 ` Dave Young 2022-11-07 7:54 ` Ard Biesheuvel 2022-11-07 8:08 ` Dave Young
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox