* 3.12 to 3.13 boot regression bisected - still applies to 3.16 @ 2014-08-04 9:34 Bruno Prémont 2014-08-04 12:27 ` Matt Fleming 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-04 9:34 UTC (permalink / raw) To: P J P; +Cc: Andrew Morton, linux-kernel, linux-efi, Matt Fleming Hi, Since 3.13 kernels with built-in initrd fail to boot on Fujitsu hardware in EFI mode (efi stub) though the exact same kernel binary does boot in BIOS mode (grub). Interestingly EFI kernels with different config do boot under VMWare. Your patch "initramfs: read CONFIG_RD_ variables for initramfs compression" is the trigger. Is something missing in EFI stub or why do things behave differently? Looking at compiled kernel with and without this patch the resulting bzImage is similar in size but in build directory I get: Vanilla 3.13, CONFIG_INITRAMFS_COMPRESSION_NONE=y: -rw-r--r-- 1 kbuild kbuild 399 Aug 4 10:26 usr/built-in.mod.c -rw-r--r-- 1 kbuild kbuild 7062309 Aug 4 10:26 usr/built-in.o -rwxr-xr-x 1 kbuild kbuild 22670 Aug 4 10:26 usr/gen_init_cpio -rw-r--r-- 1 kbuild kbuild 7061260 Aug 4 10:26 usr/initramfs_data.cpio.gz -rw-r--r-- 1 kbuild kbuild 7062240 Aug 4 10:26 usr/initramfs_data.o -rw-r--r-- 1 kbuild kbuild 0 Aug 4 10:26 usr/modules.builtin -rw-r--r-- 1 kbuild kbuild 0 Aug 4 10:26 usr/modules.order Does not boot, reboots after exiting EFI stub Vanilla 3.13, CONFIG_INITRAMFS_COMPRESSION_GZIP=y: -rw-r--r-- 1 kbuild kbuild 399 Aug 4 10:26 usr/built-in.mod.c -rw-r--r-- 1 kbuild kbuild 7062309 Aug 4 10:26 usr/built-in.o -rwxr-xr-x 1 kbuild kbuild 22670 Aug 4 10:26 usr/gen_init_cpio -rw-r--r-- 1 kbuild kbuild 7061260 Aug 4 10:26 usr/initramfs_data.cpio.gz -rw-r--r-- 1 kbuild kbuild 7062240 Aug 4 10:26 usr/initramfs_data.o -rw-r--r-- 1 kbuild kbuild 0 Aug 4 10:37 usr/modules.builtin -rw-r--r-- 1 kbuild kbuild 0 Aug 4 10:37 usr/modules.order Does not boot, reboots after exiting EFI stub 3.13 with patch revered, CONFIG_INITRAMFS_COMPRESSION_NONE=y: -rw-r--r-- 1 kbuild kbuild 399 Aug 4 10:16 usr/built-in.mod.c -rw-r--r-- 1 kbuild kbuild 16931869 Aug 4 10:16 usr/built-in.o -rwxr-xr-x 1 kbuild kbuild 22670 Aug 4 10:16 usr/gen_init_cpio -rw-r--r-- 1 kbuild kbuild 16930816 Aug 4 10:16 usr/initramfs_data.cpio -rw-r--r-- 1 kbuild kbuild 16931800 Aug 4 10:16 usr/initramfs_data.o -rw-r--r-- 1 kbuild kbuild 0 Aug 4 10:16 usr/modules.builtin -rw-r--r-- 1 kbuild kbuild 0 Aug 4 10:16 usr/modules.order Boots successfully. Related config options: CONFIG_RD_GZIP=y CONFIG_RD_BZIP2=y CONFIG_RD_LZMA=y CONFIG_RD_XZ=y CONFIG_RD_LZO=y CONFIG_RD_LZ4=y CONFIG_INITRAMFS_SOURCE="/usr/src/initrd64-20131127.cpio" CONFIG_INITRAMFS_ROOT_UID=0 CONFIG_INITRAMFS_ROOT_GID=0 CONFIG_INITRAMFS_COMPRESSION_NONE=y # CONFIG_INITRAMFS_COMPRESSION_GZIP is not set # CONFIG_INITRAMFS_COMPRESSION_BZIP2 is not set # CONFIG_INITRAMFS_COMPRESSION_LZMA is not set # CONFIG_INITRAMFS_COMPRESSION_XZ is not set # CONFIG_INITRAMFS_COMPRESSION_LZO is not set Bruno ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-04 9:34 3.12 to 3.13 boot regression bisected - still applies to 3.16 Bruno Prémont @ 2014-08-04 12:27 ` Matt Fleming 2014-08-04 13:06 ` Bruno Prémont 0 siblings, 1 reply; 13+ messages in thread From: Matt Fleming @ 2014-08-04 12:27 UTC (permalink / raw) To: Bruno Prémont; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Mon, 04 Aug, at 11:34:35AM, Bruno Prémont wrote: > Hi, > > Since 3.13 kernels with built-in initrd fail to boot on Fujitsu hardware > in EFI mode (efi stub) though the exact same kernel binary does boot in > BIOS mode (grub). > Interestingly EFI kernels with different config do boot under VMWare. > > Your patch "initramfs: read CONFIG_RD_ variables for initramfs > compression" is the trigger. > > > Is something missing in EFI stub or why do things behave differently? Nuts. I suspect it's an EFI boot stub bug. Have you definitely tried out 3.16? In particular the following commit might make a difference, commit c7fb93ec51d4 Author: Michael Brown <mbrown@fensystems.co.uk> Date: Thu Jul 10 12:26:20 2014 +0100 x86/efi: Include a .bss section within the PE/COFF headers The PE/COFF headers currently describe only the initialised-data portions of the image, and result in no space being allocated for the uninitialised-data portions. Consequently, the EFI boot stub will end up overwriting unexpected areas of memory, with unpredictable results. Fix by including a .bss section in the PE/COFF headers (functionally equivalent to the init_size field in the bzImage header). Signed-off-by: Michael Brown <mbrown@fensystems.co.uk> Cc: Thomas Bächler <thomas@archlinux.org> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> > Looking at compiled kernel with and without this patch the resulting > bzImage is similar in size but in build directory I get: Could you send me the initrd image? I'd like to try and reproduce this on my end, even though I regularly boot with a built-in initrd. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-04 12:27 ` Matt Fleming @ 2014-08-04 13:06 ` Bruno Prémont 2014-08-04 13:54 ` Matt Fleming 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-04 13:06 UTC (permalink / raw) To: Matt Fleming; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi Hi Matt, On Mon, 4 Aug 2014 13:27:28 +0100 Matt Fleming wrote: > On Mon, 04 Aug, at 11:34:35AM, Bruno Prémont wrote: > > Hi, > > > > Since 3.13 kernels with built-in initrd fail to boot on Fujitsu hardware > > in EFI mode (efi stub) though the exact same kernel binary does boot in > > BIOS mode (grub). > > Interestingly EFI kernels with different config do boot under VMWare. > > > > Your patch "initramfs: read CONFIG_RD_ variables for initramfs > > compression" is the trigger. > > > > > > Is something missing in EFI stub or why do things behave differently? > > Nuts. I suspect it's an EFI boot stub bug. Have you definitely tried out > 3.16? In particular the following commit might make a difference, > > commit c7fb93ec51d4 > Author: Michael Brown <mbrown@fensystems.co.uk> > Date: Thu Jul 10 12:26:20 2014 +0100 > > x86/efi: Include a .bss section within the PE/COFF headers > > The PE/COFF headers currently describe only the initialised-data > portions of the image, and result in no space being allocated for the > uninitialised-data portions. Consequently, the EFI boot stub will end > up overwriting unexpected areas of memory, with unpredictable results. > > Fix by including a .bss section in the PE/COFF headers (functionally > equivalent to the init_size field in the bzImage header). > > Signed-off-by: Michael Brown <mbrown@fensystems.co.uk> > Cc: Thomas Bächler <thomas@archlinux.org> > Cc: Josh Boyer <jwboyer@fedoraproject.org> > Cc: <stable@vger.kernel.org> > Signed-off-by: Matt Fleming <matt.fleming@intel.com> Yes, I did as I have seen that patch flying by, but it did not help (I tried at 3.16-rc7). On 3.16-rc7 I even tried adding earlyprintk=efi,keep, console=efi, ignore_loglevel and added some efi_printk() in EFI stub (in the spirit of https://bugzilla.kernel.org/show_bug.cgi?id=68761) The last message I get is my efi_printk() right before exiting boot services. Without my efi_printk() there is no output at all. Then system reboots. There is no output on serial console either (via BMC), (earlycon=uart,io,0x3f8,115200 or earlyprintk=serial,ttyS0,115200) I even tried without initrd (setting CONFIG_INITRAMFS_SOURCE="") and got the same end-result. > > Looking at compiled kernel with and without this patch the resulting > > bzImage is similar in size but in build directory I get: > > Could you send me the initrd image? I'd like to try and reproduce this > on my end, even though I regularly boot with a built-in initrd. I could share a slightly modified one, replacing the contained /etc/passwd. It's about 16MiB in size due to RAID controller management blobs for recovery. Except for that it just tries to find ROOT partition, setting up dmcrypt if needed. Note: System is Fujitsu Primergy RX200 S7 server, BIOS revisions tested: R2.21.0 and R2.25.0. The same initrd works on Fujitsu laptop (Lifebook U904, EFI, other kernel config) works though. Any hint on how to find out what fails would be nice! initrd issues tend not to be easy to debug (it would help if initrd issues could be reported at the time kernel tries to start init - e.g. when console outputs are up and running). Bruno ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-04 13:06 ` Bruno Prémont @ 2014-08-04 13:54 ` Matt Fleming 2014-08-05 8:02 ` Bruno Prémont 0 siblings, 1 reply; 13+ messages in thread From: Matt Fleming @ 2014-08-04 13:54 UTC (permalink / raw) To: Bruno Prémont; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Mon, 04 Aug, at 03:06:27PM, Bruno Prémont wrote: > > Yes, I did as I have seen that patch flying by, but it did not help > (I tried at 3.16-rc7). :-( Thanks for testing. > On 3.16-rc7 I even tried adding earlyprintk=efi,keep, console=efi, > ignore_loglevel and added some efi_printk() in EFI stub (in the spirit > of https://bugzilla.kernel.org/show_bug.cgi?id=68761) > The last message I get is my efi_printk() right before exiting boot > services. Without my efi_printk() there is no output at all. > > Then system reboots. OK, so the fact that the system reboots suggests that the boot stub/kernel caused a fault. > There is no output on serial console either (via BMC), > (earlycon=uart,io,0x3f8,115200 or earlyprintk=serial,ttyS0,115200) > > > I even tried without initrd (setting CONFIG_INITRAMFS_SOURCE="") > and got the same end-result. Oh that's interesting. > I could share a slightly modified one, replacing the > contained /etc/passwd. It's about 16MiB in size due to RAID controller > management blobs for recovery. Except for that it just tries to find > ROOT partition, setting up dmcrypt if needed. This shouldn't be necessary if you can reproduce the issue without an initrd as you stated above. > Any hint on how to find out what fails would be nice! > initrd issues tend not to be easy to debug (it would help if initrd > issues could be reported at the time kernel tries to start init - e.g. > when console outputs are up and running). I don't think this is necessarily an initrd issue. The way that I would debug this is to insert while(1); into strategic places. Yes, it's lame and time consuming, but it's effective. My first suggestion would be setup_arch(). In particular, because your machine is resetting, I'd guess that the kernel's early trap handlers haven't yet been installed. So throw a, while (1); in there and see if you can get your machine to hang instead of reset. If it doesn't hang, the reset occurs earlier in boot - work backwards. If it does hang then you know that execution gets at least that far - work forwards. Like I said, lame but effective. Meanwhile I'm going to go and stare at the EFI boot stub code and instrument OVMF to check for more memory corruption bugs like the one Michael found in commit c7fb93ec51d4 ("x86/efi: Include a .bss section within the PE/COFF headers"). -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-04 13:54 ` Matt Fleming @ 2014-08-05 8:02 ` Bruno Prémont 2014-08-05 8:45 ` Matt Fleming 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-05 8:02 UTC (permalink / raw) To: Matt Fleming; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Mon, 4 Aug 2014 14:54:52 +0100 Matt Fleming wrote: > On Mon, 04 Aug, at 03:06:27PM, Bruno Prémont wrote: > > > > Yes, I did as I have seen that patch flying by, but it did not help > > (I tried at 3.16-rc7). > > :-( Thanks for testing. > > > On 3.16-rc7 I even tried adding earlyprintk=efi,keep, console=efi, > > ignore_loglevel and added some efi_printk() in EFI stub (in the spirit > > of https://bugzilla.kernel.org/show_bug.cgi?id=68761) > > The last message I get is my efi_printk() right before exiting boot > > services. Without my efi_printk() there is no output at all. > > > > Then system reboots. > > OK, so the fact that the system reboots suggests that the boot > stub/kernel caused a fault. > > > There is no output on serial console either (via BMC), > > (earlycon=uart,io,0x3f8,115200 or earlyprintk=serial,ttyS0,115200) > > > > > > I even tried without initrd (setting CONFIG_INITRAMFS_SOURCE="") > > and got the same end-result. > > Oh that's interesting. > > > I could share a slightly modified one, replacing the > > contained /etc/passwd. It's about 16MiB in size due to RAID controller > > management blobs for recovery. Except for that it just tries to find > > ROOT partition, setting up dmcrypt if needed. > > This shouldn't be necessary if you can reproduce the issue without an > initrd as you stated above. I just verified CONFIG_INITRAMFS_SOURCE="" on 3.16 and it reboots. > > Any hint on how to find out what fails would be nice! > > initrd issues tend not to be easy to debug (it would help if initrd > > issues could be reported at the time kernel tries to start init - e.g. > > when console outputs are up and running). > > I don't think this is necessarily an initrd issue. > > The way that I would debug this is to insert while(1); into strategic > places. Yes, it's lame and time consuming, but it's effective. > > My first suggestion would be setup_arch(). In particular, because your > machine is resetting, I'd guess that the kernel's early trap handlers > haven't yet been installed. > > So throw a, > > while (1); > > in there and see if you can get your machine to hang instead of reset. > If it doesn't hang, the reset occurs earlier in boot - work backwards. > If it does hang then you know that execution gets at least that far - > work forwards. Like I said, lame but effective. I tried in setup_arch(), but system still keeps rebooting. Working backwards I got to x86_64_start_kernel() in arch/x86/kernel/head64.c but system is still rebooting. Not sure what happens before x86_64_start_kernel() is called, it seems to be called from ASM code in arch/x86/kernel/head_64.S. > Meanwhile I'm going to go and stare at the EFI boot stub code and > instrument OVMF to check for more memory corruption bugs like the one > Michael found in commit c7fb93ec51d4 ("x86/efi: Include a .bss section > within the PE/COFF headers"). If there are places between exit_boot() in arch/x86/boot/compressed/eboot.c and x86_64_start_kernel() where I should include such loops, please tell! Bruno ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 8:02 ` Bruno Prémont @ 2014-08-05 8:45 ` Matt Fleming 2014-08-05 9:13 ` Bruno Prémont 0 siblings, 1 reply; 13+ messages in thread From: Matt Fleming @ 2014-08-05 8:45 UTC (permalink / raw) To: Bruno Prémont; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Tue, 05 Aug, at 10:02:42AM, Bruno Prémont wrote: > > I tried in setup_arch(), but system still keeps rebooting. > > Working backwards I got to x86_64_start_kernel() in > arch/x86/kernel/head64.c but system is still rebooting. Thanks for doing this. I'm sure it was a major PITA ;-) > Not sure what happens before x86_64_start_kernel() is called, it seems > to be called from ASM code in arch/x86/kernel/head_64.S. Yep. Roughly the code flow goes like this (chronologically), efi_pe_entry() [arch/x86/boot/compressed/head_64.S] efi_main() [arch/x86/boot/compressed/eboot.c] startup_64 [arch/x86/kernel/head_64.S] secondary_startup64 [arch/x86/kernel/head_64.S] x86_64_start_kernel() [arch/x86/kernel/head64.c] > > Meanwhile I'm going to go and stare at the EFI boot stub code and > > instrument OVMF to check for more memory corruption bugs like the one > > Michael found in commit c7fb93ec51d4 ("x86/efi: Include a .bss section > > within the PE/COFF headers"). > > If there are places between exit_boot() in > arch/x86/boot/compressed/eboot.c and x86_64_start_kernel() where I > should include such loops, please tell! I guess we need to verify efi_main() actually exits correctly. So a while (1); loop at the end of that function would be useful. Assuming that does actually hang, you get the fun of rummaging around in the early assembly code, where you can use something like this, bruno: hlt jmp bruno to try and force a hang. Could you also attach your .config? In particular I'm wondering whether you've got CONFIG_RELOCATBLE enabled. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 8:45 ` Matt Fleming @ 2014-08-05 9:13 ` Bruno Prémont 2014-08-05 9:18 ` Matt Fleming 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-05 9:13 UTC (permalink / raw) To: Matt Fleming; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi [-- Attachment #1: Type: text/plain, Size: 2179 bytes --] On Tue, 5 Aug 2014 09:45:42 +0100 Matt Fleming wrote: > On Tue, 05 Aug, at 10:02:42AM, Bruno Prémont wrote: > > > > I tried in setup_arch(), but system still keeps rebooting. > > > > Working backwards I got to x86_64_start_kernel() in > > arch/x86/kernel/head64.c but system is still rebooting. > > Thanks for doing this. I'm sure it was a major PITA ;-) Fortunately building the kernels on a separate system and being able to build a dozen of them to try from efi shell it's survivable. > > Not sure what happens before x86_64_start_kernel() is called, it seems > > to be called from ASM code in arch/x86/kernel/head_64.S. > > Yep. Roughly the code flow goes like this (chronologically), > > efi_pe_entry() [arch/x86/boot/compressed/head_64.S] > efi_main() [arch/x86/boot/compressed/eboot.c] I get at least to just before status = efi_call_early(exit_boot_services, handle, key); in eboot.c on line 1310. A efi_printk inserted there is displayed. > startup_64 [arch/x86/kernel/head_64.S] > secondary_startup64 [arch/x86/kernel/head_64.S] > x86_64_start_kernel() [arch/x86/kernel/head64.c] > > > > Meanwhile I'm going to go and stare at the EFI boot stub code and > > > instrument OVMF to check for more memory corruption bugs like the one > > > Michael found in commit c7fb93ec51d4 ("x86/efi: Include a .bss section > > > within the PE/COFF headers"). > > > > If there are places between exit_boot() in > > arch/x86/boot/compressed/eboot.c and x86_64_start_kernel() where I > > should include such loops, please tell! > > I guess we need to verify efi_main() actually exits correctly. So a > while (1); loop at the end of that function would be useful. > > Assuming that does actually hang, you get the fun of rummaging around in > the early assembly code, where you can use something like this, > > bruno: > hlt > jmp bruno > > to try and force a hang. Will spin a few attempts and see what I get. > Could you also attach your .config? In particular I'm wondering whether > you've got CONFIG_RELOCATBLE enabled. Config attached (gzipped). CONFIG_RELOCATBLE is not enabled. Bruno [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 18362 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 9:13 ` Bruno Prémont @ 2014-08-05 9:18 ` Matt Fleming 2014-08-05 11:51 ` Bruno Prémont 0 siblings, 1 reply; 13+ messages in thread From: Matt Fleming @ 2014-08-05 9:18 UTC (permalink / raw) To: Bruno Prémont; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Tue, 05 Aug, at 11:13:30AM, Bruno Prémont wrote: > > I get at least to just before > status = efi_call_early(exit_boot_services, handle, key); > in eboot.c on line 1310. A efi_printk inserted there is displayed. This is worth pointing out in case you're unaware, but do you know that it's not valid to call efi_printk() after ExitBootServices()? Doing so will almost certainly cause your machine to fault. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 9:18 ` Matt Fleming @ 2014-08-05 11:51 ` Bruno Prémont 2014-08-05 12:11 ` Bruno Prémont 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-05 11:51 UTC (permalink / raw) To: Matt Fleming; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Tue, 5 Aug 2014 10:18:48 +0100 Matt Fleming wrote: > On Tue, 05 Aug, at 11:13:30AM, Bruno Prémont wrote: > > > > I get at least to just before > > status = efi_call_early(exit_boot_services, handle, key); > > in eboot.c on line 1310. A efi_printk inserted there is displayed. > > This is worth pointing out in case you're unaware, but do you know that > it's not valid to call efi_printk() after ExitBootServices()? Doing so > will almost certainly cause your machine to fault. I am aware that efi_printk() uses boot services! Now I tried out loops at many places and have gotten up to line 340 in arch/x86/kernel/head_64.S System reboots within the following assembler instructions (does not reach line 359). So efi_main() returns successfully but the assembler code following it gets something wrong. I'm going to try further to determine which line between 340 and 359 is the "bad" one. Bruno ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 11:51 ` Bruno Prémont @ 2014-08-05 12:11 ` Bruno Prémont 2014-08-05 12:55 ` Matt Fleming 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-05 12:11 UTC (permalink / raw) To: Matt Fleming; +Cc: P J P, Andrew Morton, linux-kernel, linux-efi On Tue, 5 Aug 2014 13:51:30 +0200 Bruno Prémont wrote: > On Tue, 5 Aug 2014 10:18:48 +0100 Matt Fleming wrote: > > On Tue, 05 Aug, at 11:13:30AM, Bruno Prémont wrote: > > > > > > I get at least to just before > > > status = efi_call_early(exit_boot_services, handle, key); > > > in eboot.c on line 1310. A efi_printk inserted there is displayed. > > > > This is worth pointing out in case you're unaware, but do you know that > > it's not valid to call efi_printk() after ExitBootServices()? Doing so > > will almost certainly cause your machine to fault. > > I am aware that efi_printk() uses boot services! > > > Now I tried out loops at many places and have gotten up to line 340 in > > arch/x86/kernel/head_64.S oops, bad copy&paste, should have been arch/x86/boot/compressed/head_64.S > System reboots within the following assembler instructions (does not > reach line 359). > > So efi_main() returns successfully but the assembler code following it > gets something wrong. > > I'm going to try further to determine which line between 340 and 359 is > the "bad" one. arch/x86/boot/compressed/head_64.S 341 /* 342 * Copy the compressed kernel to the end of our buffer 343 * where decompression in place becomes safe. 344 */ 345 pushq %rsi 346 leaq (_bss-8)(%rip), %rsi 347 leaq (_bss-8)(%rbx), %rdi 348 movq $_bss /* - $startup_32 */, %rcx 349 shrq $3, %rcx 350 std code gets up to here 351 rep movsq this location is never reached but instead system reboots 352 cld 353 popq %rsi 354 355 /* 356 * Jump to the relocated address. 357 */ Bruno ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 12:11 ` Bruno Prémont @ 2014-08-05 12:55 ` Matt Fleming 2014-08-05 14:21 ` Bruno Prémont 0 siblings, 1 reply; 13+ messages in thread From: Matt Fleming @ 2014-08-05 12:55 UTC (permalink / raw) To: Bruno Prémont Cc: P J P, Andrew Morton, linux-kernel, linux-efi, H. Peter Anvin On Tue, 05 Aug, at 02:11:42PM, Bruno Prémont wrote: > > arch/x86/boot/compressed/head_64.S > 341 /* > 342 * Copy the compressed kernel to the end of our buffer > 343 * where decompression in place becomes safe. > 344 */ > 345 pushq %rsi > 346 leaq (_bss-8)(%rip), %rsi > 347 leaq (_bss-8)(%rbx), %rdi > 348 movq $_bss /* - $startup_32 */, %rcx > 349 shrq $3, %rcx > 350 std > > code gets up to here > > 351 rep movsq > > this location is never reached but instead system reboots > > 352 cld > 353 popq %rsi > 354 > 355 /* > 356 * Jump to the relocated address. > 357 */ Excellent. Thanks for doing this, it's all starting to make sense now. I suspect if you enable CONFIG_RELOCATABLE things will work just fine. I've actually got a patch to force that option if CONFIG_EFI_STUB is enabled to mitigate this exact problem. Could you try it out (see below)? Without CONFIG_RELOCATABLE the early boot code will try and decompress the kernel image to LOAD_PHYSICAL_ADDR. That may have worked in the days of BIOS, when it was reasonable to assume that nothing important would be sitting in the 0x10000000 region, but that's just not so for UEFI. For UEFI we need to request memory from the firmware and not stray outside the bounds of those allocations. Otherwise there's the potential that we'll trash bits of the firmware's code/data. --- >From b9ffb908e18b11be1d89e4c39bee5d4671d3fa6f Mon Sep 17 00:00:00 2001 From: Matt Fleming <matt.fleming@intel.com> Date: Fri, 11 Jul 2014 08:45:25 +0100 Subject: [PATCH] x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Without CONFIG_RELOCATABLE the early boot code will decompress the kernel to LOAD_PHYSICAL_ADDR. While this may have been fine in the BIOS days, that isn't going to fly with UEFI since parts of the firmware code/data may be located at LOAD_PHYSICAL_ADDR. Straying outside of the bounds of the regions we've explicitly requested from the firmware will cause all sorts of trouble. Bruno reports that his machine resets while trying to decompress the kernel image. We already go to great pains to ensure the kernel is loaded into a suitably aligned buffer, it's just that the address isn't necessarily LOAD_PHYSICAL_ADDR, because we can't guarantee that address isn't in-use by the firmware. Explicitly enforce CONFIG_RELOCATABLE for the EFI boot stub, so that we can load the kernel at any address with the correct alignment. Reported-by: Bruno Prémont <bonbons@linux-vserver.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 801ed36c2e49..f31b8ae8c81c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1537,6 +1537,7 @@ config EFI config EFI_STUB bool "EFI stub support" depends on EFI + select RELOCATABLE ---help--- This kernel feature allows a bzImage to be loaded directly by EFI firmware without the use of a bootloader. -- 1.9.0 -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 12:55 ` Matt Fleming @ 2014-08-05 14:21 ` Bruno Prémont 2014-08-05 15:07 ` Matt Fleming 0 siblings, 1 reply; 13+ messages in thread From: Bruno Prémont @ 2014-08-05 14:21 UTC (permalink / raw) To: Matt Fleming Cc: P J P, Andrew Morton, linux-kernel, linux-efi, H. Peter Anvin On Tue, 5 Aug 2014 13:55:48 +0100 Matt Fleming wrote: > I suspect if you enable CONFIG_RELOCATABLE things will work just fine. > I've actually got a patch to force that option if CONFIG_EFI_STUB is > enabled to mitigate this exact problem. Could you try it out (see > below)? > > Without CONFIG_RELOCATABLE the early boot code will try and decompress > the kernel image to LOAD_PHYSICAL_ADDR. That may have worked in the days > of BIOS, when it was reasonable to assume that nothing important would > be sitting in the 0x10000000 region, but that's just not so for UEFI. > > For UEFI we need to request memory from the firmware and not stray > outside the bounds of those allocations. Otherwise there's the potential > that we'll trash bits of the firmware's code/data. Thanks, enabling CONFIG_RELOCATABLE allows kernel to successfully boot! So you can add my tested-by to the patch. If of interest, memory layout information as reported by 3.16 with CONFIG_RELOCATABLE enabled: [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007da97fff] usable [ 0.000000] BIOS-e820: [mem 0x000000007da98000-0x000000007dae5fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000007dae6000-0x000000007db80fff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000007db81000-0x000000007dd89fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000007dd8a000-0x000000007f362fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000007f363000-0x000000007f363fff] usable [ 0.000000] BIOS-e820: [mem 0x000000007f364000-0x000000007f3e9fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000007f3ea000-0x000000007f7fffff] usable [ 0.000000] BIOS-e820: [mem 0x0000000080000000-0x000000008fffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000047fffffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] e820: update [mem 0x6d6e3018-0x6d6f7657] usable ==> usable [ 0.000000] e820: update [mem 0x6d6da018-0x6d6e2057] usable ==> usable [ 0.000000] extended physical RAM map: [ 0.000000] reserve setup_data: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000000100000-0x000000006d6da017] usable [ 0.000000] reserve setup_data: [mem 0x000000006d6da018-0x000000006d6e2057] usable [ 0.000000] reserve setup_data: [mem 0x000000006d6e2058-0x000000006d6e3017] usable [ 0.000000] reserve setup_data: [mem 0x000000006d6e3018-0x000000006d6f7657] usable [ 0.000000] reserve setup_data: [mem 0x000000006d6f7658-0x000000007da97fff] usable [ 0.000000] reserve setup_data: [mem 0x000000007da98000-0x000000007dae5fff] reserved [ 0.000000] reserve setup_data: [mem 0x000000007dae6000-0x000000007db80fff] ACPI data [ 0.000000] reserve setup_data: [mem 0x000000007db81000-0x000000007dd89fff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x000000007dd8a000-0x000000007f362fff] reserved [ 0.000000] reserve setup_data: [mem 0x000000007f363000-0x000000007f363fff] usable [ 0.000000] reserve setup_data: [mem 0x000000007f364000-0x000000007f3e9fff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x000000007f3ea000-0x000000007f7fffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000080000000-0x000000008fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000100000000-0x000000047fffffff] usable [ 0.000000] efi: EFI v2.31 by American Megatrends [ 0.000000] efi: ACPI=0x7db05000 ACPI 2.0=0x7db05000 SMBIOS=0xf04c0 MPS=0xfd4b0 [ 0.000000] efi: mem00: type=3, attr=0xf, range=[0x0000000000000000-0x0000000000008000) (0MB) [ 0.000000] efi: mem01: type=2, attr=0xf, range=[0x0000000000008000-0x000000000000e000) (0MB) [ 0.000000] efi: mem02: type=7, attr=0xf, range=[0x000000000000e000-0x000000000003f000) (0MB) [ 0.000000] efi: mem03: type=4, attr=0xf, range=[0x000000000003f000-0x0000000000050000) (0MB) [ 0.000000] efi: mem04: type=3, attr=0xf, range=[0x0000000000050000-0x00000000000a0000) (0MB) [ 0.000000] efi: mem05: type=7, attr=0xf, range=[0x0000000000100000-0x0000000001000000) (15MB) [ 0.000000] efi: mem06: type=2, attr=0xf, range=[0x0000000001000000-0x0000000001100000) (1MB) [ 0.000000] efi: mem07: type=7, attr=0xf, range=[0x0000000001100000-0x0000000002000000) (15MB) [ 0.000000] efi: mem08: type=2, attr=0xf, range=[0x0000000002000000-0x00000000038e7000) (24MB) [ 0.000000] efi: mem09: type=7, attr=0xf, range=[0x00000000038e7000-0x000000006d6d9000) (1693MB) [ 0.000000] efi: mem10: type=2, attr=0xf, range=[0x000000006d6d9000-0x000000006d6f9000) (0MB) [ 0.000000] efi: mem11: type=1, attr=0xf, range=[0x000000006d6f9000-0x000000006efe0000) (24MB) [ 0.000000] efi: mem12: type=7, attr=0xf, range=[0x000000006efe0000-0x0000000071d8d000) (45MB) [ 0.000000] efi: mem13: type=4, attr=0xf, range=[0x0000000071d8d000-0x0000000071e5c000) (0MB) [ 0.000000] efi: mem14: type=7, attr=0xf, range=[0x0000000071e5c000-0x0000000071e60000) (0MB) [ 0.000000] efi: mem15: type=4, attr=0xf, range=[0x0000000071e60000-0x0000000071e90000) (0MB) [ 0.000000] efi: mem16: type=7, attr=0xf, range=[0x0000000071e90000-0x0000000071eb5000) (0MB) [ 0.000000] efi: mem17: type=4, attr=0xf, range=[0x0000000071eb5000-0x0000000071fe6000) (1MB) [ 0.000000] efi: mem18: type=7, attr=0xf, range=[0x0000000071fe6000-0x0000000071fef000) (0MB) [ 0.000000] efi: mem19: type=4, attr=0xf, range=[0x0000000071fef000-0x0000000071ff1000) (0MB) [ 0.000000] efi: mem20: type=7, attr=0xf, range=[0x0000000071ff1000-0x0000000071ff5000) (0MB) [ 0.000000] efi: mem21: type=4, attr=0xf, range=[0x0000000071ff5000-0x0000000071ff6000) (0MB) [ 0.000000] efi: mem22: type=7, attr=0xf, range=[0x0000000071ff6000-0x000000007200f000) (0MB) [ 0.000000] efi: mem23: type=4, attr=0xf, range=[0x000000007200f000-0x0000000072010000) (0MB) [ 0.000000] efi: mem24: type=7, attr=0xf, range=[0x0000000072010000-0x0000000072013000) (0MB) [ 0.000000] efi: mem25: type=4, attr=0xf, range=[0x0000000072013000-0x0000000072017000) (0MB) [ 0.000000] efi: mem26: type=7, attr=0xf, range=[0x0000000072017000-0x0000000072018000) (0MB) [ 0.000000] efi: mem27: type=4, attr=0xf, range=[0x0000000072018000-0x0000000072019000) (0MB) [ 0.000000] efi: mem28: type=7, attr=0xf, range=[0x0000000072019000-0x0000000072061000) (0MB) [ 0.000000] efi: mem29: type=4, attr=0xf, range=[0x0000000072061000-0x00000000720a6000) (0MB) [ 0.000000] efi: mem30: type=7, attr=0xf, range=[0x00000000720a6000-0x00000000720ab000) (0MB) [ 0.000000] efi: mem31: type=4, attr=0xf, range=[0x00000000720ab000-0x00000000720d1000) (0MB) [ 0.000000] efi: mem32: type=7, attr=0xf, range=[0x00000000720d1000-0x00000000720dc000) (0MB) [ 0.000000] efi: mem33: type=4, attr=0xf, range=[0x00000000720dc000-0x00000000720dd000) (0MB) [ 0.000000] efi: mem34: type=7, attr=0xf, range=[0x00000000720dd000-0x00000000720f9000) (0MB) [ 0.000000] efi: mem35: type=4, attr=0xf, range=[0x00000000720f9000-0x000000007232e000) (2MB) [ 0.000000] efi: mem36: type=7, attr=0xf, range=[0x000000007232e000-0x0000000072332000) (0MB) [ 0.000000] efi: mem37: type=4, attr=0xf, range=[0x0000000072332000-0x0000000072357000) (0MB) [ 0.000000] efi: mem38: type=7, attr=0xf, range=[0x0000000072357000-0x000000007235e000) (0MB) [ 0.000000] efi: mem39: type=4, attr=0xf, range=[0x000000007235e000-0x000000007235f000) (0MB) [ 0.000000] efi: mem40: type=7, attr=0xf, range=[0x000000007235f000-0x0000000072365000) (0MB) [ 0.000000] efi: mem41: type=4, attr=0xf, range=[0x0000000072365000-0x0000000072370000) (0MB) [ 0.000000] efi: mem42: type=7, attr=0xf, range=[0x0000000072370000-0x0000000072376000) (0MB) [ 0.000000] efi: mem43: type=4, attr=0xf, range=[0x0000000072376000-0x000000007237f000) (0MB) [ 0.000000] efi: mem44: type=7, attr=0xf, range=[0x000000007237f000-0x0000000072380000) (0MB) [ 0.000000] efi: mem45: type=4, attr=0xf, range=[0x0000000072380000-0x000000007d0ff000) (173MB) [ 0.000000] efi: mem46: type=7, attr=0xf, range=[0x000000007d0ff000-0x000000007d5e4000) (4MB) [ 0.000000] efi: mem47: type=3, attr=0xf, range=[0x000000007d5e4000-0x000000007da98000) (4MB) [ 0.000000] efi: mem48: type=0, attr=0xf, range=[0x000000007da98000-0x000000007daa7000) (0MB) [ 0.000000] efi: mem49: type=0, attr=0xf, range=[0x000000007daa7000-0x000000007dae6000) (0MB) [ 0.000000] efi: mem50: type=9, attr=0xf, range=[0x000000007dae6000-0x000000007db05000) (0MB) [ 0.000000] efi: mem51: type=9, attr=0xf, range=[0x000000007db05000-0x000000007db81000) (0MB) [ 0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007db81000-0x000000007dc6f000) (0MB) [ 0.000000] efi: mem53: type=10, attr=0xf, range=[0x000000007dc6f000-0x000000007dd8a000) (1MB) [ 0.000000] efi: mem54: type=6, attr=0x800000000000000f, range=[0x000000007dd8a000-0x000000007e1d2000) (4MB) [ 0.000000] efi: mem55: type=6, attr=0x800000000000000f, range=[0x000000007e1d2000-0x000000007e23e000) (0MB) [ 0.000000] efi: mem56: type=6, attr=0x800000000000000f, range=[0x000000007e23e000-0x000000007e240000) (0MB) [ 0.000000] efi: mem57: type=6, attr=0x800000000000000f, range=[0x000000007e240000-0x000000007f2fc000) (16MB) [ 0.000000] efi: mem58: type=5, attr=0x800000000000000f, range=[0x000000007f2fc000-0x000000007f310000) (0MB) [ 0.000000] efi: mem59: type=5, attr=0x800000000000000f, range=[0x000000007f310000-0x000000007f363000) (0MB) [ 0.000000] efi: mem60: type=4, attr=0xf, range=[0x000000007f363000-0x000000007f364000) (0MB) [ 0.000000] efi: mem61: type=10, attr=0xf, range=[0x000000007f364000-0x000000007f3ea000) (0MB) [ 0.000000] efi: mem62: type=4, attr=0xf, range=[0x000000007f3ea000-0x000000007f539000) (1MB) [ 0.000000] efi: mem63: type=3, attr=0xf, range=[0x000000007f539000-0x000000007f7d7000) (2MB) [ 0.000000] efi: mem64: type=4, attr=0xf, range=[0x000000007f7d7000-0x000000007f7dd000) (0MB) [ 0.000000] efi: mem65: type=3, attr=0xf, range=[0x000000007f7dd000-0x000000007f7e1000) (0MB) [ 0.000000] efi: mem66: type=4, attr=0xf, range=[0x000000007f7e1000-0x000000007f800000) (0MB) [ 0.000000] efi: mem67: type=7, attr=0xf, range=[0x0000000100000000-0x0000000480000000) (14336MB) [ 0.000000] efi: mem68: type=11, attr=0x8000000000000001, range=[0x0000000080000000-0x0000000090000000) (256MB) [ 0.000000] efi: mem69: type=11, attr=0x8000000000000001, range=[0x00000000fed1c000-0x00000000fed20000) (0MB) [ 0.000000] efi: mem70: type=11, attr=0x8000000000000001, range=[0x00000000ff000000-0x0000000100000000) (16MB) [ 0.000000] SMBIOS 2.7 present. [ 0.000000] DMI: FUJITSU PRIMERGY RX200 S7/D3032-A1, BIOS V4.6.5.3 R2.21.0 for D3032-A1x 04/05/2013 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16 2014-08-05 14:21 ` Bruno Prémont @ 2014-08-05 15:07 ` Matt Fleming 0 siblings, 0 replies; 13+ messages in thread From: Matt Fleming @ 2014-08-05 15:07 UTC (permalink / raw) To: Bruno Prémont Cc: P J P, Andrew Morton, linux-kernel, linux-efi, H. Peter Anvin On Tue, 05 Aug, at 04:21:07PM, Bruno Prémont wrote: > > Thanks, enabling CONFIG_RELOCATABLE allows kernel to successfully boot! > > So you can add my tested-by to the patch. Great, thanks for testing! I've tagged the patch for stable and I'll get it sent to tip quickly. > If of interest, memory layout information as reported by 3.16 with > CONFIG_RELOCATABLE enabled: [...] > [ 0.000000] efi: mem06: type=2, attr=0xf, range=[0x0000000001000000-0x0000000001100000) (1MB) Bingo, this is likely to be the reason for the resets. Overwriting EFI_LOADER_DATA regions (well, any region other than EFI_CONVENTIONAL_MEMORY) is gonna cause some issues. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2014-08-05 15:07 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-04 9:34 3.12 to 3.13 boot regression bisected - still applies to 3.16 Bruno Prémont 2014-08-04 12:27 ` Matt Fleming 2014-08-04 13:06 ` Bruno Prémont 2014-08-04 13:54 ` Matt Fleming 2014-08-05 8:02 ` Bruno Prémont 2014-08-05 8:45 ` Matt Fleming 2014-08-05 9:13 ` Bruno Prémont 2014-08-05 9:18 ` Matt Fleming 2014-08-05 11:51 ` Bruno Prémont 2014-08-05 12:11 ` Bruno Prémont 2014-08-05 12:55 ` Matt Fleming 2014-08-05 14:21 ` Bruno Prémont 2014-08-05 15:07 ` Matt Fleming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).