qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] sparc: crash when using initrd > 5M
@ 2019-01-03 15:48 Corentin Labbe
  2019-01-18 13:33 ` Mark Cave-Ayland
  0 siblings, 1 reply; 10+ messages in thread
From: Corentin Labbe @ 2019-01-03 15:48 UTC (permalink / raw)
  To: sparclinux, qemu-devel

Hello

When using an initrd > 5M, I hit the following kernel crash:
qemu-system-sparc -kernel vmlinux -initrd rootfs.cpio.gz -nographic
Configuration device id QEMU version 1 machine id 32
Probing SBus slot 0 offset 0
Probing SBus slot 1 offset 0
Probing SBus slot 2 offset 0
Probing SBus slot 3 offset 0
Probing SBus slot 4 offset 0
Probing SBus slot 5 offset 0
Invalid FCode start byte
CPUs: 1 x FMI,MB86904
UUID: 00000000-0000-0000-0000-000000000000
Welcome to OpenBIOS v1.1 built on Oct 5 2018 08:20
  Type 'help' for detailed information
[sparc] Kernel already loaded
switching to new context:
PROMLIB: obio_ranges 1
[    0.000000] PROMLIB: Sun Boot Prom Version 3 Revision 2
[    0.000000] Linux version 4.20.0-next-20190102+ (compile@Red) (gcc version 7.3.0 (Gentoo 7.3.0-r3 p1.4)) #148 Thu Jan 3 16:17:08 CET 2019
[    0.000000] printk: bootconsole [earlyprom0] enabled
[    0.000000] ARCH: SUN4M
[    0.000000] TYPE: SPARCstation 5
[    0.000000] Ethernet address: 52:54:00:12:34:56
[    0.000000] Unable to handle kernel NULL pointer dereference
[    0.000000] tsk->{mm,active_mm}->context = ffffffff
[    0.000000] tsk->{mm,active_mm}->pgd = 00000000
[    0.000000]               \|/ ____ \|/
[    0.000000]               "@'/ ,. \`@"
[    0.000000]               /_| \__/ |_\
[    0.000000]                  \__U_/
[    0.000000] swapper(0): Oops [#1]
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.0-next-20190102+ #148
[    0.000000] PSR: 04001fc0 PC: f0010ef0 NPC: f0010ef4 Y: 00000000    Not tainted
[    0.000000] PC: <do_sparc_fault+0x158/0x404>
[    0.000000] %G: 0000000a 000003c4  f05ece08 f05ecc00  00000000 00e00000  f05d4000 00000001
[    0.000000] %O: 00000000 00e00000  00800000 00e00000  00000000 00000002  f05d5bb8 f00bba58
[    0.000000] RPC: <memblock_reserve+0x38/0x68>
[    0.000000] %L: 00000040 f05dfaf8  f05d5c68 00000001  0003ffff 006951e0  f05ed014 f0674ab4
[    0.000000] %I: f05d5c80 00000000  00000002 f1000000  ffffffff 00000000  f05d5c20 f0007fd8
[    0.000000] Disabling lock debugging due to kernel taint
[    0.000000] Caller[f0007fd8]: srmmu_fault+0x58/0x68
[    0.000000] Caller[f0618598]: memblock_alloc_try_nid+0xb8/0xc8
[    0.000000] Caller[f0611094]: srmmu_paging_init+0x174/0xaf8
[    0.000000] Caller[f06106a8]: paging_init+0x4/0x24
[    0.000000] Caller[f060e4f0]: setup_arch+0x3e8/0x480
[    0.000000] Caller[f060ab50]: start_kernel+0x48/0x460
[    0.000000] Caller[f060a43c]: continue_boot+0x324/0x334
[    0.000000] Caller[00000000]:   (null)
[    0.000000] Instruction DUMP:
[    0.000000]  c800a024 
[    0.000000]  83286002 
[    0.000000]  073c17b3 
[    0.000000] <c4010001>
[    0.000000]  c600e22c 
[    0.000000]  8a08a003 
[    0.000000]  80a16001 
[    0.000000]  0280003b 
[    0.000000]  c600c001 
[    0.000000] 
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] Press Stop-A (L1-A) from sun keyboard or send break
[    0.000000] twice on console to return to the boot prom
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
qemu-system-sparc: terminating on signal 15 from pid 13043 (killall)

The NULL ptr dereference is done by memset() in srmmu_nocache_init() and memblock_alloc_try_nid().
If I comment both memset, the boot pass

But since nothing explain the NULL ptr deref in memset(), I suspect something is overriden by the initrd

Regards

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-01-03 15:48 [Qemu-devel] sparc: crash when using initrd > 5M Corentin Labbe
@ 2019-01-18 13:33 ` Mark Cave-Ayland
  2019-02-01 14:15   ` Mark Cave-Ayland
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Cave-Ayland @ 2019-01-18 13:33 UTC (permalink / raw)
  To: Corentin Labbe, sparclinux, qemu-devel

On 03/01/2019 15:48, Corentin Labbe wrote:

> Hello
> 
> When using an initrd > 5M, I hit the following kernel crash:
> qemu-system-sparc -kernel vmlinux -initrd rootfs.cpio.gz -nographic
> Configuration device id QEMU version 1 machine id 32
> Probing SBus slot 0 offset 0
> Probing SBus slot 1 offset 0
> Probing SBus slot 2 offset 0
> Probing SBus slot 3 offset 0
> Probing SBus slot 4 offset 0
> Probing SBus slot 5 offset 0
> Invalid FCode start byte
> CPUs: 1 x FMI,MB86904
> UUID: 00000000-0000-0000-0000-000000000000
> Welcome to OpenBIOS v1.1 built on Oct 5 2018 08:20
>   Type 'help' for detailed information
> [sparc] Kernel already loaded
> switching to new context:
> PROMLIB: obio_ranges 1
> [    0.000000] PROMLIB: Sun Boot Prom Version 3 Revision 2
> [    0.000000] Linux version 4.20.0-next-20190102+ (compile@Red) (gcc version 7.3.0 (Gentoo 7.3.0-r3 p1.4)) #148 Thu Jan 3 16:17:08 CET 2019
> [    0.000000] printk: bootconsole [earlyprom0] enabled
> [    0.000000] ARCH: SUN4M
> [    0.000000] TYPE: SPARCstation 5
> [    0.000000] Ethernet address: 52:54:00:12:34:56
> [    0.000000] Unable to handle kernel NULL pointer dereference
> [    0.000000] tsk->{mm,active_mm}->context = ffffffff
> [    0.000000] tsk->{mm,active_mm}->pgd = 00000000
> [    0.000000]               \|/ ____ \|/
> [    0.000000]               "@'/ ,. \`@"
> [    0.000000]               /_| \__/ |_\
> [    0.000000]                  \__U_/
> [    0.000000] swapper(0): Oops [#1]
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.0-next-20190102+ #148
> [    0.000000] PSR: 04001fc0 PC: f0010ef0 NPC: f0010ef4 Y: 00000000    Not tainted
> [    0.000000] PC: <do_sparc_fault+0x158/0x404>
> [    0.000000] %G: 0000000a 000003c4  f05ece08 f05ecc00  00000000 00e00000  f05d4000 00000001
> [    0.000000] %O: 00000000 00e00000  00800000 00e00000  00000000 00000002  f05d5bb8 f00bba58
> [    0.000000] RPC: <memblock_reserve+0x38/0x68>
> [    0.000000] %L: 00000040 f05dfaf8  f05d5c68 00000001  0003ffff 006951e0  f05ed014 f0674ab4
> [    0.000000] %I: f05d5c80 00000000  00000002 f1000000  ffffffff 00000000  f05d5c20 f0007fd8
> [    0.000000] Disabling lock debugging due to kernel taint
> [    0.000000] Caller[f0007fd8]: srmmu_fault+0x58/0x68
> [    0.000000] Caller[f0618598]: memblock_alloc_try_nid+0xb8/0xc8
> [    0.000000] Caller[f0611094]: srmmu_paging_init+0x174/0xaf8
> [    0.000000] Caller[f06106a8]: paging_init+0x4/0x24
> [    0.000000] Caller[f060e4f0]: setup_arch+0x3e8/0x480
> [    0.000000] Caller[f060ab50]: start_kernel+0x48/0x460
> [    0.000000] Caller[f060a43c]: continue_boot+0x324/0x334
> [    0.000000] Caller[00000000]:   (null)
> [    0.000000] Instruction DUMP:
> [    0.000000]  c800a024 
> [    0.000000]  83286002 
> [    0.000000]  073c17b3 
> [    0.000000] <c4010001>
> [    0.000000]  c600e22c 
> [    0.000000]  8a08a003 
> [    0.000000]  80a16001 
> [    0.000000]  0280003b 
> [    0.000000]  c600c001 
> [    0.000000] 
> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> [    0.000000] Press Stop-A (L1-A) from sun keyboard or send break
> [    0.000000] twice on console to return to the boot prom
> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> qemu-system-sparc: terminating on signal 15 from pid 13043 (killall)
> 
> The NULL ptr dereference is done by memset() in srmmu_nocache_init() and memblock_alloc_try_nid().
> If I comment both memset, the boot pass
> 
> But since nothing explain the NULL ptr deref in memset(), I suspect something is overriden by the initrd

Sorry about the delay in replying to this, I haven't been too well recently.

Looking at the code I suspect the problem is that when loading a kernel directly,
OpenBIOS isn't adding the kernel/initrd memory ranges to the DT properties, and so
the kernel doesn't recreate its own mapping on boot.

It shouldn't be too hard to make this happen, let me take and look and see how
difficult this would be.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-01-18 13:33 ` Mark Cave-Ayland
@ 2019-02-01 14:15   ` Mark Cave-Ayland
  2019-02-05  9:11     ` Corentin Labbe
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Cave-Ayland @ 2019-02-01 14:15 UTC (permalink / raw)
  To: Corentin Labbe, sparclinux, qemu-devel

On 18/01/2019 13:33, Mark Cave-Ayland wrote:

> On 03/01/2019 15:48, Corentin Labbe wrote:
> 
>> Hello
>>
>> When using an initrd > 5M, I hit the following kernel crash:
>> qemu-system-sparc -kernel vmlinux -initrd rootfs.cpio.gz -nographic
>> Configuration device id QEMU version 1 machine id 32
>> Probing SBus slot 0 offset 0
>> Probing SBus slot 1 offset 0
>> Probing SBus slot 2 offset 0
>> Probing SBus slot 3 offset 0
>> Probing SBus slot 4 offset 0
>> Probing SBus slot 5 offset 0
>> Invalid FCode start byte
>> CPUs: 1 x FMI,MB86904
>> UUID: 00000000-0000-0000-0000-000000000000
>> Welcome to OpenBIOS v1.1 built on Oct 5 2018 08:20
>>   Type 'help' for detailed information
>> [sparc] Kernel already loaded
>> switching to new context:
>> PROMLIB: obio_ranges 1
>> [    0.000000] PROMLIB: Sun Boot Prom Version 3 Revision 2
>> [    0.000000] Linux version 4.20.0-next-20190102+ (compile@Red) (gcc version 7.3.0 (Gentoo 7.3.0-r3 p1.4)) #148 Thu Jan 3 16:17:08 CET 2019
>> [    0.000000] printk: bootconsole [earlyprom0] enabled
>> [    0.000000] ARCH: SUN4M
>> [    0.000000] TYPE: SPARCstation 5
>> [    0.000000] Ethernet address: 52:54:00:12:34:56
>> [    0.000000] Unable to handle kernel NULL pointer dereference
>> [    0.000000] tsk->{mm,active_mm}->context = ffffffff
>> [    0.000000] tsk->{mm,active_mm}->pgd = 00000000
>> [    0.000000]               \|/ ____ \|/
>> [    0.000000]               "@'/ ,. \`@"
>> [    0.000000]               /_| \__/ |_\
>> [    0.000000]                  \__U_/
>> [    0.000000] swapper(0): Oops [#1]
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.0-next-20190102+ #148
>> [    0.000000] PSR: 04001fc0 PC: f0010ef0 NPC: f0010ef4 Y: 00000000    Not tainted
>> [    0.000000] PC: <do_sparc_fault+0x158/0x404>
>> [    0.000000] %G: 0000000a 000003c4  f05ece08 f05ecc00  00000000 00e00000  f05d4000 00000001
>> [    0.000000] %O: 00000000 00e00000  00800000 00e00000  00000000 00000002  f05d5bb8 f00bba58
>> [    0.000000] RPC: <memblock_reserve+0x38/0x68>
>> [    0.000000] %L: 00000040 f05dfaf8  f05d5c68 00000001  0003ffff 006951e0  f05ed014 f0674ab4
>> [    0.000000] %I: f05d5c80 00000000  00000002 f1000000  ffffffff 00000000  f05d5c20 f0007fd8
>> [    0.000000] Disabling lock debugging due to kernel taint
>> [    0.000000] Caller[f0007fd8]: srmmu_fault+0x58/0x68
>> [    0.000000] Caller[f0618598]: memblock_alloc_try_nid+0xb8/0xc8
>> [    0.000000] Caller[f0611094]: srmmu_paging_init+0x174/0xaf8
>> [    0.000000] Caller[f06106a8]: paging_init+0x4/0x24
>> [    0.000000] Caller[f060e4f0]: setup_arch+0x3e8/0x480
>> [    0.000000] Caller[f060ab50]: start_kernel+0x48/0x460
>> [    0.000000] Caller[f060a43c]: continue_boot+0x324/0x334
>> [    0.000000] Caller[00000000]:   (null)
>> [    0.000000] Instruction DUMP:
>> [    0.000000]  c800a024 
>> [    0.000000]  83286002 
>> [    0.000000]  073c17b3 
>> [    0.000000] <c4010001>
>> [    0.000000]  c600e22c 
>> [    0.000000]  8a08a003 
>> [    0.000000]  80a16001 
>> [    0.000000]  0280003b 
>> [    0.000000]  c600c001 
>> [    0.000000] 
>> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
>> [    0.000000] Press Stop-A (L1-A) from sun keyboard or send break
>> [    0.000000] twice on console to return to the boot prom
>> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>> qemu-system-sparc: terminating on signal 15 from pid 13043 (killall)
>>
>> The NULL ptr dereference is done by memset() in srmmu_nocache_init() and memblock_alloc_try_nid().
>> If I comment both memset, the boot pass
>>
>> But since nothing explain the NULL ptr deref in memset(), I suspect something is overriden by the initrd
> 
> Sorry about the delay in replying to this, I haven't been too well recently.
> 
> Looking at the code I suspect the problem is that when loading a kernel directly,
> OpenBIOS isn't adding the kernel/initrd memory ranges to the DT properties, and so
> the kernel doesn't recreate its own mapping on boot.
> 
> It shouldn't be too hard to make this happen, let me take and look and see how
> difficult this would be.

I think I now have a fix for this, with changes needed in both QEMU and OpenBIOS.

Firstly you'll need to apply the QEMU patch from
https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06635.html and then you'll
need an updated OpenBIOS.

I've uploaded a pre-compiled openbios-sparc32 with the patches from
https://mail.coreboot.org/hyperkitty/list/openbios@openbios.org/thread/E6IMJNUFRF7W6ALWSYBOOCEYLBFXXQEN/
to https://www.ilande.co.uk/tmp/qemu/openbios-sparc32-initrdfix for testing.

Please can you test and let me know if this solves the issue? If so, I'll see if I
can get them merged in time for the upcoming QEMU 4.0 release.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-01 14:15   ` Mark Cave-Ayland
@ 2019-02-05  9:11     ` Corentin Labbe
  2019-02-05 16:45       ` Mark Cave-Ayland
  0 siblings, 1 reply; 10+ messages in thread
From: Corentin Labbe @ 2019-02-05  9:11 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: sparclinux, qemu-devel

On Fri, Feb 01, 2019 at 02:15:15PM +0000, Mark Cave-Ayland wrote:
> On 18/01/2019 13:33, Mark Cave-Ayland wrote:
> 
> > On 03/01/2019 15:48, Corentin Labbe wrote:
> > 
> >> Hello
> >>
> >> When using an initrd > 5M, I hit the following kernel crash:
> >> qemu-system-sparc -kernel vmlinux -initrd rootfs.cpio.gz -nographic
> >> Configuration device id QEMU version 1 machine id 32
> >> Probing SBus slot 0 offset 0
> >> Probing SBus slot 1 offset 0
> >> Probing SBus slot 2 offset 0
> >> Probing SBus slot 3 offset 0
> >> Probing SBus slot 4 offset 0
> >> Probing SBus slot 5 offset 0
> >> Invalid FCode start byte
> >> CPUs: 1 x FMI,MB86904
> >> UUID: 00000000-0000-0000-0000-000000000000
> >> Welcome to OpenBIOS v1.1 built on Oct 5 2018 08:20
> >>   Type 'help' for detailed information
> >> [sparc] Kernel already loaded
> >> switching to new context:
> >> PROMLIB: obio_ranges 1
> >> [    0.000000] PROMLIB: Sun Boot Prom Version 3 Revision 2
> >> [    0.000000] Linux version 4.20.0-next-20190102+ (compile@Red) (gcc version 7.3.0 (Gentoo 7.3.0-r3 p1.4)) #148 Thu Jan 3 16:17:08 CET 2019
> >> [    0.000000] printk: bootconsole [earlyprom0] enabled
> >> [    0.000000] ARCH: SUN4M
> >> [    0.000000] TYPE: SPARCstation 5
> >> [    0.000000] Ethernet address: 52:54:00:12:34:56
> >> [    0.000000] Unable to handle kernel NULL pointer dereference
> >> [    0.000000] tsk->{mm,active_mm}->context = ffffffff
> >> [    0.000000] tsk->{mm,active_mm}->pgd = 00000000
> >> [    0.000000]               \|/ ____ \|/
> >> [    0.000000]               "@'/ ,. \`@"
> >> [    0.000000]               /_| \__/ |_\
> >> [    0.000000]                  \__U_/
> >> [    0.000000] swapper(0): Oops [#1]
> >> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.0-next-20190102+ #148
> >> [    0.000000] PSR: 04001fc0 PC: f0010ef0 NPC: f0010ef4 Y: 00000000    Not tainted
> >> [    0.000000] PC: <do_sparc_fault+0x158/0x404>
> >> [    0.000000] %G: 0000000a 000003c4  f05ece08 f05ecc00  00000000 00e00000  f05d4000 00000001
> >> [    0.000000] %O: 00000000 00e00000  00800000 00e00000  00000000 00000002  f05d5bb8 f00bba58
> >> [    0.000000] RPC: <memblock_reserve+0x38/0x68>
> >> [    0.000000] %L: 00000040 f05dfaf8  f05d5c68 00000001  0003ffff 006951e0  f05ed014 f0674ab4
> >> [    0.000000] %I: f05d5c80 00000000  00000002 f1000000  ffffffff 00000000  f05d5c20 f0007fd8
> >> [    0.000000] Disabling lock debugging due to kernel taint
> >> [    0.000000] Caller[f0007fd8]: srmmu_fault+0x58/0x68
> >> [    0.000000] Caller[f0618598]: memblock_alloc_try_nid+0xb8/0xc8
> >> [    0.000000] Caller[f0611094]: srmmu_paging_init+0x174/0xaf8
> >> [    0.000000] Caller[f06106a8]: paging_init+0x4/0x24
> >> [    0.000000] Caller[f060e4f0]: setup_arch+0x3e8/0x480
> >> [    0.000000] Caller[f060ab50]: start_kernel+0x48/0x460
> >> [    0.000000] Caller[f060a43c]: continue_boot+0x324/0x334
> >> [    0.000000] Caller[00000000]:   (null)
> >> [    0.000000] Instruction DUMP:
> >> [    0.000000]  c800a024 
> >> [    0.000000]  83286002 
> >> [    0.000000]  073c17b3 
> >> [    0.000000] <c4010001>
> >> [    0.000000]  c600e22c 
> >> [    0.000000]  8a08a003 
> >> [    0.000000]  80a16001 
> >> [    0.000000]  0280003b 
> >> [    0.000000]  c600c001 
> >> [    0.000000] 
> >> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> >> [    0.000000] Press Stop-A (L1-A) from sun keyboard or send break
> >> [    0.000000] twice on console to return to the boot prom
> >> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> >> qemu-system-sparc: terminating on signal 15 from pid 13043 (killall)
> >>
> >> The NULL ptr dereference is done by memset() in srmmu_nocache_init() and memblock_alloc_try_nid().
> >> If I comment both memset, the boot pass
> >>
> >> But since nothing explain the NULL ptr deref in memset(), I suspect something is overriden by the initrd
> > 
> > Sorry about the delay in replying to this, I haven't been too well recently.
> > 
> > Looking at the code I suspect the problem is that when loading a kernel directly,
> > OpenBIOS isn't adding the kernel/initrd memory ranges to the DT properties, and so
> > the kernel doesn't recreate its own mapping on boot.
> > 
> > It shouldn't be too hard to make this happen, let me take and look and see how
> > difficult this would be.
> 
> I think I now have a fix for this, with changes needed in both QEMU and OpenBIOS.
> 
> Firstly you'll need to apply the QEMU patch from
> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06635.html and then you'll
> need an updated OpenBIOS.
> 
> I've uploaded a pre-compiled openbios-sparc32 with the patches from
> https://mail.coreboot.org/hyperkitty/list/openbios@openbios.org/thread/E6IMJNUFRF7W6ALWSYBOOCEYLBFXXQEN/
> to https://www.ilande.co.uk/tmp/qemu/openbios-sparc32-initrdfix for testing.
> 
> Please can you test and let me know if this solves the issue? If so, I'll see if I
> can get them merged in time for the upcoming QEMU 4.0 release.
> 

Hello

Sorry even with the patch I still hit the issue.

I have added some debug and at least qemu set initrd_size correctly now.

I have tried to compile openbios-sparc32 for debugging but fail with
arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
     asm __volatile__ ("\n\tcall __switch_context"
     ^~~
make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
(gcc 7.2 and gc 6.4 with binutils 2.30)

Regards

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-05  9:11     ` Corentin Labbe
@ 2019-02-05 16:45       ` Mark Cave-Ayland
  2019-02-06  7:28         ` Corentin Labbe
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Cave-Ayland @ 2019-02-05 16:45 UTC (permalink / raw)
  To: Corentin Labbe; +Cc: sparclinux, qemu-devel

On 05/02/2019 09:11, Corentin Labbe wrote:

> On Fri, Feb 01, 2019 at 02:15:15PM +0000, Mark Cave-Ayland wrote:
>> On 18/01/2019 13:33, Mark Cave-Ayland wrote:
>>
>>> On 03/01/2019 15:48, Corentin Labbe wrote:
>>>
>>>> Hello
>>>>
>>>> When using an initrd > 5M, I hit the following kernel crash:
>>>> qemu-system-sparc -kernel vmlinux -initrd rootfs.cpio.gz -nographic
>>>> Configuration device id QEMU version 1 machine id 32
>>>> Probing SBus slot 0 offset 0
>>>> Probing SBus slot 1 offset 0
>>>> Probing SBus slot 2 offset 0
>>>> Probing SBus slot 3 offset 0
>>>> Probing SBus slot 4 offset 0
>>>> Probing SBus slot 5 offset 0
>>>> Invalid FCode start byte
>>>> CPUs: 1 x FMI,MB86904
>>>> UUID: 00000000-0000-0000-0000-000000000000
>>>> Welcome to OpenBIOS v1.1 built on Oct 5 2018 08:20
>>>>   Type 'help' for detailed information
>>>> [sparc] Kernel already loaded
>>>> switching to new context:
>>>> PROMLIB: obio_ranges 1
>>>> [    0.000000] PROMLIB: Sun Boot Prom Version 3 Revision 2
>>>> [    0.000000] Linux version 4.20.0-next-20190102+ (compile@Red) (gcc version 7.3.0 (Gentoo 7.3.0-r3 p1.4)) #148 Thu Jan 3 16:17:08 CET 2019
>>>> [    0.000000] printk: bootconsole [earlyprom0] enabled
>>>> [    0.000000] ARCH: SUN4M
>>>> [    0.000000] TYPE: SPARCstation 5
>>>> [    0.000000] Ethernet address: 52:54:00:12:34:56
>>>> [    0.000000] Unable to handle kernel NULL pointer dereference
>>>> [    0.000000] tsk->{mm,active_mm}->context = ffffffff
>>>> [    0.000000] tsk->{mm,active_mm}->pgd = 00000000
>>>> [    0.000000]               \|/ ____ \|/
>>>> [    0.000000]               "@'/ ,. \`@"
>>>> [    0.000000]               /_| \__/ |_\
>>>> [    0.000000]                  \__U_/
>>>> [    0.000000] swapper(0): Oops [#1]
>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.0-next-20190102+ #148
>>>> [    0.000000] PSR: 04001fc0 PC: f0010ef0 NPC: f0010ef4 Y: 00000000    Not tainted
>>>> [    0.000000] PC: <do_sparc_fault+0x158/0x404>
>>>> [    0.000000] %G: 0000000a 000003c4  f05ece08 f05ecc00  00000000 00e00000  f05d4000 00000001
>>>> [    0.000000] %O: 00000000 00e00000  00800000 00e00000  00000000 00000002  f05d5bb8 f00bba58
>>>> [    0.000000] RPC: <memblock_reserve+0x38/0x68>
>>>> [    0.000000] %L: 00000040 f05dfaf8  f05d5c68 00000001  0003ffff 006951e0  f05ed014 f0674ab4
>>>> [    0.000000] %I: f05d5c80 00000000  00000002 f1000000  ffffffff 00000000  f05d5c20 f0007fd8
>>>> [    0.000000] Disabling lock debugging due to kernel taint
>>>> [    0.000000] Caller[f0007fd8]: srmmu_fault+0x58/0x68
>>>> [    0.000000] Caller[f0618598]: memblock_alloc_try_nid+0xb8/0xc8
>>>> [    0.000000] Caller[f0611094]: srmmu_paging_init+0x174/0xaf8
>>>> [    0.000000] Caller[f06106a8]: paging_init+0x4/0x24
>>>> [    0.000000] Caller[f060e4f0]: setup_arch+0x3e8/0x480
>>>> [    0.000000] Caller[f060ab50]: start_kernel+0x48/0x460
>>>> [    0.000000] Caller[f060a43c]: continue_boot+0x324/0x334
>>>> [    0.000000] Caller[00000000]:   (null)
>>>> [    0.000000] Instruction DUMP:
>>>> [    0.000000]  c800a024 
>>>> [    0.000000]  83286002 
>>>> [    0.000000]  073c17b3 
>>>> [    0.000000] <c4010001>
>>>> [    0.000000]  c600e22c 
>>>> [    0.000000]  8a08a003 
>>>> [    0.000000]  80a16001 
>>>> [    0.000000]  0280003b 
>>>> [    0.000000]  c600c001 
>>>> [    0.000000] 
>>>> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
>>>> [    0.000000] Press Stop-A (L1-A) from sun keyboard or send break
>>>> [    0.000000] twice on console to return to the boot prom
>>>> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>>>> qemu-system-sparc: terminating on signal 15 from pid 13043 (killall)
>>>>
>>>> The NULL ptr dereference is done by memset() in srmmu_nocache_init() and memblock_alloc_try_nid().
>>>> If I comment both memset, the boot pass
>>>>
>>>> But since nothing explain the NULL ptr deref in memset(), I suspect something is overriden by the initrd
>>>
>>> Sorry about the delay in replying to this, I haven't been too well recently.
>>>
>>> Looking at the code I suspect the problem is that when loading a kernel directly,
>>> OpenBIOS isn't adding the kernel/initrd memory ranges to the DT properties, and so
>>> the kernel doesn't recreate its own mapping on boot.
>>>
>>> It shouldn't be too hard to make this happen, let me take and look and see how
>>> difficult this would be.
>>
>> I think I now have a fix for this, with changes needed in both QEMU and OpenBIOS.
>>
>> Firstly you'll need to apply the QEMU patch from
>> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06635.html and then you'll
>> need an updated OpenBIOS.
>>
>> I've uploaded a pre-compiled openbios-sparc32 with the patches from
>> https://mail.coreboot.org/hyperkitty/list/openbios@openbios.org/thread/E6IMJNUFRF7W6ALWSYBOOCEYLBFXXQEN/
>> to https://www.ilande.co.uk/tmp/qemu/openbios-sparc32-initrdfix for testing.
>>
>> Please can you test and let me know if this solves the issue? If so, I'll see if I
>> can get them merged in time for the upcoming QEMU 4.0 release.
>>
> 
> Hello
> 
> Sorry even with the patch I still hit the issue.
> 
> I have added some debug and at least qemu set initrd_size correctly now.
> 
> I have tried to compile openbios-sparc32 for debugging but fail with
> arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
>      asm __volatile__ ("\n\tcall __switch_context"
>      ^~~
> make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
> (gcc 7.2 and gc 6.4 with binutils 2.30)

Hmmm. One other thing I've noticed is that newer kernels tend need a minimum of 256M
RAM to start up - does it work if you add -m 256 to your command line?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-05 16:45       ` Mark Cave-Ayland
@ 2019-02-06  7:28         ` Corentin Labbe
  2019-02-06  7:37           ` Mark Cave-Ayland
  0 siblings, 1 reply; 10+ messages in thread
From: Corentin Labbe @ 2019-02-06  7:28 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: sparclinux, qemu-devel

On Tue, Feb 05, 2019 at 04:45:16PM +0000, Mark Cave-Ayland wrote:
> On 05/02/2019 09:11, Corentin Labbe wrote:
> 
> > On Fri, Feb 01, 2019 at 02:15:15PM +0000, Mark Cave-Ayland wrote:
> >> On 18/01/2019 13:33, Mark Cave-Ayland wrote:
> >>
> >>> On 03/01/2019 15:48, Corentin Labbe wrote:
> >>>
> >>>> Hello
> >>>>
> >>>> When using an initrd > 5M, I hit the following kernel crash:
> >>>> qemu-system-sparc -kernel vmlinux -initrd rootfs.cpio.gz -nographic
> >>>> Configuration device id QEMU version 1 machine id 32
> >>>> Probing SBus slot 0 offset 0
> >>>> Probing SBus slot 1 offset 0
> >>>> Probing SBus slot 2 offset 0
> >>>> Probing SBus slot 3 offset 0
> >>>> Probing SBus slot 4 offset 0
> >>>> Probing SBus slot 5 offset 0
> >>>> Invalid FCode start byte
> >>>> CPUs: 1 x FMI,MB86904
> >>>> UUID: 00000000-0000-0000-0000-000000000000
> >>>> Welcome to OpenBIOS v1.1 built on Oct 5 2018 08:20
> >>>>   Type 'help' for detailed information
> >>>> [sparc] Kernel already loaded
> >>>> switching to new context:
> >>>> PROMLIB: obio_ranges 1
> >>>> [    0.000000] PROMLIB: Sun Boot Prom Version 3 Revision 2
> >>>> [    0.000000] Linux version 4.20.0-next-20190102+ (compile@Red) (gcc version 7.3.0 (Gentoo 7.3.0-r3 p1.4)) #148 Thu Jan 3 16:17:08 CET 2019
> >>>> [    0.000000] printk: bootconsole [earlyprom0] enabled
> >>>> [    0.000000] ARCH: SUN4M
> >>>> [    0.000000] TYPE: SPARCstation 5
> >>>> [    0.000000] Ethernet address: 52:54:00:12:34:56
> >>>> [    0.000000] Unable to handle kernel NULL pointer dereference
> >>>> [    0.000000] tsk->{mm,active_mm}->context = ffffffff
> >>>> [    0.000000] tsk->{mm,active_mm}->pgd = 00000000
> >>>> [    0.000000]               \|/ ____ \|/
> >>>> [    0.000000]               "@'/ ,. \`@"
> >>>> [    0.000000]               /_| \__/ |_\
> >>>> [    0.000000]                  \__U_/
> >>>> [    0.000000] swapper(0): Oops [#1]
> >>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.20.0-next-20190102+ #148
> >>>> [    0.000000] PSR: 04001fc0 PC: f0010ef0 NPC: f0010ef4 Y: 00000000    Not tainted
> >>>> [    0.000000] PC: <do_sparc_fault+0x158/0x404>
> >>>> [    0.000000] %G: 0000000a 000003c4  f05ece08 f05ecc00  00000000 00e00000  f05d4000 00000001
> >>>> [    0.000000] %O: 00000000 00e00000  00800000 00e00000  00000000 00000002  f05d5bb8 f00bba58
> >>>> [    0.000000] RPC: <memblock_reserve+0x38/0x68>
> >>>> [    0.000000] %L: 00000040 f05dfaf8  f05d5c68 00000001  0003ffff 006951e0  f05ed014 f0674ab4
> >>>> [    0.000000] %I: f05d5c80 00000000  00000002 f1000000  ffffffff 00000000  f05d5c20 f0007fd8
> >>>> [    0.000000] Disabling lock debugging due to kernel taint
> >>>> [    0.000000] Caller[f0007fd8]: srmmu_fault+0x58/0x68
> >>>> [    0.000000] Caller[f0618598]: memblock_alloc_try_nid+0xb8/0xc8
> >>>> [    0.000000] Caller[f0611094]: srmmu_paging_init+0x174/0xaf8
> >>>> [    0.000000] Caller[f06106a8]: paging_init+0x4/0x24
> >>>> [    0.000000] Caller[f060e4f0]: setup_arch+0x3e8/0x480
> >>>> [    0.000000] Caller[f060ab50]: start_kernel+0x48/0x460
> >>>> [    0.000000] Caller[f060a43c]: continue_boot+0x324/0x334
> >>>> [    0.000000] Caller[00000000]:   (null)
> >>>> [    0.000000] Instruction DUMP:
> >>>> [    0.000000]  c800a024 
> >>>> [    0.000000]  83286002 
> >>>> [    0.000000]  073c17b3 
> >>>> [    0.000000] <c4010001>
> >>>> [    0.000000]  c600e22c 
> >>>> [    0.000000]  8a08a003 
> >>>> [    0.000000]  80a16001 
> >>>> [    0.000000]  0280003b 
> >>>> [    0.000000]  c600c001 
> >>>> [    0.000000] 
> >>>> [    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> >>>> [    0.000000] Press Stop-A (L1-A) from sun keyboard or send break
> >>>> [    0.000000] twice on console to return to the boot prom
> >>>> [    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> >>>> qemu-system-sparc: terminating on signal 15 from pid 13043 (killall)
> >>>>
> >>>> The NULL ptr dereference is done by memset() in srmmu_nocache_init() and memblock_alloc_try_nid().
> >>>> If I comment both memset, the boot pass
> >>>>
> >>>> But since nothing explain the NULL ptr deref in memset(), I suspect something is overriden by the initrd
> >>>
> >>> Sorry about the delay in replying to this, I haven't been too well recently.
> >>>
> >>> Looking at the code I suspect the problem is that when loading a kernel directly,
> >>> OpenBIOS isn't adding the kernel/initrd memory ranges to the DT properties, and so
> >>> the kernel doesn't recreate its own mapping on boot.
> >>>
> >>> It shouldn't be too hard to make this happen, let me take and look and see how
> >>> difficult this would be.
> >>
> >> I think I now have a fix for this, with changes needed in both QEMU and OpenBIOS.
> >>
> >> Firstly you'll need to apply the QEMU patch from
> >> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06635.html and then you'll
> >> need an updated OpenBIOS.
> >>
> >> I've uploaded a pre-compiled openbios-sparc32 with the patches from
> >> https://mail.coreboot.org/hyperkitty/list/openbios@openbios.org/thread/E6IMJNUFRF7W6ALWSYBOOCEYLBFXXQEN/
> >> to https://www.ilande.co.uk/tmp/qemu/openbios-sparc32-initrdfix for testing.
> >>
> >> Please can you test and let me know if this solves the issue? If so, I'll see if I
> >> can get them merged in time for the upcoming QEMU 4.0 release.
> >>
> > 
> > Hello
> > 
> > Sorry even with the patch I still hit the issue.
> > 
> > I have added some debug and at least qemu set initrd_size correctly now.
> > 
> > I have tried to compile openbios-sparc32 for debugging but fail with
> > arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
> >      asm __volatile__ ("\n\tcall __switch_context"
> >      ^~~
> > make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
> > (gcc 7.2 and gc 6.4 with binutils 2.30)
> 
> Hmmm. One other thing I've noticed is that newer kernels tend need a minimum of 256M
> RAM to start up - does it work if you add -m 256 to your command line?
> 
> 

I have already set 256M of RAM. (and tried 512)

Regards

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-06  7:28         ` Corentin Labbe
@ 2019-02-06  7:37           ` Mark Cave-Ayland
  2019-02-06  9:06             ` Corentin Labbe
  2019-02-06 19:38             ` Corentin Labbe
  0 siblings, 2 replies; 10+ messages in thread
From: Mark Cave-Ayland @ 2019-02-06  7:37 UTC (permalink / raw)
  To: Corentin Labbe; +Cc: sparclinux, qemu-devel

On 06/02/2019 07:28, Corentin Labbe wrote:

>>> Hello
>>>
>>> Sorry even with the patch I still hit the issue.
>>>
>>> I have added some debug and at least qemu set initrd_size correctly now.
>>>
>>> I have tried to compile openbios-sparc32 for debugging but fail with
>>> arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
>>>      asm __volatile__ ("\n\tcall __switch_context"
>>>      ^~~
>>> make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
>>> (gcc 7.2 and gc 6.4 with binutils 2.30)
>>
>> Hmmm. One other thing I've noticed is that newer kernels tend need a minimum of 256M
>> RAM to start up - does it work if you add -m 256 to your command line?
>>
>>
> 
> I have already set 256M of RAM. (and tried 512)

I wonder then if this is being triggered by a recent kernel change? I tend to test
using the latest Debian ports ISOs which are currently running 4.9 and that booted
fine when I was testing the patches above.

Can you try with a few older kernels to see if this is the case?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-06  7:37           ` Mark Cave-Ayland
@ 2019-02-06  9:06             ` Corentin Labbe
  2019-02-06 19:38             ` Corentin Labbe
  1 sibling, 0 replies; 10+ messages in thread
From: Corentin Labbe @ 2019-02-06  9:06 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: sparclinux, qemu-devel

On Wed, Feb 06, 2019 at 07:37:29AM +0000, Mark Cave-Ayland wrote:
> On 06/02/2019 07:28, Corentin Labbe wrote:
> 
> >>> Hello
> >>>
> >>> Sorry even with the patch I still hit the issue.
> >>>
> >>> I have added some debug and at least qemu set initrd_size correctly now.
> >>>
> >>> I have tried to compile openbios-sparc32 for debugging but fail with
> >>> arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
> >>>      asm __volatile__ ("\n\tcall __switch_context"
> >>>      ^~~
> >>> make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
> >>> (gcc 7.2 and gc 6.4 with binutils 2.30)
> >>
> >> Hmmm. One other thing I've noticed is that newer kernels tend need a minimum of 256M
> >> RAM to start up - does it work if you add -m 256 to your command line?
> >>
> >>
> > 
> > I have already set 256M of RAM. (and tried 512)
> 
> I wonder then if this is being triggered by a recent kernel change? I tend to test
> using the latest Debian ports ISOs which are currently running 4.9 and that booted
> fine when I was testing the patches above.
> 
> Can you try with a few older kernels to see if this is the case?
> 
> 

Hello

Good catch, it boot fine with 4.9.99.

A start a long git bisect...

Regards

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-06  7:37           ` Mark Cave-Ayland
  2019-02-06  9:06             ` Corentin Labbe
@ 2019-02-06 19:38             ` Corentin Labbe
  2019-02-08 16:09               ` Mark Cave-Ayland
  1 sibling, 1 reply; 10+ messages in thread
From: Corentin Labbe @ 2019-02-06 19:38 UTC (permalink / raw)
  To: Mark Cave-Ayland; +Cc: sparclinux, qemu-devel

On Wed, Feb 06, 2019 at 07:37:29AM +0000, Mark Cave-Ayland wrote:
> On 06/02/2019 07:28, Corentin Labbe wrote:
> 
> >>> Hello
> >>>
> >>> Sorry even with the patch I still hit the issue.
> >>>
> >>> I have added some debug and at least qemu set initrd_size correctly now.
> >>>
> >>> I have tried to compile openbios-sparc32 for debugging but fail with
> >>> arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
> >>>      asm __volatile__ ("\n\tcall __switch_context"
> >>>      ^~~
> >>> make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
> >>> (gcc 7.2 and gc 6.4 with binutils 2.30)
> >>
> >> Hmmm. One other thing I've noticed is that newer kernels tend need a minimum of 256M
> >> RAM to start up - does it work if you add -m 256 to your command line?
> >>
> >>
> > 
> > I have already set 256M of RAM. (and tried 512)
> 
> I wonder then if this is being triggered by a recent kernel change? I tend to test
> using the latest Debian ports ISOs which are currently running 4.9 and that booted
> fine when I was testing the patches above.
> 
> Can you try with a few older kernels to see if this is the case?
> 

Hello

In fact the problem was due to .config since a defconfig works fine on next-20190205
After lots of diff I found that CONFIG_LOG_BUF_SHIFT=18 cause this behaviour.

Note that values of 16,17 cause also the same problem.

Regards

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] sparc: crash when using initrd > 5M
  2019-02-06 19:38             ` Corentin Labbe
@ 2019-02-08 16:09               ` Mark Cave-Ayland
  0 siblings, 0 replies; 10+ messages in thread
From: Mark Cave-Ayland @ 2019-02-08 16:09 UTC (permalink / raw)
  To: Corentin Labbe; +Cc: sparclinux, qemu-devel

On 06/02/2019 19:38, Corentin Labbe wrote:

> On Wed, Feb 06, 2019 at 07:37:29AM +0000, Mark Cave-Ayland wrote:
>> On 06/02/2019 07:28, Corentin Labbe wrote:
>>
>>>>> Hello
>>>>>
>>>>> Sorry even with the patch I still hit the issue.
>>>>>
>>>>> I have added some debug and at least qemu set initrd_size correctly now.
>>>>>
>>>>> I have tried to compile openbios-sparc32 for debugging but fail with
>>>>> arch/sparc32/context.c:116:5: error: PIC register clobbered by 'l7' in 'asm'
>>>>>      asm __volatile__ ("\n\tcall __switch_context"
>>>>>      ^~~
>>>>> make[1]: *** [rules.mak:219: target/arch/sparc32/context.o] Error 1
>>>>> (gcc 7.2 and gc 6.4 with binutils 2.30)
>>>>
>>>> Hmmm. One other thing I've noticed is that newer kernels tend need a minimum of 256M
>>>> RAM to start up - does it work if you add -m 256 to your command line?
>>>>
>>>>
>>>
>>> I have already set 256M of RAM. (and tried 512)
>>
>> I wonder then if this is being triggered by a recent kernel change? I tend to test
>> using the latest Debian ports ISOs which are currently running 4.9 and that booted
>> fine when I was testing the patches above.
>>
>> Can you try with a few older kernels to see if this is the case?
>>
> 
> Hello
> 
> In fact the problem was due to .config since a defconfig works fine on next-20190205
> After lots of diff I found that CONFIG_LOG_BUF_SHIFT=18 cause this behaviour.
> 
> Note that values of 16,17 cause also the same problem.

Okay, thanks for the feedback. I will aim to merge the OpenBIOS/QEMU patches before
the upcoming QEMU 4.0 release since even if it happens to work now, not passing the
memory allocations/translations via the DT seems a little fragile.


ATB,

Mark.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-02-08 16:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-03 15:48 [Qemu-devel] sparc: crash when using initrd > 5M Corentin Labbe
2019-01-18 13:33 ` Mark Cave-Ayland
2019-02-01 14:15   ` Mark Cave-Ayland
2019-02-05  9:11     ` Corentin Labbe
2019-02-05 16:45       ` Mark Cave-Ayland
2019-02-06  7:28         ` Corentin Labbe
2019-02-06  7:37           ` Mark Cave-Ayland
2019-02-06  9:06             ` Corentin Labbe
2019-02-06 19:38             ` Corentin Labbe
2019-02-08 16:09               ` Mark Cave-Ayland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).