From: "Alex Bennée" <alex.bennee@linaro.org>
To: Lukasz Majewski <l.majewski@majess.pl>
Cc: qemu-devel@nongnu.org, qemu-discuss@nongnu.org, qemu-arm@nongnu.org
Subject: Re: Problem with init debugging under QEMU ARM
Date: Fri, 17 Sep 2021 17:49:46 +0100 [thread overview]
Message-ID: <87tuijrvpo.fsf@linaro.org> (raw)
In-Reply-To: <20210913102917.0bf933d8@ktm>
Lukasz Majewski <l.majewski@majess.pl> writes:
> [[PGP Signed Part:Undecided]]
> Dear QEMU community,
>
> I'm now trying to fix bug in glibc which became apparent on qemu 6.0.0.
>
> The error is caused by glibc commit:
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=sysdeps/arm/dl-machine.h;h=eb13cb8b57496a0ec175c54a495f7e78db978fb7;hp=ff5e09e207f7986b1506b8895ae6c2aff032a380;hb=bca0f5cbc9257c13322b99e55235c4f21ba0bd82;hpb=34b4624b04fc8f038b2c329ca7560197320615b4
>
> (reverting it causes the board to boot again)
>
> Other components:
> binutils_2.37
> gcc_11.2
> Yocto poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
>
> Qemu start line (the problem is visible on 5.2.0 and 6.0.0):
> qemu-system-arm -device
> virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> tap,id=net0,ifnam e=tap0,script=no,downscript=no -object
> rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-pci,rng=rng0 -drive
> id=disk0,file=y2038-image-devel-qemuarm.ext4,if=none,format
> =raw -device virtio-blk-device,drive=disk0 -device qemu-xhci -device
> usb-tablet -device usb-kbd -machine virt,highmem=off -cpu cortex-a15
> -smp 4 -m 256 -serial mon:stdio -serial null -nographic -device
> VGA,edid=on -kernel
> zImage--5.10.62+git0+bce2813b16_machine-r0-qemuarm-20210910095636.bin
> -append 'root=/dev/vda rw mem=256M
> ip=192.168.7.2::192.168.7.1:255.255.255.0 console=ttyAMA0 console=hvc0
> vmalloc=256
>
>
> It has been tested with Cortex-A9 (Vexpress A9 2 core board) and
> Cortex-A15. I've also tested the v5.1, v5.10 and v5.14 kernels. The
> error is persistent.
>
> I do add -s and -S when starting qemu-system-arm. I can use gdb to
> debug the kernel without issues. Unfortunately, I'm not able to debug
> /sbin/init after switching contex to user space.
Is GDB being confused by the switch to userspace? Have you stepped over
an eret into userspace?
> Moreover, gdbserver cannot be used as the error (and kernel OOPs) is
> caused when early code from ld-linux.so.3 (the _dl_start function) is
> executed.
>
>
> Any hints on how to debug it?
Are any of the linux gdb python scripts able to give you the paddr of a
given binary/task so you can stick a breakpoint there?
>
> Inspecting assembler is one (awkward) option (some results presented
> below). I can also inspect the VMA of the code just before starting the
> /sbin/init process.
>
> Unfortunately, when I try to break on user space code it is not very
> helpful (as -S -s are supposed to be used with kernel).
>
>
> Some info with the eligible code (_dl_start function):
> ------------------------------------------------------
>
> I think that the problem may be with having the negative value
> calculated.
>
> The relevant snipet:
>
> 116c: bf00 nop
> 116e: bf00 nop
> 1170: bf00 nop
> 1172: f8df 3508 ldr.w r3, [pc, #1288] ; 167c
> <_dl_start+0x520>
>
> 1176: f8df 1508 ldr.w r1, [pc, #1288] ; 1680
> <_dl_start+0x524>
>
> 117a: 447b add r3, pc
> 117c: 4479 add r1, pc
> 117e: f8c3 1598 str.w r1, [r3, #1432] ; 0x598
> 1182: bf00 nop
> 1184: bf00 nop
> 1186: bf00 nop
> 1188: bf00 nop
> 118a: bf00 nop
> 118c: bf00 nop
> 118e: f8df 24f4 ldr.w r2, [pc, #1268] ; 1684
> <_dl_start+0x528> 1192: f8d3 5598 ldr.w r5, [r3,
> #1432] ; 0x598 1196: 447a add r2, pc
> 1198: 442a add r2, r5
> 119a: 1a52 subs r2, r2, r1
> 119c: f8c3 25a0 str.w r2, [r3, #1440] ; 0x5a0
> 11a0: 6813 ldr r3, [r2, #0]
>
>
> 167c: 0002be92 .word 0x0002be92
> 1680: ffffee80 .word 0xffffee80
>
> The r1 gets the 0xffffee80 (negative offset) value. It is then added to
> pc and used to calculate r2.
>
> For working code (aforementioned patch reverted) - there are NO such
> large values (like aforementioned 0xffffee80). The arithmetic is done
> on
>
> 1690: 00000020 .word 0x00000020
> 1694: 0002be7e .word 0x0002be7e
>
> which seems to work.
>
> Maybe I'm missing some flag when I do start qemu-system-arm?
>
> Thanks in advance for help and hints.
--
Alex Bennée
WARNING: multiple messages have this Message-ID (diff)
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Lukasz Majewski <l.majewski@majess.pl>
Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, qemu-discuss@nongnu.org
Subject: Re: Problem with init debugging under QEMU ARM
Date: Fri, 17 Sep 2021 17:49:46 +0100 [thread overview]
Message-ID: <87tuijrvpo.fsf@linaro.org> (raw)
In-Reply-To: <20210913102917.0bf933d8@ktm>
Lukasz Majewski <l.majewski@majess.pl> writes:
> [[PGP Signed Part:Undecided]]
> Dear QEMU community,
>
> I'm now trying to fix bug in glibc which became apparent on qemu 6.0.0.
>
> The error is caused by glibc commit:
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=sysdeps/arm/dl-machine.h;h=eb13cb8b57496a0ec175c54a495f7e78db978fb7;hp=ff5e09e207f7986b1506b8895ae6c2aff032a380;hb=bca0f5cbc9257c13322b99e55235c4f21ba0bd82;hpb=34b4624b04fc8f038b2c329ca7560197320615b4
>
> (reverting it causes the board to boot again)
>
> Other components:
> binutils_2.37
> gcc_11.2
> Yocto poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
>
> Qemu start line (the problem is visible on 5.2.0 and 6.0.0):
> qemu-system-arm -device
> virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> tap,id=net0,ifnam e=tap0,script=no,downscript=no -object
> rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-pci,rng=rng0 -drive
> id=disk0,file=y2038-image-devel-qemuarm.ext4,if=none,format
> =raw -device virtio-blk-device,drive=disk0 -device qemu-xhci -device
> usb-tablet -device usb-kbd -machine virt,highmem=off -cpu cortex-a15
> -smp 4 -m 256 -serial mon:stdio -serial null -nographic -device
> VGA,edid=on -kernel
> zImage--5.10.62+git0+bce2813b16_machine-r0-qemuarm-20210910095636.bin
> -append 'root=/dev/vda rw mem=256M
> ip=192.168.7.2::192.168.7.1:255.255.255.0 console=ttyAMA0 console=hvc0
> vmalloc=256
>
>
> It has been tested with Cortex-A9 (Vexpress A9 2 core board) and
> Cortex-A15. I've also tested the v5.1, v5.10 and v5.14 kernels. The
> error is persistent.
>
> I do add -s and -S when starting qemu-system-arm. I can use gdb to
> debug the kernel without issues. Unfortunately, I'm not able to debug
> /sbin/init after switching contex to user space.
Is GDB being confused by the switch to userspace? Have you stepped over
an eret into userspace?
> Moreover, gdbserver cannot be used as the error (and kernel OOPs) is
> caused when early code from ld-linux.so.3 (the _dl_start function) is
> executed.
>
>
> Any hints on how to debug it?
Are any of the linux gdb python scripts able to give you the paddr of a
given binary/task so you can stick a breakpoint there?
>
> Inspecting assembler is one (awkward) option (some results presented
> below). I can also inspect the VMA of the code just before starting the
> /sbin/init process.
>
> Unfortunately, when I try to break on user space code it is not very
> helpful (as -S -s are supposed to be used with kernel).
>
>
> Some info with the eligible code (_dl_start function):
> ------------------------------------------------------
>
> I think that the problem may be with having the negative value
> calculated.
>
> The relevant snipet:
>
> 116c: bf00 nop
> 116e: bf00 nop
> 1170: bf00 nop
> 1172: f8df 3508 ldr.w r3, [pc, #1288] ; 167c
> <_dl_start+0x520>
>
> 1176: f8df 1508 ldr.w r1, [pc, #1288] ; 1680
> <_dl_start+0x524>
>
> 117a: 447b add r3, pc
> 117c: 4479 add r1, pc
> 117e: f8c3 1598 str.w r1, [r3, #1432] ; 0x598
> 1182: bf00 nop
> 1184: bf00 nop
> 1186: bf00 nop
> 1188: bf00 nop
> 118a: bf00 nop
> 118c: bf00 nop
> 118e: f8df 24f4 ldr.w r2, [pc, #1268] ; 1684
> <_dl_start+0x528> 1192: f8d3 5598 ldr.w r5, [r3,
> #1432] ; 0x598 1196: 447a add r2, pc
> 1198: 442a add r2, r5
> 119a: 1a52 subs r2, r2, r1
> 119c: f8c3 25a0 str.w r2, [r3, #1440] ; 0x5a0
> 11a0: 6813 ldr r3, [r2, #0]
>
>
> 167c: 0002be92 .word 0x0002be92
> 1680: ffffee80 .word 0xffffee80
>
> The r1 gets the 0xffffee80 (negative offset) value. It is then added to
> pc and used to calculate r2.
>
> For working code (aforementioned patch reverted) - there are NO such
> large values (like aforementioned 0xffffee80). The arithmetic is done
> on
>
> 1690: 00000020 .word 0x00000020
> 1694: 0002be7e .word 0x0002be7e
>
> which seems to work.
>
> Maybe I'm missing some flag when I do start qemu-system-arm?
>
> Thanks in advance for help and hints.
--
Alex Bennée
next prev parent reply other threads:[~2021-09-17 16:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-13 8:29 Problem with init debugging under QEMU ARM Lukasz Majewski
2021-09-17 8:35 ` Lukasz Majewski
2021-09-17 16:49 ` Alex Bennée [this message]
2021-09-17 16:49 ` Alex Bennée
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tuijrvpo.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=l.majewski@majess.pl \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-discuss@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.