Re: Problem with init debugging under QEMU ARM

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: Lukasz Majewski <l.majewski@majess.pl>
Cc: qemu-devel@nongnu.org, qemu-discuss@nongnu.org, qemu-arm@nongnu.org
Subject: Re: Problem with init debugging under QEMU ARM
Date: Fri, 17 Sep 2021 17:49:46 +0100	[thread overview]
Message-ID: <87tuijrvpo.fsf@linaro.org> (raw)
In-Reply-To: <20210913102917.0bf933d8@ktm>


Lukasz Majewski <l.majewski@majess.pl> writes:

> [[PGP Signed Part:Undecided]]
> Dear QEMU community,
>
> I'm now trying to fix bug in glibc which became apparent on qemu 6.0.0.
>
> The error is caused by glibc commit:
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=sysdeps/arm/dl-machine.h;h=eb13cb8b57496a0ec175c54a495f7e78db978fb7;hp=ff5e09e207f7986b1506b8895ae6c2aff032a380;hb=bca0f5cbc9257c13322b99e55235c4f21ba0bd82;hpb=34b4624b04fc8f038b2c329ca7560197320615b4
>
> (reverting it causes the board to boot again)
>
> Other components:
> binutils_2.37
> gcc_11.2
> Yocto poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
>
> Qemu start line (the problem is visible on 5.2.0 and 6.0.0):
> qemu-system-arm -device
> virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> tap,id=net0,ifnam e=tap0,script=no,downscript=no -object
> rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-pci,rng=rng0 -drive
> id=disk0,file=y2038-image-devel-qemuarm.ext4,if=none,format
> =raw -device virtio-blk-device,drive=disk0 -device qemu-xhci -device
> usb-tablet -device usb-kbd  -machine virt,highmem=off -cpu cortex-a15
> -smp 4 -m 256 -serial mon:stdio -serial null -nographic -device
> VGA,edid=on -kernel
> zImage--5.10.62+git0+bce2813b16_machine-r0-qemuarm-20210910095636.bin
> -append 'root=/dev/vda rw  mem=256M
> ip=192.168.7.2::192.168.7.1:255.255.255.0 console=ttyAMA0 console=hvc0
> vmalloc=256
>
>
> It has been tested with Cortex-A9 (Vexpress A9 2 core board) and
> Cortex-A15. I've also tested the v5.1, v5.10 and v5.14 kernels. The
> error is persistent.
>
> I do add -s and -S when starting qemu-system-arm. I can use gdb to
> debug the kernel without issues. Unfortunately, I'm not able to debug
> /sbin/init after switching contex to user space.

Is GDB being confused by the switch to userspace? Have you stepped over
an eret into userspace?

> Moreover, gdbserver cannot be used as the error (and kernel OOPs) is
> caused when early code from ld-linux.so.3 (the _dl_start function) is
> executed.
>
>
> Any hints on how to debug it?

Are any of the linux gdb python scripts able to give you the paddr of a
given binary/task so you can stick a breakpoint there?

>
> Inspecting assembler is one (awkward) option (some results presented
> below). I can also inspect the VMA of the code just before starting the
> /sbin/init process.
>
> Unfortunately, when I try to break on user space code it is not very
> helpful (as -S -s are supposed to be used with kernel).
>
>
> Some info with the eligible code (_dl_start function):
> ------------------------------------------------------
>
> I think that the problem may be with having the negative value
> calculated.
>
> The relevant snipet:
>
>     116c:       bf00            nop
>     116e:       bf00            nop
>     1170:       bf00            nop
>     1172:       f8df 3508       ldr.w   r3, [pc, #1288] ; 167c
>     <_dl_start+0x520>
>
>     1176:       f8df 1508       ldr.w   r1, [pc, #1288] ; 1680
>     <_dl_start+0x524>
>
>     117a:       447b            add     r3, pc
>     117c:       4479            add     r1, pc
>     117e:       f8c3 1598       str.w   r1, [r3, #1432] ; 0x598
>     1182:       bf00            nop
>     1184:       bf00            nop
>     1186:       bf00            nop
>     1188:       bf00            nop
>     118a:       bf00            nop
>     118c:       bf00            nop
>     118e:       f8df 24f4       ldr.w   r2, [pc, #1268] ; 1684
>     <_dl_start+0x528> 1192:       f8d3 5598       ldr.w   r5, [r3,
>     #1432] ; 0x598 1196:       447a            add     r2, pc
>     1198:       442a            add     r2, r5
>     119a:       1a52            subs    r2, r2, r1
>     119c:       f8c3 25a0       str.w   r2, [r3, #1440] ; 0x5a0
>     11a0:       6813            ldr     r3, [r2, #0]
>
>
>     167c:       0002be92        .word   0x0002be92
>     1680:       ffffee80        .word   0xffffee80
>
> The r1 gets the 0xffffee80 (negative offset) value. It is then added to
> pc and used to calculate r2.
>
> For working code (aforementioned patch reverted) - there are NO such
> large values (like aforementioned 0xffffee80). The arithmetic is done
> on 
>
>    1690:       00000020        .word   0x00000020
>    1694:       0002be7e        .word   0x0002be7e
>
> which seems to work.
>
> Maybe I'm missing some flag when I do start qemu-system-arm?
>
> Thanks in advance for help and hints.


-- 
Alex Bennée

WARNING: multiple messages have this Message-ID (diff)

From: "Alex Bennée" <alex.bennee@linaro.org>
To: Lukasz Majewski <l.majewski@majess.pl>
Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, qemu-discuss@nongnu.org
Subject: Re: Problem with init debugging under QEMU ARM
Date: Fri, 17 Sep 2021 17:49:46 +0100	[thread overview]
Message-ID: <87tuijrvpo.fsf@linaro.org> (raw)
In-Reply-To: <20210913102917.0bf933d8@ktm>


Lukasz Majewski <l.majewski@majess.pl> writes:

> [[PGP Signed Part:Undecided]]
> Dear QEMU community,
>
> I'm now trying to fix bug in glibc which became apparent on qemu 6.0.0.
>
> The error is caused by glibc commit:
> bca0f5cbc9257c13322b99e55235c4f21ba0bd82
> https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=sysdeps/arm/dl-machine.h;h=eb13cb8b57496a0ec175c54a495f7e78db978fb7;hp=ff5e09e207f7986b1506b8895ae6c2aff032a380;hb=bca0f5cbc9257c13322b99e55235c4f21ba0bd82;hpb=34b4624b04fc8f038b2c329ca7560197320615b4
>
> (reverting it causes the board to boot again)
>
> Other components:
> binutils_2.37
> gcc_11.2
> Yocto poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1
>
> Qemu start line (the problem is visible on 5.2.0 and 6.0.0):
> qemu-system-arm -device
> virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
> tap,id=net0,ifnam e=tap0,script=no,downscript=no -object
> rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-pci,rng=rng0 -drive
> id=disk0,file=y2038-image-devel-qemuarm.ext4,if=none,format
> =raw -device virtio-blk-device,drive=disk0 -device qemu-xhci -device
> usb-tablet -device usb-kbd  -machine virt,highmem=off -cpu cortex-a15
> -smp 4 -m 256 -serial mon:stdio -serial null -nographic -device
> VGA,edid=on -kernel
> zImage--5.10.62+git0+bce2813b16_machine-r0-qemuarm-20210910095636.bin
> -append 'root=/dev/vda rw  mem=256M
> ip=192.168.7.2::192.168.7.1:255.255.255.0 console=ttyAMA0 console=hvc0
> vmalloc=256
>
>
> It has been tested with Cortex-A9 (Vexpress A9 2 core board) and
> Cortex-A15. I've also tested the v5.1, v5.10 and v5.14 kernels. The
> error is persistent.
>
> I do add -s and -S when starting qemu-system-arm. I can use gdb to
> debug the kernel without issues. Unfortunately, I'm not able to debug
> /sbin/init after switching contex to user space.

Is GDB being confused by the switch to userspace? Have you stepped over
an eret into userspace?

> Moreover, gdbserver cannot be used as the error (and kernel OOPs) is
> caused when early code from ld-linux.so.3 (the _dl_start function) is
> executed.
>
>
> Any hints on how to debug it?

Are any of the linux gdb python scripts able to give you the paddr of a
given binary/task so you can stick a breakpoint there?

>
> Inspecting assembler is one (awkward) option (some results presented
> below). I can also inspect the VMA of the code just before starting the
> /sbin/init process.
>
> Unfortunately, when I try to break on user space code it is not very
> helpful (as -S -s are supposed to be used with kernel).
>
>
> Some info with the eligible code (_dl_start function):
> ------------------------------------------------------
>
> I think that the problem may be with having the negative value
> calculated.
>
> The relevant snipet:
>
>     116c:       bf00            nop
>     116e:       bf00            nop
>     1170:       bf00            nop
>     1172:       f8df 3508       ldr.w   r3, [pc, #1288] ; 167c
>     <_dl_start+0x520>
>
>     1176:       f8df 1508       ldr.w   r1, [pc, #1288] ; 1680
>     <_dl_start+0x524>
>
>     117a:       447b            add     r3, pc
>     117c:       4479            add     r1, pc
>     117e:       f8c3 1598       str.w   r1, [r3, #1432] ; 0x598
>     1182:       bf00            nop
>     1184:       bf00            nop
>     1186:       bf00            nop
>     1188:       bf00            nop
>     118a:       bf00            nop
>     118c:       bf00            nop
>     118e:       f8df 24f4       ldr.w   r2, [pc, #1268] ; 1684
>     <_dl_start+0x528> 1192:       f8d3 5598       ldr.w   r5, [r3,
>     #1432] ; 0x598 1196:       447a            add     r2, pc
>     1198:       442a            add     r2, r5
>     119a:       1a52            subs    r2, r2, r1
>     119c:       f8c3 25a0       str.w   r2, [r3, #1440] ; 0x5a0
>     11a0:       6813            ldr     r3, [r2, #0]
>
>
>     167c:       0002be92        .word   0x0002be92
>     1680:       ffffee80        .word   0xffffee80
>
> The r1 gets the 0xffffee80 (negative offset) value. It is then added to
> pc and used to calculate r2.
>
> For working code (aforementioned patch reverted) - there are NO such
> large values (like aforementioned 0xffffee80). The arithmetic is done
> on 
>
>    1690:       00000020        .word   0x00000020
>    1694:       0002be7e        .word   0x0002be7e
>
> which seems to work.
>
> Maybe I'm missing some flag when I do start qemu-system-arm?
>
> Thanks in advance for help and hints.


-- 
Alex Bennée

next prev parent reply	other threads:[~2021-09-17 16:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-13  8:29 Problem with init debugging under QEMU ARM Lukasz Majewski
2021-09-17  8:35 ` Lukasz Majewski
2021-09-17 16:49 ` Alex Bennée [this message]
2021-09-17 16:49   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tuijrvpo.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=l.majewski@majess.pl \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-discuss@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.