Problem with init debugging under QEMU ARM

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Lukasz Majewski <l.majewski@majess.pl>
To: <qemu-arm@nongnu.org>
Cc: qemu-devel@nongnu.org, qemu-discuss@nongnu.org
Subject: Problem with init debugging under QEMU ARM
Date: Mon, 13 Sep 2021 10:29:17 +0200	[thread overview]
Message-ID: <20210913102917.0bf933d8@ktm> (raw)

[-- Attachment #1: Type: text/plain, Size: 4139 bytes --]

Dear QEMU community,

I'm now trying to fix bug in glibc which became apparent on qemu 6.0.0.

The error is caused by glibc commit:
bca0f5cbc9257c13322b99e55235c4f21ba0bd82
https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=sysdeps/arm/dl-machine.h;h=eb13cb8b57496a0ec175c54a495f7e78db978fb7;hp=ff5e09e207f7986b1506b8895ae6c2aff032a380;hb=bca0f5cbc9257c13322b99e55235c4f21ba0bd82;hpb=34b4624b04fc8f038b2c329ca7560197320615b4

(reverting it causes the board to boot again)

Other components:
binutils_2.37
gcc_11.2
Yocto poky: SHA1: 1e2e9a84d6dd81d7f6dd69c0d119d0149d10ade1

Qemu start line (the problem is visible on 5.2.0 and 6.0.0):
qemu-system-arm -device
virtio-net-device,netdev=net0,mac=52:54:00:12:34:02 -netdev
tap,id=net0,ifnam e=tap0,script=no,downscript=no -object
rng-random,filename=/dev/urandom,id=rng0 -device
virtio-rng-pci,rng=rng0 -drive
id=disk0,file=y2038-image-devel-qemuarm.ext4,if=none,format
=raw -device virtio-blk-device,drive=disk0 -device qemu-xhci -device
usb-tablet -device usb-kbd  -machine virt,highmem=off -cpu cortex-a15
-smp 4 -m 256 -serial mon:stdio -serial null -nographic -device
VGA,edid=on -kernel
zImage--5.10.62+git0+bce2813b16_machine-r0-qemuarm-20210910095636.bin
-append 'root=/dev/vda rw  mem=256M
ip=192.168.7.2::192.168.7.1:255.255.255.0 console=ttyAMA0 console=hvc0
vmalloc=256

It has been tested with Cortex-A9 (Vexpress A9 2 core board) and
Cortex-A15. I've also tested the v5.1, v5.10 and v5.14 kernels. The
error is persistent.

I do add -s and -S when starting qemu-system-arm. I can use gdb to
debug the kernel without issues. Unfortunately, I'm not able to debug
/sbin/init after switching contex to user space.

Moreover, gdbserver cannot be used as the error (and kernel OOPs) is
caused when early code from ld-linux.so.3 (the _dl_start function) is
executed.

Any hints on how to debug it?

Inspecting assembler is one (awkward) option (some results presented
below). I can also inspect the VMA of the code just before starting the
/sbin/init process.

Unfortunately, when I try to break on user space code it is not very
helpful (as -S -s are supposed to be used with kernel).

Some info with the eligible code (_dl_start function):
------------------------------------------------------

I think that the problem may be with having the negative value
calculated.

The relevant snipet:

    116c:       bf00            nop
    116e:       bf00            nop
    1170:       bf00            nop
    1172:       f8df 3508       ldr.w   r3, [pc, #1288] ; 167c
    <_dl_start+0x520>

    1176:       f8df 1508       ldr.w   r1, [pc, #1288] ; 1680
    <_dl_start+0x524>

    117a:       447b            add     r3, pc
    117c:       4479            add     r1, pc
    117e:       f8c3 1598       str.w   r1, [r3, #1432] ; 0x598
    1182:       bf00            nop
    1184:       bf00            nop
    1186:       bf00            nop
    1188:       bf00            nop
    118a:       bf00            nop
    118c:       bf00            nop
    118e:       f8df 24f4       ldr.w   r2, [pc, #1268] ; 1684
    <_dl_start+0x528> 1192:       f8d3 5598       ldr.w   r5, [r3,
    #1432] ; 0x598 1196:       447a            add     r2, pc
    1198:       442a            add     r2, r5
    119a:       1a52            subs    r2, r2, r1
    119c:       f8c3 25a0       str.w   r2, [r3, #1440] ; 0x5a0
    11a0:       6813            ldr     r3, [r2, #0]

    167c:       0002be92        .word   0x0002be92
    1680:       ffffee80        .word   0xffffee80

The r1 gets the 0xffffee80 (negative offset) value. It is then added to
pc and used to calculate r2.

For working code (aforementioned patch reverted) - there are NO such
large values (like aforementioned 0xffffee80). The arithmetic is done
on 

   1690:       00000020        .word   0x00000020
   1694:       0002be7e        .word   0x0002be7e

which seems to work.

Maybe I'm missing some flag when I do start qemu-system-arm?

Thanks in advance for help and hints.

-- 
Best regards,

Łukasz Majewski

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

next             reply	other threads:[~2021-09-13 14:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-13  8:29 Lukasz Majewski [this message]
2021-09-17  8:35 ` Problem with init debugging under QEMU ARM Lukasz Majewski
2021-09-17 16:49 ` Alex Bennée
2021-09-17 16:49   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210913102917.0bf933d8@ktm \
    --to=l.majewski@majess.pl \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-discuss@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.