Re: QEMU commit 0a923be2f642 broke my or1k image.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Rob Landley <rob@landley.net>
To: Stafford Horne <shorne@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	"Jason A. Donenfeld" <Jason@zx2c4.com>,
	QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: QEMU commit 0a923be2f642 broke my or1k image.
Date: Sat, 23 Nov 2024 23:18:53 -0600	[thread overview]
Message-ID: <87a6b910-5af6-47ad-ad8d-b79f11a7cbf2@landley.net> (raw)
In-Reply-To: <Z0GSETLeT5w8B2DX@antec>

On 11/23/24 02:28, Stafford Horne wrote:
>> Just a guess, but given the alignment change, I suspect it's barfing on the
>> statically linked initramfs? That seems the most likely step to go off the
>> rails given the failing patch is a symbol alignment change in the flattened
>> device tree plumbing, and I think the initramfs extractor parses device
>> trees very early on to find stuff (I forget why). Moving "where the data
>> lives" without a corresponding change to the "where to look for the data"
>> code seems a bit strange, but it's not my area...
> 
> OK, and the broken earlycon may be masking what is going on, as we should at
> least see some console output before things fail.  The earlcon fix is in 6.13
> not 6.12.
> 
> I was able to test your or1k.tgz image and figure out what is wrong.  Your
> run-qemu.sh script has 'console=FIXME'.  This command line argument is taken in
> and is causing the boot process to not be able to find the console.
> 
> Changing it to 'console=ttyS0' allows me to see the output.

Ha, so it STARTED parsing console= and broke. Oops. (It was there so I'd 
notice...)

> I put a branch with the qemu patches I have here:
> 
>    - https://github.com/stffrdhrn/qemu/tree/or1k-9.2.0-fixes-1
> 
>> Here's the miniconfig I built 6.12 with (90% of which is generic to all the
>> architectures I'm testing, the sections are labeled. The console="FIXME" bit
>> is because I can't get qemu-system-or1k -append "blah" to go through to
>> linux, so I stuck FIXME in that field for the or1k target and it wound up in
>> the output):
> 
> The kernel command line is injected by qemu into the qemu generated
> devicetree.  I notice when I boot your kernel with the reverted FDT alignment
> fix the console prints:
> 
>      Kernel command line: earlycon
> 
> This means that the qemu devicetree is not being used, hence the command line
> args are not working.  The qemu device tree not being used is not good, but that
> is why reverting the alignment fix 'seems' to fix the issue.  To me the revert
> looks to be breaking the qemu devicetree allowing us to fall back to the kernel
> supplied devicetree.

I'm happy to do it the "right" way if I know what that is. I just 
stumbled around and got it to work.

>> Also, looking at that, I'm using a builtin DTB and you might be passing one
>> in via -dtb? Another thing the alignment change might break...
> 
> Thanks for the steps.  I was just using the or1k.tgz you provided earlier.  The
> above will help if I want to try some kernel fixes on my own.

I'm attempting to regression test as many targets as I can to get 
consistent basic behavior out of:

   https://landley.net/bin/mkroot/0.8.11/

I'm trying to get a new release out with the 6.12 kernel which is why 
I'm revisiting this now.

I've even got a test script that runs all the targets under qemu 
(booting them in parallel even) and checks that A) they boot and run 
userspace, B) they can talk to an emulated hard disk, C) they can talk 
to an emulated network, D) the clock gets set reasonably, E) it knows 
how to exit the emulator. You'd be surprised how many regressions there 
are in just that...

Speaking of which, is there a way to get or1k to exit the emulator? I 
told the kernel to reboot but it says "reboot failed, system halted" and 
hangs instead of exiting qemu. (My testroot runs qemu under "timeout -i 
10" to kill it after 10 seconds of inactivity, I.E. nothing written to 
stdout, but it still counts as a failure on one of the criteria.)

>>> Note, I did find some issues with the kernel nor properly handling stdout-path.
>>> It seems that if there are multiple uarts the first one will always be used as
>>> the default uart.  Only the console= command line argument can be used to
>>> override that.
>>
>> I've never managed to get console= to go through to linux in
>> qemu-system-or1k. The above tries but is ignored.
> 
> As I mentioned above this is a good clue and explains why the alignment "fix"
> fixes your issue.

Happy to do it properly. Almost all the other targets can do console=, 
the FIXME was there to highlight the fact it didn't work right. 
(Silently working for the WRONG REASON is still bad when regression 
testing.)

>> It's also doing a statically linked initramfs because -initrd didn't work on
>> this target. Happy to update if it's been fixed, the other targets are
>> almost all using -initrd to feed in an external cpio.gz
> 
> Using -initrd should work.  But also using the statically linked initramfs
> should be fine too.  The setup I use for testing uses virt with a virtio block
> driver.

Most of the other targets _don't_ use builtin initramfs, so you can swap 
them out "aftermarket" as it were. When it's separate you can examine 
and edit the contents without rebuilding the kernel...

> When using qemu with -initrd qemu will back the kernel, initrd and fdt one after
> the other into memory as per.
> 
> [ kernel ] - Loads from 0x100 (based on elf layout)
> [ initrd ] - page aligned
> [  fdt   ] - page aligned devicetree (revert moved to 4 bytes aligned)
> 
> The fdt address gets placed into r3 which the kernel uses to find the qemu FDT.
> Finding the FDT one of the first steps of the boot processes.

I updated my mkroot config:

   https://github.com/landley/toybox/commit/fb3ca98e2faa

I.E. changed the FIXME to ttyS0, removed BUILTIN=1 so it's no longer 
statically linking the initramfs image, and yanked the builtin DTB, and 
the result works with v9.2.0-rc1.

Still doesn't know how to exit qemu, though. (Is there a kernel symbol I 
can add to 6.12, or does qemu still not have an exit mechanism for this 
board yet?)

(FYI: be2csv is a shell function to convert bash's brace expansion 
syntax to a comma separated value list, and then csv2cfg is another 
shell function that turns the CSV into https://lwn.net/Articles/160497/ 
. The CSV is shipped as docs/linux-microconfig in the tarball if you're 
curious. That's how a 400 line bash script can build a Linux system that 
boots to a shell prompt for a dozen architectures. The or1k config is 
now 2 lines, for example. 3 with the "if or1k" check. The variables it 
assigns to are documented around line 190.)

> If you provide command line args console=ttyS0 things will work.
> 
> Also console=ttyS0 is not used as all as it should be the default in QEMU.

I specify it explicitly to be consistent across architectures.

> It looks like the root cause of the issue was the 'console=FIXME'.
> 
> I hope it helps.

Yup, I just had to remove workarounds for old qemu that are no longer 
needed. Thanks for the help. (If you do teach qemu to exit at some 
point, please let me know...)

> -Stafford

Thanks,

Rob

next prev parent reply	other threads:[~2024-11-24  5:19 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-11  5:42 QEMU commit 0a923be2f642 broke my or1k image Rob Landley
2024-09-16  7:21 ` Stafford Horne
2024-11-21 22:32   ` Rob Landley
2024-11-22 16:35     ` Stafford Horne
2024-11-23  0:54       ` Rob Landley
2024-11-23  8:28         ` Stafford Horne
2024-11-24  5:18           ` Rob Landley [this message]
2024-11-24  6:50             ` Stafford Horne
2024-12-22 17:29               ` or1k -M virt -hda and net Rob Landley
     [not found]                 ` <Z2lgL31ZeSkO59MZ@antec>
2025-01-01  1:19                   ` Rob Landley
2025-01-07 11:56                     ` Rob Landley
2025-01-07 17:31                       ` Stafford Horne
2025-01-07 18:05                         ` Alex Bennée
2025-01-07 23:20                           ` Rob Landley
2025-01-08 13:01                             ` BALATON Zoltan
2025-01-08 22:57                               ` Rob Landley
2025-01-09  2:05                                 ` BALATON Zoltan
2025-01-08 14:59                             ` Alex Bennée
2025-01-08 22:34                               ` Rob Landley
2025-01-09  2:48                                 ` BALATON Zoltan
2025-01-07 22:44                         ` Rob Landley
2025-01-08  8:24                           ` Geert Uytterhoeven
2025-01-08 16:23                             ` Rob Landley
2025-01-08 16:26                               ` Geert Uytterhoeven
2025-01-08 22:40                                 ` Rob Landley
2025-01-09  8:49                                   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6b910-5af6-47ad-ad8d-b79f11a7cbf2@landley.net \
    --to=rob@landley.net \
    --cc=Jason@zx2c4.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=shorne@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).