qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: BALATON Zoltan <balaton@eik.bme.hu>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH qemu v20] spapr: Implement Open Firmware client interface
Date: Sun, 23 May 2021 13:20:11 +1000	[thread overview]
Message-ID: <4f6ceca3-5f18-fe70-18f9-4efde8feb1ed@ozlabs.ru> (raw)
In-Reply-To: <babe39af-fd34-8c5-de99-a0f485bfbce@eik.bme.hu>



On 22/05/2021 23:01, BALATON Zoltan wrote:
> On Sat, 22 May 2021, Alexey Kardashevskiy wrote:
>> On 21/05/2021 19:05, BALATON Zoltan wrote:
>>> On Fri, 21 May 2021, Alexey Kardashevskiy wrote:
>>>> On 21/05/2021 07:59, BALATON Zoltan wrote:
>>>>> On Thu, 20 May 2021, Alexey Kardashevskiy wrote:
>>>>>> The PAPR platform describes an OS environment that's presented by
>>>>>> a combination of a hypervisor and firmware. The features it specifies
>>>>>> require collaboration between the firmware and the hypervisor.
>>>>>>
>>>>>> Since the beginning, the runtime component of the firmware (RTAS) has
>>>>>> been implemented as a 20 byte shim which simply forwards it to
>>>>>> a hypercall implemented in qemu. The boot time firmware component is
>>>>>> SLOF - but a build that's specific to qemu, and has always needed 
>>>>>> to be
>>>>>> updated in sync with it. Even though we've managed to limit the 
>>>>>> amount
>>>>>> of runtime communication we need between qemu and SLOF, there's some,
>>>>>> and it has become increasingly awkward to handle as we've implemented
>>>>>> new features.
>>>>>>
>>>>>> This implements a boot time OF client interface (CI) which is
>>>>>> enabled by a new "x-vof" pseries machine option (stands for 
>>>>>> "Virtual Open
>>>>>> Firmware). When enabled, QEMU implements the custom H_OF_CLIENT hcall
>>>>>> which implements Open Firmware Client Interface (OF CI). This allows
>>>>>> using a smaller stateless firmware which does not have to manage
>>>>>> the device tree.
>>>>>>
>>>>>> The new "vof.bin" firmware image is included with source code under
>>>>>> pc-bios/. It also includes RTAS blob.
>>>>>>
>>>>>> This implements a handful of CI methods just to get -kernel/-initrd
>>>>>> working. In particular, this implements the device tree fetching and
>>>>>> simple memory allocator - "claim" (an OF CI memory allocator) and 
>>>>>> updates
>>>>>> "/memory@0/available" to report the client about available memory.
>>>>>>
>>>>>> This implements changing some device tree properties which we know 
>>>>>> how
>>>>>> to deal with, the rest is ignored. To allow changes, this skips
>>>>>> fdt_pack() when x-vof=on as not packing the blob leaves some room for
>>>>>> appending.
>>>>>>
>>>>>> In absence of SLOF, this assigns phandles to device tree nodes to 
>>>>>> make
>>>>>> device tree traversing work.
>>>>>>
>>>>>> When x-vof=on, this adds "/chosen" every time QEMU (re)builds a tree.
>>>>>>
>>>>>> This adds basic instances support which are managed by a hash map
>>>>>> ihandle -> [phandle].
>>>>>>
>>>>>> Before the guest started, the used memory is:
>>>>>> 0..e60 - the initial firmware
>>>>>> 8000..10000 - stack
>>>>>> 400000.. - kernel
>>>>>> 3ea0000.. - initramdisk
>>>>>>
>>>>>> This OF CI does not implement "interpret".
>>>>>>
>>>>>> Unlike SLOF, this does not format uninitialized nvram. Instead, this
>>>>>> includes a disk image with pre-formatted nvram.
>>>>>>
>>>>>> With this basic support, this can only boot into kernel directly.
>>>>>> However this is just enough for the petitboot kernel and 
>>>>>> initradmdisk to
>>>>>> boot from any possible source. Note this requires reasonably 
>>>>>> recent guest
>>>>>> kernel with:
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735 
>>>>>> The immediate benefit is much faster booting time which especially
>>>>>> crucial with fully emulated early CPU bring up environments. Also 
>>>>>> this
>>>>>> may come handy when/if GRUB-in-the-userspace sees light of the day.
>>>>>>
>>>>>> This separates VOF and sPAPR in a hope that VOF bits may be reused by
>>>>>> other POWERPC boards which do not support pSeries.
>>>>>>
>>>>>> This is coded in assumption that later on we might be adding 
>>>>>> support for
>>>>>> booting from QEMU backends (blockdev is the first candidate) without
>>>>>> devices/drivers in between as OF1275 does not require that and
>>>>>> it is quite easy to so.
>>>>>>
>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>>> ---
>>>>>>
>>>>>> The example command line is:
>>>>>>
>>>>>> /home/aik/pbuild/qemu-killslof-localhost-ppc64/qemu-system-ppc64 \
>>>>>> -nodefaults \
>>>>>> -chardev stdio,id=STDIO0,signal=off,mux=on \
>>>>>> -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
>>>>>> -mon id=MON0,chardev=STDIO0,mode=readline \
>>>>>> -nographic \
>>>>>> -vga none \
>>>>>> -enable-kvm \
>>>>>> -m 8G \
>>>>>> -machine 
>>>>>> pseries,x-vof=on,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken,cap-ccf-assist=off 
>>>>>> \
>>>>>> -kernel pbuild/kernel-le-guest/vmlinux \
>>>>>> -initrd pb/rootfs.cpio.xz \
>>>>>> -drive 
>>>>>> id=DRIVE0,if=none,file=./p/qemu-killslof/pc-bios/vof-nvram.bin,format=raw 
>>>>>> \
>>>>>> -global spapr-nvram.drive=DRIVE0 \
>>>>>> -snapshot \
>>>>>> -smp 8,threads=8 \
>>>>>> -L /home/aik/t/qemu-ppc64-bios/ \
>>>>>> -trace events=qemu_trace_events \
>>>>>> -d guest_errors \
>>>>>> -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.tmux26 \
>>>>>> -mon chardev=SOCKET0,mode=control
>>>>>>
>>>>>> ---
>>>>>> Changes:
>>>>>> v20:
>>>>>> * compile vof.bin with -mcpu=power4 for better compatibility
>>>>>> * s/std/stw/ in entry.S to make it work on ppc32
>>>>>> * fixed dt_available property to support both 32 and 64bit
>>>>>> * shuffled prom_args handling code
>>>>>> * do not enforce 32bit in MSR (again, to support 32bit platforms)
>>>>>>
>>>>>
>>>>> [...]
>>>>>
>>>>>> diff --git a/default-configs/devices/ppc64-softmmu.mak 
>>>>>> b/default-configs/devices/ppc64-softmmu.mak
>>>>>> index ae0841fa3a18..9fb201dfacfa 100644
>>>>>> --- a/default-configs/devices/ppc64-softmmu.mak
>>>>>> +++ b/default-configs/devices/ppc64-softmmu.mak
>>>>>> @@ -9,3 +9,4 @@ CONFIG_POWERNV=y
>>>>>>  # For pSeries
>>>>>>  CONFIG_PSERIES=y
>>>>>>  CONFIG_NVDIMM=y
>>>>>> +CONFIG_VOF=y
>>>>>> diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
>>>>>> index e51e0e5e5ac6..964510dfc73d 100644
>>>>>> --- a/hw/ppc/Kconfig
>>>>>> +++ b/hw/ppc/Kconfig
>>>>>> @@ -143,3 +143,6 @@ config FW_CFG_PPC
>>>>>>
>>>>>>  config FDT_PPC
>>>>>>      bool
>>>>>> +
>>>>>> +config VOF
>>>>>> +    bool
>>>>>
>>>>> I think you should just add "select VOF" to config PSERIES section 
>>>>> in Kconfig instead of adding it to 
>>>>> default-configs/devices/ppc64-softmmu.mak. 
>>>>
>>>> oh well, can do that too.
>>>
>>> I think most config options should be selected by KConfig and the 
>>> default config should only include machines, otherwise VOF would be 
>>> added also when you don't compile PSERIES or PEGASOS2. With select in 
>>> Kconfig it will be added when needed. That's why it's better to use 
>>> select in this case.
>>>
>>>>>  That should do it, it works in my updated pegasos2 patch:
>>>>>
>>>>> https://osdn.net/projects/qmiga/scm/git/qemu/commits/3c1fad08469b4d3c04def22044e52b2d27774a61 
>>>>> [...]
>>>>>> diff --git a/pc-bios/vof/entry.S b/pc-bios/vof/entry.S
>>>>>> new file mode 100644
>>>>>> index 000000000000..569688714c91
>>>>>> --- /dev/null
>>>>>> +++ b/pc-bios/vof/entry.S
>>>>>> @@ -0,0 +1,51 @@
>>>>>> +#define LOAD32(rn, name)    \
>>>>>> +    lis     rn,name##@h;    \
>>>>>> +    ori     rn,rn,name##@l
>>>>>> +
>>>>>> +#define ENTRY(func_name)    \
>>>>>> +    .text;                  \
>>>>>> +    .align  2;              \
>>>>>> +    .globl  .func_name;     \
>>>>>> +    .func_name:             \
>>>>>> +    .globl  func_name;      \
>>>>>> +    func_name:
>>>>>> +
>>>>>> +#define KVMPPC_HCALL_BASE       0xf000
>>>>>> +#define KVMPPC_H_RTAS           (KVMPPC_HCALL_BASE + 0x0)
>>>>>> +#define KVMPPC_H_VOF_CLIENT     (KVMPPC_HCALL_BASE + 0x5)
>>>>>> +
>>>>>> +    . = 0x100 /* Do exactly as SLOF does */
>>>>>> +
>>>>>> +ENTRY(_start)
>>>>>> +#    LOAD32(%r31, 0) /* Go 32bit mode */
>>>>>> +#    mtmsrd %r31,0
>>>>>> +    LOAD32(2, __toc_start)
>>>>>> +    b entry_c
>>>>>> +
>>>>>> +ENTRY(_prom_entry)
>>>>>> +    LOAD32(2, __toc_start)
>>>>>> +    stwu    %r1,-112(%r1)
>>>>>> +    stw     %r31,104(%r1)
>>>>>> +    mflr    %r31
>>>>>> +    bl prom_entry
>>>>>> +    nop
>>>>>> +    mtlr    %r31
>>>>>> +    ld      %r31,104(%r1)
>>>>>
>>>>> It's getting there, now I see the first client call from the guest 
>>>>> boot code but then it crashes on this ld opcode which apparently is 
>>>>> 64 bit only:
>>>>
>>>> Oh right.
>>>>
>>>>
>>>>> Hopefully this is the last such opcode left before I can really 
>>>>> test this.
>>>>
>>>> Make it lwz, and test it?
>>>
>>> Yes, figured that out too after sending this message. Replacing with 
>>> lwz works but I wonder that now you have stwu lwz do the stack 
>>> offsets need adjusting too or you just waste 4 bytes now?
>>
>> Well, this assumes the 64bit client and that ABI. I think ideally the 
>> firmware is supposed to use its own stack but I did not bother here. I 
>> do not know 32bit ABI at all so say whether the existing code should 
>> just work or not :-/
> 
> It seems to work so that's OK, just thought if the firmware is 32 bit it 
> does not need 64 bit values on stack but if that's also potentially used 
> by a 64 bit kernel then it may be better to keep it that way to avoid 
> confusion. With the 64 bit opcodes replaced it seems to work on pegasos2 
> and the guest can call CI functions and get a reply so maybe it's just a 
> few wasted bytes that's not a big deal.
> 
>>> With lwz here I found no further 64 bit opcodes and the guest boot 
>>> code could walk the device tree. It failed later but I think that's 
>>> because I'll need to fill more info about the machine in the device 
>>> tree. I'll experiment with that but it looks like it could work at 
>>> least for MorphOS. I'll have to try Linux too.
>>
>> There are plenty of tracepoints, enable them all.
> 
> I'm running with -trace enable="vof*" but it does not give me too much 
> info as a lot of calls (such as peer, child, etc.) don't log anything 
> other than there was a hypercall so only get info about opening paths 
> and querying some props. The MorphOS boot.img just walks the device tree 
> gathering some data about the machine then calls quiesce and boot into 
> the OS that later tries to use the gathered info at which point it 
> crashes without any logs if some info is not as expected. This does not 
> make it easy to debug but I think once I fill the device tree enough 
> with all needed info it should work. Currently I'm missing info about 
> PCI devices that it may need.


One thing to note about PCI is that normally I think the client expects 
the firmware to do PCI probing and SLOF does it. But VOF does not and 
Linux scans PCI bus(es) itself. Might be a problem for you kernel.


> 
>>>>> Do you have some info on how the stdout works in VOF? I think I'll 
>>>>> need that to test with Linux and get output but I'm not sure what's 
>>>>> needed on the machine side.
>>>>
>>>> VOF opens stsout and stores the ihandle (in fdt) which the client 
>>>> (==kernel) uses for writing. To make it work properly, you need to 
>>>> hook up that instance to a device backend similar to what I have for 
>>>> spapr-vty:
>>>>
>>>> https://github.com/aik/qemu/commit/a381a5b50c23c74013e2bd39cc5dad5b6385965d 
>>>>
>>>> This is not a part of this patch as I'm trying to keep things 
>>>> simpler and accessing backends from VOF is still unsettled. But 
>>>> there is a workaround which  is trace_vof_write, I use this. Thanks,
>>>
>>> The above patch is about stdin but stdout seems to be added by the 
>>> current vof patch. What is spapr-vty?
>>
>> It is pseries' paravirtual serial device, pegasos does not have it.
>>
>>> I don't think I have something similar in pegasos2 where I just have 
>>> a normal serial port created by ISASuperIO in the vt8231 model.
>>
>> Correct.
>>
>>> Can I use that backend somehow or have to create some other serial 
>>> device to connect to stdout?
>>> Does trace_vof_write work for stuff output by the guest?
>>> I guess that's only for things printed by VOF itself
>>
>> VOF itself does not prints anything in this patch.
> 
> However it seems to be needed for linux as the first thing it does seems 
> to be getting /chosen/stdout and calls exit if it returns nothing. So 

Right, Linux does but VOF (==vof.bin) does not.

> I'll need this at least for linux. (I think MorphOS may also query it to 
> print a banner or some messages but not sure it needs it, at least it 
> does not abort right away if not found.)

Tracepoints print this :)

>>> but to see Linux output do I need a stdout in VOF or it will just 
>>> open the serial with its own driver and use that?
>>> So I'm not sure what's the stdout parts in the current vof patch does 
>>> and if I need that for anything. I'll try to experiment with it some 
>>> more but fixing the ld and Kconfig seems to be enough to get it work 
>>> for me.
>>
>> So for the client to print something, /chosen/stdout needs to have a 
>> valid ihandle.
>> The only way to get a valid ihandle is having a valid phandle which 
>> vof_client_open() can open.
>> A valid phandle is a phandle of any node in the device tree. On spapr 
>> we pick some spapr-vty, open it and store in /chosen/stdout.
>>
>> From this point output from the client can be seen via a tracepoint.
>>
>> Now if we want proper output without tracepoints - we need to hook it 
>> up with some chardev backend (not a device such a vt8231 or spapr-vty 
>> but backend).
> 
> I don't know much about it but devices are also connected to some 
> backend so is it possible to use the same backend for VOF as used for 
> the normal serial port?

Yes but with this initial patch there is no backend support, you only 
get tracepoints.

> But I need a way to find that and connect it to 
> VOF and I'm not qure how to do that yet.

Pick some device in the machine reset code (or you can open the root - 
"/"), resolve its FW (==FDT) path, call vof_client_open_store() on it, 
it will store ihandle in the FDT. This will enable stdout and the output 
can be seen via tracepoint.


> Or do I need to create a 
> separate serial backend and connect that to VOF? I'll try to look at 
> spapr-vty to see what it does.

No additional devices needed.


> 
>> https://github.com/aik/qemu/commit/a381a5b50c23c74013e2bd3 does this:
>> 1. when a phandle is open, QEMU will search for DeviceState* for the 
>> specific FDT node and get a chardev from the device.
>> 2. when write() is called, QEMU calls qemu_chr_fe_write_all() on 
>> chardev from 1.
>>
>> From this point you do not need a tracepoint and the output will 
>> appears in the console you set up for stdout.
>>
>> Now if you want input from this console, things get tricky. First, on 
>> powernv/pseries we only need this for grub as otherwise the kernel has 
>> all the drivers needed and will not use the client interface. For the 
>> grub, we need to provide a valid ihandle for /chosen/stdin which is 
>> easy but implementing read() on this is not as there is no simple 
>> device-type-independend way of reading from chardev. I hacked it for 
>> spapr-tvy but other serial devices will need special handling, or 
>> we'll have to introduce some VOF_SERIAL_READ interface for those which 
>> will face opposition :)
>>
>> Makes sense?
> 
> It explains things a bit but still not entirely clear how can I get 
> something to add as a stdout. With the pegasos2 firmware it puts the 
> serial device there normally that it inits and opens. Without that 
> firmware we have to somehow do that from QEMU so find the serial backend 
> used by the serial device within the vt8231 model (or use a different 
> backend just for this?) then open it and put it in the device tree. If 
> that's correct or how to do it is not clear yet.

spapr looks through all spapr-vty and picks one with the lowest @reg. 
You can do a similar thing. Or add a machine option with a serial device 
id which you want to be the default console. So many options :)



-- 
Alexey


  parent reply	other threads:[~2021-05-23  3:21 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  9:05 [PATCH qemu v20] spapr: Implement Open Firmware client interface Alexey Kardashevskiy
2021-05-20 21:59 ` BALATON Zoltan
2021-05-21  0:25   ` Alexey Kardashevskiy
2021-05-21  9:05     ` BALATON Zoltan
2021-05-21 19:57       ` BALATON Zoltan
2021-05-22  6:39         ` Alexey Kardashevskiy
2021-05-22 13:08           ` BALATON Zoltan
2021-05-23  3:47             ` Alexey Kardashevskiy
2021-05-23 12:12               ` BALATON Zoltan
2021-05-22  6:22       ` Alexey Kardashevskiy
2021-05-22 13:01         ` BALATON Zoltan
2021-05-22 15:02           ` BALATON Zoltan
2021-05-22 16:46             ` BALATON Zoltan
2021-05-23  3:41               ` Alexey Kardashevskiy
2021-05-23 12:02                 ` BALATON Zoltan
2021-05-23  3:31             ` Alexey Kardashevskiy
2021-05-23 11:24               ` BALATON Zoltan
2021-05-24  4:26                 ` Alexey Kardashevskiy
2021-05-24  5:40                   ` David Gibson
2021-05-24 11:56                     ` BALATON Zoltan
2021-05-23  3:20           ` Alexey Kardashevskiy [this message]
2021-05-23 11:19             ` BALATON Zoltan
2021-05-23 17:09               ` BALATON Zoltan
2021-05-24  6:01                 ` David Gibson
2021-05-24 10:55                   ` BALATON Zoltan
2021-05-24 12:46                     ` Alexey Kardashevskiy
2021-05-24 22:34                       ` BALATON Zoltan
2021-05-25  5:24                       ` David Gibson
2021-05-25  5:23                     ` David Gibson
2021-05-25 10:08                       ` BALATON Zoltan
2021-05-27  5:34                         ` David Gibson
2021-05-27 12:42                           ` BALATON Zoltan
2021-06-02  7:57                             ` David Gibson
2021-06-02 12:29                               ` BALATON Zoltan
2021-06-04  6:29                                 ` David Gibson
2021-06-04 13:59                                   ` BALATON Zoltan
2021-06-07  3:30                                     ` David Gibson
2021-06-07 22:54                                       ` BALATON Zoltan
2021-06-09  5:51                                         ` Alexey Kardashevskiy
2021-06-09 10:19                                           ` BALATON Zoltan
2021-06-06 22:21                                   ` BALATON Zoltan
2021-06-07  3:37                                     ` David Gibson
2021-06-07 22:20                                       ` BALATON Zoltan
2021-05-24 12:42                   ` BALATON Zoltan
2021-05-25  5:29                     ` David Gibson
2021-05-25  9:55                       ` BALATON Zoltan
2021-05-27  5:31                         ` David Gibson
2021-05-24  5:23   ` David Gibson
2021-05-24  9:57     ` BALATON Zoltan
2021-05-24 10:50       ` David Gibson
2021-05-29 18:10 ` BALATON Zoltan
2021-05-30 17:33 ` BALATON Zoltan
2021-05-31 13:07   ` BALATON Zoltan
2021-06-01 12:02     ` Alexey Kardashevskiy
2021-06-01 14:12       ` BALATON Zoltan
2021-06-04  6:21         ` David Gibson
2021-06-04 13:27           ` BALATON Zoltan
2021-06-07  3:02             ` David Gibson
2021-06-04  6:19   ` David Gibson
2021-06-04 13:50     ` BALATON Zoltan
2021-06-04 14:34       ` BALATON Zoltan
2021-06-07  3:05       ` David Gibson
2021-06-09  6:13         ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4f6ceca3-5f18-fe70-18f9-4efde8feb1ed@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=balaton@eik.bme.hu \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).