From: Stefan Hajnoczi <stefanha@redhat.com>
To: Maran Wilson <maran.wilson@oracle.com>
Cc: Stefano Garzarella <sgarzare@redhat.com>,
qemu-devel@nongnu.org, Rob Bradford <robert.bradford@intel.com>,
Samuel Ortiz <sameo@linux.intel.com>
Subject: Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
Date: Fri, 7 Dec 2018 10:02:12 +0000 [thread overview]
Message-ID: <20181207100212.GA18699@stefanha-x1.localdomain> (raw)
In-Reply-To: <546d6559-893f-0ccf-55e2-a671597a01ae@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 8223 bytes --]
On Thu, Dec 06, 2018 at 06:47:54AM -0800, Maran Wilson wrote:
> On 12/6/2018 2:38 AM, Stefan Hajnoczi wrote:
> > On Wed, Dec 05, 2018 at 10:04:36AM -0800, Maran Wilson wrote:
> > > On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
> > > > On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
> > > > > On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> > > > > > On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> > > > > > > Hi Stefano, thanks for capturing all these numbers,
> > > > > > >
> > > > > > > On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> > > > > > > > Hi Rob,
> > > > > > > > I continued to investigate the boot time, and as you suggested I
> > > > > > > > looked also at qemu-lite 2.11.2
> > > > > > > > (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> > > > > > > > did the following tests using the Kata kernel configuration
> > > > > > > > (
> > > > > > > > https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> > > > > > > > )
> > > > > > > >
> > > > > > > > To compare the results with qemu-lite direct kernel load, I added
> > > > > > > > another tracepoint:
> > > > > > > > - linux_start_kernel: first entry of the Linux kernel
> > > > > > > > (start_kernel())
> > > > > > > >
> > > > > > > Great, do you have a set of patches available that all these trace
> > > > > > > points. It would be great for reproduction.
> > > > > > For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> > > > > > nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> > > > > > I'm also sharing a python script that I'm using with perf to extract
> > > > > > the numbers in this way:
> > > > > >
> > > > > > $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> > > > > > sched:sched_process_exec -o /tmp/qemu_perf.data &
> > > > > > $ # start qemu/nemu multiple times
> > > > > > $ killall perf
> > > > > > $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> > > > > >
> > > > > > > > As you can see, NEMU is faster to jump to the kernel
> > > > > > > > (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> > > > > > > > virt support, but the time to the user space is strangely high, maybe
> > > > > > > > the kernel configuration that I used is not the best one.
> > > > > > > > Do you suggest another kernel configuration?
> > > > > > > >
> > > > > > > This looks very bad. This isn't the kernel configuration we normally
> > > > > > > test with in our automated test system but is definitely one we support
> > > > > > > as part of our partnernship with the Kata team. It's a high priority
> > > > > > > for me to try and investigate that. Have you saved the kernel messages
> > > > > > > as they might be helpful?
> > > > > > Yes, I'm attaching the dmesg output with nemu and qemu.
> > > > > >
> > > > > > > > Anyway, I obtained the best boot time with qemu-lite and direct
> > > > > > > > kernel
> > > > > > > > load (vmlinux ELF image). I think because the kernel was not
> > > > > > > > compressed. Indeed, looking to the others test, the kernel
> > > > > > > > decompression (bzImage) takes about 80 ms (linux_start_kernel -
> > > > > > > > linux_start_boot). (I'll investigate better)
> > > > > > > >
> > > > > > > Yup being able to load an uncompressed kernel is one of the big
> > > > > > > advantages of qemu-lite. I wonder if we could bring that feature into
> > > > > > > qemu itself to supplement the existing firmware based kernel loading.
> > > > > > I think so, I'll try to understand if we can merge the qemu-lite
> > > > > > direct kernel loading in qemu.
> > > > > An attempt was made a long time ago to push the qemu-lite stuff (from the
> > > > > Intel Clear Containers project) upstream. As I understand it, the main
> > > > > stumbling block that seemed to derail the effort was that it involved adding
> > > > > Linux OS specific code to Qemu so that Qemu could do things like create and
> > > > > populate the zero page that Linux expects when entering startup_64().
> > > > >
> > > > > That ends up being a lot of very low-level, operating specific knowledge
> > > > > about Linux that ends up getting baked into Qemu code. And understandably, a
> > > > > number of folks saw problems with going down a path like that.
> > > > >
> > > > > Since then, we have put together an alternative solution that would allow
> > > > > Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
> > > > > (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
> > > > > first making changes to both the ABI as well as Linux, and then updating
> > > > > Qemu to take advantage of the updated ABI which is already supported by both
> > > > > Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
> > > > > and just be programmed to the published ABI.
> > > > >
> > > > > The canonical definition for the HVM direct boot ABI is in the Xen tree and
> > > > > we needed to make some minor changes to the ABI definition to allow KVM
> > > > > guests to also use the same structure and entry point. Those changes were
> > > > > accepted to the Xen tree already:
> > > > > https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
> > > > >
> > > > > The corresponding Linux changes that would allow KVM guests to be booted via
> > > > > this PVH entry point have already been posted and reviewed:
> > > > > https://lkml.org/lkml/2018/4/16/1002
> > > > >
> > > > > The final part is the set of Qemu changes to take advantage of the above and
> > > > > boot a KVM guest via an uncompressed kernel binary using the entry point
> > > > > defined by the ABI. Liam Merwick will be posting some RFC patches very soon
> > > > > to allow this.
> > > > Cool, thanks for doing this work!
> > > >
> > > > How do the boot times compare to qemu-lite and Firecracker's
> > > > (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> > > > boot?
> > > Boot times compare very favorably to qemu-lite, since the end result is
> > > basically doing a very similar thing. For now, we are going with a QEMU +
> > > qboot solution to introduce the PVH entry support in Qemu (meaning we will
> > > be posting Qemu and qboot patches and you will need both to boot an
> > > uncompressed kernel binary). As such we have numbers that Liam will include
> > > in the cover letter showing significant boot time improvement over existing
> > > QEMU + qboot approaches involving a compressed kernel binary. And as we all
> > > know, the existing qboot approach already gets boot times down pretty low.
> > The first email in this thread contains benchmark results showing that
> > optimized SeaBIOS is comparable to qboot, so it does not offer anything
> > unique with respect to boot time.
>
> To be fair, what I'm saying is that the qboot + PVH approach saves a
> significant percentage of boot time as compared to qboot only. So it does
> provide an important improvement over both existing qboot as well as
> optimized SeaBIOS from what I can tell. Please see:
>
> http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00957.html
> and
> http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00953.html
>
> > We're trying to focus on SeaBIOS because it's actively maintained and
> > already shipped by distros. Relying on qboot will make it harder to get
> > PVH into the hands of users because distros have to package and ship
> > qboot first. This might also require users to change their QEMU
> > command-line syntax to benefit from fast kernel booting.
>
> But you do make a good point here about distribution and usability. Using
> qboot is just one way to take advantage of the PVH entry -- and the quickest
> way for us to get something usable out there for the community to look at
> and play with.
>
> There are other ways to take advantage of the PVH entry for KVM guests, once
> the Linux changes are in place. So qboot is definitely not a hard
> requirement in the long run.
Great, good to hear!
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2018-12-07 10:02 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-26 16:40 [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares Stefano Garzarella
2018-11-27 9:57 ` Rob Bradford
2018-11-27 14:21 ` Stefano Garzarella
2018-12-03 14:27 ` Stefano Garzarella
2018-12-03 15:44 ` Rob Bradford
2018-12-03 16:35 ` Stefano Garzarella
2018-12-04 22:44 ` Maran Wilson
2018-12-05 12:06 ` Stefano Garzarella
2018-12-05 13:20 ` Stefan Hajnoczi
2018-12-05 14:19 ` Boris Ostrovsky
2018-12-05 18:04 ` Maran Wilson
2018-12-06 10:38 ` Stefan Hajnoczi
2018-12-06 14:47 ` Maran Wilson
2018-12-07 10:02 ` Stefan Hajnoczi [this message]
2018-12-10 13:46 ` Stefano Garzarella
2018-12-05 12:26 ` Philippe Mathieu-Daudé
2018-12-05 16:23 ` Stefano Garzarella
2018-12-13 11:19 ` Stefano Garzarella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181207100212.GA18699@stefanha-x1.localdomain \
--to=stefanha@redhat.com \
--cc=maran.wilson@oracle.com \
--cc=qemu-devel@nongnu.org \
--cc=robert.bradford@intel.com \
--cc=sameo@linux.intel.com \
--cc=sgarzare@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).