From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: pkrempa@redhat.com, marcel.a@redhat.com, libvir-list@redhat.com,
hutao@cn.fujitsu.com, qemu-devel@nongnu.org, armbru@redhat.com,
rhod@redhat.com, kraxel@redhat.com, anthony@codemonkey.ws,
Paolo Bonzini <pbonzini@redhat.com>,
lcapitulino@redhat.com, lersek@redhat.com, afaerber@suse.de
Subject: Re: [Qemu-devel] [RFC PATCH v2 0/3] Start fixing the pvpanic mess
Date: Wed, 21 Aug 2013 20:26:18 +0300 [thread overview]
Message-ID: <20130821172618.GB12410@redhat.com> (raw)
In-Reply-To: <5214F2C0.8050203@redhat.com>
On Wed, Aug 21, 2013 at 11:02:56AM -0600, Eric Blake wrote:
> On 08/21/2013 10:51 AM, Paolo Bonzini wrote:
> > Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
> >> No, <on_crash> is the right thing to be using for this from
> >> libvirt's pov & I don't think we should invent something new.
> >> The <on_crash> element has always been intended to represent
> >> handling of guest panics, not qemu internal errors.
> >
> > Actually for Xen HVM guests, it mostly traps things such as failed
> > vmentries. The Xen PV-on-HVM drivers do not register a panic notifier
> > that moves the guest to the "crashed" state.
> >
> > <on_crash> cannot be salvaged, in my opinion, because all domain XMLs in
> > the wild will have a setting that causes libvirt to add "-device
> > isa-pvpanic". Thus changing libvirt versions will change guest
> > hardware, which is _very_ bad.
>
> Let's expand on that statement:
>
> Libvirt's default for <on_crash> is 'destroy'. But virt-install (and
> thus virt-manager) have been setting explicit 'restart' for AGES now.
>
> Arguably, this is YET ANOTHER reason why virt-manager should be using
> libosinfo to make sane choices about new guest XML, based on known
> capabilities of the guest it will be installing. But that only affects
> newly created guests after we fix the virt stack.
>
> In the meantime, you have a point that we have a back-compat mess - we
> promise ABI stability (guests shall not see hardware changes when
> upgrading versions of libvirtd but leaving the XML unchanged - the only
> way to change hardware seen by an existing guest is to explicitly modify
> XML).
>
> >
> > In addition, Windows XP and 2003 will show the annoying device wizard
> > upon a libvirt upgrade, and fixing this is what surfaced all the mess.
>
> Yes, so we need the back-compat code to leave pvpanic out of
> pre-existing guests, if we can find a way to sensibly do that.
>
> So, this boils down to a question of what SHOULD the valid states for
> <on_crash> be? Generically, we want <on_crash>destroy</on_crash> to not
> invalidate a guest, but also to not instantiate a pvpanic device; since
> that covers the libvirt defaults. We also want
> <on_crash>restart</on_crash> to not invalidate a guest, but also to not
> instantiate a pvpanic device, since so many existing guests have that
> setting thanks to virt-install.
>
> Maybe that means we add attributes/sub-elements to <on_crash> that
> express whether pvpanic device is permitted; and the absence of that
> attribute means the status quo (the <on_crash> tag is effectively
> ignored because without pvpanic device, there is no way for libvirt to
> learn if a guest panicked). Or does it mean we expose a new sub-element
> of <devices>, similar to how we have a <memballoon> subelement that
> controls whether the memballoon device is show to the guest, and just
> document that for qemu, <on_crash> is a no-op without the <pvpanic>
> subelement?
This is a QEMU bug that you happened to be Cc'd on.
So you started worry about supporting a buggy QEMU.
This is generally futile.
There are uncounted bugs that we silently fixed.
They are often much more major than this silly reversibility bug.
Some bios versions have racy hotplug support so
hotplug event can be missed.
Should libvirt warn the user that bios is broken
and suggest restarting guest to see the device?
Some QEMU versions had a racy implementation of virtio
that would corrupt guest memory.
Should libvirt warn the user that virtio is broken
and suggest switching to e1000 or upgrading QEMU?
Some QEMU versions have buggy qcow2 that would corrupt disk.
Should libvirt warn the user that qcow2 is broken
and suggest switching to raw?
Some kernels have buggy vhost drivers which would crash host.
Should libvirt detect these and tell user to upgrade kernel
or switch to userspace virtio?
Some kernels have NIC drivers that brick hardware.
Should libvirt detect these and tell user to upgrade kernel
or switch to a different NIC?
There are libc bugs, glib bugs ....
Let's fix the bug in QEMU and move on.
Working around them in libvirt is unnecessary.
--
MST
next prev parent reply other threads:[~2013-08-21 17:24 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-21 16:43 [Qemu-devel] [RFC PATCH v2 0/3] Start fixing the pvpanic mess Paolo Bonzini
2013-08-21 16:43 ` [Qemu-devel] [PATCH 1/3] vl: allow "cont" from panicked state Paolo Bonzini
2013-08-21 16:43 ` [Qemu-devel] [PATCH 2/3] pc: get rid of builtin pvpanic Paolo Bonzini
2013-08-21 17:03 ` Michael S. Tsirkin
2013-08-21 17:02 ` Paolo Bonzini
2013-08-21 17:07 ` Andreas Färber
2013-08-21 17:04 ` Michael S. Tsirkin
2013-08-21 17:33 ` Michael S. Tsirkin
2013-08-21 16:43 ` [Qemu-devel] [PATCH 3/3] pvpanic: rename to isa-pvpanic Paolo Bonzini
2013-08-21 17:01 ` Michael S. Tsirkin
2013-08-21 17:01 ` Paolo Bonzini
2013-08-21 17:07 ` Michael S. Tsirkin
2013-08-21 17:06 ` Paolo Bonzini
2013-08-21 17:31 ` Michael S. Tsirkin
2013-08-22 12:43 ` Laszlo Ersek
2013-08-22 12:41 ` Paolo Bonzini
2013-08-25 10:44 ` Michael S. Tsirkin
2013-08-22 16:50 ` Anthony Liguori
2013-08-25 10:29 ` Michael S. Tsirkin
2013-08-21 17:35 ` Andreas Färber
2013-08-21 17:46 ` Paolo Bonzini
2013-08-21 16:48 ` [Qemu-devel] [RFC PATCH v2 0/3] Start fixing the pvpanic mess Daniel P. Berrange
2013-08-21 16:51 ` Paolo Bonzini
2013-08-21 16:55 ` Daniel P. Berrange
2013-08-21 16:56 ` Paolo Bonzini
2013-08-21 17:10 ` Eric Blake
2013-08-21 17:11 ` Paolo Bonzini
2013-08-22 9:17 ` Daniel P. Berrange
2013-08-21 17:02 ` Eric Blake
2013-08-21 17:10 ` Paolo Bonzini
2013-08-21 17:26 ` Michael S. Tsirkin [this message]
2013-08-21 17:30 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130821172618.GB12410@redhat.com \
--to=mst@redhat.com \
--cc=afaerber@suse.de \
--cc=anthony@codemonkey.ws \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=hutao@cn.fujitsu.com \
--cc=kraxel@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=lersek@redhat.com \
--cc=libvir-list@redhat.com \
--cc=marcel.a@redhat.com \
--cc=pbonzini@redhat.com \
--cc=pkrempa@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rhod@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).