From: Alex Williamson <alex.williamson@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: cam@cs.ualberta.ca, qemu-devel@nongnu.org, kvm@vger.kernel.org,
quintela@redhat.com
Subject: [Qemu-devel] Re: [PATCH 0/6] Save state error handling (kill off no_migrate)
Date: Mon, 08 Nov 2010 14:23:37 -0700 [thread overview]
Message-ID: <1289251417.28165.37.camel@x201> (raw)
In-Reply-To: <20101108205901.GB10777@redhat.com>
On Mon, 2010-11-08 at 22:59 +0200, Michael S. Tsirkin wrote:
> On Mon, Nov 08, 2010 at 10:20:46AM -0700, Alex Williamson wrote:
> > On Mon, 2010-11-08 at 18:54 +0200, Michael S. Tsirkin wrote:
> > > On Mon, Nov 08, 2010 at 07:59:57AM -0700, Alex Williamson wrote:
> > > > On Mon, 2010-11-08 at 13:40 +0200, Michael S. Tsirkin wrote:
> > > > > On Wed, Oct 06, 2010 at 02:58:57PM -0600, Alex Williamson wrote:
> > > > > > Our code paths for saving or migrating a VM are full of functions that
> > > > > > return void, leaving no opportunity for a device to cancel a migration,
> > > > > > either from error or incompatibility. The ivshmem driver attempted to
> > > > > > solve this with a no_migrate flag on the save state entry. I think the
> > > > > > more generic and flexible way to solve this is to allow driver save
> > > > > > functions to fail. This series implements that and converts ivshmem
> > > > > > to uses a set_params function to NAK migration much earlier in the
> > > > > > processes. This touches a lot of files, but bulk of those changes are
> > > > > > simply s/void/int/ and tacking a "return 0" to the end of functions.
> > > > > > Thanks,
> > > > > >
> > > > > > Alex
> > > > >
> > > > > Well error handling is always tricky: it seems easier to
> > > > > require save handlers to never fail.
> > > >
> > > > Sure it's easier, but does that make it robust?
> > >
> > > More robust in the face of wwhat kind of failure?
> >
> > I really don't understand why we're having a discussion about whether
> > providing a means to return an error is a good thing or not. These
> > patches touch a lot of files, but the change is dead simple.
>
> I just don't see the motivation. Presumably your patches are
> there to achieve some kind of goal, right? I am trying to
> figure out what that goal is.
My goal is that I want to be able to NAK a migration when devices are
assigned, and I think we can do it more generically than the no_migrate
flag so that it supports this application and any other reason that
saves might fail in the future.
> Currently savevm callbacks never fail. So they
> return void. Why is returing 0 and adding a bunch of code to test the
> condition that never happens a good idea? It just seems to create more
> ways for devices to shoot themselves in the foot.
And more ways to indicate something bad happened and keep running. We
already have far too many abort() calls in the code.
> > > > > So there's a bunch of code here but what exactly is the benefit?
> > > > > Since save handlers have no idea what does the remote do,
> > > > > what is the compatibility you mention?
> > > >
> > > > There are two users I currently have in mind. ivshmem currently makes
> > > > use of the register_device_unmigratable() because it makes use of host
> > > > specific resources and connections (aiui). This sets the no_migrate
> > > > flag, which is not dynamic and a bit of a band-aide.
> > > > The other is
> > > > device assignment, which needs a way to NAK a migration since physical
> > > > devices are never migratable.
> > >
> > > Well since all these can't be migrated ever, a fixed property actually seems
> > > a good match. Sure it's not dynamic but all the easier to debug.
> > >
> > > > I imagine we could at some point have
> > > > devices with state tied to other features that can't always be detached
> > > > from the host, this tries to provide the infrastructure for that to
> > > > happen.
> > > >
> > > > Alex
> > >
> > > Let guest control whether you can migrate?
> > > Sounds like something that is more likely to be abused
> > > than used constructively.
> >
> > s/guest/device/ So you would rather the migration failed on the
> > incoming side where it may not be detected
>
> And incoming migration handlers *must* validate the input, anyway.
> We should not plaster over this with checks on outgoing side.
I'm not in any way suggesting incoming shouldn't do validation.
> > or it may be detected too
> > late to stop the migration?
> >
> > Alex
>
> So there's a bug and device is in an unexpected state.
> What can we do? Assert, print an error, notify guest - all these
> come to mind. But stop migration? Seems arbitrary.
Perhaps the problem is that either an assert or an fprintf are the first
things that come to mind. We shouldn't have guests randomly blowing up
or telling users to go scan through their log files to find errors.
It's not very hard to allow simple error handling, so why shouldn't our
first plan of attack be to return an error so that the human/qmp monitor
can detect it and inform the user. For the current candidates for this
interface, there's no point notifying the guest, it's the interface
attempting to do the migration that needs to know there's something
blocking it.
Alex
next prev parent reply other threads:[~2010-11-08 21:26 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-06 20:58 [Qemu-devel] [PATCH 0/6] Save state error handling (kill off no_migrate) Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 1/6] savevm: Allow SaveStateHandler() to return error Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 2/6] savevm: Allow vmsd->pre_save " Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 3/6] pci: Allow pci_device_save() " Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 4/6] virtio: Allow virtio_save() errors Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 5/6] savevm: Allow set_params and save_live_state to error Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 6/6] savevm: Remove register_device_unmigratable() Alex Williamson
2010-10-07 16:55 ` [Qemu-devel] Re: [PATCH 0/6] Save state error handling (kill off no_migrate) Alex Williamson
2010-11-08 11:40 ` Michael S. Tsirkin
2010-11-08 14:59 ` Alex Williamson
2010-11-08 16:54 ` Michael S. Tsirkin
2010-11-08 17:20 ` Alex Williamson
2010-11-08 20:59 ` Michael S. Tsirkin
2010-11-08 21:23 ` Alex Williamson [this message]
2010-11-09 12:00 ` Michael S. Tsirkin
2010-11-09 14:58 ` Alex Williamson
2010-11-09 15:07 ` Michael S. Tsirkin
2010-11-09 15:34 ` Alex Williamson
2010-11-09 15:42 ` Michael S. Tsirkin
2010-11-09 15:47 ` Alex Williamson
2010-11-09 16:15 ` Michael S. Tsirkin
2010-11-09 16:30 ` Alex Williamson
2010-11-09 16:49 ` Michael S. Tsirkin
2010-11-09 17:44 ` Alex Williamson
2010-11-09 19:35 ` Alex Williamson
2010-11-16 10:23 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1289251417.28165.37.camel@x201 \
--to=alex.williamson@redhat.com \
--cc=cam@cs.ualberta.ca \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).