[Qemu-devel] Re: [PATCH 0/6] Save state error handling (kill off no_migrate)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Alex Williamson <alex.williamson@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: cam@cs.ualberta.ca, qemu-devel@nongnu.org, kvm@vger.kernel.org,
	quintela@redhat.com
Subject: [Qemu-devel] Re: [PATCH 0/6] Save state error handling (kill off no_migrate)
Date: Tue, 09 Nov 2010 09:30:45 -0700	[thread overview]
Message-ID: <1289320245.14321.28.camel@x201> (raw)
In-Reply-To: <20101109161525.GA26897@redhat.com>

On Tue, 2010-11-09 at 18:15 +0200, Michael S. Tsirkin wrote:
> On Tue, Nov 09, 2010 at 08:47:00AM -0700, Alex Williamson wrote:
> > > > But it could.  What if ivshmem is acting in a peer role, but has no
> > > > clients, could it migrate?  What if ivshmem is migratable when the
> > > > migration begins, but while the migration continues, a connection is
> > > > setup and it becomes unmigratable.
> > > 
> > > Sounds like something we should work to prevent, not support :)
> > 
> > s/:)/:(/  why?
> 
> It will just confuse everyone. Also if it happens after sending
> all of memory, it's pretty painful.

It happens after sending all of memory with no_migrate, and I think
pushing that earlier might introduce some races around when
register_device_unmigratable() can be called.

> > > >  Using this series, ivshmem would
> > > > have multiple options how to support this.  It could a) NAK the
> > > > migration, b) drop connections and prevent new connections until the
> > > > migration finishes, c) detect that new connections have happened since
> > > > the migration started and cancel.  And probably more.  no_migrate can
> > > > only do a).  And in fact, we can only test no_migrate after the VM is
> > > > stopped (after all memory is migrated) because otherwise it could race
> > > > with devices setting no_migrate during migration.
> > > 
> > > We really want no_migrate to be static. changing it is abusing
> > > the infrastructure.
> > 
> > You call it abusing, I call it making use of the infrastructure.  Why
> > unnecessarily restrict ourselves?  Is return 0/-1 really that scary,
> > unmaintainable, undebuggable?  I don't understand the resistance.
> > 
> > Alex
> 
> management really does not know how to handle unexpected
> migration failures. They must be avoided.
> 
> There are some very special cases that fail migration. They are
> currently easy to find with grep register_device_unmigratable.
> I prefer to keep it that way.

How can management tools be improved to better handle unexpected
migration failures when the only way for qemu to fail is an abort?  We
need the infrastructure to at least return an error first.  Do we just
need to add some fprintfs to the save core to print the id string of the
device that failed to save?  I just can't buy the "code is easier to
grep" as an argument against adding better error handling to the save
code path.  Anyone else want to chime in?

Alex

next prev parent reply	other threads:[~2010-11-09 16:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-06 20:58 [Qemu-devel] [PATCH 0/6] Save state error handling (kill off no_migrate) Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 1/6] savevm: Allow SaveStateHandler() to return error Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 2/6] savevm: Allow vmsd->pre_save " Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 3/6] pci: Allow pci_device_save() " Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 4/6] virtio: Allow virtio_save() errors Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 5/6] savevm: Allow set_params and save_live_state to error Alex Williamson
2010-10-06 20:59 ` [Qemu-devel] [PATCH 6/6] savevm: Remove register_device_unmigratable() Alex Williamson
2010-10-07 16:55 ` [Qemu-devel] Re: [PATCH 0/6] Save state error handling (kill off no_migrate) Alex Williamson
2010-11-08 11:40 ` Michael S. Tsirkin
2010-11-08 14:59   ` Alex Williamson
2010-11-08 16:54     ` Michael S. Tsirkin
2010-11-08 17:20       ` Alex Williamson
2010-11-08 20:59         ` Michael S. Tsirkin
2010-11-08 21:23           ` Alex Williamson
2010-11-09 12:00             ` Michael S. Tsirkin
2010-11-09 14:58               ` Alex Williamson
2010-11-09 15:07                 ` Michael S. Tsirkin
2010-11-09 15:34                   ` Alex Williamson
2010-11-09 15:42                     ` Michael S. Tsirkin
2010-11-09 15:47                       ` Alex Williamson
2010-11-09 16:15                         ` Michael S. Tsirkin
2010-11-09 16:30                           ` Alex Williamson [this message]
2010-11-09 16:49                             ` Michael S. Tsirkin
2010-11-09 17:44                               ` Alex Williamson
2010-11-09 19:35                                 ` Alex Williamson
2010-11-16 10:23 ` Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1289320245.14321.28.camel@x201 \
    --to=alex.williamson@redhat.com \
    --cc=cam@cs.ualberta.ca \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).