qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>,
	quintela@redhat.com, qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] 3.1: second invocation of migrate crashes qemu
Date: Thu, 24 Jan 2019 20:04:28 +0000	[thread overview]
Message-ID: <20190124200428.GG2101@work-vm> (raw)
In-Reply-To: <20190121164550.GE5638@linux.fritz.box>

* Kevin Wolf (kwolf@redhat.com) wrote:
> Am 21.01.2019 um 17:05 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (kwolf@redhat.com) wrote:
> > > Am 18.01.2019 um 16:57 hat Dr. David Alan Gilbert geschrieben:
> > > > * Kevin Wolf (kwolf@redhat.com) wrote:
> > > > > Am 14.01.2019 um 11:51 hat Dr. David Alan Gilbert geschrieben:
> > > > > > * Michael Tokarev (mjt@tls.msk.ru) wrote:
> > > > > > > $ qemu-system-x86_64 -monitor stdio -hda foo.img
> > > > > > > QEMU 3.1.0 monitor - type 'help' for more information
> > > > > > > (qemu) stop
> > > > > > > (qemu) migrate "exec:cat >/dev/null"
> > > > > > > (qemu) migrate "exec:cat >/dev/null"
> > > > > > > qemu-system-x86_64: /build/qemu/qemu-3.1/block.c:4647: bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
> > > > > > > Aborted
> > > > > > 
> > > > > > And on head as well;  it only happens if the 1st migrate is succesful;
> > > > > > if it got cancelled the 2nd one works, so it's not too bad.
> > > > > > 
> > > > > > I suspect the problem here is all around locking/ownership - the block
> > > > > > devices get shutdown at the end of migration since the assumption is
> > > > > > that the other end has them open now and we had better release them.
> > > > > 
> > > > > Yes, only "cont" gets control back to the source VM.
> > > > > 
> > > > > I think we really should limit the possible monitor commands in the
> > > > > postmigrate status, and possibly provide a way to get back to the
> > > > > regular paused state (which means getting back control of the resources)
> > > > > without resuming the VM first.
> > > > 
> > > > This error is a little interesting if you'd done something like:
> > > > 
> > > > 
> > > >      src:
> > > >          stop
> > > >          migrate
> > > > 
> > > >      dst:
> > > >          <kill qemu for some reason>
> > > >          start a new qemu
> > > > 
> > > >      src:
> > > >          migrate
> > > > 
> > > > Now that used to work (safely) - note we've not started
> > > > a VM succesfully anywhere else.
> > > > 
> > > > Now the source refuses to let that happen - with a rather
> > > > nasty abort.
> > > 
> > > Essentially it's another effect of the problem that migration has always
> > > lacked a proper model of ownership transfer. And it's still treating
> > > this as a block layer problem rather than making it a core concept of
> > > migration as it should.
> > > 
> > > We can stack another one-off fix on top, and get back control of the
> > > block devices automatically on a second 'migrate'. But it feels like a
> > > hack and not like VMs had a properly designed and respected state
> > > machine.
> > 
> > Hmm; I don't like to get back to this argument because I think
> > we've got a perfectly servicable model that's implemented at higher
> > levels outside qemu, and the real problem is the block layer added
> > new assumptions about the semantics without checking they were really
> > true.
> > qemu only has the view from a single host; it takes the higher level
> > view from something like libvirt to have the view across multiple hosts
> > to understand who has the ownership when.
> 
> Obviously the upper layer is not handling this without the help of QEMU
> or we wouldn't have had bugs that images were accessed by two QEMU
> processes at the same time. We didn't change the assumptions, but we
> only started to actually check the preconditions that have always been
> necessary to perform live migration correctly.

In this case there is a behaviour that was perfectly legal before that
fails now; further the case is safe - the source hasn't accessed the
disks after the first migration and isn't trying to access it again
either.

> But if you like to think the upper layer should handle all of this,

I don't really want the upper layer to handle all of this; but I don't
think we can handle it all either - we've not got the higher level
view of screwups that happen outside qemu.

>then
> it's on libvirt to handle the ownership transfer manually. If we really
> want, we can add explicit QMP commands to activate and inactivate block
> nodes. This can be done and requiring that the management layer does
> all of this would be a consistent interface, too.
> 
> I just don't like this design much for two reasons: The first is that
> you can't migrate a VM that has disks with a simple 'migrate' command
> any more. The second is that if you implement it consistently, this has
> an impact on compatibility. I think it's a design that could be
> considered if we were adding live migration as a new feature, but it's
> probably hard to switch to it now.
> 
> In any case, I do think we should finally make a decision how ownership
> of resources should work in the context of migration, and then implement
> that.

I think we're mostly OK, but what I'd like would be:
  a) I'd like things to fail gently rather than abort; so I'd either
     like the current functions to fail cleanly so I can fail the
     migration or add a check at the start of migration to tell the user
    they did something wrong.

  b) I'd like commands that can tell me the current state and a command
     to move it to the other state explicitly; so we've got a way
     to recover in weirder cases.

  c) I'd like to document what the states should be before/after/in
     various  middle states of migration.

I think the normal case is fine, and hence as you say, I wouldn't
want to break a normal 'migrate' - I just want cleaner failures
and ways to do the more unusual things.

Dave

> Kevin
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

      reply	other threads:[~2019-01-24 20:14 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-12 17:11 [Qemu-devel] 3.1: second invocation of migrate crashes qemu Michael Tokarev
2019-01-14 10:51 ` Dr. David Alan Gilbert
2019-01-14 11:52   ` Kevin Wolf
2019-01-18 15:57     ` Dr. David Alan Gilbert
2019-01-21 15:55       ` Kevin Wolf
2019-01-21 16:05         ` Dr. David Alan Gilbert
2019-01-21 16:45           ` Kevin Wolf
2019-01-24 20:04             ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190124200428.GG2101@work-vm \
    --to=dgilbert@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mjt@tls.msk.ru \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).