From: Jeff Cody <jcody@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Luiz Capitulino <lcapitulino@redhat.com>,
qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 3/3] qapi: Introduce blockdev-query-group-snapshot-failure
Date: Tue, 21 Feb 2012 09:11:28 -0500 [thread overview]
Message-ID: <4F43A610.5070203@redhat.com> (raw)
In-Reply-To: <4F428778.8040409@redhat.com>
On 02/20/2012 12:48 PM, Eric Blake wrote:
> On 02/20/2012 10:31 AM, Jeff Cody wrote:
>> In the case of a failure in a group snapshot, it is possible for
>> multiple file image failures to occur - for instance, failure of
>> an original snapshot, and then failure of one or more of the
>> attempted reopens of the original.
>>
>> Knowing all of the file images which failed could be useful or
>> critical information, so this command returns a list of strings
>> containing the filenames of all failures from the last
>> invocation of blockdev-group-snapshot-sync.
>
> Meta-question:
>
> Suppose that the guest is running when we issue
> blockdev-group-snapshot-sync - in that case, qemu is responsible for
> pausing and then resuming the guest. On success, this makes sense. But
> what happens on failure?
The guest is not paused in blockdev-group-snapshot-sync; I don't think
that qemu should enforce pause/resume in the live snapshot commands.
>
> If we only fail at creating one snapshot, but successfully roll back the
> rest of the set, should the guest be resumed (as if the command had
> never been attempted), or should the guest be left paused?
>
> On the other hand, if we fail at creating one snapshot, as well as fail
> at rolling back, then that argues that we _cannot_ resume the guest,
> because we no longer have a block device open.
Is that really true, though? Depending on what drive failed, the guest
may still be runnable. It would be roughly equivalent to the guest as a
drive failure; a bad event, but not always fatal.
But, I think v2 of the patch may make this moot - I was talking with
Kevin, and he had some good ideas on how to do this without requiring a
close & reopen in the case of the snapshot failure; which means that we
shouldn't have to worry about the second scenario. I am going to
incorporate those changes into v2.
>
> This policy needs to be documented in one (or both) of the two new
> monitor commands, and we probably ought to make sure that if the guest
> is left paused where it had originally started as running, then an
> appropriate event is also emitted.
I agree, the documentation should make it clear what is going on - I
will add that to v2.
>
> For blockdev-snapshot-sync, libvirt was always pausing qemu before
> issuing the snapshot, then resuming afterwards; but now that we have the
> ability to make the set atomic, I'm debating about whether libvirt still
> needs to pause qemu, or whether it can now rely on qemu doing the right
> things about pausing and resuming as part of the snapshot command.
>
Again, it doesn't pause automatically, so that is up to libvirt. The
guest agent is also available to freeze the filesystem, if libvirt wants
to trust it (and it is running); if not, then libvirt can still issue a
pause/resume around the snapshot command (and libvirt may be in a better
position to decide what to do in case of failure, if it has some
knowledge of the drives that failed and how they are used).
prev parent reply other threads:[~2012-02-21 14:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-20 17:31 [Qemu-devel] [PATCH 0/3] Group Live Snapshots Jeff Cody
2012-02-20 17:31 ` [Qemu-devel] [PATCH 1/3] qapi: Allow QMP/QAPI commands to have array inputs Jeff Cody
2012-02-22 14:53 ` Anthony Liguori
2012-02-22 16:12 ` Jeff Cody
2012-02-22 17:35 ` Anthony Liguori
2012-02-22 17:47 ` Eric Blake
2012-02-22 17:56 ` Anthony Liguori
2012-02-22 18:32 ` Jeff Cody
2012-02-22 18:26 ` Jeff Cody
2012-02-22 20:25 ` Luiz Capitulino
2012-02-22 20:31 ` Anthony Liguori
2012-02-22 20:37 ` Luiz Capitulino
2012-02-20 17:31 ` [Qemu-devel] [PATCH 2/3] qapi: Introduce blockdev-group-snapshot-sync command Jeff Cody
2012-02-20 17:41 ` Eric Blake
2012-02-21 12:52 ` Jeff Cody
2012-02-20 17:31 ` [Qemu-devel] [PATCH 3/3] qapi: Introduce blockdev-query-group-snapshot-failure Jeff Cody
2012-02-20 17:48 ` Eric Blake
2012-02-21 14:11 ` Jeff Cody [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F43A610.5070203@redhat.com \
--to=jcody@redhat.com \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).