From: Anthony Liguori <anthony@codemonkey.ws>
To: Jes Sorensen <Jes.Sorensen@redhat.com>
Cc: kwolf@redhat.com, qemu-devel@nongnu.org,
Markus Armbruster <armbru@redhat.com>,
Avi Kivity <avi@redhat.com>,
Luiz Capitulino <lcapitulino@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 1/1] Add QMP bits for blockdev-snapshot-sync.
Date: Thu, 28 Apr 2011 10:10:58 -0500 [thread overview]
Message-ID: <4DB98382.6040105@codemonkey.ws> (raw)
In-Reply-To: <4DB98073.6050701@redhat.com>
On 04/28/2011 09:57 AM, Jes Sorensen wrote:
> On 04/28/11 16:46, Anthony Liguori wrote:
>> On 04/28/2011 09:38 AM, Jes Sorensen wrote:
>>>
>>> Sorry but this is utterly bogus.
>>>
>>> The snapshot support as is works fine, and the command is in the
>>> monitor. We should expose it in QMP as well.
>>
>> It went in for the monitor because it was considered an imperfect
>> command so we held up the QMP side because we wanted a better interface.
>
> I am not sure who is included in the 'we' here.....
>
>> The current command does:
>> 1) Create new image backing to current image
>> 2) Flush outstanding I/O to old image
>> 3) Close current image
>> 4) Reopen newly created image
>> 5) Go
>
>> Operations (1) and (2) are very synchronous operations. (4) can be too.
>> We really should have a bdrv_aio_snapshot() function that implements
>> the logic for at least (2) in an asynchronous fashion.
>>
>> That sort of interface is going to affect how we expose things in QMP.
>> As from a QMP perspective, we're going to do something like:
>>
>> a) start snapshot
>> b) query snapshot progress
>> c) receive notification of snapshot completion
>> d) flip over image
>
> Sorry this is inherently broken. The management tool should not be
> keeping state in this process. I agree an async interface would be nice,
> but the above process is plain wrong.
>
> The async snapshot process needs to be doing the exact same as in the
> current implementation, the main difference is that it would be running
> asynchronously and that QMP would be able to query the state of it.
No, the command does too many things and as such, makes it impossible
for a management tool to gracefully recover.
>> And of course, this needs to be carefully thought through for race
>> conditions. In the current command, what happens if you get a crash
>> between (2) and (3)? There's no way for the management tools to know
>> that we didn't finish flushing writes. How does the management tool
>> know that (1) didn't fail mid way through resulting in a corrupted image?
>
> There is no issue here, you have the exact same problem if you get a
> crash during d) in your example. It is the same with the existing
> command, the crash is only an issue if it happens right in the middle of
> the switch over. Until then, only the original image remains valid.
But the new image is always valid once it's been created as pending
writes are lost no matter what in a hard power failure. That suggests
the following flow:
1) Create new image with a backing file
2) Completion ACK
At this stage, the management tool should update it's internal state to
point to the new image.
3) Begin switch over to new image
4) Switch over image.
5) Receive notification when it's complete
If (4) is happening asynchronously, things get kind of complicated. You
can either wait for things to quiesce on their own or you can queue
pending requests. It's unclear to me what the right strategy is or if
there's a mixed strategy that's needed. I think it makes sense to treat
3/4 as a command with (5) being an event notification.
But combining 1-5 in a single interface creates a command that while
convenient on the command line, is not usable for a robust management tool.
Regards,
Anthony Liguori
> Cheers,
> Jes
>
next prev parent reply other threads:[~2011-04-28 15:11 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-18 14:27 [Qemu-devel] [PATCH v2 0/1] Add QMP bits for blockdev-snapshot-sync Jes.Sorensen
2011-04-18 14:27 ` [Qemu-devel] [PATCH v2 1/1] " Jes.Sorensen
2011-04-27 15:05 ` Luiz Capitulino
2011-04-27 15:05 ` Jes Sorensen
2011-04-28 12:45 ` Jes Sorensen
2011-04-28 13:14 ` Jes Sorensen
2011-04-28 13:21 ` Jes Sorensen
2011-04-28 13:41 ` Kevin Wolf
2011-04-28 13:46 ` Jes Sorensen
2011-04-28 14:42 ` Kevin Wolf
2011-04-28 14:41 ` Jes Sorensen
2011-04-28 14:21 ` Luiz Capitulino
2011-04-28 14:30 ` Jes Sorensen
2011-04-28 14:38 ` Anthony Liguori
2011-04-28 14:36 ` Anthony Liguori
2011-04-28 14:38 ` Jes Sorensen
2011-04-28 14:46 ` Anthony Liguori
2011-04-28 14:57 ` Jes Sorensen
2011-04-28 15:10 ` Anthony Liguori [this message]
2011-04-29 13:38 ` Jes Sorensen
2011-04-29 13:45 ` Anthony Liguori
2011-05-03 11:44 ` Jes Sorensen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DB98382.6040105@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=Jes.Sorensen@redhat.com \
--cc=armbru@redhat.com \
--cc=avi@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).