From: Kevin Wolf <kwolf@redhat.com>
To: Federico Simoncelli <fsimonce@redhat.com>
Cc: stefanha@linux.vnet.ibm.com,
Markus Armbruster <armbru@redhat.com>,
qemu-devel@nongnu.org, lcapitulino@redhat.com,
Paolo Bonzini <pbonzini@redhat.com>,
Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command
Date: Wed, 14 Mar 2012 10:34:08 +0100 [thread overview]
Message-ID: <4F606610.3030809@redhat.com> (raw)
In-Reply-To: <6a335e17-f79d-4caf-81fa-6d8b0130a0a6@zmail16.collab.prod.int.phx2.redhat.com>
Am 14.03.2012 01:14, schrieb Federico Simoncelli:
> ----- Original Message -----
>> From: "Eric Blake" <eblake@redhat.com>
>> To: "Paolo Bonzini" <pbonzini@redhat.com>
>> Cc: kwolf@redhat.com, fsimonce@redhat.com, qemu-devel@nongnu.org, stefanha@linux.vnet.ibm.com, lcapitulino@redhat.com
>> Sent: Tuesday, March 13, 2012 9:48:10 PM
>> Subject: Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command
>>
>> On 03/06/2012 10:56 AM, Paolo Bonzini wrote:
>>> From: Federico Simoncelli <fsimonce@redhat.com>
>>>
>>> Signed-off-by: Federico Simoncelli <fsimonce@redhat.com>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>
>>> ##
>>> +# @drive-reopen
>>> +#
>>> +# Assigns a new image file to a device.
>>> +#
>>> +# @device: the name of the device for which we are changing the
>>> image file.
>>> +#
>>> +# @new-image-file: the target of the new image. If the file
>>> doesn't exists the
>>> +# command will fail.
>>> +#
>>> +# @format: #optional the format of the new image, default is
>>> 'qcow2'.
>>> +#
>>> +# Returns: nothing on success
>>> +# If @device is not a valid block device, DeviceNotFound
>>> +# If @new-image-file can't be opened, OpenFileFailed
>>> +# If @format is invalid, InvalidBlockFormat
>>> +#
>>> +# Since 1.1
>>> +##
>>> +{ 'command': 'drive-reopen',
>>> + 'data': { 'device': 'str', 'new-image-file': 'str', '*format':
>>> 'str' } }
>>
>> I still think we need a 'drive-reopen' action included in
>> 'transaction',
>> as an 11/10 on this series. For disk migration, it is true that you
>> can
>> migrate one disk at a time, and therefore only need to reopen one
>> disk
>> at a time, to get the guarantee that for a single disk image, the
>> current state of that image will be guaranteed to be consistent using
>> only one storage domain.
>
> I'm not sure if this was already addressed on this mailing list but
> the main problem is that as general rule a qcow file cannot be opened
> in r/w mode twice. I believe there was only one exception to that
> with the live migration and it generated several issues.
In fact the same is true for any image. There are just some special
cases that happen to work anyway, but using them is not more than a hack.
> That said, reopen is really hard to be implemented as a transaction
> without breaking that rule. For example in the blkmirror case you'd
> need to open the destination image in r/w while the mirroring is in
> action (already having the same image in r/w mode).
>
> There are several solutions here but they are either really hard to
> implement or non definitive. For example:
>
> * We could try to implement the reopen command for each special case,
> eg: blkmirror, reopening the same image, etc... and in such cases
> reusing the same bs that we already have. The downside is that this
> command will be coupled with all this special cases.
The problem we're trying to solve is that we have a graph of open
BlockDriverStates (connected by bs->file, backing file relations etc.)
and we want to transform it into a different graph of open BDSs
atomically, reusing zero, one or many of the existing BDSs, and possibly
with changed properties (cache mode, read-only, etc.)
What we can have reasonably easily (there are patches floating around,
they just need to be completed), is a bdrv_reopen() that changes flags
on one given BDS, without changing the file it's backed by. This is
already broken up into prepare/commit/abort stages as we need it to
reopen VMDK's split images safely.
In theory this should be enough to build the new graph by opening yet
unused BDSs, preparing the reopen of reused ones and only if all of that
was successful, committing the bdrv_reopen and changing the relations
between the nodes. I hope that at the same time it's clear that this
isn't exactly trivial to implement.
> * We could use the transaction APIs without actually making it
> transaction (if we fail in the middle we can't rollback). The only
> advantage of this is that we'd provide a consistent API to libvirt
> and we would postpone the problem to the future. Anyway I strongly
> discourage this as it's completely unsafe and it's going to break
> the transaction semantic. Moreover it's a solution that relies too
> much on the hope of finding something appropriate in the future.
This is not an option. Advertising transactional behaviour and not
implementing it is just plain wrong.
> * We could leave it as it is, a distinct command that is not part of
> the transaction and that it's closing the old image before opening
> the new one.
Yes, this would be the short-term preliminary solution. I would tend to
leave it to downstreams to implement it as an extension, though.
> This is not completely correct, the main intent was to not spread one
> image chain across two storage domains (making it incomplete if one of
> them was missing). In the next oVirt release a VM can have different
> disks on different storage domains, so this wouldn't be a special case
> but just a normal situation.
The problem with this kind of argument is that we're not developing only
for oVirt, but need to look for what makes sense for any management tool
(or even just direct users of qemu).
Kevin
next prev parent reply other threads:[~2012-03-14 9:31 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-06 17:55 [Qemu-devel] [PATCH v4 00/10] Mirrored block writes Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 01/10] use QSIMPLEQ_FOREACH_SAFE when freeing list elements Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 02/10] fix format name for backing file Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 03/10] qapi: complete implementation of unions Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 04/10] rename blockdev-group-snapshot-sync Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 05/10] add mode field to blockdev-snapshot-sync transaction item Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 06/10] qmp: convert blockdev-snapshot-sync to a wrapper around transactions Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 07/10] Add blkmirror block driver Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 08/10] add mirroring to transaction Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 09/10] add drive-mirror command and HMP equivalent Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command Paolo Bonzini
2012-03-13 20:48 ` Eric Blake
2012-03-14 0:14 ` Federico Simoncelli
2012-03-14 9:34 ` Kevin Wolf [this message]
2012-03-14 13:11 ` Eric Blake
2012-03-14 14:30 ` Kevin Wolf
2012-03-14 13:29 ` Federico Simoncelli
2012-03-14 9:17 ` Kevin Wolf
2012-03-14 9:19 ` Paolo Bonzini
2012-03-14 9:35 ` Kevin Wolf
2012-03-09 15:36 ` [Qemu-devel] [PATCH v4 00/10] Mirrored block writes Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F606610.3030809@redhat.com \
--to=kwolf@redhat.com \
--cc=armbru@redhat.com \
--cc=eblake@redhat.com \
--cc=fsimonce@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).