All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: stefanha@linux.vnet.ibm.com,
	Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org, lcapitulino@redhat.com,
	Federico Simoncelli <fsimonce@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command
Date: Wed, 14 Mar 2012 07:11:01 -0600	[thread overview]
Message-ID: <4F6098E5.4010709@redhat.com> (raw)
In-Reply-To: <4F606610.3030809@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4591 bytes --]

On 03/14/2012 03:34 AM, Kevin Wolf wrote:

>> That said, reopen is really hard to be implemented as a transaction
>> without breaking that rule. For example in the blkmirror case you'd
>> need to open the destination image in r/w while the mirroring is in
>> action (already having the same image in r/w mode).
>>
>> There are several solutions here but they are either really hard to
>> implement or non definitive. For example:
>>
>> * We could try to implement the reopen command for each special case,
>>   eg: blkmirror, reopening the same image, etc... and in such cases
>>   reusing the same bs that we already have. The downside is that this
>>   command will be coupled with all this special cases.
> 
> The problem we're trying to solve is that we have a graph of open
> BlockDriverStates (connected by bs->file, backing file relations etc.)
> and we want to transform it into a different graph of open BDSs
> atomically, reusing zero, one or many of the existing BDSs, and possibly
> with changed properties (cache mode, read-only, etc.)
> 
> What we can have reasonably easily (there are patches floating around,
> they just need to be completed), is a bdrv_reopen() that changes flags
> on one given BDS, without changing the file it's backed by. This is
> already broken up into prepare/commit/abort stages as we need it to
> reopen VMDK's split images safely.
> 
> In theory this should be enough to build the new graph by opening yet
> unused BDSs, preparing the reopen of reused ones and only if all of that
> was successful, committing the bdrv_reopen and changing the relations
> between the nodes. I hope that at the same time it's clear that this
> isn't exactly trivial to implement.

Agreed that 1) it looks implementable, and 2) it does not look trivial.

> 
>> * We could use the transaction APIs without actually making it
>>   transaction (if we fail in the middle we can't rollback). The only
>>   advantage of this is that we'd provide a consistent API to libvirt
>>   and we would postpone the problem to the future. Anyway I strongly
>>   discourage this as it's completely unsafe and it's going to break
>>   the transaction semantic. Moreover it's a solution that relies too
>>   much on the hope of finding something appropriate in the future.
> 
> This is not an option. Advertising transactional behaviour and not
> implementing it is just plain wrong.

Absolutely concur.  From libvirt's perspective, I will be assuming that:

if 'transaction' exists, then it supports 'blockdev-snapshot-sync' and
'drive-mirror' (assuming that these are the only two commands that are
in 'transaction' in qemu 1.1), but nothing else is safe to use without a
further probe.  Then, if down the road, we go through the pain of making
'drive-reopen' transactionable, then there will be some new
introspection command (one idea was an optional argument to
query-commands which limits output to just the commands that can also
occur in a 'transaction'), and libvirt will have to use that
introspection before using the additional features.

> 
>> * We could leave it as it is, a distinct command that is not part of
>>   the transaction and that it's closing the old image before opening
>>   the new one.
> 
> Yes, this would be the short-term preliminary solution. I would tend to
> leave it to downstreams to implement it as an extension, though.

Correct - I'm fine with libvirt using the direct, non-atomic,
'drive-reopen' for the immediate needs of oVirt while we work on a more
complete solution for down the road.

> 
>> This is not completely correct, the main intent was to not spread one
>> image chain across two storage domains (making it incomplete if one of
>> them was missing). In the next oVirt release a VM can have different
>> disks on different storage domains, so this wouldn't be a special case
>> but just a normal situation.
> 
> The problem with this kind of argument is that we're not developing only
> for oVirt, but need to look for what makes sense for any management tool
> (or even just direct users of qemu).

Indeed - that's why I still think it's dangerous to not have an atomic
'drive-reopen', even if oVirt can be coded to work in spite of an
initial implementation being non-atomic.  But there's a difference
between wish-lists (atomic reopen) and practicality (what can we
implement now), and I'm not going to insist on the impossible :)

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

  reply	other threads:[~2012-03-14 13:12 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-06 17:55 [Qemu-devel] [PATCH v4 00/10] Mirrored block writes Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 01/10] use QSIMPLEQ_FOREACH_SAFE when freeing list elements Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 02/10] fix format name for backing file Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 03/10] qapi: complete implementation of unions Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 04/10] rename blockdev-group-snapshot-sync Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 05/10] add mode field to blockdev-snapshot-sync transaction item Paolo Bonzini
2012-03-06 17:55 ` [Qemu-devel] [PATCH v4 06/10] qmp: convert blockdev-snapshot-sync to a wrapper around transactions Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 07/10] Add blkmirror block driver Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 08/10] add mirroring to transaction Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 09/10] add drive-mirror command and HMP equivalent Paolo Bonzini
2012-03-06 17:56 ` [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command Paolo Bonzini
2012-03-13 20:48   ` Eric Blake
2012-03-14  0:14     ` Federico Simoncelli
2012-03-14  9:34       ` Kevin Wolf
2012-03-14 13:11         ` Eric Blake [this message]
2012-03-14 14:30           ` Kevin Wolf
2012-03-14 13:29         ` Federico Simoncelli
2012-03-14  9:17     ` Kevin Wolf
2012-03-14  9:19       ` Paolo Bonzini
2012-03-14  9:35         ` Kevin Wolf
2012-03-09 15:36 ` [Qemu-devel] [PATCH v4 00/10] Mirrored block writes Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F6098E5.4010709@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=fsimonce@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.