qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Luiz Capitulino <lcapitulino@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Ori Mamluk <omamluk@zerto.com>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] Block job commands in QEMU 1.2 [v2, including support for replication]
Date: Thu, 24 May 2012 10:57:05 -0600	[thread overview]
Message-ID: <4FBE6861.2040503@redhat.com> (raw)
In-Reply-To: <4FBE3A89.8020702@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3971 bytes --]

On 05/24/2012 07:41 AM, Paolo Bonzini wrote:
> changes from v1:
> - added per-job iostatus
> - added description of persistent dirty bitmap
> 
> The same content is also at
> http://wiki.qemu.org/Features/LiveBlockMigration/1.2
> 

> * query-block-jobs: BlockJobInfo gets two new fields, paused and
> io-status.  The job-specific iostatus is completely separate from the
> block device iostatus.

Is it still true that for mirror jobs, whether we are mirroring is still
determined by whether 'len'=='offset'?

> * drive-mirror: activates mirroring to a second block device (optionally
> creating the image on that second block device).  Compared to the
> earlier versions, the "full" argument is replaced by an enum option
> "sync" with three values:
> 
> - top: copies data in the topmost image to the destination
> 
> - full: copies data from all images to the destination
> 
> - dirty: copies clusters that are marked in the dirty bitmap to the
> destination (see below)

Different, but at least RHEL used the name __com.redhat_drive-mirror, so
libvirt can cope with the difference.

> 
> 
> * block-job-complete: force completion of mirroring and switching of the
> device to the target, not related to the rest of the proposal.
> Synchronously opens backing files if needed, asynchronously completes
> the job.

Can this be made part of a 'transaction'?  Likewise, can
'block-job-cancel' be made part of a 'transaction'?  Having those two
commands transactionable means that you could copy multiple disks at the
same point in time (block-job-cancel) or pivot multiple disks leaving
the former files consistent at the same point in time
(block-job-complete).  It doesn't have to be done in the first round,
but we should make sure we are not precluding this for future growth.

Also, for the purposes of copying but not pivoting, you only have a safe
copy if 'len'=='offset' at the time of the cancel.  But now that you are
adding the possibility of mirroring reverting to copying, there is a
race where I can probe and see that we are in mirroring, then issue a
'block-job-cancel' to affect a copy operation, but in the meantime
things reverted, and the cancel ends up leaving me with an incomplete
copy.  Maybe 'block-job-complete' should be given an optional boolean
parameter; by default or if the parameter is true, we pivot, but if
false, then we do the same as 'block-job-cancel' to affect a safe copy
if we are in mirroring, while erroring out if we are not in mirroring,
leaving 'block-job-cancel' as a way to always cancel a job but no longer
a safe way to guarantee a copy operation.


> Persistent dirty bitmap
> =======================
> 
> A persistent dirty bitmap can be used by management for two reasons.
> When mirroring is used for continuous replication of storage, to record
> I/O operations that happened while the replication server is not
> connected or unavailable.  When mirroring is used for storage migration,
> to check after a management crash whether the VM must be restarted with
> the source or the destination.

Is there a particular file format for the dirty bitmap?  Is there a
header, or is it just straight bitmap, where the size of the file is an
exact function of size of the file that it maps?

> 
> If management crashes between (6) and (7), it can examine the dirty
> bitmap on disk.  If it is all-zeros,

Obviously, this would be all-zeros in the map portion of the file, any
header portion would not impact this.

> management can restart the virtual
> machine with /mnt/dest/diskname.img.  If it has even a single zero bit,

s/zero/non-zero/

> management can restart the virtual machine with the persistent dirty
> bitmap enabled, and later issue again a drive-mirror command to restart
> from step 4.
> 
> Paolo
> 

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 620 bytes --]

  parent reply	other threads:[~2012-05-24 16:57 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-18 17:08 [Qemu-devel] Proposal for extensions of block job commands in QEMU 1.2 Paolo Bonzini
2012-05-21  9:29 ` Kevin Wolf
2012-05-21 10:02   ` Paolo Bonzini
2012-05-21 10:32     ` Kevin Wolf
2012-05-21 11:02       ` Paolo Bonzini
2012-05-21 13:07         ` Kevin Wolf
2012-05-21 15:18           ` Paolo Bonzini
2012-05-21 13:13         ` Eric Blake
2012-05-21 12:20 ` Stefan Hajnoczi
2012-05-21 13:59 ` Luiz Capitulino
2012-05-21 14:09   ` Kevin Wolf
2012-05-21 14:16     ` Anthony Liguori
2012-05-21 14:17     ` Luiz Capitulino
2012-05-21 14:10   ` Anthony Liguori
2012-05-21 14:16     ` Luiz Capitulino
2012-05-21 14:19       ` Anthony Liguori
2012-05-21 14:26         ` Paolo Bonzini
2012-05-21 14:40           ` Anthony Liguori
2012-05-21 14:47             ` Paolo Bonzini
2012-05-21 15:44               ` Anthony Liguori
2012-05-21 15:55                 ` Paolo Bonzini
2012-05-21 14:17     ` Kevin Wolf
2012-05-21 14:39   ` Paolo Bonzini
2012-05-24 13:41 ` [Qemu-devel] Block job commands in QEMU 1.2 [v2, including support for replication] Paolo Bonzini
2012-05-24 14:00   ` Ori Mamluk
2012-05-24 14:19     ` Paolo Bonzini
2012-05-24 15:32       ` Dor Laor
2012-05-25  8:59         ` Paolo Bonzini
2012-05-24 16:57   ` Eric Blake [this message]
2012-05-25  8:48     ` Paolo Bonzini
2012-05-25 15:02       ` Eric Blake
2012-05-25  8:28   ` Stefan Hajnoczi
2012-05-25  8:42     ` Kevin Wolf
2012-05-25  9:43   ` Stefan Hajnoczi
2012-05-25 11:17     ` Paolo Bonzini
2012-05-25 12:09       ` Stefan Hajnoczi
2012-05-25 13:25         ` Paolo Bonzini
2012-05-25 16:57   ` Luiz Capitulino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBE6861.2040503@redhat.com \
    --to=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=omamluk@zerto.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).