From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:43673) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S21jx-0007aX-Q7 for qemu-devel@nongnu.org; Mon, 27 Feb 2012 09:39:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S21jm-00088t-Ax for qemu-devel@nongnu.org; Mon, 27 Feb 2012 09:39:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:51637) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S21jm-00088j-2b for qemu-devel@nongnu.org; Mon, 27 Feb 2012 09:39:38 -0500 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q1REdbiM010993 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 27 Feb 2012 09:39:37 -0500 Message-ID: <4F4B95A5.3000804@redhat.com> Date: Mon, 27 Feb 2012 15:39:33 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <20120224170143.78f55d3e@doriath.home> <8eaeb022-ea20-4823-886a-e629bce1c776@zmail16.collab.prod.int.phx2.redhat.com> <20120227091215.7849e558@doriath.home> <4F4B7BCD.6000808@redhat.com> <20120227100645.3f36d52d@doriath.home> In-Reply-To: <20120227100645.3f36d52d@doriath.home> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] drive transactions (was Re: [PATCH 2/2 v2] Add the blockdev-reopen and blockdev-migrate commands) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: kwolf@redhat.com, Jeff Cody , mtosatti@redhat.com, qemu-devel@nongnu.org, armbru@redhat.com, Federico Simoncelli On 02/27/2012 02:06 PM, Luiz Capitulino wrote: > IMHO, this is asking for two commands, where cases 1 & 2 is one of them > and cases 3 & 4 is the other one. Note how 'incremental' goes away and > 'new_image_file' really becomes an optional. I really would have no idea on naming except perhaps "drive_migrate" and "funny_drive_migrate_for_ovirt". But actually I must admit that it's a rat's nest. First, there's no reason why live-snapshotting to new_image_file shouldn't be handled within QEMU. That would make the above table much more orthogonal. However, oVirt likes to create its own snapshots. Perhaps we do need to rethink this together with group snapshots. There are four kinds of operations that we do on block devices: (a) create an image. This is part of what blockdev-snapshot does. (b) switch a block device to a new image. drive-reopen does this. (c) add mirroring to a new destination. (d) activate streaming from a base image drive-migrate does (b) and (c), will do (a) and (d) when we add pre-copy, and should do (a) right now if Federico wasn't an oVirt developer. :) Thinking more about it, the commands we _do_ need are: - start a transaction - switch to a new image - add mirroring to a new destination - commit a transaction - rollback a transaction - query transaction errors Creating an image can be done outside a transaction for now because we only support live external snapshots. Streaming can also be started outside a transaction, because it doesn't need to be started atomically. If we have the above elementary commands, blockdev-snapshot-sync becomes sugar for this: (create image) blockdev-start-transaction if no active transaction drive-reopen blockdev-commit-transaction if no active transaction Jeff's group snapshot can be realized as this: blockdev-begin-transaction blockdev-snapshot-sync ... blockdev-snapshot-sync blockdev-commit-transaction And for drive-migrate, let's look at the above 3 cases: > > 1) incremental=false, new_image_file not passed: > > right now fail; in the future: > > - create an image on dest with no backing file; > > - writes will be mirrored to current_image_file and dest > > - start streaming from current_image_file to dest This is a new command "drive-mirror device dest", which does: (create image for dest) blockdev-begin-transaction if no active transaction drive-mirror device dest blockdev-commit-transaction if no active transaction The command does this: - mirror writes to current_image_file and dest - start streaming from current_image_file to dest The second part can be suppressed with a boolean argument. > > 2) incremental=false, new_image_file passed: > > right now fail; in the future: > > - create an image on dest with no backing file; > > - live-snapshot based on current_image_file to new_image_file; > > - writes will be mirrored to new_image_file and dest > > - start streaming from current_image_file to dest Atomicity is not needed here, so the user can simply issue: blockdev-snapshot-sync device new-image-file drive-mirror device dest > > 4) incremental=true, new_image_file passed: > > - no images will be created > > - writes will be mirrored to new_image_file and dest No need to provide this from within QEMU, because libvirt/oVirt can do the dance using elementary operations: blockdev-begin-transaction drive-reopen device new-image-file drive-mirror streaming=false device dest blockdev-commit-transaction No strange optional arguments, no proliferation of commands, etc. The only downside is that if someone tries to do (4) without transactions (or without stopping the VM) they'll get corruption because atomicity is required. Paolo