From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37197) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UNI0X-0005F6-Go for qemu-devel@nongnu.org; Wed, 03 Apr 2013 03:21:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UNI0T-0008FV-Kp for qemu-devel@nongnu.org; Wed, 03 Apr 2013 03:21:21 -0400 Received: from mx4-phx2.redhat.com ([209.132.183.25]:41440) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UNI0T-0008FL-8X for qemu-devel@nongnu.org; Wed, 03 Apr 2013 03:21:17 -0400 Date: Wed, 3 Apr 2013 03:21:12 -0400 (EDT) From: Paolo Bonzini Message-ID: <1466473402.1066275.1364973672613.JavaMail.root@redhat.com> In-Reply-To: <515BC36F.7080307@linux.vnet.ibm.com> References: <1364810491-21404-1-git-send-email-xiawenc@linux.vnet.ibm.com> <1364810491-21404-3-git-send-email-xiawenc@linux.vnet.ibm.com> <20130402135520.GJ2341@dhcp-200-207.str.redhat.com> <515BC36F.7080307@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 2/3] block: adjust qmp_transaction to be extendable List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wenchao Xia Cc: Kevin Wolf , stefanha@gmail.com, qemu-devel@nongnu.org, dietmar@proxmox.com ----- Messaggio originale ----- > Da: "Wenchao Xia" > A: "Kevin Wolf" > Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, dietmar@proxmox.com, stef= anha@gmail.com > Inviato: Mercoled=C3=AC, 3 aprile 2013 7:51:43 > Oggetto: Re: [Qemu-devel] [PATCH 2/3] block: adjust qmp_transaction to be= extendable >=20 > =E4=BA=8E 2013-4-2 21:55, Kevin Wolf =E5=86=99=E9=81=93: > > Am 01.04.2013 um 12:01 hat Wenchao Xia geschrieben: > >> Now code for external snapshot are packaged as one case > >> in qmp_transaction, so later other operation could be added. > >> The logic in qmp_transaction is changed a bit: Original code > >> tries to create all images first and then update all *bdrv > >> once together, new code create and update one *bdrv one time, > >> and use bdrv_deappend() to rollback on fail. This allows mixing > >> different kind of requests in qmp_transaction() later. > >> > >> Signed-off-by: Wenchao Xia > >> --- > >> blockdev.c | 250 > >> +++++++++++++++++++++++++++++++++++++----------------------- > >> 1 files changed, 153 insertions(+), 97 deletions(-) > >> > >> diff --git a/blockdev.c b/blockdev.c > >> index 8cdc9ce..75416fb 100644 > >> --- a/blockdev.c > >> +++ b/blockdev.c > >> @@ -779,9 +779,155 @@ void qmp_blockdev_snapshot_sync(const char *devi= ce, > >> const char *snapshot_file, > >> > >> > >> /* New and old BlockDriverState structs for group snapshots */ > >> -typedef struct BlkTransactionStates { > >> +typedef struct BdrvActionOps { > >> + int (*commit)(BlockdevAction *action, void **p_opaque, Error **er= rp); > >> + void (*rollback)(BlockdevAction *action, void *opaque); > >> + void (*clean)(BlockdevAction *action, void *opaque); > >> +} BdrvActionOps; > > > > You don't really implement the transactional pattern that was used by > > the original code (and is used elsewhere). It should work like this: > > > > 1. Prepare: This stage performs all operations that can fail. They are > > not made active yet. For example with snapshotting, open a new > > BlockDriver state, but don't change the backing file chain yet. > > > > 2. Commit: Activate the change. This operation can never fail. For this > > reason, you never have to undo anything done here. > > > > 3. Rollback: Basically just free everything that prepare did up to the > > error. > > > > If you do it your way, you get into serious trouble if rollback involve= s > > operations that can fail. > > > > In the original code, here start the prepare: > > That is a clear comment, thanks. I changed the model since expecting > commit operation may need rollback, if not I am confident to keep > original model. > Since bdrv_snapshot_delete() can fail, I guess assertion of its > success should be used in the rollback later. No, if bdrv_snapshot_delete() can fail, you need to split it in two parts: one that can fail, and one that cannot. If you cannot, then there are two possibilities: - if the failures are minor and could be repaired with "qemu-img check -r" (e.g. lost clusters), then this is not optimal but can still be done; - otherwise, the operation simply cannot be made transactionable. In the case of qcow2_snapshot_delete, everything except ret =3D bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, nb_snapshots), &header_data, sizeof(header_data)); if (ret < 0) { goto fail; } must be in the prepare phase. Everything after "fail" (which right now is nothing, but it should at least undo the qcow2_alloc_clusters operation) must be in the rollback phase. Everything in the middle is the commit phase. Paolo =20 >=20 > >> @@ -806,125 +950,37 @@ void qmp_transaction(BlockdevActionList *dev_li= st, > >> Error **errp) > >> /* We don't do anything in this loop that commits us to the snap= shot > >> */ > >> while (NULL !=3D dev_entry) { > >> BlockdevAction *dev_info =3D NULL; > >> - BlockDriver *proto_drv; > >> - BlockDriver *drv; > >> - int flags; > >> - enum NewImageMode mode; > >> - const char *new_image_file; > >> - const char *device; > >> - const char *format =3D "qcow2"; > >> - > >> dev_info =3D dev_entry->value; > >> dev_entry =3D dev_entry->next; > >> > >> states =3D g_malloc0(sizeof(BlkTransactionStates)); > >> QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, states, entry); > >> > >> + states->action =3D dev_info; > >> switch (dev_info->kind) { > >> case BLOCKDEV_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC: > >> - device =3D dev_info->blockdev_snapshot_sync->device; > >> - if (!dev_info->blockdev_snapshot_sync->has_mode) { > >> - dev_info->blockdev_snapshot_sync->mode =3D > >> NEW_IMAGE_MODE_ABSOLUTE_PATHS; > >> - } > >> - new_image_file =3D > >> dev_info->blockdev_snapshot_sync->snapshot_file; > >> - if (dev_info->blockdev_snapshot_sync->has_format) { > >> - format =3D dev_info->blockdev_snapshot_sync->format; > >> - } > >> - mode =3D dev_info->blockdev_snapshot_sync->mode; > >> + states->ops =3D &external_snapshot_ops; > >> break; > >> default: > >> abort(); > >> } > >> > >> - drv =3D bdrv_find_format(format); > >> - if (!drv) { > >> - error_set(errp, QERR_INVALID_BLOCK_FORMAT, format); > >> - goto delete_and_fail; > >> - } > >> - > >> - states->old_bs =3D bdrv_find(device); > >> - if (!states->old_bs) { > >> - error_set(errp, QERR_DEVICE_NOT_FOUND, device); > >> - goto delete_and_fail; > >> - } > >> - > >> - if (!bdrv_is_inserted(states->old_bs)) { > >> - error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); > >> - goto delete_and_fail; > >> - } > >> - > >> - if (bdrv_in_use(states->old_bs)) { > >> - error_set(errp, QERR_DEVICE_IN_USE, device); > >> - goto delete_and_fail; > >> - } > >> - > >> - if (!bdrv_is_read_only(states->old_bs)) { > >> - if (bdrv_flush(states->old_bs)) { > >> - error_set(errp, QERR_IO_ERROR); > >> - goto delete_and_fail; > >> - } > >> - } > >> - > >> - flags =3D states->old_bs->open_flags; > >> - > >> - proto_drv =3D bdrv_find_protocol(new_image_file); > >> - if (!proto_drv) { > >> - error_set(errp, QERR_INVALID_BLOCK_FORMAT, format); > >> - goto delete_and_fail; > >> - } > >> - > >> - /* create new image w/backing file */ > >> - if (mode !=3D NEW_IMAGE_MODE_EXISTING) { > >> - bdrv_img_create(new_image_file, format, > >> - states->old_bs->filename, > >> - states->old_bs->drv->format_name, > >> - NULL, -1, flags, &local_err, false); > >> - if (error_is_set(&local_err)) { > >> - error_propagate(errp, local_err); > >> - goto delete_and_fail; > >> - } > >> - } > >> - > >> - /* We will manually add the backing_hd field to the bs later = */ > >> - states->new_bs =3D bdrv_new(""); > >> - /* TODO Inherit bs->options or only take explicit options wit= h an > >> - * extended QMP command? */ > >> - ret =3D bdrv_open(states->new_bs, new_image_file, NULL, > >> - flags | BDRV_O_NO_BACKING, drv); > >> - if (ret !=3D 0) { > >> - error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file); > >> + if (states->ops->commit(states->action, &states->opaque, errp= )) { > >> goto delete_and_fail; > >> } > >> } > > > > The following block is the commit: > > > >> - > >> - /* Now we are going to do the actual pivot. Everything up to thi= s > >> point > >> - * is reversible, but we are committed at this point */ > >> - QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) { > >> - /* This removes our old bs from the bdrv_states, and adds the= new > >> bs */ > >> - bdrv_append(states->new_bs, states->old_bs); > >> - /* We don't need (or want) to use the transactional > >> - * bdrv_reopen_multiple() across all the entries at once, bec= ause > >> we > >> - * don't want to abort all of them if one of them fails the > >> reopen */ > >> - bdrv_reopen(states->new_bs, states->new_bs->open_flags & > >> ~BDRV_O_RDWR, > >> - NULL); > >> - } > >> - > >> /* success */ > >> goto exit; > > > > And this is rollback: > > > >> delete_and_fail: > >> - /* > >> - * failure, and it is all-or-none; abandon each new bs, and keep u= sing > >> - * the original bs for all images > >> - */ > >> QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) { > >> - if (states->new_bs) { > >> - bdrv_delete(states->new_bs); > >> - } > >> + states->ops->rollback(states->action, states->opaque); > >> } > > > > Kevin > > >=20 >=20 > -- > Best Regards >=20 > Wenchao Xia >=20 >=20