* [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn
@ 2015-10-23 23:56 John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 01/14] qapi: Add transaction support to block-dirty-bitmap operations John Snow
` (15 more replies)
0 siblings, 16 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
Welcome to V10!
Where'd 8 and 9 go? Private off-list missives from Fam.
Now you, I, and everyone on qemu-devel are staring at V10.
What's new in V10?
I replaced the per-action "transactional-cancel" parameter with
a per-transaction paremeter named "err-cancel" which is implemented
as an enum in case we want to add new behaviors in the future, such
as a "jobs only" cancel mode.
For now, it's "all" or "none", and if you use it with actions that
do not support the latent transactional cancel, you will receive
an error for your troubles.
This version primarily changed patches 10,11 and replaced it with
patches 10-12 that are cut a little differently.
This is based on top of the work by Stefan Hajnoczi and Fam Zheng.
Recap: motivation for block job transactions
--------------------------------------------
If an incremental backup block job fails then we reclaim the bitmap so
the job can be retried. The problem comes when multiple jobs are started as
part of a qmp 'transaction' command. We need to group these jobs in a
transaction so that either all jobs complete successfully or all bitmaps are
reclaimed.
Without transactions, there is a case where some jobs complete successfully and
throw away their bitmaps, making it impossible to retry the backup by rerunning
the command if one of the jobs fails.
How does this implementation work?
----------------------------------
These patches add a BlockJobTxn object with the following API:
txn = block_job_txn_new();
block_job_txn_add_job(txn, job1);
block_job_txn_add_job(txn, job2);
The jobs either both complete successfully or they both fail/cancel. If the
user cancels job1 then job2 will also be cancelled and vice versa.
Jobs objects stay alive waiting for other jobs to complete, even if the
coroutines have returned. They can be cancelled by the user during this time.
Job blockers are still in effect and no other block job can run on this device
in the meantime (since QEMU currently only allows 1 job per device). This is
the main drawback to this approach but reasonable since you probably don't want
to run other jobs/operations until you're sure the backup was successful (you
won't be able to retry a failed backup if there's a new job running).
[History]
v10: Replaced per-action parameter with per-transaction properties.
Patches 10,11 were split into 10-12.
v9: this version fixes a reference count problem with job->bs,
in patch 05.
v8: Rebase on to master.
Minor fixes addressing John Snow's comments.
v7: Add Eric's rev-by in 1, 11.
Add Max's rev-by in 4, 5, 9, 10, 11.
Add John's rev-by in 5, 6, 8.
Fix wording for 6. [John]
Fix comment of block_job_txn_add_job() in 9. [Max]
Remove superfluous hunks, and document default value in 11. [Eric]
Update Makefile dep in 14. [Max]
________________________________________________________________________________
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch block-transpop
https://github.com/jnsnow/qemu/tree/block-transpop
This version is tagged block-transpop-v10:
https://github.com/jnsnow/qemu/releases/tag/block-transpop-v10
Fam Zheng (6):
backup: Extract dirty bitmap handling as a separate function
blockjob: Introduce reference count and fix reference to job->bs
blockjob: Add .commit and .abort block job actions
blockjob: Add "completed" and "ret" in BlockJob
blockjob: Simplify block_job_finish_sync
block: Add block job transactions
John Snow (7):
qapi: Add transaction support to block-dirty-bitmap operations
iotests: add transactional incremental backup test
block: rename BlkTransactionState and BdrvActionOps
block/backup: Rely on commit/abort for cleanup
block: Add BlockJobTxn support to backup_run
block: add transactional properties
iotests: 124 - transactional failure test
Stefan Hajnoczi (1):
tests: add BlockJobTxn unit test
block.c | 19 +-
block/backup.c | 50 ++++--
block/mirror.c | 2 +-
blockdev.c | 430 ++++++++++++++++++++++++++++++++++-----------
blockjob.c | 188 ++++++++++++++++----
docs/bitmaps.md | 6 +-
include/block/block.h | 2 +-
include/block/block_int.h | 6 +-
include/block/blockjob.h | 85 ++++++++-
qapi-schema.json | 54 +++++-
qemu-img.c | 3 -
qmp-commands.hx | 2 +-
tests/Makefile | 3 +
tests/qemu-iotests/124 | 182 ++++++++++++++++++-
tests/qemu-iotests/124.out | 4 +-
tests/test-blockjob-txn.c | 244 +++++++++++++++++++++++++
16 files changed, 1108 insertions(+), 172 deletions(-)
create mode 100644 tests/test-blockjob-txn.c
--
2.4.3
^ permalink raw reply [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 01/14] qapi: Add transaction support to block-dirty-bitmap operations
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 02/14] iotests: add transactional incremental backup test John Snow
` (14 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
This adds two qmp commands to transactions.
block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.
block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block.c | 19 +++++++-
blockdev.c | 114 +++++++++++++++++++++++++++++++++++++++++++++-
docs/bitmaps.md | 6 +--
include/block/block.h | 1 -
include/block/block_int.h | 3 ++
qapi-schema.json | 6 ++-
6 files changed, 139 insertions(+), 10 deletions(-)
diff --git a/block.c b/block.c
index 6771c3a..fb015cd 100644
--- a/block.c
+++ b/block.c
@@ -3474,10 +3474,25 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
}
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
- hbitmap_reset_all(bitmap->bitmap);
+ if (!out) {
+ hbitmap_reset_all(bitmap->bitmap);
+ } else {
+ HBitmap *backup = bitmap->bitmap;
+ bitmap->bitmap = hbitmap_alloc(bitmap->size,
+ hbitmap_granularity(backup));
+ *out = backup;
+ }
+}
+
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
+{
+ HBitmap *tmp = bitmap->bitmap;
+ assert(bdrv_dirty_bitmap_enabled(bitmap));
+ bitmap->bitmap = in;
+ hbitmap_free(tmp);
}
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
diff --git a/blockdev.c b/blockdev.c
index 8141b6b..ef5a46e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1706,6 +1706,106 @@ static void blockdev_backup_clean(BlkTransactionState *common)
}
}
+typedef struct BlockDirtyBitmapState {
+ BlkTransactionState common;
+ BdrvDirtyBitmap *bitmap;
+ BlockDriverState *bs;
+ AioContext *aio_context;
+ HBitmap *backup;
+ bool prepared;
+} BlockDirtyBitmapState;
+
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+ Error *local_err = NULL;
+ BlockDirtyBitmapAdd *action;
+ BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+ action = common->action->block_dirty_bitmap_add;
+ /* AIO context taken and released within qmp_block_dirty_bitmap_add */
+ qmp_block_dirty_bitmap_add(action->node, action->name,
+ action->has_granularity, action->granularity,
+ &local_err);
+
+ if (!local_err) {
+ state->prepared = true;
+ } else {
+ error_propagate(errp, local_err);
+ }
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+ BlockDirtyBitmapAdd *action;
+ BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+ action = common->action->block_dirty_bitmap_add;
+ /* Should not be able to fail: IF the bitmap was added via .prepare(),
+ * then the node reference and bitmap name must have been valid.
+ */
+ if (state->prepared) {
+ qmp_block_dirty_bitmap_remove(action->node, action->name, &error_abort);
+ }
+}
+
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+ BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+ BlockDirtyBitmap *action;
+
+ action = common->action->block_dirty_bitmap_clear;
+ state->bitmap = block_dirty_bitmap_lookup(action->node,
+ action->name,
+ &state->bs,
+ &state->aio_context,
+ errp);
+ if (!state->bitmap) {
+ return;
+ }
+
+ if (bdrv_dirty_bitmap_frozen(state->bitmap)) {
+ error_setg(errp, "Cannot modify a frozen bitmap");
+ return;
+ } else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
+ error_setg(errp, "Cannot clear a disabled bitmap");
+ return;
+ }
+
+ bdrv_clear_dirty_bitmap(state->bitmap, &state->backup);
+ /* AioContext is released in .clean() */
+}
+
+static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+{
+ BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+ bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
+}
+
+static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+{
+ BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+ hbitmap_free(state->backup);
+}
+
+static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+{
+ BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+ if (state->aio_context) {
+ aio_context_release(state->aio_context);
+ }
+}
+
static void abort_prepare(BlkTransactionState *common, Error **errp)
{
error_setg(errp, "Transaction aborted using Abort action");
@@ -1746,6 +1846,18 @@ static const BdrvActionOps actions[] = {
.abort = internal_snapshot_abort,
.clean = internal_snapshot_clean,
},
+ [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
+ .instance_size = sizeof(BlockDirtyBitmapState),
+ .prepare = block_dirty_bitmap_add_prepare,
+ .abort = block_dirty_bitmap_add_abort,
+ },
+ [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
+ .instance_size = sizeof(BlockDirtyBitmapState),
+ .prepare = block_dirty_bitmap_clear_prepare,
+ .commit = block_dirty_bitmap_clear_commit,
+ .abort = block_dirty_bitmap_clear_abort,
+ .clean = block_dirty_bitmap_clear_clean,
+ }
};
/*
@@ -2126,7 +2238,7 @@ void qmp_block_dirty_bitmap_clear(const char *node, const char *name,
goto out;
}
- bdrv_clear_dirty_bitmap(bitmap);
+ bdrv_clear_dirty_bitmap(bitmap, NULL);
out:
aio_context_release(aio_context);
diff --git a/docs/bitmaps.md b/docs/bitmaps.md
index fa87f07..9fd8ea6 100644
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -97,11 +97,7 @@ which is included at the end of this document.
}
```
-## Transactions (Not yet implemented)
-
-* Transactional commands are forthcoming in a future version,
- and are not yet available for use. This section serves as
- documentation of intent for their design and usage.
+## Transactions
### Justification
diff --git a/include/block/block.h b/include/block/block.h
index 84f05ad..7e8a6f6 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -505,7 +505,6 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap);
void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, struct HBitmapIter *hbi);
void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t offset);
int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index a480f94..025aefb 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -672,4 +672,7 @@ void blk_dev_resize_cb(BlockBackend *blk);
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
bool bdrv_requests_pending(BlockDriverState *bs);
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in);
+
#endif /* BLOCK_INT_H */
diff --git a/qapi-schema.json b/qapi-schema.json
index f60be29..a73c2c9 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1531,6 +1531,8 @@
# abort since 1.6
# blockdev-snapshot-internal-sync since 1.7
# blockdev-backup since 2.3
+# block-dirty-bitmap-add since 2.5
+# block-dirty-bitmap-clear since 2.5
##
{ 'union': 'TransactionAction',
'data': {
@@ -1538,7 +1540,9 @@
'drive-backup': 'DriveBackup',
'blockdev-backup': 'BlockdevBackup',
'abort': 'Abort',
- 'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal'
+ 'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
+ 'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
+ 'block-dirty-bitmap-clear': 'BlockDirtyBitmap'
} }
##
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 02/14] iotests: add transactional incremental backup test
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 01/14] qapi: Add transaction support to block-dirty-bitmap operations John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps John Snow
` (13 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
Test simple usage cases for using transactions to create
and synchronize incremental backups.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
tests/qemu-iotests/124 | 54 ++++++++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/124.out | 4 ++--
2 files changed, 56 insertions(+), 2 deletions(-)
diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 9ccd118..9c1977e 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
pass
+def transaction_action(action, **kwargs):
+ return {
+ 'type': action,
+ 'data': kwargs
+ }
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+ return transaction_action('block-dirty-bitmap-clear',
+ node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+ return transaction_action('drive-backup', device=device, target=target,
+ **kwargs)
+
+
class Bitmap:
def __init__(self, name, drive):
self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
return self.do_incremental_simple(granularity=131072)
+ def test_incremental_transaction(self):
+ '''Test: Verify backups made from transactionally created bitmaps.
+
+ Create a bitmap "before" VM execution begins, then create a second
+ bitmap AFTER writes have already occurred. Use transactions to create
+ a full backup and synchronize both bitmaps to this backup.
+ Create an incremental backup through both bitmaps and verify that
+ both backups match the current drive0 image.
+ '''
+
+ drive0 = self.drives[0]
+ bitmap0 = self.add_bitmap('bitmap0', drive0)
+ self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+ ('0xfe', '16M', '256k'),
+ ('0x64', '32736k', '64k')))
+ bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+ result = self.vm.qmp('transaction', actions=[
+ transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+ transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+ transaction_drive_backup(drive0['id'], drive0['backup'],
+ sync='full', format=drive0['fmt'])
+ ])
+ self.assert_qmp(result, 'return', {})
+ self.wait_until_completed(drive0['id'])
+ self.files.append(drive0['backup'])
+
+ self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+ ('0x55', '8M', '352k'),
+ ('0x78', '15872k', '1M')))
+ # Both bitmaps should be correctly in sync.
+ self.create_incremental(bitmap0)
+ self.create_incremental(bitmap1)
+ self.vm.shutdown()
+ self.check_backups()
+
+
def test_incremental_failure(self):
'''Test: Verify backups made after a failure are correct.
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-.......
+........
----------------------------------------------------------------------
-Ran 7 tests
+Ran 8 tests
OK
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 01/14] qapi: Add transaction support to block-dirty-bitmap operations John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 02/14] iotests: add transactional incremental backup test John Snow
@ 2015-10-23 23:56 ` John Snow
2015-11-03 16:33 ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 04/14] backup: Extract dirty bitmap handling as a separate function John Snow
` (12 subsequent siblings)
15 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
These structures are misnomers, somewhat.
(1) BlockTransactionState is not state for a transaction,
but is rather state for a single transaction action.
Rename it "BlkActionState" to be more accurate.
(2) The BdrvActionOps describes operations for the BlkActionState,
above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
which there isn't.
Rename this to 'BlkActionOps' to match 'BlkActionState'.
Lastly, update the surrounding in-line documentation and comments
to reflect the current nature of how Transactions operate.
This patch changes only comments and names, and should not affect
behavior in any way.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
blockdev.c | 116 ++++++++++++++++++++++++++++++++++---------------------------
1 file changed, 65 insertions(+), 51 deletions(-)
diff --git a/blockdev.c b/blockdev.c
index ef5a46e..74360ef 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1244,43 +1244,57 @@ static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const char *node,
/* New and old BlockDriverState structs for atomic group operations */
-typedef struct BlkTransactionState BlkTransactionState;
+typedef struct BlkActionState BlkActionState;
-/* Only prepare() may fail. In a single transaction, only one of commit() or
- abort() will be called, clean() will always be called if it present. */
-typedef struct BdrvActionOps {
- /* Size of state struct, in bytes. */
+/**
+ * BlkActionOps:
+ * Table of operations that define an Action.
+ *
+ * @instance_size: Size of state struct, in bytes.
+ * @prepare: Prepare the work, must NOT be NULL.
+ * @commit: Commit the changes, can be NULL.
+ * @abort: Abort the changes on fail, can be NULL.
+ * @clean: Clean up resources after all transaction actions have called
+ * commit() or abort(). Can be NULL.
+ *
+ * Only prepare() may fail. In a single transaction, only one of commit() or
+ * abort() will be called. clean() will always be called if it is present.
+ */
+typedef struct BlkActionOps {
size_t instance_size;
- /* Prepare the work, must NOT be NULL. */
- void (*prepare)(BlkTransactionState *common, Error **errp);
- /* Commit the changes, can be NULL. */
- void (*commit)(BlkTransactionState *common);
- /* Abort the changes on fail, can be NULL. */
- void (*abort)(BlkTransactionState *common);
- /* Clean up resource in the end, can be NULL. */
- void (*clean)(BlkTransactionState *common);
-} BdrvActionOps;
+ void (*prepare)(BlkActionState *common, Error **errp);
+ void (*commit)(BlkActionState *common);
+ void (*abort)(BlkActionState *common);
+ void (*clean)(BlkActionState *common);
+} BlkActionOps;
-/*
- * This structure must be arranged as first member in child type, assuming
- * that compiler will also arrange it to the same address with parent instance.
- * Later it will be used in free().
+/**
+ * BlkActionState:
+ * Describes one Action's state within a Transaction.
+ *
+ * @action: QAPI-defined enum identifying which Action to perform.
+ * @ops: Table of ActionOps this Action can perform.
+ * @entry: List membership for all Actions in this Transaction.
+ *
+ * This structure must be arranged as first member in a subclassed type,
+ * assuming that the compiler will also arrange it to the same offsets as the
+ * base class.
*/
-struct BlkTransactionState {
+struct BlkActionState {
TransactionAction *action;
- const BdrvActionOps *ops;
- QSIMPLEQ_ENTRY(BlkTransactionState) entry;
+ const BlkActionOps *ops;
+ QSIMPLEQ_ENTRY(BlkActionState) entry;
};
/* internal snapshot private data */
typedef struct InternalSnapshotState {
- BlkTransactionState common;
+ BlkActionState common;
BlockDriverState *bs;
AioContext *aio_context;
QEMUSnapshotInfo sn;
} InternalSnapshotState;
-static void internal_snapshot_prepare(BlkTransactionState *common,
+static void internal_snapshot_prepare(BlkActionState *common,
Error **errp)
{
Error *local_err = NULL;
@@ -1376,7 +1390,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
state->bs = bs;
}
-static void internal_snapshot_abort(BlkTransactionState *common)
+static void internal_snapshot_abort(BlkActionState *common)
{
InternalSnapshotState *state =
DO_UPCAST(InternalSnapshotState, common, common);
@@ -1399,7 +1413,7 @@ static void internal_snapshot_abort(BlkTransactionState *common)
}
}
-static void internal_snapshot_clean(BlkTransactionState *common)
+static void internal_snapshot_clean(BlkActionState *common)
{
InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
common, common);
@@ -1411,13 +1425,13 @@ static void internal_snapshot_clean(BlkTransactionState *common)
/* external snapshot private data */
typedef struct ExternalSnapshotState {
- BlkTransactionState common;
+ BlkActionState common;
BlockDriverState *old_bs;
BlockDriverState *new_bs;
AioContext *aio_context;
} ExternalSnapshotState;
-static void external_snapshot_prepare(BlkTransactionState *common,
+static void external_snapshot_prepare(BlkActionState *common,
Error **errp)
{
int flags, ret;
@@ -1532,7 +1546,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
}
}
-static void external_snapshot_commit(BlkTransactionState *common)
+static void external_snapshot_commit(BlkActionState *common)
{
ExternalSnapshotState *state =
DO_UPCAST(ExternalSnapshotState, common, common);
@@ -1550,7 +1564,7 @@ static void external_snapshot_commit(BlkTransactionState *common)
aio_context_release(state->aio_context);
}
-static void external_snapshot_abort(BlkTransactionState *common)
+static void external_snapshot_abort(BlkActionState *common)
{
ExternalSnapshotState *state =
DO_UPCAST(ExternalSnapshotState, common, common);
@@ -1563,13 +1577,13 @@ static void external_snapshot_abort(BlkTransactionState *common)
}
typedef struct DriveBackupState {
- BlkTransactionState common;
+ BlkActionState common;
BlockDriverState *bs;
AioContext *aio_context;
BlockJob *job;
} DriveBackupState;
-static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
+static void drive_backup_prepare(BlkActionState *common, Error **errp)
{
DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
BlockDriverState *bs;
@@ -1610,7 +1624,7 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
state->job = state->bs->job;
}
-static void drive_backup_abort(BlkTransactionState *common)
+static void drive_backup_abort(BlkActionState *common)
{
DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
BlockDriverState *bs = state->bs;
@@ -1621,7 +1635,7 @@ static void drive_backup_abort(BlkTransactionState *common)
}
}
-static void drive_backup_clean(BlkTransactionState *common)
+static void drive_backup_clean(BlkActionState *common)
{
DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -1631,13 +1645,13 @@ static void drive_backup_clean(BlkTransactionState *common)
}
typedef struct BlockdevBackupState {
- BlkTransactionState common;
+ BlkActionState common;
BlockDriverState *bs;
BlockJob *job;
AioContext *aio_context;
} BlockdevBackupState;
-static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
+static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
{
BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
BlockdevBackup *backup;
@@ -1686,7 +1700,7 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
state->job = state->bs->job;
}
-static void blockdev_backup_abort(BlkTransactionState *common)
+static void blockdev_backup_abort(BlkActionState *common)
{
BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
BlockDriverState *bs = state->bs;
@@ -1697,7 +1711,7 @@ static void blockdev_backup_abort(BlkTransactionState *common)
}
}
-static void blockdev_backup_clean(BlkTransactionState *common)
+static void blockdev_backup_clean(BlkActionState *common)
{
BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
@@ -1707,7 +1721,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
}
typedef struct BlockDirtyBitmapState {
- BlkTransactionState common;
+ BlkActionState common;
BdrvDirtyBitmap *bitmap;
BlockDriverState *bs;
AioContext *aio_context;
@@ -1715,7 +1729,7 @@ typedef struct BlockDirtyBitmapState {
bool prepared;
} BlockDirtyBitmapState;
-static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+static void block_dirty_bitmap_add_prepare(BlkActionState *common,
Error **errp)
{
Error *local_err = NULL;
@@ -1736,7 +1750,7 @@ static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
}
}
-static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+static void block_dirty_bitmap_add_abort(BlkActionState *common)
{
BlockDirtyBitmapAdd *action;
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
@@ -1751,7 +1765,7 @@ static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
}
}
-static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
Error **errp)
{
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
@@ -1780,7 +1794,7 @@ static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
/* AioContext is released in .clean() */
}
-static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_abort(BlkActionState *common)
{
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
common, common);
@@ -1788,7 +1802,7 @@ static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
}
-static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_commit(BlkActionState *common)
{
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
common, common);
@@ -1796,7 +1810,7 @@ static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
hbitmap_free(state->backup);
}
-static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_clean(BlkActionState *common)
{
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
common, common);
@@ -1806,17 +1820,17 @@ static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
}
}
-static void abort_prepare(BlkTransactionState *common, Error **errp)
+static void abort_prepare(BlkActionState *common, Error **errp)
{
error_setg(errp, "Transaction aborted using Abort action");
}
-static void abort_commit(BlkTransactionState *common)
+static void abort_commit(BlkActionState *common)
{
g_assert_not_reached(); /* this action never succeeds */
}
-static const BdrvActionOps actions[] = {
+static const BlkActionOps actions[] = {
[TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
.instance_size = sizeof(ExternalSnapshotState),
.prepare = external_snapshot_prepare,
@@ -1836,7 +1850,7 @@ static const BdrvActionOps actions[] = {
.clean = blockdev_backup_clean,
},
[TRANSACTION_ACTION_KIND_ABORT] = {
- .instance_size = sizeof(BlkTransactionState),
+ .instance_size = sizeof(BlkActionState),
.prepare = abort_prepare,
.commit = abort_commit,
},
@@ -1867,10 +1881,10 @@ static const BdrvActionOps actions[] = {
void qmp_transaction(TransactionActionList *dev_list, Error **errp)
{
TransactionActionList *dev_entry = dev_list;
- BlkTransactionState *state, *next;
+ BlkActionState *state, *next;
Error *local_err = NULL;
- QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionState) snap_bdrv_states;
+ QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
QSIMPLEQ_INIT(&snap_bdrv_states);
/* drain all i/o before any operations */
@@ -1879,7 +1893,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
/* We don't do anything in this loop that commits us to the operations */
while (NULL != dev_entry) {
TransactionAction *dev_info = NULL;
- const BdrvActionOps *ops;
+ const BlkActionOps *ops;
dev_info = dev_entry->value;
dev_entry = dev_entry->next;
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 04/14] backup: Extract dirty bitmap handling as a separate function
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (2 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 05/14] blockjob: Introduce reference count and fix reference to job->bs John Snow
` (11 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Fam Zheng <famz@redhat.com>
This will be reused by the coming new transactional completion code.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
block/backup.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 5696431..8b92ed1 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -218,6 +218,22 @@ static void backup_iostatus_reset(BlockJob *job)
bdrv_iostatus_reset(s->target);
}
+static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
+{
+ BdrvDirtyBitmap *bm;
+ BlockDriverState *bs = job->common.bs;
+
+ if (ret < 0 || block_job_is_cancelled(&job->common)) {
+ /* Merge the successor back into the parent, delete nothing. */
+ bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
+ assert(bm);
+ } else {
+ /* Everything is fine, delete this bitmap and install the backup. */
+ bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
+ assert(bm);
+ }
+}
+
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
@@ -438,16 +454,7 @@ static void coroutine_fn backup_run(void *opaque)
qemu_co_rwlock_unlock(&job->flush_rwlock);
if (job->sync_bitmap) {
- BdrvDirtyBitmap *bm;
- if (ret < 0 || block_job_is_cancelled(&job->common)) {
- /* Merge the successor back into the parent, delete nothing. */
- bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
- assert(bm);
- } else {
- /* Everything is fine, delete this bitmap and install the backup. */
- bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
- assert(bm);
- }
+ backup_cleanup_sync_bitmap(job, ret);
}
hbitmap_free(job->bitmap);
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 05/14] blockjob: Introduce reference count and fix reference to job->bs
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (3 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 04/14] backup: Extract dirty bitmap handling as a separate function John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 06/14] blockjob: Add .commit and .abort block job actions John Snow
` (10 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Fam Zheng <famz@redhat.com>
Add reference count to block job, meanwhile move the ownership of the
reference to job->bs from the caller (which is released in two
completion callbacks) to the block job itself. It is necessary for
block_job_complete_sync to work, because block job shouldn't live longer
than its bs, as asserted in bdrv_delete.
Now block_job_complete_sync can be simplified.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
block/mirror.c | 2 +-
blockdev.c | 28 ----------------------------
blockjob.c | 25 ++++++++++++++++---------
include/block/blockjob.h | 18 +++++++++++++++---
qemu-img.c | 3 ---
5 files changed, 32 insertions(+), 44 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 7e43511..9e28ed5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -736,7 +736,7 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!s->dirty_bitmap) {
g_free(s->replaces);
- block_job_release(bs);
+ block_job_unref(&s->common);
return;
}
bdrv_set_enable_write_cache(s->target, true);
diff --git a/blockdev.c b/blockdev.c
index 74360ef..810347f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -281,32 +281,6 @@ typedef struct {
BlockDriverState *bs;
} BDRVPutRefBH;
-static void bdrv_put_ref_bh(void *opaque)
-{
- BDRVPutRefBH *s = opaque;
-
- bdrv_unref(s->bs);
- qemu_bh_delete(s->bh);
- g_free(s);
-}
-
-/*
- * Release a BDS reference in a BH
- *
- * It is not safe to use bdrv_unref() from a callback function when the callers
- * still need the BlockDriverState. In such cases we schedule a BH to release
- * the reference.
- */
-static void bdrv_put_ref_bh_schedule(BlockDriverState *bs)
-{
- BDRVPutRefBH *s;
-
- s = g_new(BDRVPutRefBH, 1);
- s->bh = qemu_bh_new(bdrv_put_ref_bh, s);
- s->bs = bs;
- qemu_bh_schedule(s->bh);
-}
-
static int parse_block_error_action(const char *buf, bool is_read, Error **errp)
{
if (!strcmp(buf, "ignore")) {
@@ -2393,8 +2367,6 @@ static void block_job_cb(void *opaque, int ret)
} else {
block_job_event_completed(bs->job, msg);
}
-
- bdrv_put_ref_bh_schedule(bs);
}
void qmp_block_stream(const char *device,
diff --git a/blockjob.c b/blockjob.c
index 1da5491..015441b 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -59,6 +59,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
job->cb = cb;
job->opaque = opaque;
job->busy = true;
+ job->refcnt = 1;
bs->job = job;
/* Only set speed when necessary to avoid NotSupported error */
@@ -67,7 +68,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
block_job_set_speed(job, speed, &local_err);
if (local_err) {
- block_job_release(bs);
+ block_job_unref(job);
error_propagate(errp, local_err);
return NULL;
}
@@ -75,15 +76,21 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
return job;
}
-void block_job_release(BlockDriverState *bs)
+void block_job_ref(BlockJob *job)
{
- BlockJob *job = bs->job;
+ ++job->refcnt;
+}
- bs->job = NULL;
- bdrv_op_unblock_all(bs, job->blocker);
- error_free(job->blocker);
- g_free(job->id);
- g_free(job);
+void block_job_unref(BlockJob *job)
+{
+ if (--job->refcnt == 0) {
+ job->bs->job = NULL;
+ bdrv_op_unblock_all(job->bs, job->blocker);
+ bdrv_unref(job->bs);
+ error_free(job->blocker);
+ g_free(job->id);
+ g_free(job);
+ }
}
void block_job_completed(BlockJob *job, int ret)
@@ -92,7 +99,7 @@ void block_job_completed(BlockJob *job, int ret)
assert(bs->job == job);
job->cb(job->opaque, ret);
- block_job_release(bs);
+ block_job_unref(job);
}
void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 289b13f..b649a40 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -130,6 +130,9 @@ struct BlockJob {
/** The opaque value that is passed to the completion function. */
void *opaque;
+
+ /** Reference count of the block job */
+ int refcnt;
};
/**
@@ -174,12 +177,21 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);
void block_job_yield(BlockJob *job);
/**
- * block_job_release:
+ * block_job_ref:
* @bs: The block device.
*
- * Release job resources when an error occurred or job completed.
+ * Grab a reference to the block job. Should be paired with block_job_unref.
*/
-void block_job_release(BlockDriverState *bs);
+void block_job_ref(BlockJob *job);
+
+/**
+ * block_job_unref:
+ * @bs: The block device.
+ *
+ * Release reference to the block job and release resources if it is the last
+ * reference.
+ */
+void block_job_unref(BlockJob *job);
/**
* block_job_completed:
diff --git a/qemu-img.c b/qemu-img.c
index 3025776..510fdbd 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -645,9 +645,6 @@ static void common_block_job_cb(void *opaque, int ret)
if (ret < 0) {
error_setg_errno(cbi->errp, -ret, "Block job failed");
}
-
- /* Drop this block job's reference */
- bdrv_unref(cbi->bs);
}
static void run_block_job(BlockJob *job, Error **errp)
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 06/14] blockjob: Add .commit and .abort block job actions
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (4 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 05/14] blockjob: Introduce reference count and fix reference to job->bs John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 07/14] blockjob: Add "completed" and "ret" in BlockJob John Snow
` (9 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
include/block/blockjob.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b649a40..ed856d7 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -50,6 +50,26 @@ typedef struct BlockJobDriver {
* manually.
*/
void (*complete)(BlockJob *job, Error **errp);
+
+ /**
+ * If the callback is not NULL, it will be invoked when all the jobs
+ * belonging to the same transaction complete; or upon this job's
+ * completion if it is not in a transaction. Skipped if NULL.
+ *
+ * All jobs will complete with a call to either .commit() or .abort() but
+ * never both.
+ */
+ void (*commit)(BlockJob *job);
+
+ /**
+ * If the callback is not NULL, it will be invoked when any job in the
+ * same transaction fails; or upon this job's failure (due to error or
+ * cancellation) if it is not in a transaction. Skipped if NULL.
+ *
+ * All jobs will complete with a call to either .commit() or .abort() but
+ * never both.
+ */
+ void (*abort)(BlockJob *job);
} BlockJobDriver;
/**
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 07/14] blockjob: Add "completed" and "ret" in BlockJob
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (5 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 06/14] blockjob: Add .commit and .abort block job actions John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync John Snow
` (8 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Fam Zheng <famz@redhat.com>
They are set when block_job_completed is called.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
blockjob.c | 3 +++
include/block/blockjob.h | 9 +++++++++
2 files changed, 12 insertions(+)
diff --git a/blockjob.c b/blockjob.c
index 015441b..0cf6b95 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -98,6 +98,9 @@ void block_job_completed(BlockJob *job, int ret)
BlockDriverState *bs = job->bs;
assert(bs->job == job);
+ assert(!job->completed);
+ job->completed = true;
+ job->ret = ret;
job->cb(job->opaque, ret);
block_job_unref(job);
}
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index ed856d7..c70d55a 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -153,6 +153,15 @@ struct BlockJob {
/** Reference count of the block job */
int refcnt;
+
+ /* True if this job has reported completion by calling block_job_completed.
+ */
+ bool completed;
+
+ /* ret code passed to block_job_completed.
+ */
+ int ret;
+
};
/**
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (6 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 07/14] blockjob: Add "completed" and "ret" in BlockJob John Snow
@ 2015-10-23 23:56 ` John Snow
2015-11-03 13:52 ` Stefan Hajnoczi
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 09/14] block: Add block job transactions John Snow
` (7 subsequent siblings)
15 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Fam Zheng <famz@redhat.com>
With job->completed and job->ret to replace BlockFinishData.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
blockjob.c | 27 ++++++---------------------
1 file changed, 6 insertions(+), 21 deletions(-)
diff --git a/blockjob.c b/blockjob.c
index 0cf6b95..2909d61 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -187,43 +187,28 @@ struct BlockFinishData {
int ret;
};
-static void block_job_finish_cb(void *opaque, int ret)
-{
- struct BlockFinishData *data = opaque;
-
- data->cancelled = block_job_is_cancelled(data->job);
- data->ret = ret;
- data->cb(data->opaque, ret);
-}
-
static int block_job_finish_sync(BlockJob *job,
void (*finish)(BlockJob *, Error **errp),
Error **errp)
{
- struct BlockFinishData data;
BlockDriverState *bs = job->bs;
Error *local_err = NULL;
+ int ret;
assert(bs->job == job);
- /* Set up our own callback to store the result and chain to
- * the original callback.
- */
- data.job = job;
- data.cb = job->cb;
- data.opaque = job->opaque;
- data.ret = -EINPROGRESS;
- job->cb = block_job_finish_cb;
- job->opaque = &data;
+ block_job_ref(job);
finish(job, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return -EBUSY;
}
- while (data.ret == -EINPROGRESS) {
+ while (!job->completed) {
aio_poll(bdrv_get_aio_context(bs), true);
}
- return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
+ ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
+ block_job_unref(job);
+ return ret;
}
/* A wrapper around block_job_cancel() taking an Error ** parameter so it may be
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 09/14] block: Add block job transactions
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (7 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 10/14] block/backup: Rely on commit/abort for cleanup John Snow
` (6 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Fam Zheng <famz@redhat.com>
Sometimes block jobs must execute as a transaction group. Finishing
jobs wait until all other jobs are ready to complete successfully.
Failure or cancellation of one job cancels the other jobs in the group.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[Rewrite the implementation which is now contained in block_job_completed.
--Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
blockjob.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++-
include/block/block.h | 1 +
include/block/blockjob.h | 38 +++++++++++++
3 files changed, 172 insertions(+), 2 deletions(-)
diff --git a/blockjob.c b/blockjob.c
index 2909d61..f4cf305 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -36,6 +36,19 @@
#include "qemu/timer.h"
#include "qapi-event.h"
+/* Transactional group of block jobs */
+struct BlockJobTxn {
+
+ /* Is this txn being cancelled? */
+ bool aborting;
+
+ /* List of jobs */
+ QLIST_HEAD(, BlockJob) jobs;
+
+ /* Reference count */
+ int refcnt;
+};
+
void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockCompletionFunc *cb,
void *opaque, Error **errp)
@@ -93,6 +106,86 @@ void block_job_unref(BlockJob *job)
}
}
+static void block_job_completed_single(BlockJob *job)
+{
+ if (!job->ret) {
+ if (job->driver->commit) {
+ job->driver->commit(job);
+ }
+ } else {
+ if (job->driver->abort) {
+ job->driver->abort(job);
+ }
+ }
+ job->cb(job->opaque, job->ret);
+ if (job->txn) {
+ block_job_txn_unref(job->txn);
+ }
+ block_job_unref(job);
+}
+
+static void block_job_completed_txn_abort(BlockJob *job)
+{
+ AioContext *ctx;
+ BlockJobTxn *txn = job->txn;
+ BlockJob *other_job, *next;
+
+ if (txn->aborting) {
+ /*
+ * We are cancelled by another job, which will handle everything.
+ */
+ return;
+ }
+ txn->aborting = true;
+ /* We are the first failed job. Cancel other jobs. */
+ QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+ ctx = bdrv_get_aio_context(other_job->bs);
+ aio_context_acquire(ctx);
+ }
+ QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+ if (other_job == job || other_job->completed) {
+ /* Other jobs are "effectively" cancelled by us, set the status for
+ * them; this job, however, may or may not be cancelled, depending
+ * on the caller, so leave it. */
+ if (other_job != job) {
+ other_job->cancelled = true;
+ }
+ continue;
+ }
+ block_job_cancel_sync(other_job);
+ assert(other_job->completed);
+ }
+ QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+ ctx = bdrv_get_aio_context(other_job->bs);
+ block_job_completed_single(other_job);
+ aio_context_release(ctx);
+ }
+}
+
+static void block_job_completed_txn_success(BlockJob *job)
+{
+ AioContext *ctx;
+ BlockJobTxn *txn = job->txn;
+ BlockJob *other_job, *next;
+ /*
+ * Successful completion, see if there are other running jobs in this
+ * txn.
+ */
+ QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+ if (!other_job->completed) {
+ return;
+ }
+ }
+ /* We are the last completed job, commit the transaction. */
+ QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+ ctx = bdrv_get_aio_context(other_job->bs);
+ aio_context_acquire(ctx);
+ assert(other_job->ret == 0);
+ block_job_completed_single(other_job);
+ aio_context_release(ctx);
+ }
+}
+
void block_job_completed(BlockJob *job, int ret)
{
BlockDriverState *bs = job->bs;
@@ -101,8 +194,13 @@ void block_job_completed(BlockJob *job, int ret)
assert(!job->completed);
job->completed = true;
job->ret = ret;
- job->cb(job->opaque, ret);
- block_job_unref(job);
+ if (!job->txn) {
+ block_job_completed_single(job);
+ } else if (ret < 0 || block_job_is_cancelled(job)) {
+ block_job_completed_txn_abort(job);
+ } else {
+ block_job_completed_txn_success(job);
+ }
}
void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -400,3 +498,36 @@ void block_job_defer_to_main_loop(BlockJob *job,
qemu_bh_schedule(data->bh);
}
+
+BlockJobTxn *block_job_txn_new(void)
+{
+ BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
+ QLIST_INIT(&txn->jobs);
+ txn->refcnt = 1;
+ return txn;
+}
+
+static void block_job_txn_ref(BlockJobTxn *txn)
+{
+ txn->refcnt++;
+}
+
+void block_job_txn_unref(BlockJobTxn *txn)
+{
+ if (txn && --txn->refcnt == 0) {
+ g_free(txn);
+ }
+}
+
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
+{
+ if (!txn) {
+ return;
+ }
+
+ assert(!job->txn);
+ job->txn = txn;
+
+ QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
+ block_job_txn_ref(txn);
+}
diff --git a/include/block/block.h b/include/block/block.h
index 7e8a6f6..9142884 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -14,6 +14,7 @@ typedef struct BlockDriver BlockDriver;
typedef struct BlockJob BlockJob;
typedef struct BdrvChild BdrvChild;
typedef struct BdrvChildRole BdrvChildRole;
+typedef struct BlockJobTxn BlockJobTxn;
typedef struct BlockDriverInfo {
/* in bytes, 0 if irrelevant */
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index c70d55a..d84ccd8 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -162,6 +162,9 @@ struct BlockJob {
*/
int ret;
+ /** Non-NULL if this job is part of a transaction */
+ BlockJobTxn *txn;
+ QLIST_ENTRY(BlockJob) txn_list;
};
/**
@@ -405,4 +408,39 @@ void block_job_defer_to_main_loop(BlockJob *job,
BlockJobDeferToMainLoopFn *fn,
void *opaque);
+/**
+ * block_job_txn_new:
+ *
+ * Allocate and return a new block job transaction. Jobs can be added to the
+ * transaction using block_job_txn_add_job().
+ *
+ * The transaction is automatically freed when the last job completes or is
+ * cancelled.
+ *
+ * All jobs in the transaction either complete successfully or fail/cancel as a
+ * group. Jobs wait for each other before completing. Cancelling one job
+ * cancels all jobs in the transaction.
+ */
+BlockJobTxn *block_job_txn_new(void);
+
+/**
+ * block_job_txn_unref:
+ *
+ * Release a reference that was previously acquired with block_job_txn_add_job
+ * or block_job_txn_new. If it's the last reference to the object, it will be
+ * freed.
+ */
+void block_job_txn_unref(BlockJobTxn *txn);
+
+/**
+ * block_job_txn_add_job:
+ * @txn: The transaction (may be NULL)
+ * @job: Job to add to the transaction
+ *
+ * Add @job to the transaction. The @job must not already be in a transaction.
+ * The caller must call either block_job_txn_unref() or block_job_completed()
+ * to release the reference that is automatically grabbed here.
+ */
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
+
#endif
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 10/14] block/backup: Rely on commit/abort for cleanup
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (8 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 09/14] block: Add block job transactions John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 11/14] block: Add BlockJobTxn support to backup_run John Snow
` (5 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
Switch over to the new .commit/.abort handlers for
cleaning up incremental bitmaps.
[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
block/backup.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index 8b92ed1..b4534a0 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -234,11 +234,29 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
}
}
+static void backup_commit(BlockJob *job)
+{
+ BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+ if (s->sync_bitmap) {
+ backup_cleanup_sync_bitmap(s, 0);
+ }
+}
+
+static void backup_abort(BlockJob *job)
+{
+ BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+ if (s->sync_bitmap) {
+ backup_cleanup_sync_bitmap(s, -1);
+ }
+}
+
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed,
.iostatus_reset = backup_iostatus_reset,
+ .commit = backup_commit,
+ .abort = backup_abort,
};
static BlockErrorAction backup_error_action(BackupBlockJob *job,
@@ -452,10 +470,6 @@ static void coroutine_fn backup_run(void *opaque)
/* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
-
- if (job->sync_bitmap) {
- backup_cleanup_sync_bitmap(job, ret);
- }
hbitmap_free(job->bitmap);
bdrv_iostatus_disable(target);
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 11/14] block: Add BlockJobTxn support to backup_run
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (9 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 10/14] block/backup: Rely on commit/abort for cleanup John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 12/14] block: add transactional properties John Snow
` (4 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
Allow a BlockJobTxn to be passed into backup_run, which
will allow the job to join a transactional group if present.
Propagate this new parameter outward into new QMP helper
functions in blockdev.c to allow transaction commands to
pass forward their BlockJobTxn object in a forthcoming patch.
[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
block/backup.c | 3 +-
blockdev.c | 112 ++++++++++++++++++++++++++++++++++------------
include/block/block_int.h | 3 +-
3 files changed, 88 insertions(+), 30 deletions(-)
diff --git a/block/backup.c b/block/backup.c
index b4534a0..f306573 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -486,7 +486,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockCompletionFunc *cb, void *opaque,
- Error **errp)
+ BlockJobTxn *txn, Error **errp)
{
int64_t len;
@@ -568,6 +568,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
sync_bitmap : NULL;
job->common.len = len;
job->common.co = qemu_coroutine_create(backup_run);
+ block_job_txn_add_job(txn, &job->common);
qemu_coroutine_enter(job->common.co, job);
return;
diff --git a/blockdev.c b/blockdev.c
index 810347f..d1dcf68 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1557,6 +1557,18 @@ typedef struct DriveBackupState {
BlockJob *job;
} DriveBackupState;
+static void do_drive_backup(const char *device, const char *target,
+ bool has_format, const char *format,
+ enum MirrorSyncMode sync,
+ bool has_mode, enum NewImageMode mode,
+ bool has_speed, int64_t speed,
+ bool has_bitmap, const char *bitmap,
+ bool has_on_source_error,
+ BlockdevOnError on_source_error,
+ bool has_on_target_error,
+ BlockdevOnError on_target_error,
+ BlockJobTxn *txn, Error **errp);
+
static void drive_backup_prepare(BlkActionState *common, Error **errp)
{
DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -1580,15 +1592,15 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
state->aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(state->aio_context);
- qmp_drive_backup(backup->device, backup->target,
- backup->has_format, backup->format,
- backup->sync,
- backup->has_mode, backup->mode,
- backup->has_speed, backup->speed,
- backup->has_bitmap, backup->bitmap,
- backup->has_on_source_error, backup->on_source_error,
- backup->has_on_target_error, backup->on_target_error,
- &local_err);
+ do_drive_backup(backup->device, backup->target,
+ backup->has_format, backup->format,
+ backup->sync,
+ backup->has_mode, backup->mode,
+ backup->has_speed, backup->speed,
+ backup->has_bitmap, backup->bitmap,
+ backup->has_on_source_error, backup->on_source_error,
+ backup->has_on_target_error, backup->on_target_error,
+ NULL, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -1625,6 +1637,15 @@ typedef struct BlockdevBackupState {
AioContext *aio_context;
} BlockdevBackupState;
+static void do_blockdev_backup(const char *device, const char *target,
+ enum MirrorSyncMode sync,
+ bool has_speed, int64_t speed,
+ bool has_on_source_error,
+ BlockdevOnError on_source_error,
+ bool has_on_target_error,
+ BlockdevOnError on_target_error,
+ BlockJobTxn *txn, Error **errp);
+
static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
{
BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
@@ -1659,12 +1680,12 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
}
aio_context_acquire(state->aio_context);
- qmp_blockdev_backup(backup->device, backup->target,
- backup->sync,
- backup->has_speed, backup->speed,
- backup->has_on_source_error, backup->on_source_error,
- backup->has_on_target_error, backup->on_target_error,
- &local_err);
+ do_blockdev_backup(backup->device, backup->target,
+ backup->sync,
+ backup->has_speed, backup->speed,
+ backup->has_on_source_error, backup->on_source_error,
+ backup->has_on_target_error, backup->on_target_error,
+ NULL, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -2537,15 +2558,17 @@ out:
aio_context_release(aio_context);
}
-void qmp_drive_backup(const char *device, const char *target,
- bool has_format, const char *format,
- enum MirrorSyncMode sync,
- bool has_mode, enum NewImageMode mode,
- bool has_speed, int64_t speed,
- bool has_bitmap, const char *bitmap,
- bool has_on_source_error, BlockdevOnError on_source_error,
- bool has_on_target_error, BlockdevOnError on_target_error,
- Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+ bool has_format, const char *format,
+ enum MirrorSyncMode sync,
+ bool has_mode, enum NewImageMode mode,
+ bool has_speed, int64_t speed,
+ bool has_bitmap, const char *bitmap,
+ bool has_on_source_error,
+ BlockdevOnError on_source_error,
+ bool has_on_target_error,
+ BlockdevOnError on_target_error,
+ BlockJobTxn *txn, Error **errp)
{
BlockBackend *blk;
BlockDriverState *bs;
@@ -2660,7 +2683,7 @@ void qmp_drive_backup(const char *device, const char *target,
backup_start(bs, target_bs, speed, sync, bmap,
on_source_error, on_target_error,
- block_job_cb, bs, &local_err);
+ block_job_cb, bs, txn, &local_err);
if (local_err != NULL) {
bdrv_unref(target_bs);
error_propagate(errp, local_err);
@@ -2671,19 +2694,37 @@ out:
aio_context_release(aio_context);
}
+void qmp_drive_backup(const char *device, const char *target,
+ bool has_format, const char *format,
+ enum MirrorSyncMode sync,
+ bool has_mode, enum NewImageMode mode,
+ bool has_speed, int64_t speed,
+ bool has_bitmap, const char *bitmap,
+ bool has_on_source_error, BlockdevOnError on_source_error,
+ bool has_on_target_error, BlockdevOnError on_target_error,
+ Error **errp)
+{
+ return do_drive_backup(device, target, has_format, format, sync,
+ has_mode, mode, has_speed, speed,
+ has_bitmap, bitmap,
+ has_on_source_error, on_source_error,
+ has_on_target_error, on_target_error,
+ NULL, errp);
+}
+
BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
{
return bdrv_named_nodes_list(errp);
}
-void qmp_blockdev_backup(const char *device, const char *target,
+void do_blockdev_backup(const char *device, const char *target,
enum MirrorSyncMode sync,
bool has_speed, int64_t speed,
bool has_on_source_error,
BlockdevOnError on_source_error,
bool has_on_target_error,
BlockdevOnError on_target_error,
- Error **errp)
+ BlockJobTxn *txn, Error **errp)
{
BlockBackend *blk;
BlockDriverState *bs;
@@ -2721,7 +2762,7 @@ void qmp_blockdev_backup(const char *device, const char *target,
bdrv_ref(target_bs);
bdrv_set_aio_context(target_bs, aio_context);
backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
- on_target_error, block_job_cb, bs, &local_err);
+ on_target_error, block_job_cb, bs, txn, &local_err);
if (local_err != NULL) {
bdrv_unref(target_bs);
error_propagate(errp, local_err);
@@ -2730,6 +2771,21 @@ out:
aio_context_release(aio_context);
}
+void qmp_blockdev_backup(const char *device, const char *target,
+ enum MirrorSyncMode sync,
+ bool has_speed, int64_t speed,
+ bool has_on_source_error,
+ BlockdevOnError on_source_error,
+ bool has_on_target_error,
+ BlockdevOnError on_target_error,
+ Error **errp)
+{
+ do_blockdev_backup(device, target, sync, has_speed, speed,
+ has_on_source_error, on_source_error,
+ has_on_target_error, on_target_error,
+ NULL, errp);
+}
+
void qmp_drive_mirror(const char *device, const char *target,
bool has_format, const char *format,
bool has_node_name, const char *node_name,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 025aefb..668e792 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -648,6 +648,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
* @on_target_error: The action to take upon error writing to the target.
* @cb: Completion function for the job.
* @opaque: Opaque pointer value passed to @cb.
+ * @txn: Transaction that this job is part of (may be NULL).
*
* Start a backup operation on @bs. Clusters in @bs are written to @target
* until the job is cancelled or manually completed.
@@ -658,7 +659,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
BlockCompletionFunc *cb, void *opaque,
- Error **errp);
+ BlockJobTxn *txn, Error **errp);
void blk_set_bs(BlockBackend *blk, BlockDriverState *bs);
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (10 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 11/14] block: Add BlockJobTxn support to backup_run John Snow
@ 2015-10-23 23:56 ` John Snow
2015-11-03 15:17 ` Stefan Hajnoczi
2015-11-03 15:23 ` [Qemu-devel] " Eric Blake
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 13/14] iotests: 124 - transactional failure test John Snow
` (3 subsequent siblings)
15 siblings, 2 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
Add both transactional properties to the QMP transactional interface,
and add the BlockJobTxn that we create as a result of the err-cancel
property to the BlkActionState structure.
[split up from a patch originally by Stefan and Fam. --js]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
blockdev.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
qapi-schema.json | 48 +++++++++++++++++++++++++++++++---
qmp-commands.hx | 2 +-
3 files changed, 120 insertions(+), 8 deletions(-)
diff --git a/blockdev.c b/blockdev.c
index d1dcf68..9b5e2fa 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1033,7 +1033,7 @@ static void blockdev_do_action(int kind, void *data, Error **errp)
action.data = data;
list.value = &action;
list.next = NULL;
- qmp_transaction(&list, errp);
+ qmp_transaction(&list, false, NULL, errp);
}
void qmp_blockdev_snapshot_sync(bool has_device, const char *device,
@@ -1248,6 +1248,7 @@ typedef struct BlkActionOps {
*
* @action: QAPI-defined enum identifying which Action to perform.
* @ops: Table of ActionOps this Action can perform.
+ * @block_job_txn: Transaction which this action belongs to.
* @entry: List membership for all Actions in this Transaction.
*
* This structure must be arranged as first member in a subclassed type,
@@ -1257,6 +1258,8 @@ typedef struct BlkActionOps {
struct BlkActionState {
TransactionAction *action;
const BlkActionOps *ops;
+ BlockJobTxn *block_job_txn;
+ TransactionProperties *txn_props;
QSIMPLEQ_ENTRY(BlkActionState) entry;
};
@@ -1268,6 +1271,20 @@ typedef struct InternalSnapshotState {
QEMUSnapshotInfo sn;
} InternalSnapshotState;
+
+static int action_check_cancel_mode(BlkActionState *s, Error **errp)
+{
+ if (s->txn_props->err_cancel != ACTION_CANCEL_MODE_NONE) {
+ error_setg(errp,
+ "Action '%s' does not support Transaction property "
+ "err-cancel = %s",
+ TransactionActionKind_lookup[s->action->kind],
+ ActionCancelMode_lookup[s->txn_props->err_cancel]);
+ return -1;
+ }
+ return 0;
+}
+
static void internal_snapshot_prepare(BlkActionState *common,
Error **errp)
{
@@ -1293,6 +1310,10 @@ static void internal_snapshot_prepare(BlkActionState *common,
name = internal->name;
/* 2. check for validation */
+ if (action_check_cancel_mode(common, errp) < 0) {
+ return;
+ }
+
blk = blk_by_name(device);
if (!blk) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
@@ -1444,6 +1465,10 @@ static void external_snapshot_prepare(BlkActionState *common,
}
/* start processing */
+ if (action_check_cancel_mode(common, errp) < 0) {
+ return;
+ }
+
state->old_bs = bdrv_lookup_bs(has_device ? device : NULL,
has_node_name ? node_name : NULL,
&local_err);
@@ -1600,7 +1625,7 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
backup->has_bitmap, backup->bitmap,
backup->has_on_source_error, backup->on_source_error,
backup->has_on_target_error, backup->on_target_error,
- NULL, &local_err);
+ common->block_job_txn, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -1685,7 +1710,7 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
backup->has_speed, backup->speed,
backup->has_on_source_error, backup->on_source_error,
backup->has_on_target_error, backup->on_target_error,
- NULL, &local_err);
+ common->block_job_txn, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
common, common);
+ if (action_check_cancel_mode(common, errp) < 0) {
+ return;
+ }
+
action = common->action->block_dirty_bitmap_add;
/* AIO context taken and released within qmp_block_dirty_bitmap_add */
qmp_block_dirty_bitmap_add(action->node, action->name,
@@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
common, common);
BlockDirtyBitmap *action;
+ if (action_check_cancel_mode(common, errp) < 0) {
+ return;
+ }
+
action = common->action->block_dirty_bitmap_clear;
state->bitmap = block_dirty_bitmap_lookup(action->node,
action->name,
@@ -1869,19 +1902,50 @@ static const BlkActionOps actions[] = {
}
};
+/**
+ * Allocate a TransactionProperties structure if necessary, and fill
+ * that structure with desired defaults if they are unset.
+ */
+static TransactionProperties *get_transaction_properties(
+ TransactionProperties *props)
+{
+ if (!props) {
+ props = g_new0(TransactionProperties, 1);
+ }
+
+ if (!props->has_err_cancel) {
+ props->has_err_cancel = true;
+ props->err_cancel = ACTION_CANCEL_MODE_NONE;
+ }
+
+ return props;
+}
+
/*
* 'Atomic' group operations. The operations are performed as a set, and if
* any fail then we roll back all operations in the group.
*/
-void qmp_transaction(TransactionActionList *dev_list, Error **errp)
+void qmp_transaction(TransactionActionList *dev_list,
+ bool has_props,
+ struct TransactionProperties *props,
+ Error **errp)
{
TransactionActionList *dev_entry = dev_list;
+ BlockJobTxn *block_job_txn = NULL;
BlkActionState *state, *next;
Error *local_err = NULL;
QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
QSIMPLEQ_INIT(&snap_bdrv_states);
+ /* Does this transaction get canceled as a group on failure?
+ * If not, we don't really need to make a BlockJobTxn.
+ */
+ props = get_transaction_properties(props);
+ if (props->err_cancel != ACTION_CANCEL_MODE_NONE) {
+ block_job_txn = block_job_txn_new();
+ }
+
/* drain all i/o before any operations */
bdrv_drain_all();
@@ -1901,6 +1965,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
state = g_malloc0(ops->instance_size);
state->ops = ops;
state->action = dev_info;
+ state->block_job_txn = block_job_txn;
+ state->txn_props = props;
QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
state->ops->prepare(state, &local_err);
@@ -1933,6 +1999,10 @@ exit:
}
g_free(state);
}
+ if (!has_props) {
+ qapi_free_TransactionProperties(props);
+ }
+ block_job_txn_unref(block_job_txn);
}
diff --git a/qapi-schema.json b/qapi-schema.json
index a73c2c9..86323e1 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1520,6 +1520,24 @@
'data': { } }
##
+# @ActionCancelMode
+#
+# An enumeration of Transactional cancellation modes.
+#
+# @none: Do not attempt to cancel any other Actions if any
+# actions fail after the Transaction request succeeds.
+# This is the default.
+#
+# @all: If any action fails after the Transaction succeeds,
+# cancel all actions. May be rejected by Actions that
+# do not support this cancellation mode.
+#
+# Since: 2.5
+##
+{ 'enum': 'ActionCancelMode',
+ 'data': [ 'none', 'all' ] }
+
+##
# @TransactionAction
#
# A discriminated record of operations that can be performed with
@@ -1546,14 +1564,35 @@
} }
##
+# @TransactionProperties
+#
+# Optional arguments to modify the behavior of a Transaction.
+#
+# @err-cancel: How failures that occur in jobs launched asynchronously
+# by a Transaction that has already completed should be handled.
+# See @ActionCancelMode for details.
+#
+# Since: 2.5
+##
+{ 'struct': 'TransactionProperties',
+ 'data': {
+ '*err-cancel': 'ActionCancelMode'
+ }
+}
+
+##
# @transaction
#
# Executes a number of transactionable QMP commands atomically. If any
# operation fails, then the entire set of actions will be abandoned and the
# appropriate error returned.
#
-# List of:
-# @TransactionAction: information needed for the respective operation
+# @actions: List of @TransactionAction;
+# information needed for the respective operations.
+#
+# @properties: Optional structure of additional options to control the
+# execution of the transaction. See @TransactionProperties
+# for additional detail.
#
# Returns: nothing on success
# Errors depend on the operations of the transaction
@@ -1565,7 +1604,10 @@
# Since 1.1
##
{ 'command': 'transaction',
- 'data': { 'actions': [ 'TransactionAction' ] } }
+ 'data': { 'actions': [ 'TransactionAction' ],
+ '*properties': 'TransactionProperties'
+ }
+}
##
# @human-monitor-command:
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 2b52980..4a2cd15 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1262,7 +1262,7 @@ EQMP
},
{
.name = "transaction",
- .args_type = "actions:q",
+ .args_type = "actions:q,properties:q?",
.mhandler.cmd_new = qmp_marshal_transaction,
},
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 13/14] iotests: 124 - transactional failure test
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (11 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 12/14] block: add transactional properties John Snow
@ 2015-10-23 23:56 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 14/14] tests: add BlockJobTxn unit test John Snow
` (2 subsequent siblings)
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.
Verify that no bitmap data was lost due to the partial transaction
failure.
To support the 'err-cancel' QMP argument name it's necessary for
transaction_action() to convert underscores in Python argument names
to hyphens for QMP argument names.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
---
tests/qemu-iotests/124 | 130 ++++++++++++++++++++++++++++++++++++++++++++-
tests/qemu-iotests/124.out | 4 +-
2 files changed, 130 insertions(+), 4 deletions(-)
diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 9c1977e..009e436 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -39,7 +39,7 @@ def try_remove(img):
def transaction_action(action, **kwargs):
return {
'type': action,
- 'data': kwargs
+ 'data': dict((k.replace('_', '-'), v) for k, v in kwargs.iteritems())
}
@@ -139,9 +139,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
def do_qmp_backup(self, error='Input/output error', **kwargs):
res = self.vm.qmp('drive-backup', **kwargs)
self.assert_qmp(res, 'return', {})
+ return self.wait_qmp_backup(kwargs['device'], error)
+
+ def wait_qmp_backup(self, device, error='Input/output error'):
event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
- match={'data': {'device': kwargs['device']}})
+ match={'data': {'device': device}})
self.assertNotEqual(event, None)
try:
@@ -156,6 +159,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
return False
+ def wait_qmp_backup_cancelled(self, device):
+ event = self.vm.event_wait(name='BLOCK_JOB_CANCELLED',
+ match={'data': {'device': device}})
+ self.assertNotEqual(event, None)
+
+
def create_anchor_backup(self, drive=None):
if drive is None:
drive = self.drives[-1]
@@ -375,6 +384,123 @@ class TestIncrementalBackup(iotests.QMPTestCase):
self.check_backups()
+ def test_transaction_failure(self):
+ '''Test: Verify backups made from a transaction that partially fails.
+
+ Add a second drive with its own unique pattern, and add a bitmap to each
+ drive. Use blkdebug to interfere with the backup on just one drive and
+ attempt to create a coherent incremental backup across both drives.
+
+ verify a failure in one but not both, then delete the failed stubs and
+ re-run the same transaction.
+
+ verify that both incrementals are created successfully.
+ '''
+
+ # Create a second drive, with pattern:
+ drive1 = self.add_node('drive1')
+ self.img_create(drive1['file'], drive1['fmt'])
+ io_write_patterns(drive1['file'], (('0x14', 0, 512),
+ ('0x5d', '1M', '32k'),
+ ('0xcd', '32M', '124k')))
+
+ # Create a blkdebug interface to this img as 'drive1'
+ result = self.vm.qmp('blockdev-add', options={
+ 'id': drive1['id'],
+ 'driver': drive1['fmt'],
+ 'file': {
+ 'driver': 'blkdebug',
+ 'image': {
+ 'driver': 'file',
+ 'filename': drive1['file']
+ },
+ 'set-state': [{
+ 'event': 'flush_to_disk',
+ 'state': 1,
+ 'new_state': 2
+ }],
+ 'inject-error': [{
+ 'event': 'read_aio',
+ 'errno': 5,
+ 'state': 2,
+ 'immediately': False,
+ 'once': True
+ }],
+ }
+ })
+ self.assert_qmp(result, 'return', {})
+
+ # Create bitmaps and full backups for both drives
+ drive0 = self.drives[0]
+ dr0bm0 = self.add_bitmap('bitmap0', drive0)
+ dr1bm0 = self.add_bitmap('bitmap0', drive1)
+ self.create_anchor_backup(drive0)
+ self.create_anchor_backup(drive1)
+ self.assert_no_active_block_jobs()
+ self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+ # Emulate some writes
+ self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+ ('0xfe', '16M', '256k'),
+ ('0x64', '32736k', '64k')))
+ self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+ ('0xef', '16M', '256k'),
+ ('0x46', '32736k', '64k')))
+
+ # Create incremental backup targets
+ target0 = self.prepare_backup(dr0bm0)
+ target1 = self.prepare_backup(dr1bm0)
+
+ # Ask for a new incremental backup per-each drive,
+ # expecting drive1's backup to fail:
+ transaction = [
+ transaction_drive_backup(drive0['id'], target0, sync='incremental',
+ format=drive0['fmt'], mode='existing',
+ bitmap=dr0bm0.name),
+ transaction_drive_backup(drive1['id'], target1, sync='incremental',
+ format=drive1['fmt'], mode='existing',
+ bitmap=dr1bm0.name)
+ ]
+ result = self.vm.qmp('transaction', actions=transaction,
+ properties={'err-cancel': 'all'} )
+ self.assert_qmp(result, 'return', {})
+
+ # Observe that drive0's backup is cancelled and drive1 completes with
+ # an error.
+ self.wait_qmp_backup_cancelled(drive0['id'])
+ self.assertFalse(self.wait_qmp_backup(drive1['id']))
+ error = self.vm.event_wait('BLOCK_JOB_ERROR')
+ self.assert_qmp(error, 'data', {'device': drive1['id'],
+ 'action': 'report',
+ 'operation': 'read'})
+ self.assertFalse(self.vm.get_qmp_events(wait=False))
+ self.assert_no_active_block_jobs()
+
+ # Delete drive0's successful target and eliminate our record of the
+ # unsuccessful drive1 target. Then re-run the same transaction.
+ dr0bm0.del_target()
+ dr1bm0.del_target()
+ target0 = self.prepare_backup(dr0bm0)
+ target1 = self.prepare_backup(dr1bm0)
+
+ # Re-run the exact same transaction.
+ result = self.vm.qmp('transaction', actions=transaction,
+ properties={'err-cancel':'all'})
+ self.assert_qmp(result, 'return', {})
+
+ # Both should complete successfully this time.
+ self.assertTrue(self.wait_qmp_backup(drive0['id']))
+ self.assertTrue(self.wait_qmp_backup(drive1['id']))
+ self.make_reference_backup(dr0bm0)
+ self.make_reference_backup(dr1bm0)
+ self.assertFalse(self.vm.get_qmp_events(wait=False))
+ self.assert_no_active_block_jobs()
+
+ # And the images should of course validate.
+ self.vm.shutdown()
+ self.check_backups()
+
+
def test_sync_dirty_bitmap_missing(self):
self.assert_no_active_block_jobs()
self.files.append(self.err_img)
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 594c16f..dae404e 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-........
+.........
----------------------------------------------------------------------
-Ran 8 tests
+Ran 9 tests
OK
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [Qemu-devel] [PATCH v10 14/14] tests: add BlockJobTxn unit test
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (12 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 13/14] iotests: 124 - transactional failure test John Snow
@ 2015-10-23 23:56 ` John Snow
2015-11-02 21:06 ` [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-11-03 15:22 ` Stefan Hajnoczi
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-10-23 23:56 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, John Snow, armbru, qemu-devel, stefanha
From: Stefan Hajnoczi <stefanha@redhat.com>
The BlockJobTxn unit test verifies that both single jobs and pairs of
jobs behave as a transaction group. Either all jobs complete
successfully or the group is cancelled.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
tests/Makefile | 3 +
tests/test-blockjob-txn.c | 244 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 247 insertions(+)
create mode 100644 tests/test-blockjob-txn.c
diff --git a/tests/Makefile b/tests/Makefile
index 9341498..e9cae36 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -47,6 +47,8 @@ check-unit-y += tests/test-thread-pool$(EXESUF)
gcov-files-test-thread-pool-y = thread-pool.c
gcov-files-test-hbitmap-y = util/hbitmap.c
check-unit-y += tests/test-hbitmap$(EXESUF)
+gcov-files-test-hbitmap-y = blockjob.c
+check-unit-y += tests/test-blockjob-txn$(EXESUF)
check-unit-y += tests/test-x86-cpuid$(EXESUF)
# all code tested by test-x86-cpuid is inside topology.h
gcov-files-test-x86-cpuid-y =
@@ -384,6 +386,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(test-block-obj-y)
tests/test-aio$(EXESUF): tests/test-aio.o $(test-block-obj-y)
tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o $(test-util-obj-y)
tests/test-throttle$(EXESUF): tests/test-throttle.o $(test-block-obj-y)
+tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(test-block-obj-y) $(test-util-obj-y)
tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
new file mode 100644
index 0000000..000f70e
--- /dev/null
+++ b/tests/test-blockjob-txn.c
@@ -0,0 +1,244 @@
+/*
+ * Blockjob transactions tests
+ *
+ * Copyright Red Hat, Inc. 2015
+ *
+ * Authors:
+ * Stefan Hajnoczi <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include <glib.h>
+#include "qapi/error.h"
+#include "qemu/main-loop.h"
+#include "block/blockjob.h"
+
+typedef struct {
+ BlockJob common;
+ unsigned int iterations;
+ bool use_timer;
+ int rc;
+ int *result;
+} TestBlockJob;
+
+static const BlockJobDriver test_block_job_driver = {
+ .instance_size = sizeof(TestBlockJob),
+};
+
+static void test_block_job_complete(BlockJob *job, void *opaque)
+{
+ BlockDriverState *bs = job->bs;
+ int rc = (intptr_t)opaque;
+
+ if (block_job_is_cancelled(job)) {
+ rc = -ECANCELED;
+ }
+
+ block_job_completed(job, rc);
+ bdrv_unref(bs);
+}
+
+static void coroutine_fn test_block_job_run(void *opaque)
+{
+ TestBlockJob *s = opaque;
+ BlockJob *job = &s->common;
+
+ while (s->iterations--) {
+ if (s->use_timer) {
+ block_job_sleep_ns(job, QEMU_CLOCK_REALTIME, 0);
+ } else {
+ block_job_yield(job);
+ }
+
+ if (block_job_is_cancelled(job)) {
+ break;
+ }
+ }
+
+ block_job_defer_to_main_loop(job, test_block_job_complete,
+ (void *)(intptr_t)s->rc);
+}
+
+typedef struct {
+ TestBlockJob *job;
+ int *result;
+} TestBlockJobCBData;
+
+static void test_block_job_cb(void *opaque, int ret)
+{
+ TestBlockJobCBData *data = opaque;
+ if (!ret && block_job_is_cancelled(&data->job->common)) {
+ ret = -ECANCELED;
+ }
+ *data->result = ret;
+ g_free(data);
+}
+
+/* Create a block job that completes with a given return code after a given
+ * number of event loop iterations. The return code is stored in the given
+ * result pointer.
+ *
+ * The event loop iterations can either be handled automatically with a 0 delay
+ * timer, or they can be stepped manually by entering the coroutine.
+ */
+static BlockJob *test_block_job_start(unsigned int iterations,
+ bool use_timer,
+ int rc, int *result)
+{
+ BlockDriverState *bs;
+ TestBlockJob *s;
+ TestBlockJobCBData *data;
+
+ data = g_new0(TestBlockJobCBData, 1);
+ bs = bdrv_new();
+ s = block_job_create(&test_block_job_driver, bs, 0, test_block_job_cb,
+ data, &error_abort);
+ s->iterations = iterations;
+ s->use_timer = use_timer;
+ s->rc = rc;
+ s->result = result;
+ s->common.co = qemu_coroutine_create(test_block_job_run);
+ data->job = s;
+ data->result = result;
+ qemu_coroutine_enter(s->common.co, s);
+ return &s->common;
+}
+
+static void test_single_job(int expected)
+{
+ BlockJob *job;
+ BlockJobTxn *txn;
+ int result = -EINPROGRESS;
+
+ txn = block_job_txn_new();
+ job = test_block_job_start(1, true, expected, &result);
+ block_job_txn_add_job(txn, job);
+
+ if (expected == -ECANCELED) {
+ block_job_cancel(job);
+ }
+
+ while (result == -EINPROGRESS) {
+ aio_poll(qemu_get_aio_context(), true);
+ }
+ g_assert_cmpint(result, ==, expected);
+}
+
+static void test_single_job_success(void)
+{
+ test_single_job(0);
+}
+
+static void test_single_job_failure(void)
+{
+ test_single_job(-EIO);
+}
+
+static void test_single_job_cancel(void)
+{
+ test_single_job(-ECANCELED);
+}
+
+static void test_pair_jobs(int expected1, int expected2)
+{
+ BlockJob *job1;
+ BlockJob *job2;
+ BlockJobTxn *txn;
+ int result1 = -EINPROGRESS;
+ int result2 = -EINPROGRESS;
+
+ txn = block_job_txn_new();
+ job1 = test_block_job_start(1, true, expected1, &result1);
+ block_job_txn_add_job(txn, job1);
+ job2 = test_block_job_start(2, true, expected2, &result2);
+ block_job_txn_add_job(txn, job2);
+
+ if (expected1 == -ECANCELED) {
+ block_job_cancel(job1);
+ }
+ if (expected2 == -ECANCELED) {
+ block_job_cancel(job2);
+ }
+
+ while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
+ aio_poll(qemu_get_aio_context(), true);
+ }
+
+ /* Failure or cancellation of one job cancels the other job */
+ if (expected1 != 0) {
+ expected2 = -ECANCELED;
+ } else if (expected2 != 0) {
+ expected1 = -ECANCELED;
+ }
+
+ g_assert_cmpint(result1, ==, expected1);
+ g_assert_cmpint(result2, ==, expected2);
+}
+
+static void test_pair_jobs_success(void)
+{
+ test_pair_jobs(0, 0);
+}
+
+static void test_pair_jobs_failure(void)
+{
+ /* Test both orderings. The two jobs run for a different number of
+ * iterations so the code path is different depending on which job fails
+ * first.
+ */
+ test_pair_jobs(-EIO, 0);
+ test_pair_jobs(0, -EIO);
+}
+
+static void test_pair_jobs_cancel(void)
+{
+ test_pair_jobs(-ECANCELED, 0);
+ test_pair_jobs(0, -ECANCELED);
+}
+
+static void test_pair_jobs_fail_cancel_race(void)
+{
+ BlockJob *job1;
+ BlockJob *job2;
+ BlockJobTxn *txn;
+ int result1 = -EINPROGRESS;
+ int result2 = -EINPROGRESS;
+
+ txn = block_job_txn_new();
+ job1 = test_block_job_start(1, true, -ECANCELED, &result1);
+ block_job_txn_add_job(txn, job1);
+ job2 = test_block_job_start(2, false, 0, &result2);
+ block_job_txn_add_job(txn, job2);
+
+ block_job_cancel(job1);
+
+ /* Now make job2 finish before the main loop kicks jobs. This simulates
+ * the race between a pending kick and another job completing.
+ */
+ block_job_enter(job2);
+ block_job_enter(job2);
+
+ while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
+ aio_poll(qemu_get_aio_context(), true);
+ }
+
+ g_assert_cmpint(result1, ==, -ECANCELED);
+ g_assert_cmpint(result2, ==, -ECANCELED);
+}
+
+int main(int argc, char **argv)
+{
+ qemu_init_main_loop(&error_abort);
+
+ g_test_init(&argc, &argv, NULL);
+ g_test_add_func("/single/success", test_single_job_success);
+ g_test_add_func("/single/failure", test_single_job_failure);
+ g_test_add_func("/single/cancel", test_single_job_cancel);
+ g_test_add_func("/pair/success", test_pair_jobs_success);
+ g_test_add_func("/pair/failure", test_pair_jobs_failure);
+ g_test_add_func("/pair/cancel", test_pair_jobs_cancel);
+ g_test_add_func("/pair/fail-cancel-race", test_pair_jobs_fail_cancel_race);
+ return g_test_run();
+}
--
2.4.3
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (13 preceding siblings ...)
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 14/14] tests: add BlockJobTxn unit test John Snow
@ 2015-11-02 21:06 ` John Snow
2015-11-03 15:22 ` Stefan Hajnoczi
15 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-11-02 21:06 UTC (permalink / raw)
To: qemu-block; +Cc: vsementsov, famz, armbru, qemu-devel, stefanha
Ping!
To clarify, I *do* intend this series to be for-2.5, and I think it does
qualify as such.
If this *doesn't* go in, we have to revise our documentation just a
pinch to undo some of Kashyap's changes that have already been merged in.
Patches 5, 10, 11 and 12 are awaiting review.
--js
On 10/23/2015 07:56 PM, John Snow wrote:
> Welcome to V10!
>
> Where'd 8 and 9 go? Private off-list missives from Fam.
> Now you, I, and everyone on qemu-devel are staring at V10.
>
> What's new in V10?
>
> I replaced the per-action "transactional-cancel" parameter with
> a per-transaction paremeter named "err-cancel" which is implemented
> as an enum in case we want to add new behaviors in the future, such
> as a "jobs only" cancel mode.
>
> For now, it's "all" or "none", and if you use it with actions that
> do not support the latent transactional cancel, you will receive
> an error for your troubles.
>
> This version primarily changed patches 10,11 and replaced it with
> patches 10-12 that are cut a little differently.
>
> This is based on top of the work by Stefan Hajnoczi and Fam Zheng.
>
> Recap: motivation for block job transactions
> --------------------------------------------
> If an incremental backup block job fails then we reclaim the bitmap so
> the job can be retried. The problem comes when multiple jobs are started as
> part of a qmp 'transaction' command. We need to group these jobs in a
> transaction so that either all jobs complete successfully or all bitmaps are
> reclaimed.
>
> Without transactions, there is a case where some jobs complete successfully and
> throw away their bitmaps, making it impossible to retry the backup by rerunning
> the command if one of the jobs fails.
>
> How does this implementation work?
> ----------------------------------
> These patches add a BlockJobTxn object with the following API:
>
> txn = block_job_txn_new();
> block_job_txn_add_job(txn, job1);
> block_job_txn_add_job(txn, job2);
>
> The jobs either both complete successfully or they both fail/cancel. If the
> user cancels job1 then job2 will also be cancelled and vice versa.
>
> Jobs objects stay alive waiting for other jobs to complete, even if the
> coroutines have returned. They can be cancelled by the user during this time.
> Job blockers are still in effect and no other block job can run on this device
> in the meantime (since QEMU currently only allows 1 job per device). This is
> the main drawback to this approach but reasonable since you probably don't want
> to run other jobs/operations until you're sure the backup was successful (you
> won't be able to retry a failed backup if there's a new job running).
>
> [History]
>
> v10: Replaced per-action parameter with per-transaction properties.
> Patches 10,11 were split into 10-12.
>
> v9: this version fixes a reference count problem with job->bs,
> in patch 05.
>
> v8: Rebase on to master.
> Minor fixes addressing John Snow's comments.
>
> v7: Add Eric's rev-by in 1, 11.
> Add Max's rev-by in 4, 5, 9, 10, 11.
> Add John's rev-by in 5, 6, 8.
> Fix wording for 6. [John]
> Fix comment of block_job_txn_add_job() in 9. [Max]
> Remove superfluous hunks, and document default value in 11. [Eric]
> Update Makefile dep in 14. [Max]
>
> ________________________________________________________________________________
>
> For convenience, this branch is available at:
> https://github.com/jnsnow/qemu.git branch block-transpop
> https://github.com/jnsnow/qemu/tree/block-transpop
>
> This version is tagged block-transpop-v10:
> https://github.com/jnsnow/qemu/releases/tag/block-transpop-v10
>
> Fam Zheng (6):
> backup: Extract dirty bitmap handling as a separate function
> blockjob: Introduce reference count and fix reference to job->bs
> blockjob: Add .commit and .abort block job actions
> blockjob: Add "completed" and "ret" in BlockJob
> blockjob: Simplify block_job_finish_sync
> block: Add block job transactions
>
> John Snow (7):
> qapi: Add transaction support to block-dirty-bitmap operations
> iotests: add transactional incremental backup test
> block: rename BlkTransactionState and BdrvActionOps
> block/backup: Rely on commit/abort for cleanup
> block: Add BlockJobTxn support to backup_run
> block: add transactional properties
> iotests: 124 - transactional failure test
>
> Stefan Hajnoczi (1):
> tests: add BlockJobTxn unit test
>
> block.c | 19 +-
> block/backup.c | 50 ++++--
> block/mirror.c | 2 +-
> blockdev.c | 430 ++++++++++++++++++++++++++++++++++-----------
> blockjob.c | 188 ++++++++++++++++----
> docs/bitmaps.md | 6 +-
> include/block/block.h | 2 +-
> include/block/block_int.h | 6 +-
> include/block/blockjob.h | 85 ++++++++-
> qapi-schema.json | 54 +++++-
> qemu-img.c | 3 -
> qmp-commands.hx | 2 +-
> tests/Makefile | 3 +
> tests/qemu-iotests/124 | 182 ++++++++++++++++++-
> tests/qemu-iotests/124.out | 4 +-
> tests/test-blockjob-txn.c | 244 +++++++++++++++++++++++++
> 16 files changed, 1108 insertions(+), 172 deletions(-)
> create mode 100644 tests/test-blockjob-txn.c
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync John Snow
@ 2015-11-03 13:52 ` Stefan Hajnoczi
0 siblings, 0 replies; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-11-03 13:52 UTC (permalink / raw)
To: John Snow; +Cc: vsementsov, famz, qemu-block, armbru, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 259 bytes --]
On Fri, Oct 23, 2015 at 07:56:46PM -0400, John Snow wrote:
> + block_job_ref(job);
> finish(job, &local_err);
> if (local_err) {
> error_propagate(errp, local_err);
> return -EBUSY;
Refcount leak, missing block_job_unref(job).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 12/14] block: add transactional properties John Snow
@ 2015-11-03 15:17 ` Stefan Hajnoczi
2015-11-03 17:27 ` John Snow
2015-11-03 15:23 ` [Qemu-devel] " Eric Blake
1 sibling, 1 reply; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-11-03 15:17 UTC (permalink / raw)
To: John Snow; +Cc: vsementsov, famz, qemu-block, armbru, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1250 bytes --]
On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
> common, common);
>
> + if (action_check_cancel_mode(common, errp) < 0) {
> + return;
> + }
> +
> action = common->action->block_dirty_bitmap_add;
> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
> qmp_block_dirty_bitmap_add(action->node, action->name,
> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
> common, common);
> BlockDirtyBitmap *action;
>
> + if (action_check_cancel_mode(common, errp) < 0) {
> + return;
> + }
> +
> action = common->action->block_dirty_bitmap_clear;
> state->bitmap = block_dirty_bitmap_lookup(action->node,
> action->name,
Why do the bitmap add/clear actions not support err-cancel=all?
I understand why other block jobs don't support it, but it's not clear
why these non-block job actions cannot.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
` (14 preceding siblings ...)
2015-11-02 21:06 ` [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
@ 2015-11-03 15:22 ` Stefan Hajnoczi
2015-11-03 17:46 ` John Snow
15 siblings, 1 reply; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-11-03 15:22 UTC (permalink / raw)
To: John Snow; +Cc: vsementsov, famz, qemu-block, armbru, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]
On Fri, Oct 23, 2015 at 07:56:38PM -0400, John Snow wrote:
> Welcome to V10!
>
> Where'd 8 and 9 go? Private off-list missives from Fam.
> Now you, I, and everyone on qemu-devel are staring at V10.
>
> What's new in V10?
>
> I replaced the per-action "transactional-cancel" parameter with
> a per-transaction paremeter named "err-cancel" which is implemented
> as an enum in case we want to add new behaviors in the future, such
> as a "jobs only" cancel mode.
>
> For now, it's "all" or "none", and if you use it with actions that
> do not support the latent transactional cancel, you will receive
> an error for your troubles.
I left comments on a few patches.
The "err-cancel" and "ActionCancelMode" naming does not describe the
concept fully, since successful block jobs will also behave differently
(waiting for each other to finish before fully completing).
"blockjob-transactions" is the best name I can think of that describes
the full concept rather than focus on just cancellation.
Besides that I'm happy with the QMP interface.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 12/14] block: add transactional properties John Snow
2015-11-03 15:17 ` Stefan Hajnoczi
@ 2015-11-03 15:23 ` Eric Blake
2015-11-03 17:31 ` John Snow
1 sibling, 1 reply; 35+ messages in thread
From: Eric Blake @ 2015-11-03 15:23 UTC (permalink / raw)
To: John Snow, qemu-block; +Cc: vsementsov, famz, armbru, stefanha, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1568 bytes --]
On 10/23/2015 05:56 PM, John Snow wrote:
> Add both transactional properties to the QMP transactional interface,
> and add the BlockJobTxn that we create as a result of the err-cancel
> property to the BlkActionState structure.
>
> [split up from a patch originally by Stefan and Fam. --js]
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: John Snow <jsnow@redhat.com>
>
> Signed-off-by: John Snow <jsnow@redhat.com>
Double S-o-b looks odd.
> ---
> blockdev.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
> qapi-schema.json | 48 +++++++++++++++++++++++++++++++---
> qmp-commands.hx | 2 +-
> 3 files changed, 120 insertions(+), 8 deletions(-)
>
> +##
> # @transaction
> #
> # Executes a number of transactionable QMP commands atomically. If any
> # operation fails, then the entire set of actions will be abandoned and the
> # appropriate error returned.
> #
> -# List of:
> -# @TransactionAction: information needed for the respective operation
> +# @actions: List of @TransactionAction;
> +# information needed for the respective operations.
> +#
> +# @properties: Optional structure of additional options to control the
Elsewhere, we've spelled it '#optional'; Marc-Andre has patches that
rely on that spelling to turn it .json into documentation.
But otherwise, looks good on first glance.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps John Snow
@ 2015-11-03 16:33 ` Jeff Cody
2015-11-03 17:32 ` John Snow
0 siblings, 1 reply; 35+ messages in thread
From: Jeff Cody @ 2015-11-03 16:33 UTC (permalink / raw)
To: John Snow; +Cc: vsementsov, famz, qemu-block, armbru, qemu-devel, stefanha
On Fri, Oct 23, 2015 at 07:56:41PM -0400, John Snow wrote:
> These structures are misnomers, somewhat.
>
> (1) BlockTransactionState is not state for a transaction,
> but is rather state for a single transaction action.
> Rename it "BlkActionState" to be more accurate.
>
> (2) The BdrvActionOps describes operations for the BlkActionState,
> above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
> which there isn't.
> Rename this to 'BlkActionOps' to match 'BlkActionState'.
>
> Lastly, update the surrounding in-line documentation and comments
> to reflect the current nature of how Transactions operate.
>
> This patch changes only comments and names, and should not affect
> behavior in any way.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Fam Zheng <famz@redhat.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> blockdev.c | 116 ++++++++++++++++++++++++++++++++++---------------------------
> 1 file changed, 65 insertions(+), 51 deletions(-)
>
just a heads up - this patch fails to apply on qemu master, so a
rebase is probably in order (this series is based on qemu master,
right?)
(I'll review from your github branch for now)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-03 15:17 ` Stefan Hajnoczi
@ 2015-11-03 17:27 ` John Snow
2015-11-05 10:47 ` Stefan Hajnoczi
0 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2015-11-03 17:27 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: vsementsov, famz, armbru, qemu-block, qemu-devel
On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>> common, common);
>>
>> + if (action_check_cancel_mode(common, errp) < 0) {
>> + return;
>> + }
>> +
>> action = common->action->block_dirty_bitmap_add;
>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>> qmp_block_dirty_bitmap_add(action->node, action->name,
>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>> common, common);
>> BlockDirtyBitmap *action;
>>
>> + if (action_check_cancel_mode(common, errp) < 0) {
>> + return;
>> + }
>> +
>> action = common->action->block_dirty_bitmap_clear;
>> state->bitmap = block_dirty_bitmap_lookup(action->node,
>> action->name,
>
> Why do the bitmap add/clear actions not support err-cancel=all?
>
> I understand why other block jobs don't support it, but it's not clear
> why these non-block job actions cannot.
>
Because they don't have a callback to invoke if the rest of the job fails.
I could create a BlockJob for them complete with a callback to invoke,
but basically it's just because there's no interface to unwind them, or
an interface to join them with the transaction.
They're small, synchronous non-job actions. Which makes them weird.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-03 15:23 ` [Qemu-devel] " Eric Blake
@ 2015-11-03 17:31 ` John Snow
2015-11-03 17:35 ` Eric Blake
0 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2015-11-03 17:31 UTC (permalink / raw)
To: Eric Blake, qemu-block; +Cc: vsementsov, famz, armbru, stefanha, qemu-devel
On 11/03/2015 10:23 AM, Eric Blake wrote:
> On 10/23/2015 05:56 PM, John Snow wrote:
>> Add both transactional properties to the QMP transactional interface,
>> and add the BlockJobTxn that we create as a result of the err-cancel
>> property to the BlkActionState structure.
>>
>> [split up from a patch originally by Stefan and Fam. --js]
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Fam Zheng <famz@redhat.com>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>
> Double S-o-b looks odd.
>
>> ---
>> blockdev.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>> qapi-schema.json | 48 +++++++++++++++++++++++++++++++---
>> qmp-commands.hx | 2 +-
>> 3 files changed, 120 insertions(+), 8 deletions(-)
>>
>
>
>> +##
>> # @transaction
>> #
>> # Executes a number of transactionable QMP commands atomically. If any
>> # operation fails, then the entire set of actions will be abandoned and the
>> # appropriate error returned.
>> #
>> -# List of:
>> -# @TransactionAction: information needed for the respective operation
>> +# @actions: List of @TransactionAction;
>> +# information needed for the respective operations.
>> +#
>> +# @properties: Optional structure of additional options to control the
>
> Elsewhere, we've spelled it '#optional'; Marc-Andre has patches that
> rely on that spelling to turn it .json into documentation.
>
> But otherwise, looks good on first glance.
>
Do you mean:
"# @properties: #optional structure of "... ?
--js
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps
2015-11-03 16:33 ` [Qemu-devel] [Qemu-block] " Jeff Cody
@ 2015-11-03 17:32 ` John Snow
0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-11-03 17:32 UTC (permalink / raw)
To: Jeff Cody; +Cc: vsementsov, famz, qemu-block, armbru, qemu-devel, stefanha
On 11/03/2015 11:33 AM, Jeff Cody wrote:
> On Fri, Oct 23, 2015 at 07:56:41PM -0400, John Snow wrote:
>> These structures are misnomers, somewhat.
>>
>> (1) BlockTransactionState is not state for a transaction,
>> but is rather state for a single transaction action.
>> Rename it "BlkActionState" to be more accurate.
>>
>> (2) The BdrvActionOps describes operations for the BlkActionState,
>> above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
>> which there isn't.
>> Rename this to 'BlkActionOps' to match 'BlkActionState'.
>>
>> Lastly, update the surrounding in-line documentation and comments
>> to reflect the current nature of how Transactions operate.
>>
>> This patch changes only comments and names, and should not affect
>> behavior in any way.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> Reviewed-by: Fam Zheng <famz@redhat.com>
>> Signed-off-by: Fam Zheng <famz@redhat.com>
>> ---
>> blockdev.c | 116 ++++++++++++++++++++++++++++++++++---------------------------
>> 1 file changed, 65 insertions(+), 51 deletions(-)
>>
>
> just a heads up - this patch fails to apply on qemu master, so a
> rebase is probably in order (this series is based on qemu master,
> right?)
>
Yes, but is a week old. I'll rebase and re-upload to github. V11 will
follow.
> (I'll review from your github branch for now)
>
Grazie.
--js
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-03 17:31 ` John Snow
@ 2015-11-03 17:35 ` Eric Blake
0 siblings, 0 replies; 35+ messages in thread
From: Eric Blake @ 2015-11-03 17:35 UTC (permalink / raw)
To: John Snow, qemu-block; +Cc: vsementsov, famz, armbru, stefanha, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 931 bytes --]
On 11/03/2015 10:31 AM, John Snow wrote:
>
>
> On 11/03/2015 10:23 AM, Eric Blake wrote:
>> On 10/23/2015 05:56 PM, John Snow wrote:
>>> Add both transactional properties to the QMP transactional interface,
>>> and add the BlockJobTxn that we create as a result of the err-cancel
>>> property to the BlkActionState structure.
>>>
>>> +# @actions: List of @TransactionAction;
>>> +# information needed for the respective operations.
>>> +#
>>> +# @properties: Optional structure of additional options to control the
>>
>> Elsewhere, we've spelled it '#optional'; Marc-Andre has patches that
>> rely on that spelling to turn it .json into documentation.
>>
>> But otherwise, looks good on first glance.
>>
>
> Do you mean:
>
> "# @properties: #optional structure of "... ?
Yes.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn
2015-11-03 15:22 ` Stefan Hajnoczi
@ 2015-11-03 17:46 ` John Snow
2015-11-03 22:07 ` John Snow
0 siblings, 1 reply; 35+ messages in thread
From: John Snow @ 2015-11-03 17:46 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: vsementsov, famz, armbru, qemu-block, qemu-devel
On 11/03/2015 10:22 AM, Stefan Hajnoczi wrote:
> On Fri, Oct 23, 2015 at 07:56:38PM -0400, John Snow wrote:
>> Welcome to V10!
>>
>> Where'd 8 and 9 go? Private off-list missives from Fam.
>> Now you, I, and everyone on qemu-devel are staring at V10.
>>
>> What's new in V10?
>>
>> I replaced the per-action "transactional-cancel" parameter with
>> a per-transaction paremeter named "err-cancel" which is implemented
>> as an enum in case we want to add new behaviors in the future, such
>> as a "jobs only" cancel mode.
>>
>> For now, it's "all" or "none", and if you use it with actions that
>> do not support the latent transactional cancel, you will receive
>> an error for your troubles.
>
> I left comments on a few patches.
>
> The "err-cancel" and "ActionCancelMode" naming does not describe the
> concept fully, since successful block jobs will also behave differently
> (waiting for each other to finish before fully completing).
> "blockjob-transactions" is the best name I can think of that describes
> the full concept rather than focus on just cancellation.
>
There's two hard problems in Computer Science...
I mean, cancellation certainly is the biggest change in net behavior,
though we are fiddling with the timings of the completion notices.
OK, so continuing on my inability to name this feature ...
blockjob-transactions = { individual, grouped, jobs-only }
where we'd support "individual" as the default,
"grouped" is the forced cancellation mode, and
jobs-only is a hypothetical future mode that adds jobs to the group, and
ignoring non-jobs' inability to join the transaction.
Still, it seems weird to have:
transaction 'properties': {'blockjob-transactions': 'grouped'},
it feels redundant to me.
How about:
completion-mode = { individual, grouped }
Avoids repeating "Transaction" and does not hyperfocus on cancellation.
> Besides that I'm happy with the QMP interface.
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn
2015-11-03 17:46 ` John Snow
@ 2015-11-03 22:07 ` John Snow
0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-11-03 22:07 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: vsementsov, famz, armbru, qemu-block, qemu-devel
On 11/03/2015 12:46 PM, John Snow wrote:
>
>
> On 11/03/2015 10:22 AM, Stefan Hajnoczi wrote:
>> On Fri, Oct 23, 2015 at 07:56:38PM -0400, John Snow wrote:
>>> Welcome to V10!
>>>
>>> Where'd 8 and 9 go? Private off-list missives from Fam.
>>> Now you, I, and everyone on qemu-devel are staring at V10.
>>>
>>> What's new in V10?
>>>
>>> I replaced the per-action "transactional-cancel" parameter with
>>> a per-transaction paremeter named "err-cancel" which is implemented
>>> as an enum in case we want to add new behaviors in the future, such
>>> as a "jobs only" cancel mode.
>>>
>>> For now, it's "all" or "none", and if you use it with actions that
>>> do not support the latent transactional cancel, you will receive
>>> an error for your troubles.
>>
>> I left comments on a few patches.
>>
>> The "err-cancel" and "ActionCancelMode" naming does not describe the
>> concept fully, since successful block jobs will also behave differently
>> (waiting for each other to finish before fully completing).
>> "blockjob-transactions" is the best name I can think of that describes
>> the full concept rather than focus on just cancellation.
>>
>
> There's two hard problems in Computer Science...
>
> I mean, cancellation certainly is the biggest change in net behavior,
> though we are fiddling with the timings of the completion notices.
>
> OK, so continuing on my inability to name this feature ...
>
> blockjob-transactions = { individual, grouped, jobs-only }
>
> where we'd support "individual" as the default,
> "grouped" is the forced cancellation mode, and
> jobs-only is a hypothetical future mode that adds jobs to the group, and
> ignoring non-jobs' inability to join the transaction.
>
> Still, it seems weird to have:
> transaction 'properties': {'blockjob-transactions': 'grouped'},
> it feels redundant to me.
>
>
> How about:
>
> completion-mode = { individual, grouped }
>
> Avoids repeating "Transaction" and does not hyperfocus on cancellation.
>
Version "10.5" is on https://github.com/jnsnow/qemu.git
- Rebased on master to play nice with the new QAPI checkins (#3, #12)
- Fixed error path leak (#3)
- Renamed the Transaction Property and enum values (#12, #13) as above
("completion-mode", "individual", "grouped")
- Fixed a leak in the block_job_txn qtest (#14)
If the names and QMP interface look good I'll publish as v11.
--js
>> Besides that I'm happy with the QMP interface.
>>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-03 17:27 ` John Snow
@ 2015-11-05 10:47 ` Stefan Hajnoczi
2015-11-05 18:52 ` John Snow
0 siblings, 1 reply; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-11-05 10:47 UTC (permalink / raw)
To: John Snow; +Cc: vsementsov, famz, armbru, qemu-block, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 3099 bytes --]
On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>
>
> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
> > On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
> >> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
> >> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
> >> common, common);
> >>
> >> + if (action_check_cancel_mode(common, errp) < 0) {
> >> + return;
> >> + }
> >> +
> >> action = common->action->block_dirty_bitmap_add;
> >> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
> >> qmp_block_dirty_bitmap_add(action->node, action->name,
> >> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
> >> common, common);
> >> BlockDirtyBitmap *action;
> >>
> >> + if (action_check_cancel_mode(common, errp) < 0) {
> >> + return;
> >> + }
> >> +
> >> action = common->action->block_dirty_bitmap_clear;
> >> state->bitmap = block_dirty_bitmap_lookup(action->node,
> >> action->name,
> >
> > Why do the bitmap add/clear actions not support err-cancel=all?
> >
> > I understand why other block jobs don't support it, but it's not clear
> > why these non-block job actions cannot.
> >
>
> Because they don't have a callback to invoke if the rest of the job fails.
>
> I could create a BlockJob for them complete with a callback to invoke,
> but basically it's just because there's no interface to unwind them, or
> an interface to join them with the transaction.
>
> They're small, synchronous non-job actions. Which makes them weird.
Funny, we've been looking at the same picture while seeing different
things:
https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
I think I understand your idea: the transaction should include both
immediate actions as well as block jobs.
My mental model was different: immediate actions commit/abort along with
the 'transaction' command. Block jobs are separate and complete/cancel
together in a group.
In practice I think the two end up being similar because we won't be
able to implement immediate action commit/abort together with
long-running block jobs because the immediate actions rely on
quiescing/pausing the guest for atomic commit/abort.
So with your mental model the QMP client has to submit 2 'transaction'
commands: 1 for the immediate actions, 1 for the block jobs.
In my mental model the QMP client submits 1 command but the immediate
actions and block jobs are two separate transaction scopes. This means
if the block jobs fail, the client needs to be aware of the immediate
actions that have committed. Because of this, it becomes just as much
client effort as submitting two separate 'transaction' commands in your
model.
Can anyone see a practical difference? I think I'm happy with John's
model.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-05 10:47 ` Stefan Hajnoczi
@ 2015-11-05 18:52 ` John Snow
2015-11-05 19:35 ` Markus Armbruster
2015-11-06 8:32 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
0 siblings, 2 replies; 35+ messages in thread
From: John Snow @ 2015-11-05 18:52 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: vsementsov, famz, armbru, qemu-block, qemu-devel
On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
> On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>>
>>
>> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
>>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>>>> common, common);
>>>>
>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>> + return;
>>>> + }
>>>> +
>>>> action = common->action->block_dirty_bitmap_add;
>>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>>>> qmp_block_dirty_bitmap_add(action->node, action->name,
>>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>>>> common, common);
>>>> BlockDirtyBitmap *action;
>>>>
>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>> + return;
>>>> + }
>>>> +
>>>> action = common->action->block_dirty_bitmap_clear;
>>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
>>>> action->name,
>>>
>>> Why do the bitmap add/clear actions not support err-cancel=all?
>>>
>>> I understand why other block jobs don't support it, but it's not clear
>>> why these non-block job actions cannot.
>>>
>>
>> Because they don't have a callback to invoke if the rest of the job fails.
>>
>> I could create a BlockJob for them complete with a callback to invoke,
>> but basically it's just because there's no interface to unwind them, or
>> an interface to join them with the transaction.
>>
>> They're small, synchronous non-job actions. Which makes them weird.
>
> Funny, we've been looking at the same picture while seeing different
> things:
> https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
>
> I think I understand your idea: the transaction should include both
> immediate actions as well as block jobs.
>
> My mental model was different: immediate actions commit/abort along with
> the 'transaction' command. Block jobs are separate and complete/cancel
> together in a group.
>
> In practice I think the two end up being similar because we won't be
> able to implement immediate action commit/abort together with
> long-running block jobs because the immediate actions rely on
> quiescing/pausing the guest for atomic commit/abort.
>
> So with your mental model the QMP client has to submit 2 'transaction'
> commands: 1 for the immediate actions, 1 for the block jobs.
>
> In my mental model the QMP client submits 1 command but the immediate
> actions and block jobs are two separate transaction scopes. This means
> if the block jobs fail, the client needs to be aware of the immediate
> actions that have committed. Because of this, it becomes just as much
> client effort as submitting two separate 'transaction' commands in your
> model.
>
> Can anyone see a practical difference? I think I'm happy with John's
> model.
>
> Stefan
>
We discussed this off-list, but for the sake of the archive:
== How it is now ==
Currently, transactions have two implicit phases: the first is the
synchronous phase. If this phase completes successfully, we consider the
transaction a success. The second phase is the asynchronous phase where
jobs launched by the synchronous phase run to completion.
all synchronous commands must complete for the transaction to "succeed."
There are currently (pre-patch) no guarantees about asynchronous command
completion. As long as all synchronous actions complete, asynchronous
actions are free to succeed or fail individually.
== My Model ==
The current behavior is my "err-cancel = none" scenario: we offer no
guarantee about the success or failure of the transaction as a whole
after the synchronous portion has completed.
What I was proposing is "err-cancel = all," which to me means that _ALL_
commands in this transaction are to succeed (synchronous or not) before
_any_ actions are irrevocably committed. This means that for a
hypothetical mixed synchronous-asynchronous transaction, that even after
the transaction succeeded (it passed the synchronous phase), if an
asynchronous action later fails, all actions both synchronous and non
are rolled-back -- a kind of retroactive failure of the transaction.
This is clearly not possible in all cases, so commands that cannot
support these semantics will refuse "err-cancel = all" during the
synchronous phase.
In practice, only asynchronous actions can tolerate these semantics, but
from a user perspective, it's clear that any transaction successfully
launched with "err-cancel = all" applies to *all* actions, regardless.
== Stefan's Model ==
Stefan's model was to imply that the "err-cancel" parameter applied only
to the *asynchronous* phase, because the synchronous phase has already
reported back success to the user as a return from the qmp-transaction
command. This would mean that to Stefan, "err-cancel = all" was implying
that only the asynchronous actions had to participate in the "all or
none" behavior of the transaction -- synchronous portions were exempt.
== Equivalence ==
Both models wind up being equivalent:
In Stefan's model, you need no foreknowledge of which actions are
synchronous or not. Upon failure during the asynchronous phase you will
need to understand which actions rolled back and which ones didn't, however.
In my model, you need foreknowledge of which actions are synchronous and
which ones are not, because synchronous actions will refuse the
"err-cancel = all" parameter. There is no sifting through failure states
when the command fails.
It's mostly a matter of when you need to know the difference between the
two classes of actions. In one model, it's before. In the other, it's
after a failure.
My model also allows for an emulation of Stefan's model, using the
hypothetical "err-cancel = jobs-only" mode, which would only enforce the
transaction semantics in the asynchronous phase.
For this reason, I think the two approaches to thinking about the
problem wind up having the same effect. I would perhaps argue that my
model is more explicit -- but I'm biased. I wrote it :)
--js
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-05 18:52 ` John Snow
@ 2015-11-05 19:35 ` Markus Armbruster
2015-11-05 19:43 ` John Snow
2015-11-06 8:32 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
1 sibling, 1 reply; 35+ messages in thread
From: Markus Armbruster @ 2015-11-05 19:35 UTC (permalink / raw)
To: John Snow; +Cc: vsementsov, famz, qemu-block, Stefan Hajnoczi, qemu-devel
John Snow <jsnow@redhat.com> writes:
> On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
>> On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>>>
>>>
>>> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
>>>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>>>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>>>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>>>>> common, common);
>>>>>
>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> action = common->action->block_dirty_bitmap_add;
>>>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>>>>> qmp_block_dirty_bitmap_add(action->node, action->name,
>>>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>>>>> common, common);
>>>>> BlockDirtyBitmap *action;
>>>>>
>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>> + return;
>>>>> + }
>>>>> +
>>>>> action = common->action->block_dirty_bitmap_clear;
>>>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
>>>>> action->name,
>>>>
>>>> Why do the bitmap add/clear actions not support err-cancel=all?
>>>>
>>>> I understand why other block jobs don't support it, but it's not clear
>>>> why these non-block job actions cannot.
>>>>
>>>
>>> Because they don't have a callback to invoke if the rest of the job fails.
>>>
>>> I could create a BlockJob for them complete with a callback to invoke,
>>> but basically it's just because there's no interface to unwind them, or
>>> an interface to join them with the transaction.
>>>
>>> They're small, synchronous non-job actions. Which makes them weird.
>>
>> Funny, we've been looking at the same picture while seeing different
>> things:
>> https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
>>
>> I think I understand your idea: the transaction should include both
>> immediate actions as well as block jobs.
>>
>> My mental model was different: immediate actions commit/abort along with
>> the 'transaction' command. Block jobs are separate and complete/cancel
>> together in a group.
>>
>> In practice I think the two end up being similar because we won't be
>> able to implement immediate action commit/abort together with
>> long-running block jobs because the immediate actions rely on
>> quiescing/pausing the guest for atomic commit/abort.
>>
>> So with your mental model the QMP client has to submit 2 'transaction'
>> commands: 1 for the immediate actions, 1 for the block jobs.
>>
>> In my mental model the QMP client submits 1 command but the immediate
>> actions and block jobs are two separate transaction scopes. This means
>> if the block jobs fail, the client needs to be aware of the immediate
>> actions that have committed. Because of this, it becomes just as much
>> client effort as submitting two separate 'transaction' commands in your
>> model.
>>
>> Can anyone see a practical difference? I think I'm happy with John's
>> model.
>>
>> Stefan
>>
>
> We discussed this off-list, but for the sake of the archive:
>
> == How it is now ==
>
> Currently, transactions have two implicit phases: the first is the
> synchronous phase. If this phase completes successfully, we consider the
> transaction a success. The second phase is the asynchronous phase where
> jobs launched by the synchronous phase run to completion.
>
> all synchronous commands must complete for the transaction to "succeed."
> There are currently (pre-patch) no guarantees about asynchronous command
> completion. As long as all synchronous actions complete, asynchronous
> actions are free to succeed or fail individually.
>
> == My Model ==
>
> The current behavior is my "err-cancel = none" scenario: we offer no
> guarantee about the success or failure of the transaction as a whole
> after the synchronous portion has completed.
>
> What I was proposing is "err-cancel = all," which to me means that _ALL_
> commands in this transaction are to succeed (synchronous or not) before
> _any_ actions are irrevocably committed. This means that for a
> hypothetical mixed synchronous-asynchronous transaction, that even after
> the transaction succeeded (it passed the synchronous phase), if an
> asynchronous action later fails, all actions both synchronous and non
> are rolled-back -- a kind of retroactive failure of the transaction.
> This is clearly not possible in all cases, so commands that cannot
> support these semantics will refuse "err-cancel = all" during the
> synchronous phase.
>
> In practice, only asynchronous actions can tolerate these semantics, but
> from a user perspective, it's clear that any transaction successfully
> launched with "err-cancel = all" applies to *all* actions, regardless.
>
> == Stefan's Model ==
>
> Stefan's model was to imply that the "err-cancel" parameter applied only
> to the *asynchronous* phase, because the synchronous phase has already
> reported back success to the user as a return from the qmp-transaction
> command. This would mean that to Stefan, "err-cancel = all" was implying
> that only the asynchronous actions had to participate in the "all or
> none" behavior of the transaction -- synchronous portions were exempt.
>
> == Equivalence ==
>
> Both models wind up being equivalent:
>
> In Stefan's model, you need no foreknowledge of which actions are
> synchronous or not. Upon failure during the asynchronous phase you will
> need to understand which actions rolled back and which ones didn't, however.
>
> In my model, you need foreknowledge of which actions are synchronous and
> which ones are not, because synchronous actions will refuse the
> "err-cancel = all" parameter. There is no sifting through failure states
> when the command fails.
>
> It's mostly a matter of when you need to know the difference between the
> two classes of actions. In one model, it's before. In the other, it's
> after a failure.
>
> My model also allows for an emulation of Stefan's model, using the
> hypothetical "err-cancel = jobs-only" mode, which would only enforce the
> transaction semantics in the asynchronous phase.
>
>
> For this reason, I think the two approaches to thinking about the
> problem wind up having the same effect. I would perhaps argue that my
> model is more explicit -- but I'm biased. I wrote it :)
I haven't followed this topic, and my opinions are therefore quite
uninformed. Here goes anway.
In John's model, you need to know whether an action is synchronous. If
you get it wrong, your attempt to err-cancel=all will fail. Sounds like
a nice early failure to me.
In Stefan's model, you "need to understand which actions rolled back" to
make sense of a failure. How? Are exactly the asynchronous ones rolled
back? What happens if you get it wrong?
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [PATCH v10 12/14] block: add transactional properties
2015-11-05 19:35 ` Markus Armbruster
@ 2015-11-05 19:43 ` John Snow
0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-11-05 19:43 UTC (permalink / raw)
To: Markus Armbruster
Cc: vsementsov, famz, qemu-block, Stefan Hajnoczi, qemu-devel
On 11/05/2015 02:35 PM, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
>
>> On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
>>> On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>>>>
>>>>
>>>> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
>>>>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>>>>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>>>>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>>>>>> common, common);
>>>>>>
>>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> action = common->action->block_dirty_bitmap_add;
>>>>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>>>>>> qmp_block_dirty_bitmap_add(action->node, action->name,
>>>>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>>>>>> common, common);
>>>>>> BlockDirtyBitmap *action;
>>>>>>
>>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> action = common->action->block_dirty_bitmap_clear;
>>>>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
>>>>>> action->name,
>>>>>
>>>>> Why do the bitmap add/clear actions not support err-cancel=all?
>>>>>
>>>>> I understand why other block jobs don't support it, but it's not clear
>>>>> why these non-block job actions cannot.
>>>>>
>>>>
>>>> Because they don't have a callback to invoke if the rest of the job fails.
>>>>
>>>> I could create a BlockJob for them complete with a callback to invoke,
>>>> but basically it's just because there's no interface to unwind them, or
>>>> an interface to join them with the transaction.
>>>>
>>>> They're small, synchronous non-job actions. Which makes them weird.
>>>
>>> Funny, we've been looking at the same picture while seeing different
>>> things:
>>> https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
>>>
>>> I think I understand your idea: the transaction should include both
>>> immediate actions as well as block jobs.
>>>
>>> My mental model was different: immediate actions commit/abort along with
>>> the 'transaction' command. Block jobs are separate and complete/cancel
>>> together in a group.
>>>
>>> In practice I think the two end up being similar because we won't be
>>> able to implement immediate action commit/abort together with
>>> long-running block jobs because the immediate actions rely on
>>> quiescing/pausing the guest for atomic commit/abort.
>>>
>>> So with your mental model the QMP client has to submit 2 'transaction'
>>> commands: 1 for the immediate actions, 1 for the block jobs.
>>>
>>> In my mental model the QMP client submits 1 command but the immediate
>>> actions and block jobs are two separate transaction scopes. This means
>>> if the block jobs fail, the client needs to be aware of the immediate
>>> actions that have committed. Because of this, it becomes just as much
>>> client effort as submitting two separate 'transaction' commands in your
>>> model.
>>>
>>> Can anyone see a practical difference? I think I'm happy with John's
>>> model.
>>>
>>> Stefan
>>>
>>
>> We discussed this off-list, but for the sake of the archive:
>>
>> == How it is now ==
>>
>> Currently, transactions have two implicit phases: the first is the
>> synchronous phase. If this phase completes successfully, we consider the
>> transaction a success. The second phase is the asynchronous phase where
>> jobs launched by the synchronous phase run to completion.
>>
>> all synchronous commands must complete for the transaction to "succeed."
>> There are currently (pre-patch) no guarantees about asynchronous command
>> completion. As long as all synchronous actions complete, asynchronous
>> actions are free to succeed or fail individually.
>>
>> == My Model ==
>>
>> The current behavior is my "err-cancel = none" scenario: we offer no
>> guarantee about the success or failure of the transaction as a whole
>> after the synchronous portion has completed.
>>
>> What I was proposing is "err-cancel = all," which to me means that _ALL_
>> commands in this transaction are to succeed (synchronous or not) before
>> _any_ actions are irrevocably committed. This means that for a
>> hypothetical mixed synchronous-asynchronous transaction, that even after
>> the transaction succeeded (it passed the synchronous phase), if an
>> asynchronous action later fails, all actions both synchronous and non
>> are rolled-back -- a kind of retroactive failure of the transaction.
>> This is clearly not possible in all cases, so commands that cannot
>> support these semantics will refuse "err-cancel = all" during the
>> synchronous phase.
>>
>> In practice, only asynchronous actions can tolerate these semantics, but
>> from a user perspective, it's clear that any transaction successfully
>> launched with "err-cancel = all" applies to *all* actions, regardless.
>>
>> == Stefan's Model ==
>>
>> Stefan's model was to imply that the "err-cancel" parameter applied only
>> to the *asynchronous* phase, because the synchronous phase has already
>> reported back success to the user as a return from the qmp-transaction
>> command. This would mean that to Stefan, "err-cancel = all" was implying
>> that only the asynchronous actions had to participate in the "all or
>> none" behavior of the transaction -- synchronous portions were exempt.
>>
>> == Equivalence ==
>>
>> Both models wind up being equivalent:
>>
>> In Stefan's model, you need no foreknowledge of which actions are
>> synchronous or not. Upon failure during the asynchronous phase you will
>> need to understand which actions rolled back and which ones didn't, however.
>>
>> In my model, you need foreknowledge of which actions are synchronous and
>> which ones are not, because synchronous actions will refuse the
>> "err-cancel = all" parameter. There is no sifting through failure states
>> when the command fails.
>>
>> It's mostly a matter of when you need to know the difference between the
>> two classes of actions. In one model, it's before. In the other, it's
>> after a failure.
>>
>> My model also allows for an emulation of Stefan's model, using the
>> hypothetical "err-cancel = jobs-only" mode, which would only enforce the
>> transaction semantics in the asynchronous phase.
>>
>>
>> For this reason, I think the two approaches to thinking about the
>> problem wind up having the same effect. I would perhaps argue that my
>> model is more explicit -- but I'm biased. I wrote it :)
>
> I haven't followed this topic, and my opinions are therefore quite
> uninformed. Here goes anway.
>
> In John's model, you need to know whether an action is synchronous. If
> you get it wrong, your attempt to err-cancel=all will fail. Sounds like
> a nice early failure to me.
>
Yes.
> In Stefan's model, you "need to understand which actions rolled back" to
> make sense of a failure. How? Are exactly the asynchronous ones rolled
> back? What happens if you get it wrong?
>
You need to understand which actions were synchronous (i.e. not jobs)
and which ones were jobs. Only jobs will be rolled back, the synchronous
actions will not be -- so to retry, you need to have a sense of /which/
actions to retry.
If you get it wrong, you get weird stuff you didn't expect, of course.
Either model implies you need to understand which actions are which
kind, but the patchset as it is right now will yell at you if you got
that wrong from the get-go.
To be fair, in Stefan's model, the QMP events will hint to you which
actions actually failed, because you'll be able to count the block job
failure events. The qmp-transaction return of "{return: {}}" is the
implicit "All non-jobs finished OK."
--js
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH v10 12/14] block: add transactional properties
2015-11-05 18:52 ` John Snow
2015-11-05 19:35 ` Markus Armbruster
@ 2015-11-06 8:32 ` Kevin Wolf
2015-11-06 16:36 ` Stefan Hajnoczi
2015-11-06 18:46 ` John Snow
1 sibling, 2 replies; 35+ messages in thread
From: Kevin Wolf @ 2015-11-06 8:32 UTC (permalink / raw)
To: John Snow
Cc: vsementsov, famz, qemu-block, armbru, qemu-devel, Stefan Hajnoczi
Am 05.11.2015 um 19:52 hat John Snow geschrieben:
>
>
> On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
> > On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
> >>
> >>
> >> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
> >>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
> >>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
> >>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
> >>>> common, common);
> >>>>
> >>>> + if (action_check_cancel_mode(common, errp) < 0) {
> >>>> + return;
> >>>> + }
> >>>> +
> >>>> action = common->action->block_dirty_bitmap_add;
> >>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
> >>>> qmp_block_dirty_bitmap_add(action->node, action->name,
> >>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
> >>>> common, common);
> >>>> BlockDirtyBitmap *action;
> >>>>
> >>>> + if (action_check_cancel_mode(common, errp) < 0) {
> >>>> + return;
> >>>> + }
> >>>> +
> >>>> action = common->action->block_dirty_bitmap_clear;
> >>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
> >>>> action->name,
> >>>
> >>> Why do the bitmap add/clear actions not support err-cancel=all?
> >>>
> >>> I understand why other block jobs don't support it, but it's not clear
> >>> why these non-block job actions cannot.
> >>>
> >>
> >> Because they don't have a callback to invoke if the rest of the job fails.
> >>
> >> I could create a BlockJob for them complete with a callback to invoke,
> >> but basically it's just because there's no interface to unwind them, or
> >> an interface to join them with the transaction.
> >>
> >> They're small, synchronous non-job actions. Which makes them weird.
> >
> > Funny, we've been looking at the same picture while seeing different
> > things:
> > https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
> >
> > I think I understand your idea: the transaction should include both
> > immediate actions as well as block jobs.
> >
> > My mental model was different: immediate actions commit/abort along with
> > the 'transaction' command. Block jobs are separate and complete/cancel
> > together in a group.
> >
> > In practice I think the two end up being similar because we won't be
> > able to implement immediate action commit/abort together with
> > long-running block jobs because the immediate actions rely on
> > quiescing/pausing the guest for atomic commit/abort.
> >
> > So with your mental model the QMP client has to submit 2 'transaction'
> > commands: 1 for the immediate actions, 1 for the block jobs.
> >
> > In my mental model the QMP client submits 1 command but the immediate
> > actions and block jobs are two separate transaction scopes. This means
> > if the block jobs fail, the client needs to be aware of the immediate
> > actions that have committed. Because of this, it becomes just as much
> > client effort as submitting two separate 'transaction' commands in your
> > model.
> >
> > Can anyone see a practical difference? I think I'm happy with John's
> > model.
> >
> > Stefan
> >
>
> We discussed this off-list, but for the sake of the archive:
>
> == How it is now ==
>
> Currently, transactions have two implicit phases: the first is the
> synchronous phase. If this phase completes successfully, we consider the
> transaction a success. The second phase is the asynchronous phase where
> jobs launched by the synchronous phase run to completion.
>
> all synchronous commands must complete for the transaction to "succeed."
> There are currently (pre-patch) no guarantees about asynchronous command
> completion. As long as all synchronous actions complete, asynchronous
> actions are free to succeed or fail individually.
You're overcomplicating this. All actions are currently synchronous and
what you consider asynchronous transaction actions aren't actually part
of the transaction at all. The action is "start block job X", not "run
block job X".
> == My Model ==
>
> The current behavior is my "err-cancel = none" scenario: we offer no
> guarantee about the success or failure of the transaction as a whole
> after the synchronous portion has completed.
>
> What I was proposing is "err-cancel = all," which to me means that _ALL_
> commands in this transaction are to succeed (synchronous or not) before
> _any_ actions are irrevocably committed. This means that for a
> hypothetical mixed synchronous-asynchronous transaction, that even after
> the transaction succeeded (it passed the synchronous phase), if an
> asynchronous action later fails, all actions both synchronous and non
> are rolled-back -- a kind of retroactive failure of the transaction.
> This is clearly not possible in all cases, so commands that cannot
> support these semantics will refuse "err-cancel = all" during the
> synchronous phase.
Is this possible in any case? You're losing transaction semantics the
lastest when you drop the BQL that the monitor holds. At least atomicity
and isolation aren't given any more.
You can try to undo some parts of what you did later one, but if any
involved BDS was used in the meantime by anything other than the block
job, you don't have transactional behaviour any more.
And isn't the management tool perfectly capable of cleaning up all the
block jobs without magic happening in qemu if one of them fails? Do we
actually need atomic failure later on? And if so, do we need atomic
failure only of block jobs started in the same transaction? Why?
Kevin
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH v10 12/14] block: add transactional properties
2015-11-06 8:32 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
@ 2015-11-06 16:36 ` Stefan Hajnoczi
2015-11-06 18:46 ` John Snow
1 sibling, 0 replies; 35+ messages in thread
From: Stefan Hajnoczi @ 2015-11-06 16:36 UTC (permalink / raw)
To: Kevin Wolf
Cc: vsementsov, famz, qemu-block, armbru, qemu-devel, Stefan Hajnoczi,
John Snow
[-- Attachment #1: Type: text/plain, Size: 6860 bytes --]
On Fri, Nov 06, 2015 at 09:32:19AM +0100, Kevin Wolf wrote:
> Am 05.11.2015 um 19:52 hat John Snow geschrieben:
> >
> >
> > On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
> > > On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
> > >>
> > >>
> > >> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
> > >>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
> > >>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
> > >>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
> > >>>> common, common);
> > >>>>
> > >>>> + if (action_check_cancel_mode(common, errp) < 0) {
> > >>>> + return;
> > >>>> + }
> > >>>> +
> > >>>> action = common->action->block_dirty_bitmap_add;
> > >>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
> > >>>> qmp_block_dirty_bitmap_add(action->node, action->name,
> > >>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
> > >>>> common, common);
> > >>>> BlockDirtyBitmap *action;
> > >>>>
> > >>>> + if (action_check_cancel_mode(common, errp) < 0) {
> > >>>> + return;
> > >>>> + }
> > >>>> +
> > >>>> action = common->action->block_dirty_bitmap_clear;
> > >>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
> > >>>> action->name,
> > >>>
> > >>> Why do the bitmap add/clear actions not support err-cancel=all?
> > >>>
> > >>> I understand why other block jobs don't support it, but it's not clear
> > >>> why these non-block job actions cannot.
> > >>>
> > >>
> > >> Because they don't have a callback to invoke if the rest of the job fails.
> > >>
> > >> I could create a BlockJob for them complete with a callback to invoke,
> > >> but basically it's just because there's no interface to unwind them, or
> > >> an interface to join them with the transaction.
> > >>
> > >> They're small, synchronous non-job actions. Which makes them weird.
> > >
> > > Funny, we've been looking at the same picture while seeing different
> > > things:
> > > https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
> > >
> > > I think I understand your idea: the transaction should include both
> > > immediate actions as well as block jobs.
> > >
> > > My mental model was different: immediate actions commit/abort along with
> > > the 'transaction' command. Block jobs are separate and complete/cancel
> > > together in a group.
> > >
> > > In practice I think the two end up being similar because we won't be
> > > able to implement immediate action commit/abort together with
> > > long-running block jobs because the immediate actions rely on
> > > quiescing/pausing the guest for atomic commit/abort.
> > >
> > > So with your mental model the QMP client has to submit 2 'transaction'
> > > commands: 1 for the immediate actions, 1 for the block jobs.
> > >
> > > In my mental model the QMP client submits 1 command but the immediate
> > > actions and block jobs are two separate transaction scopes. This means
> > > if the block jobs fail, the client needs to be aware of the immediate
> > > actions that have committed. Because of this, it becomes just as much
> > > client effort as submitting two separate 'transaction' commands in your
> > > model.
> > >
> > > Can anyone see a practical difference? I think I'm happy with John's
> > > model.
> > >
> > > Stefan
> > >
> >
> > We discussed this off-list, but for the sake of the archive:
> >
> > == How it is now ==
> >
> > Currently, transactions have two implicit phases: the first is the
> > synchronous phase. If this phase completes successfully, we consider the
> > transaction a success. The second phase is the asynchronous phase where
> > jobs launched by the synchronous phase run to completion.
> >
> > all synchronous commands must complete for the transaction to "succeed."
> > There are currently (pre-patch) no guarantees about asynchronous command
> > completion. As long as all synchronous actions complete, asynchronous
> > actions are free to succeed or fail individually.
>
> You're overcomplicating this. All actions are currently synchronous and
> what you consider asynchronous transaction actions aren't actually part
> of the transaction at all. The action is "start block job X", not "run
> block job X".
Yes, this is how I've thought of it too.
> > == My Model ==
> >
> > The current behavior is my "err-cancel = none" scenario: we offer no
> > guarantee about the success or failure of the transaction as a whole
> > after the synchronous portion has completed.
> >
> > What I was proposing is "err-cancel = all," which to me means that _ALL_
> > commands in this transaction are to succeed (synchronous or not) before
> > _any_ actions are irrevocably committed. This means that for a
> > hypothetical mixed synchronous-asynchronous transaction, that even after
> > the transaction succeeded (it passed the synchronous phase), if an
> > asynchronous action later fails, all actions both synchronous and non
> > are rolled-back -- a kind of retroactive failure of the transaction.
> > This is clearly not possible in all cases, so commands that cannot
> > support these semantics will refuse "err-cancel = all" during the
> > synchronous phase.
>
> Is this possible in any case? You're losing transaction semantics the
> lastest when you drop the BQL that the monitor holds. At least atomicity
> and isolation aren't given any more.
>
> You can try to undo some parts of what you did later one, but if any
> involved BDS was used in the meantime by anything other than the block
> job, you don't have transactional behaviour any more.
>
> And isn't the management tool perfectly capable of cleaning up all the
> block jobs without magic happening in qemu if one of them fails? Do we
> actually need atomic failure later on? And if so, do we need atomic
> failure only of block jobs started in the same transaction? Why?
I think we do because the backup block job, when given a dirty bitmap to
copy out, will either discard that dirty bitmap upon completion or merge
the dirty bitmap back into the currently active dirty bitmap for the
device on failure (that way your incremental backup can be retried).
Now when you take an incremental backup of multiple drives at the same
instant in time, it would be a pain to have one or more jobs complete
(and discard the bitmap) but others fail. Then you would no longer have
a single point in time when the incremental backup was taken...
That is the motivation for this feature.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH v10 12/14] block: add transactional properties
2015-11-06 8:32 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2015-11-06 16:36 ` Stefan Hajnoczi
@ 2015-11-06 18:46 ` John Snow
2015-11-06 20:35 ` John Snow
1 sibling, 1 reply; 35+ messages in thread
From: John Snow @ 2015-11-06 18:46 UTC (permalink / raw)
To: Kevin Wolf
Cc: vsementsov, famz, qemu-block, armbru, qemu-devel, Stefan Hajnoczi
On 11/06/2015 03:32 AM, Kevin Wolf wrote:
> Am 05.11.2015 um 19:52 hat John Snow geschrieben:
>>
>>
>> On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
>>> On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>>>>
>>>>
>>>> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
>>>>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>>>>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>>>>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>>>>>> common, common);
>>>>>>
>>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> action = common->action->block_dirty_bitmap_add;
>>>>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>>>>>> qmp_block_dirty_bitmap_add(action->node, action->name,
>>>>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>>>>>> common, common);
>>>>>> BlockDirtyBitmap *action;
>>>>>>
>>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>>> + return;
>>>>>> + }
>>>>>> +
>>>>>> action = common->action->block_dirty_bitmap_clear;
>>>>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
>>>>>> action->name,
>>>>>
>>>>> Why do the bitmap add/clear actions not support err-cancel=all?
>>>>>
>>>>> I understand why other block jobs don't support it, but it's not clear
>>>>> why these non-block job actions cannot.
>>>>>
>>>>
>>>> Because they don't have a callback to invoke if the rest of the job fails.
>>>>
>>>> I could create a BlockJob for them complete with a callback to invoke,
>>>> but basically it's just because there's no interface to unwind them, or
>>>> an interface to join them with the transaction.
>>>>
>>>> They're small, synchronous non-job actions. Which makes them weird.
>>>
>>> Funny, we've been looking at the same picture while seeing different
>>> things:
>>> https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
>>>
>>> I think I understand your idea: the transaction should include both
>>> immediate actions as well as block jobs.
>>>
>>> My mental model was different: immediate actions commit/abort along with
>>> the 'transaction' command. Block jobs are separate and complete/cancel
>>> together in a group.
>>>
>>> In practice I think the two end up being similar because we won't be
>>> able to implement immediate action commit/abort together with
>>> long-running block jobs because the immediate actions rely on
>>> quiescing/pausing the guest for atomic commit/abort.
>>>
>>> So with your mental model the QMP client has to submit 2 'transaction'
>>> commands: 1 for the immediate actions, 1 for the block jobs.
>>>
>>> In my mental model the QMP client submits 1 command but the immediate
>>> actions and block jobs are two separate transaction scopes. This means
>>> if the block jobs fail, the client needs to be aware of the immediate
>>> actions that have committed. Because of this, it becomes just as much
>>> client effort as submitting two separate 'transaction' commands in your
>>> model.
>>>
>>> Can anyone see a practical difference? I think I'm happy with John's
>>> model.
>>>
>>> Stefan
>>>
>>
>> We discussed this off-list, but for the sake of the archive:
>>
>> == How it is now ==
>>
>> Currently, transactions have two implicit phases: the first is the
>> synchronous phase. If this phase completes successfully, we consider the
>> transaction a success. The second phase is the asynchronous phase where
>> jobs launched by the synchronous phase run to completion.
>>
>> all synchronous commands must complete for the transaction to "succeed."
>> There are currently (pre-patch) no guarantees about asynchronous command
>> completion. As long as all synchronous actions complete, asynchronous
>> actions are free to succeed or fail individually.
>
> You're overcomplicating this. All actions are currently synchronous and
> what you consider asynchronous transaction actions aren't actually part
> of the transaction at all. The action is "start block job X", not "run
> block job X".
>
"OK," except the entire goal of the series was to allow the "aren't
actually part of the transaction at all" parts to live and die together
based on the transaction they were launched from. This really implies
they belong to a second asynchronous phase of the transaction.
Otherwise, why would totally unrelated jobs fail because a different one
did?
I realize this is an /extension/ of the existing semantics vs a "fix,"
and part of the confusion is how I and everyone else was looking at it
differently. How could this happen, though? Let's look at our
transaction documentation:
"Operations that are currently supported:" [...] "- drive-backup" [...]
"If there is any failure
performing any of the operations, all operations for the group are
abandoned."
Where do we say "Except drive-backup, though, because technically the
drive-backup action only starts the job, and we don't really consider
this to be part of the transaction" ? We've never actually defined this
behavior as part of our API as far as I can see.
Regardless: the net effect is still the same. We want jobs launched by
transactions to (optionally!) cancel themselves on failure. The
complications only arise because people want the exact semantic meaning
to be precise for the QMP API.
The concept is simple, the language to describe it has been muddy.
>> == My Model ==
>>
>> The current behavior is my "err-cancel = none" scenario: we offer no
>> guarantee about the success or failure of the transaction as a whole
>> after the synchronous portion has completed.
>>
>> What I was proposing is "err-cancel = all," which to me means that _ALL_
>> commands in this transaction are to succeed (synchronous or not) before
>> _any_ actions are irrevocably committed. This means that for a
>> hypothetical mixed synchronous-asynchronous transaction, that even after
>> the transaction succeeded (it passed the synchronous phase), if an
>> asynchronous action later fails, all actions both synchronous and non
>> are rolled-back -- a kind of retroactive failure of the transaction.
>> This is clearly not possible in all cases, so commands that cannot
>> support these semantics will refuse "err-cancel = all" during the
>> synchronous phase.
>
> Is this possible in any case? You're losing transaction semantics the
> lastest when you drop the BQL that the monitor holds. At least atomicity
> and isolation aren't given any more.
>
It might be possible in at least *some* cases. I currently don't even
attempt it, though. This is why some transaction actions just refuse
this parameter if it shows up.
It'd be pretty easy to undo a bitmap-add or a bitmap-clear, as long as I
gave these actions a conditional success callback.
The "undo" semantics of the jobs are somewhat different. I am not
suggesting we can teleport back in time to before we tried to do the
backup, but we can at least abort the backup and make it like you never
issued the command -- which is important for incremental backups.
The rollback behavior for each action needs to be spelled out in our
documentation ... as well as categorizing which actions complete before
qmp_transactions return and which may have lingering effects (like
drive-backup.)
> You can try to undo some parts of what you did later one, but if any
> involved BDS was used in the meantime by anything other than the block
> job, you don't have transactional behaviour any more.
>
> And isn't the management tool perfectly capable of cleaning up all the
> block jobs without magic happening in qemu if one of them fails? Do we
> actually need atomic failure later on? And if so, do we need atomic
> failure only of block jobs started in the same transaction? Why?
>
There's always a debate about who is responsible for cleaning things up
on failures: QEMU or the management tool? Luckily: it's an optional
parameter, so the tool can decide at run-time about what semantics it wants.
IMO: It's a little late in this series to question if we need this or
not, but I'll oblige.
The original objection from Stefan to the incremental backup
https://soundcloud.com/tothejazz/gyms-flappy-the-happy-sealtransaction
semantics was that if two incremental backup jobs launched by a
transaction had only partial success, the management tool would have to
take on extra state and possibly issue some corrective actions to QEMU
in order to undo the damage.
QEMU, however, could just unravel the failure fairly trivially and allow
the backup commands to maintain a simple binary success/failure state.
This was agreed to be the better, less complicated management scenario.
In the case that incremental backups partially complete, you'll have one
bitmap deleted and a different bitmap merged back onto the BDS. The
management tool can at this point absolutely not delete the one
incremental that got made and needs to leave it in place before
re-issuing the incremental backup. It's then responsible for either
squashing the two incremental backups it made, or otherwise managing the
disparity in the number of actual backup files.
For QEMU's trouble, we don't delete incremental backup data until after
the backup operation, so it's trivial to hold onto the data until we
know everything's OK.
We decided it was nicest for QEMU to just roll back all of the jobs in
this case. If the management tool disagrees, it can just roll with the
original semantics.
I still believe strongly that this is the right way to go. It's
flexible, allows for either party to manage the failure, the parameter
is completely non-ambiguous, provides for early failure if the expected
semantics are not possible, and the code is already here and mostly
reviewed... except for the API names.
--js
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [Qemu-devel] [Qemu-block] [PATCH v10 12/14] block: add transactional properties
2015-11-06 18:46 ` John Snow
@ 2015-11-06 20:35 ` John Snow
0 siblings, 0 replies; 35+ messages in thread
From: John Snow @ 2015-11-06 20:35 UTC (permalink / raw)
To: Kevin Wolf
Cc: vsementsov, famz, qemu-block, armbru, qemu-devel, Stefan Hajnoczi
On 11/06/2015 01:46 PM, John Snow wrote:
>
>
> On 11/06/2015 03:32 AM, Kevin Wolf wrote:
>> Am 05.11.2015 um 19:52 hat John Snow geschrieben:
>>>
>>>
>>> On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
>>>> On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
>>>>>
>>>>>
>>>>> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
>>>>>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
>>>>>>> @@ -1732,6 +1757,10 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>>>>>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>>>>>>> common, common);
>>>>>>>
>>>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>>>> + return;
>>>>>>> + }
>>>>>>> +
>>>>>>> action = common->action->block_dirty_bitmap_add;
>>>>>>> /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>>>>>>> qmp_block_dirty_bitmap_add(action->node, action->name,
>>>>>>> @@ -1767,6 +1796,10 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
>>>>>>> common, common);
>>>>>>> BlockDirtyBitmap *action;
>>>>>>>
>>>>>>> + if (action_check_cancel_mode(common, errp) < 0) {
>>>>>>> + return;
>>>>>>> + }
>>>>>>> +
>>>>>>> action = common->action->block_dirty_bitmap_clear;
>>>>>>> state->bitmap = block_dirty_bitmap_lookup(action->node,
>>>>>>> action->name,
>>>>>>
>>>>>> Why do the bitmap add/clear actions not support err-cancel=all?
>>>>>>
>>>>>> I understand why other block jobs don't support it, but it's not clear
>>>>>> why these non-block job actions cannot.
>>>>>>
>>>>>
>>>>> Because they don't have a callback to invoke if the rest of the job fails.
>>>>>
>>>>> I could create a BlockJob for them complete with a callback to invoke,
>>>>> but basically it's just because there's no interface to unwind them, or
>>>>> an interface to join them with the transaction.
>>>>>
>>>>> They're small, synchronous non-job actions. Which makes them weird.
>>>>
>>>> Funny, we've been looking at the same picture while seeing different
>>>> things:
>>>> https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
>>>>
>>>> I think I understand your idea: the transaction should include both
>>>> immediate actions as well as block jobs.
>>>>
>>>> My mental model was different: immediate actions commit/abort along with
>>>> the 'transaction' command. Block jobs are separate and complete/cancel
>>>> together in a group.
>>>>
>>>> In practice I think the two end up being similar because we won't be
>>>> able to implement immediate action commit/abort together with
>>>> long-running block jobs because the immediate actions rely on
>>>> quiescing/pausing the guest for atomic commit/abort.
>>>>
>>>> So with your mental model the QMP client has to submit 2 'transaction'
>>>> commands: 1 for the immediate actions, 1 for the block jobs.
>>>>
>>>> In my mental model the QMP client submits 1 command but the immediate
>>>> actions and block jobs are two separate transaction scopes. This means
>>>> if the block jobs fail, the client needs to be aware of the immediate
>>>> actions that have committed. Because of this, it becomes just as much
>>>> client effort as submitting two separate 'transaction' commands in your
>>>> model.
>>>>
>>>> Can anyone see a practical difference? I think I'm happy with John's
>>>> model.
>>>>
>>>> Stefan
>>>>
>>>
>>> We discussed this off-list, but for the sake of the archive:
>>>
>>> == How it is now ==
>>>
>>> Currently, transactions have two implicit phases: the first is the
>>> synchronous phase. If this phase completes successfully, we consider the
>>> transaction a success. The second phase is the asynchronous phase where
>>> jobs launched by the synchronous phase run to completion.
>>>
>>> all synchronous commands must complete for the transaction to "succeed."
>>> There are currently (pre-patch) no guarantees about asynchronous command
>>> completion. As long as all synchronous actions complete, asynchronous
>>> actions are free to succeed or fail individually.
>>
>> You're overcomplicating this. All actions are currently synchronous and
>> what you consider asynchronous transaction actions aren't actually part
>> of the transaction at all. The action is "start block job X", not "run
>> block job X".
>>
>
> "OK," except the entire goal of the series was to allow the "aren't
> actually part of the transaction at all" parts to live and die together
> based on the transaction they were launched from. This really implies
> they belong to a second asynchronous phase of the transaction.
>
> Otherwise, why would totally unrelated jobs fail because a different one
> did?
>
> I realize this is an /extension/ of the existing semantics vs a "fix,"
> and part of the confusion is how I and everyone else was looking at it
> differently. How could this happen, though? Let's look at our
> transaction documentation:
>
> "Operations that are currently supported:" [...] "- drive-backup" [...]
> "If there is any failure
> performing any of the operations, all operations for the group are
> abandoned."
>
> Where do we say "Except drive-backup, though, because technically the
> drive-backup action only starts the job, and we don't really consider
> this to be part of the transaction" ? We've never actually defined this
> behavior as part of our API as far as I can see.
>
> Regardless: the net effect is still the same. We want jobs launched by
> transactions to (optionally!) cancel themselves on failure. The
> complications only arise because people want the exact semantic meaning
> to be precise for the QMP API.
>
> The concept is simple, the language to describe it has been muddy.
>
>>> == My Model ==
>>>
>>> The current behavior is my "err-cancel = none" scenario: we offer no
>>> guarantee about the success or failure of the transaction as a whole
>>> after the synchronous portion has completed.
>>>
>>> What I was proposing is "err-cancel = all," which to me means that _ALL_
>>> commands in this transaction are to succeed (synchronous or not) before
>>> _any_ actions are irrevocably committed. This means that for a
>>> hypothetical mixed synchronous-asynchronous transaction, that even after
>>> the transaction succeeded (it passed the synchronous phase), if an
>>> asynchronous action later fails, all actions both synchronous and non
>>> are rolled-back -- a kind of retroactive failure of the transaction.
>>> This is clearly not possible in all cases, so commands that cannot
>>> support these semantics will refuse "err-cancel = all" during the
>>> synchronous phase.
>>
>> Is this possible in any case? You're losing transaction semantics the
>> lastest when you drop the BQL that the monitor holds. At least atomicity
>> and isolation aren't given any more.
>>
>
> It might be possible in at least *some* cases. I currently don't even
> attempt it, though. This is why some transaction actions just refuse
> this parameter if it shows up.
>
> It'd be pretty easy to undo a bitmap-add or a bitmap-clear, as long as I
> gave these actions a conditional success callback.
>
> The "undo" semantics of the jobs are somewhat different. I am not
> suggesting we can teleport back in time to before we tried to do the
> backup, but we can at least abort the backup and make it like you never
> issued the command -- which is important for incremental backups.
>
> The rollback behavior for each action needs to be spelled out in our
> documentation ... as well as categorizing which actions complete before
> qmp_transactions return and which may have lingering effects (like
> drive-backup.)
>
>> You can try to undo some parts of what you did later one, but if any
>> involved BDS was used in the meantime by anything other than the block
>> job, you don't have transactional behaviour any more.
>>
>> And isn't the management tool perfectly capable of cleaning up all the
>> block jobs without magic happening in qemu if one of them fails? Do we
>> actually need atomic failure later on? And if so, do we need atomic
>> failure only of block jobs started in the same transaction? Why?
>>
>
> There's always a debate about who is responsible for cleaning things up
> on failures: QEMU or the management tool? Luckily: it's an optional
> parameter, so the tool can decide at run-time about what semantics it wants.
>
> IMO: It's a little late in this series to question if we need this or
> not, but I'll oblige.
>
> The original objection from Stefan to the incremental backup
> https://soundcloud.com/tothejazz/gyms-flappy-the-happy-sealtransaction
> semantics was that if two incremental backup jobs launched by a
LOL. I mispasted a soundcloud link in here. Well... Enjoy this stupid
song that made me laugh that I was trying to link to someone else.
Sigh!!! (So embarrassed.)
> transaction had only partial success, the management tool would have to
> take on extra state and possibly issue some corrective actions to QEMU
> in order to undo the damage.
>
> QEMU, however, could just unravel the failure fairly trivially and allow
> the backup commands to maintain a simple binary success/failure state.
> This was agreed to be the better, less complicated management scenario.
>
> In the case that incremental backups partially complete, you'll have one
> bitmap deleted and a different bitmap merged back onto the BDS. The
> management tool can at this point absolutely not delete the one
> incremental that got made and needs to leave it in place before
> re-issuing the incremental backup. It's then responsible for either
> squashing the two incremental backups it made, or otherwise managing the
> disparity in the number of actual backup files.
>
> For QEMU's trouble, we don't delete incremental backup data until after
> the backup operation, so it's trivial to hold onto the data until we
> know everything's OK.
>
> We decided it was nicest for QEMU to just roll back all of the jobs in
> this case. If the management tool disagrees, it can just roll with the
> original semantics.
>
>
> I still believe strongly that this is the right way to go. It's
> flexible, allows for either party to manage the failure, the parameter
> is completely non-ambiguous, provides for early failure if the expected
> semantics are not possible, and the code is already here and mostly
> reviewed... except for the API names.
>
> --js
>
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2015-11-06 20:35 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-23 23:56 [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 01/14] qapi: Add transaction support to block-dirty-bitmap operations John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 02/14] iotests: add transactional incremental backup test John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 03/14] block: rename BlkTransactionState and BdrvActionOps John Snow
2015-11-03 16:33 ` [Qemu-devel] [Qemu-block] " Jeff Cody
2015-11-03 17:32 ` John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 04/14] backup: Extract dirty bitmap handling as a separate function John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 05/14] blockjob: Introduce reference count and fix reference to job->bs John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 06/14] blockjob: Add .commit and .abort block job actions John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 07/14] blockjob: Add "completed" and "ret" in BlockJob John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 08/14] blockjob: Simplify block_job_finish_sync John Snow
2015-11-03 13:52 ` Stefan Hajnoczi
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 09/14] block: Add block job transactions John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 10/14] block/backup: Rely on commit/abort for cleanup John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 11/14] block: Add BlockJobTxn support to backup_run John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 12/14] block: add transactional properties John Snow
2015-11-03 15:17 ` Stefan Hajnoczi
2015-11-03 17:27 ` John Snow
2015-11-05 10:47 ` Stefan Hajnoczi
2015-11-05 18:52 ` John Snow
2015-11-05 19:35 ` Markus Armbruster
2015-11-05 19:43 ` John Snow
2015-11-06 8:32 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2015-11-06 16:36 ` Stefan Hajnoczi
2015-11-06 18:46 ` John Snow
2015-11-06 20:35 ` John Snow
2015-11-03 15:23 ` [Qemu-devel] " Eric Blake
2015-11-03 17:31 ` John Snow
2015-11-03 17:35 ` Eric Blake
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 13/14] iotests: 124 - transactional failure test John Snow
2015-10-23 23:56 ` [Qemu-devel] [PATCH v10 14/14] tests: add BlockJobTxn unit test John Snow
2015-11-02 21:06 ` [Qemu-devel] [PATCH v10 00/14] block: incremental backup transactions using BlockJobTxn John Snow
2015-11-03 15:22 ` Stefan Hajnoczi
2015-11-03 17:46 ` John Snow
2015-11-03 22:07 ` John Snow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).