qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/7] Live block commit
@ 2012-09-25 16:29 Jeff Cody
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images Jeff Cody
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

This series adds the basic case, of a live commit between two
images below the active layer, e.g.:

[base] <--- [snp-1] <--- [snp-2] <--- [snp-3] <--- [active]

can be collapsed down via commit, into:

[base] <--- [active]

or,

[base] <--- [snp-1] <--- [active],

[base] <--- [snp-3] <--- [active],

etc..

TODO: * 'stage-2' of live commit functionality, to be able to push down the
        active layer. 

These changes are all reflected in my github repo:

    git://github.com/codyprime/qemu-kvm-jtc.git  branch: jtc-live-commit-1.3-v8

Changes from v1 -> v2:
===================================

Patch 1/7: [Eric] - Commit message cleanup
                  - Make comments clearer
                  - fixed whitespace removal

Dropped old Patch 4/8 (new qerror message) - Eric

Patch 2/7:        - Changed error_set() to error_setg()

Patch 5/7: [Eric] - Changed error_set() to error_setg()
                  - Updated comments/docs to be clearer
                  - fixed whitespace removal


Changes from RFC to v1:
===================================

* Feedback incorporated (see below)
* qemu-iotests added
* live snapshots have active->backing_hd reopened read-only.

Patch 1/8: [Kevin] -  Rename bdrv_delete_intermediate() to
                      bdrv_drop_intermediate()
                   -  Change terminology from 'child' to 'overlay'
                   -  Do not allow top == active, and return NULL for error,
                      simplified some logic.

Patch 2/8: [Kevin]       - Preserve errors in commit_populate()
           [Kevin, Eric] - Check that there is enough room in base, and resize
                           with bdrv_truncate() if not.
           [Jeff]        - Switch from 'child' to 'overlay' terminology
                         - Fixed up more error checking
                         - Minor cleanup
           [Kevin]       - Perform the bdrv_reopen() inside commit_start(),
                           instead of in blockdev.c

Patch 3/8: [Paolo]  - Rearranged patch, so the forward declaration is gone
            
Patch 4/8: None
Patch 5/8: None

Patch 6/8: [Eric]   - Updated QMP/qmp-events.txt with the new 'type' field
           [Kevin]  - If base cannot be found, return attempted base name
                    - Moved reopen() logic to commit_start
                    - Fixed comments in JSON command header
           [Jeff]   - If 'top' is not specified, default to active->backing_hd

Patch 7/8: New - simple qemu-iotest for commit.

Patch 8/8: New - after live snapshot, use bdrv_reopen() to reopen
                 active->backing_hd read-only.



Changes from the RFC v1 -> RFC v2:
===================================

* This patch series is not on top of Paolo's blk mirror series yet, to make it
  easier to apply independently if desired.  This means some of what was in the
  previous RFC series is not in this one (BlockdevOnError, for instance), but
  that can be easily added in once Paolo's series are in.

* This patches series is dependent on the reopen() series with transactional
  reopen.

* The target release for this series is 1.3

* Found some mistakes in the reopen calls

* Dropped the BlockdevOnError argument (for now), will add in if rebasing on
  top of Paolo's series.

* Used the new qerror system

Jeff Cody (7):
  block: add support functions for live commit, to find and delete
    images.
  block: add live block commit functionality
  blockdev: rename block_stream_cb to a generic block_job_cb
  block: helper function, to find the base image of a chain
  QAPI: add command for live block commit, 'block-commit'
  qemu-iotests: add initial tests for live block commit
  block: after creating a live snapshot, make old image read-only

 QMP/qmp-events.txt         |   6 +-
 block.c                    | 166 ++++++++++++++++++++++++++++++
 block.h                    |   5 +
 block/Makefile.objs        |   1 +
 block/commit.c             | 246 +++++++++++++++++++++++++++++++++++++++++++++
 block_int.h                |  16 +++
 blockdev.c                 |  75 +++++++++++++-
 qapi-schema.json           |  35 +++++++
 qmp-commands.hx            |   6 ++
 tests/qemu-iotests/040     | 142 ++++++++++++++++++++++++++
 tests/qemu-iotests/040.out |   5 +
 tests/qemu-iotests/group   |   1 +
 trace-events               |   4 +-
 13 files changed, 702 insertions(+), 6 deletions(-)
 create mode 100644 block/commit.c
 create mode 100755 tests/qemu-iotests/040
 create mode 100644 tests/qemu-iotests/040.out

-- 
1.7.11.4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images.
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-26 13:53   ` Kevin Wolf
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality Jeff Cody
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

Add bdrv_find_overlay(), and bdrv_drop_intermediate().

bdrv_find_overlay():  given 'bs' and the active (topmost) BDS of an image chain,
                    find the image that is the immediate top of 'bs'

bdrv_drop_intermediate():
                    Given 3 BDS (active, top, base), drop images above
                    base up to and including top, and set base to be the
                    backing file of top's overlay node.

                    E.g., this converts:

                    bottom <- base <- intermediate <- top <- active

                    to

                    bottom <- base <- active

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 block.h |   4 ++
 2 files changed, 154 insertions(+)

diff --git a/block.c b/block.c
index 751ebdc..d044529 100644
--- a/block.c
+++ b/block.c
@@ -1724,6 +1724,156 @@ int bdrv_change_backing_file(BlockDriverState *bs,
     return ret;
 }
 
+/*
+ * Finds the image layer in the chain that has 'bs' as its backing file.
+ *
+ * active is the current topmost image.
+ *
+ * Returns NULL if bs is not found in active's image chain,
+ * or if active == bs.
+ */
+BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
+                                    BlockDriverState *bs)
+{
+    BlockDriverState *overlay = NULL;
+    BlockDriverState *intermediate;
+
+    assert(active != NULL);
+    assert(bs != NULL);
+
+    /* if bs is the same as active, then by definition it has no overlay
+     */
+    if (active == bs) {
+        return NULL;
+    }
+
+    intermediate = active;
+    while (intermediate->backing_hd) {
+        if (intermediate->backing_hd == bs) {
+            overlay = intermediate;
+            break;
+        }
+        intermediate = intermediate->backing_hd;
+    }
+
+    return overlay;
+}
+
+typedef struct BlkIntermediateStates {
+    BlockDriverState *bs;
+    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
+} BlkIntermediateStates;
+
+
+/*
+ * Drops images above 'base' up to and including 'top', and sets the image
+ * above 'top' to have base as its backing file.
+ *
+ * Requires that the overlay to 'top' is opened r/w, so that the backing file
+ * information in 'bs' can be properly updated.
+ *
+ * E.g., this will convert the following chain:
+ * bottom <- base <- intermediate <- top <- active
+ *
+ * to
+ *
+ * bottom <- base <- active
+ *
+ * It is allowed for bottom==base, in which case it converts:
+ *
+ * base <- intermediate <- top <- active
+ *
+ * to
+ *
+ * base <- active
+ *
+ * base <- intermediate <- top
+ *
+ * becomes
+ *
+ * base <- top
+ *
+ * Error conditions:
+ *  if active == top, that is considered an error
+ *
+ */
+int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
+                           BlockDriverState *base)
+{
+    BlockDriverState *intermediate;
+    BlockDriverState *base_bs = NULL;
+    BlockDriverState *new_top_bs = NULL;
+    BlkIntermediateStates *intermediate_state, *next;
+    int ret = -EIO;
+
+    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
+    QSIMPLEQ_INIT(&states_to_delete);
+
+    if (!top->drv || !base->drv) {
+        goto exit;
+    }
+
+    new_top_bs = bdrv_find_overlay(active, top);
+
+    if (new_top_bs == NULL) {
+        /* we could not find the image above 'top', this is an error */
+        goto exit;
+    }
+
+    /* special case of new_top_bs->backing_hd already pointing to base - nothing
+     * to do, no intermediate images */
+    if (new_top_bs->backing_hd == base) {
+        ret = 0;
+        goto exit;
+    }
+
+    intermediate = new_top_bs->backing_hd;  /* this should be the same
+                                               as 'top' */
+
+    /* now we will go down through the list, and add each BDS we find
+     * into our deletion queue, until we hit the 'base'
+     */
+    while (intermediate) {
+        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
+        intermediate_state->bs = intermediate;
+        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
+
+        if (intermediate->backing_hd == base) {
+            base_bs = intermediate->backing_hd;
+            break;
+        }
+        intermediate = intermediate->backing_hd;
+    }
+    if (base_bs == NULL) {
+        /* something went wrong, we did not end at the base. safely
+         * unravel everything, and exit with error */
+        goto exit;
+    }
+
+    /* success - we can delete the intermediate states, and link top->base */
+    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
+                                   base_bs->drv ? base_bs->drv->format_name : "");
+    if (ret) {
+        goto exit;
+    }
+    new_top_bs->backing_hd = base_bs;
+
+
+    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
+        /* so that bdrv_close() does not recursively close the chain */
+        intermediate_state->bs->backing_hd = NULL;
+        bdrv_delete(intermediate_state->bs);
+    }
+    ret = 0;
+
+exit:
+    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
+        g_free(intermediate_state);
+    }
+    return ret;
+}
+
+
 static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
                                    size_t size)
 {
diff --git a/block.h b/block.h
index b1095d8..8c9b424 100644
--- a/block.h
+++ b/block.h
@@ -203,6 +203,10 @@ int bdrv_commit_all(void);
 int bdrv_change_backing_file(BlockDriverState *bs,
     const char *backing_file, const char *backing_fmt);
 void bdrv_register(BlockDriver *bdrv);
+int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
+                           BlockDriverState *base);
+BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
+                                    BlockDriverState *bs);
 
 
 typedef struct BdrvCheckResult {
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-25 18:12   ` Eric Blake
  2012-09-26 14:03   ` Kevin Wolf
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 3/7] blockdev: rename block_stream_cb to a generic block_job_cb Jeff Cody
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

This adds the live commit coroutine.  This iteration focuses on the
commit only below the active layer, and not the active layer itself.

The behaviour is similar to block streaming; the sectors are walked
through, and anything that exists above 'base' is committed back down
into base.  At the end, intermediate images are deleted, and the
chain stitched together.  Images are restored to their original open
flags upon completion.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/Makefile.objs |   1 +
 block/commit.c      | 246 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 block_int.h         |  16 ++++
 trace-events        |   2 +
 4 files changed, 265 insertions(+)
 create mode 100644 block/commit.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index b5754d3..4a136b8 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -4,6 +4,7 @@ block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
 block-obj-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
 block-obj-y += stream.o
+block-obj-y += commit.o
 block-obj-$(CONFIG_WIN32) += raw-win32.o
 block-obj-$(CONFIG_POSIX) += raw-posix.o
 block-obj-$(CONFIG_LIBISCSI) += iscsi.o
diff --git a/block/commit.c b/block/commit.c
new file mode 100644
index 0000000..67fd9df
--- /dev/null
+++ b/block/commit.c
@@ -0,0 +1,246 @@
+/*
+ * Live block commit
+ *
+ * Copyright Red Hat, Inc. 2012
+ *
+ * Authors:
+ *  Jeff Cody   <jcody@redhat.com>
+ *  Based on stream.c by Stefan Hajnoczi
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include "trace.h"
+#include "block_int.h"
+#include "qemu/ratelimit.h"
+
+enum {
+    /*
+     * Size of data buffer for populating the image file.  This should be large
+     * enough to process multiple clusters in a single call, so that populating
+     * contiguous regions of the image is efficient.
+     */
+    COMMIT_BUFFER_SIZE = 512 * 1024, /* in bytes */
+};
+
+#define SLICE_TIME 100000000ULL /* ns */
+
+typedef struct CommitBlockJob {
+    BlockJob common;
+    RateLimit limit;
+    BlockDriverState *active;
+    BlockDriverState *top;
+    BlockDriverState *base;
+    BlockErrorAction on_error;
+    int base_flags;
+    int orig_overlay_flags;
+} CommitBlockJob;
+
+static int coroutine_fn commit_populate(BlockDriverState *bs,
+                                        BlockDriverState *base,
+                                        int64_t sector_num, int nb_sectors,
+                                        void *buf)
+{
+    int ret = 0;
+
+    ret = bdrv_read(bs, sector_num, buf, nb_sectors);
+    if (ret) {
+        return ret;
+    }
+
+    ret = bdrv_write(base, sector_num, buf, nb_sectors);
+    if (ret) {
+        return ret;
+    }
+
+    return 0;
+}
+
+static void coroutine_fn commit_run(void *opaque)
+{
+    CommitBlockJob *s = opaque;
+    BlockDriverState *active = s->active;
+    BlockDriverState *top = s->top;
+    BlockDriverState *base = s->base;
+    BlockDriverState *overlay_bs = NULL;
+    int64_t sector_num, end;
+    int ret = 0;
+    int n = 0;
+    void *buf;
+    int bytes_written = 0;
+    int64_t base_len;
+
+    ret = s->common.len = bdrv_getlength(top);
+
+
+    if (s->common.len < 0) {
+        goto exit_restore_reopen;
+    }
+
+    ret = base_len = bdrv_getlength(base);
+    if (base_len < 0) {
+        goto exit_restore_reopen;
+    }
+
+    if (base_len < s->common.len) {
+        ret = bdrv_truncate(base, s->common.len);
+        if (ret) {
+            goto exit_restore_reopen;
+        }
+    }
+
+    overlay_bs = bdrv_find_overlay(active, top);
+
+    end = s->common.len >> BDRV_SECTOR_BITS;
+    buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE);
+
+    for (sector_num = 0; sector_num < end; sector_num += n) {
+        uint64_t delay_ns = 0;
+        bool copy;
+
+wait:
+        /* Note that even when no rate limit is applied we need to yield
+         * with no pending I/O here so that qemu_aio_flush() returns.
+         */
+        block_job_sleep_ns(&s->common, rt_clock, delay_ns);
+        if (block_job_is_cancelled(&s->common)) {
+            break;
+        }
+        /* Copy if allocated above the base */
+        ret = bdrv_co_is_allocated_above(top, base, sector_num,
+                                         COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
+                                         &n);
+        copy = (ret == 1);
+        trace_commit_one_iteration(s, sector_num, n, ret);
+        if (copy) {
+            if (s->common.speed) {
+                delay_ns = ratelimit_calculate_delay(&s->limit, n);
+                if (delay_ns > 0) {
+                    goto wait;
+                }
+            }
+            ret = commit_populate(top, base, sector_num, n, buf);
+            bytes_written += n * BDRV_SECTOR_SIZE;
+        }
+        if (ret < 0) {
+            if (s->on_error == BLOCK_ERR_STOP_ANY    ||
+                s->on_error == BLOCK_ERR_REPORT      ||
+                (s->on_error == BLOCK_ERR_STOP_ENOSPC && ret == -ENOSPC)) {
+                goto exit_free_buf;
+            } else {
+                n = 0;
+                continue;
+            }
+        }
+        /* Publish progress */
+        s->common.offset += n * BDRV_SECTOR_SIZE;
+    }
+
+    ret = 0;
+
+    if (!block_job_is_cancelled(&s->common) && sector_num == end) {
+        /* success */
+        ret = bdrv_drop_intermediate(active, top, base);
+    }
+
+exit_free_buf:
+    qemu_vfree(buf);
+
+exit_restore_reopen:
+    /* restore base open flags here if appropriate (e.g., change the base back
+     * to r/o). These reopens do not need to be atomic, since we won't abort
+     * even on failure here */
+    if (s->base_flags != bdrv_get_flags(base)) {
+        bdrv_reopen(base, s->base_flags, NULL);
+    }
+    if (s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
+        bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
+    }
+
+    block_job_complete(&s->common, ret);
+}
+
+static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
+{
+    CommitBlockJob *s = container_of(job, CommitBlockJob, common);
+
+    if (speed < 0) {
+        error_set(errp, QERR_INVALID_PARAMETER, "speed");
+        return;
+    }
+    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
+}
+
+static BlockJobType commit_job_type = {
+    .instance_size = sizeof(CommitBlockJob),
+    .job_type      = "commit",
+    .set_speed     = commit_set_speed,
+};
+
+void commit_start(BlockDriverState *bs, BlockDriverState *base,
+                 BlockDriverState *top, int64_t speed,
+                 BlockErrorAction on_error, BlockDriverCompletionFunc *cb,
+                 void *opaque, Error **errp)
+{
+    CommitBlockJob *s;
+    BlockReopenQueue *reopen_queue = NULL;
+    int orig_overlay_flags;
+    int orig_base_flags;
+    BlockDriverState *overlay_bs;
+    Error *local_err = NULL;
+
+    if ((on_error == BLOCK_ERR_STOP_ANY ||
+         on_error == BLOCK_ERR_STOP_ENOSPC) &&
+        !bdrv_iostatus_is_enabled(bs)) {
+        error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
+        return;
+    }
+
+    overlay_bs = bdrv_find_overlay(bs, top);
+
+    if (overlay_bs == NULL) {
+        error_setg(errp, "Could not find overlay image for %s:", top->filename);
+        return;
+    }
+
+    orig_base_flags    = bdrv_get_flags(base);
+    orig_overlay_flags = bdrv_get_flags(overlay_bs);
+
+    /* convert base_bs & overlay_bs to r/w, if necessary */
+    if (!(orig_base_flags & BDRV_O_RDWR)) {
+        reopen_queue = bdrv_reopen_queue(reopen_queue, base,
+                                         orig_base_flags | BDRV_O_RDWR);
+    }
+    if (!(orig_overlay_flags & BDRV_O_RDWR)) {
+        reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs,
+                                         orig_overlay_flags | BDRV_O_RDWR);
+    }
+    if (reopen_queue) {
+        bdrv_reopen_multiple(reopen_queue, &local_err);
+        if (local_err != NULL) {
+            error_propagate(errp, local_err);
+            return;
+        }
+    }
+
+
+    s = block_job_create(&commit_job_type, bs, speed, cb, opaque, errp);
+    if (!s) {
+        return;
+    }
+
+    s->base   = base;
+    s->top    = top;
+    s->active = bs;
+
+    s->base_flags          = orig_base_flags;
+    s->orig_overlay_flags  = orig_overlay_flags;
+
+    s->on_error = on_error;
+    s->common.co = qemu_coroutine_create(commit_run);
+
+    trace_commit_start(bs, base, top, s, s->common.co, opaque);
+    qemu_coroutine_enter(s->common.co, s);
+}
diff --git a/block_int.h b/block_int.h
index ac4245c..56164a7 100644
--- a/block_int.h
+++ b/block_int.h
@@ -463,4 +463,20 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
                   BlockDriverCompletionFunc *cb,
                   void *opaque, Error **errp);
 
+/**
+ * commit_start:
+ * @bs: Top Block device
+ * @base: Block device that will be written into, and become the new top
+ * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
+ * @on_error: The action to take upon error.
+ * @cb: Completion function for the job.
+ * @opaque: Opaque pointer value passed to @cb.
+ * @errp: Error object.
+ *
+ */
+void commit_start(BlockDriverState *bs, BlockDriverState *base,
+                 BlockDriverState *top, int64_t speed,
+                 BlockErrorAction on_error, BlockDriverCompletionFunc *cb,
+                 void *opaque, Error **errp);
+
 #endif /* BLOCK_INT_H */
diff --git a/trace-events b/trace-events
index f5b5097..dbc3007 100644
--- a/trace-events
+++ b/trace-events
@@ -74,6 +74,8 @@ bdrv_co_do_copy_on_readv(void *bs, int64_t sector_num, int nb_sectors, int64_t c
 # block/stream.c
 stream_one_iteration(void *s, int64_t sector_num, int nb_sectors, int is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
 stream_start(void *bs, void *base, void *s, void *co, void *opaque) "bs %p base %p s %p co %p opaque %p"
+commit_one_iteration(void *s, int64_t sector_num, int nb_sectors, int is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
+commit_start(void *bs, void *base, void *top, void *s, void *co, void *opaque) "bs %p base %p top %p s %p co %p opaque %p"
 
 # blockdev.c
 qmp_block_job_cancel(void *job) "job %p"
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 3/7] blockdev: rename block_stream_cb to a generic block_job_cb
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images Jeff Cody
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain Jeff Cody
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake


Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 blockdev.c   | 6 +++---
 trace-events | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 7c83baa..9caa16f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1063,12 +1063,12 @@ static QObject *qobject_from_block_job(BlockJob *job)
                               job->speed);
 }
 
-static void block_stream_cb(void *opaque, int ret)
+static void block_job_cb(void *opaque, int ret)
 {
     BlockDriverState *bs = opaque;
     QObject *obj;
 
-    trace_block_stream_cb(bs, bs->job, ret);
+    trace_block_job_cb(bs, bs->job, ret);
 
     assert(bs->job);
     obj = qobject_from_block_job(bs->job);
@@ -1110,7 +1110,7 @@ void qmp_block_stream(const char *device, bool has_base,
     }
 
     stream_start(bs, base_bs, base, has_speed ? speed : 0,
-                 block_stream_cb, bs, &local_err);
+                 block_job_cb, bs, &local_err);
     if (error_is_set(&local_err)) {
         error_propagate(errp, local_err);
         return;
diff --git a/trace-events b/trace-events
index dbc3007..29771a7 100644
--- a/trace-events
+++ b/trace-events
@@ -79,7 +79,7 @@ commit_start(void *bs, void *base, void *top, void *s, void *co, void *opaque) "
 
 # blockdev.c
 qmp_block_job_cancel(void *job) "job %p"
-block_stream_cb(void *bs, void *job, int ret) "bs %p job %p ret %d"
+block_job_cb(void *bs, void *job, int ret) "bs %p job %p ret %d"
 qmp_block_stream(void *bs, void *job) "bs %p job %p"
 
 # hw/virtio-blk.c
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
                   ` (2 preceding siblings ...)
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 3/7] blockdev: rename block_stream_cb to a generic block_job_cb Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-25 19:13   ` Eric Blake
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit' Jeff Cody
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

This is a simple helper function, that will return the base image
of a given image chain.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block.c | 16 ++++++++++++++++
 block.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/block.c b/block.c
index d044529..4a9bdc1 100644
--- a/block.c
+++ b/block.c
@@ -3124,6 +3124,22 @@ int bdrv_get_backing_file_depth(BlockDriverState *bs)
     return 1 + bdrv_get_backing_file_depth(bs->backing_hd);
 }
 
+BlockDriverState *bdrv_find_base(BlockDriverState *bs)
+{
+    BlockDriverState *curr_bs = NULL;
+
+    if (!bs) {
+        return NULL;
+    }
+
+    curr_bs = bs;
+
+    while (curr_bs->backing_hd) {
+        curr_bs = curr_bs->backing_hd;
+    }
+    return curr_bs;
+}
+
 #define NB_SUFFIXES 4
 
 char *get_human_readable_size(char *buf, int buf_size, int64_t size)
diff --git a/block.h b/block.h
index 8c9b424..e9249c4 100644
--- a/block.h
+++ b/block.h
@@ -207,6 +207,7 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
                            BlockDriverState *base);
 BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
                                     BlockDriverState *bs);
+BlockDriverState *bdrv_find_base(BlockDriverState *bs);
 
 
 typedef struct BdrvCheckResult {
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
                   ` (3 preceding siblings ...)
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-25 19:42   ` Eric Blake
  2012-09-26 14:13   ` Kevin Wolf
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit Jeff Cody
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only Jeff Cody
  6 siblings, 2 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

The command for live block commit is added, which has the following
arguments:

device: the block device to perform the commit on (mandatory)
base:   the base image to commit into; optional (if not specified,
        it is the underlying original image)
top:    the top image of the commit - all data from inside top down
        to base will be committed into base. optional (if not specified,
        it is one below the active image) - see note below
speed:  maximum speed, in bytes/sec

note: eventually this will support merging down the active layer,
      but that code is not yet complete.  If the active layer is passed
      in currently as top, or top is left to the default, then an error
      will be returned.

The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
be emitted.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 QMP/qmp-events.txt |  6 +++--
 blockdev.c         | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json   | 35 +++++++++++++++++++++++++++++
 qmp-commands.hx    |  6 +++++
 4 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt
index 2878058..4491020 100644
--- a/QMP/qmp-events.txt
+++ b/QMP/qmp-events.txt
@@ -50,7 +50,8 @@ Emitted when a block job has been cancelled.
 
 Data:
 
-- "type":     Job type ("stream" for image streaming, json-string)
+- "type":     Job type (json-string; "stream" for image streaming
+                                     "commit" for block commit)
 - "device":   Device name (json-string)
 - "len":      Maximum progress value (json-int)
 - "offset":   Current progress value (json-int)
@@ -73,7 +74,8 @@ Emitted when a block job has completed.
 
 Data:
 
-- "type":     Job type ("stream" for image streaming, json-string)
+- "type":     Job type (json-string; "stream" for image streaming
+                                     "commit" for block commit)
 - "device":   Device name (json-string)
 - "len":      Maximum progress value (json-int)
 - "offset":   Current progress value (json-int)
diff --git a/blockdev.c b/blockdev.c
index 9caa16f..6f1080c 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1124,6 +1124,70 @@ void qmp_block_stream(const char *device, bool has_base,
     trace_qmp_block_stream(bs, bs->job);
 }
 
+void qmp_block_commit(const char *device,
+                      bool has_base, const char *base,
+                      bool has_top, const char *top,
+                      bool has_speed, int64_t speed,
+                      Error **errp)
+{
+    BlockDriverState *bs;
+    BlockDriverState *base_bs, *top_bs;
+    Error *local_err = NULL;
+    /* This will be part of the QMP command, if/when the
+     * BlockdevOnError change for blkmirror makes it in
+     */
+    BlockErrorAction on_error = BLOCK_ERR_REPORT;
+
+    /* drain all i/o before commits */
+    bdrv_drain_all();
+
+    bs = bdrv_find(device);
+    if (!bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+        return;
+    }
+    if (base && has_base) {
+        base_bs = bdrv_find_backing_image(bs, base);
+    } else {
+        base_bs = bdrv_find_base(bs);
+    }
+
+    if (base_bs == NULL) {
+        error_set(errp, QERR_BASE_NOT_FOUND, base ? base : "NULL");
+        return;
+    }
+
+    if (top && has_top) {
+        /* if we want to allow the active layer,
+         * use 'bdrv_find_image()' here */
+        top_bs = bdrv_find_backing_image(bs, top);
+    } else {
+        /* default is one below the active layer, unless that is
+         * the base */
+        top_bs = bs->backing_hd;
+        if (top_bs == base_bs) {
+            error_setg(errp,
+                       "Invalid files for merge: top and base are the same");
+            return;
+        }
+    }
+    if (top_bs == NULL) {
+        error_setg(errp, "Top image file %s not found", top ? top : "NULL");
+        return;
+    }
+
+    commit_start(bs, base_bs, top_bs, speed, on_error, block_job_cb, bs,
+                &local_err);
+    if (local_err != NULL) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    /* Grab a reference so hotplug does not delete the BlockDriverState from
+     * underneath us.
+     */
+    drive_get_ref(drive_get_by_blockdev(bs));
+}
+
 static BlockJob *find_block_job(const char *device)
 {
     BlockDriverState *bs;
diff --git a/qapi-schema.json b/qapi-schema.json
index 14e4419..e614453 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1468,6 +1468,41 @@
   'returns': 'str' }
 
 ##
+# @block-commit
+#
+# Live commit of data from overlay image nodes into backing nodes - i.e.,
+# writes data between 'top' and 'base' into 'base'.
+#
+# @device:  the name of the device
+#
+# @base:   #optional The file name of the backing image to write data into.
+#                    If not specified, this is the deepest backing image
+#
+# @top:    #optional The file name of the backing image within the image chain,
+#                    which contains the topmost data to be committed down.
+#                    If not specified, this is one layer below the active
+#                    layer (i.e. active->backing_hd).
+#
+#                    If top == base, that is an error.
+#
+#
+# @speed:  #optional the maximum speed, in bytes per second
+#
+# Returns: Nothing on success
+#          If commit or stream is already active on this device, DeviceInUse
+#          If @device does not exist, DeviceNotFound
+#          If image commit is not supported by this device, NotSupported
+#          If @base does not exist, a generic error is returned
+#          If @top does not exist, a generic error is returned
+#          If @speed is invalid, InvalidParameter
+#
+# Since: 1.3
+#
+##
+{ 'command': 'block-commit',
+  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
+            '*speed': 'int' } }
+
 # @migrate_cancel
 #
 # Cancel the current executing migration process.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 6e21ddb..e244763 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -792,6 +792,12 @@ EQMP
     },
 
     {
+        .name       = "block-commit",
+        .args_type  = "device:B,base:s?,top:s?,speed:o?",
+        .mhandler.cmd_new = qmp_marshal_input_block_commit,
+    },
+
+    {
         .name       = "block-job-set-speed",
         .args_type  = "device:B,speed:o",
         .mhandler.cmd_new = qmp_marshal_input_block_job_set_speed,
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
                   ` (4 preceding siblings ...)
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit' Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-25 18:02   ` Eric Blake
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only Jeff Cody
  6 siblings, 1 reply; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

Derived from the streaming test cases (030), this adds the
following tests:

1. For the following image chain, commit [mid] into [backing],
   and use qemu-io to verify [backing] has its original data, as
   well as the data from [mid]

           [backing] <-- [mid] <-- [test]

2. Verifies that 'block-commit' with the 'speed' parameter sets the
   speed parameter, as reported by 'query-block-jobs'

3. Verifies that a bogus 'device' parameter to 'block-commit'
   results in error

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 tests/qemu-iotests/040     | 142 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/040.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 148 insertions(+)
 create mode 100755 tests/qemu-iotests/040
 create mode 100644 tests/qemu-iotests/040.out

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
new file mode 100755
index 0000000..6b89a41
--- /dev/null
+++ b/tests/qemu-iotests/040
@@ -0,0 +1,142 @@
+#!/usr/bin/env python
+#
+# Tests for image block commit.
+#
+# Copyright (C) 2012 IBM, Corp.
+# Copyright (C) 2012 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+# Test for live block commit
+# Derived from Image Streaming Test 030
+
+import time
+import os
+import iotests
+from iotests import qemu_img, qemu_io
+import struct
+
+backing_img = os.path.join(iotests.test_dir, 'backing.img')
+mid_img = os.path.join(iotests.test_dir, 'mid.img')
+test_img = os.path.join(iotests.test_dir, 'test.img')
+
+class ImageCommitTestCase(iotests.QMPTestCase):
+    '''Abstract base class for image commit test cases'''
+
+    def assert_no_active_commit(self):
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return', [])
+
+    def cancel_and_wait(self, drive='drive0'):
+        '''Cancel a block job and wait for it to finish'''
+        result = self.vm.qmp('block-job-cancel', device=drive)
+        self.assert_qmp(result, 'return', {})
+
+        cancelled = False
+        while not cancelled:
+            for event in self.vm.get_qmp_events(wait=True):
+                if event['event'] == 'BLOCK_JOB_CANCELLED':
+                    self.assert_qmp(event, 'data/type', 'commit')
+                    self.assert_qmp(event, 'data/device', drive)
+                    cancelled = True
+
+        self.assert_no_active_commit()
+
+    def create_image(self, name, size):
+        file = open(name, 'w')
+        i = 0
+        while i < size:
+            sector = struct.pack('>l504xl', i / 512, i / 512)
+            file.write(sector)
+            i = i + 512
+        file.close()
+
+
+class TestSingleDrive(ImageCommitTestCase):
+    image_len = 1 * 1024 * 1024
+    test_len = 1 * 1024 * 256
+
+    def setUp(self):
+        self.create_image(backing_img, TestSingleDrive.image_len)
+        qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % backing_img, mid_img)
+        qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % mid_img, test_img)
+        qemu_io('-c', 'write -P 0xab 0 524288', backing_img)
+        qemu_io('-c', 'write -P 0xef 524288 524288', mid_img)
+        self.vm = iotests.VM().add_drive(test_img)
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        os.remove(mid_img)
+        os.remove(backing_img)
+
+    def test_commit(self):
+        self.assert_no_active_commit()
+        result = self.vm.qmp('block-commit', device='drive0')
+        self.assert_qmp(result, 'return', {})
+
+        completed = False
+        while not completed:
+            for event in self.vm.get_qmp_events(wait=True):
+                if event['event'] == 'BLOCK_JOB_COMPLETED':
+                    self.assert_qmp(event, 'data/type', 'commit')
+                    self.assert_qmp(event, 'data/device', 'drive0')
+                    self.assert_qmp(event, 'data/offset', self.image_len)
+                    self.assert_qmp(event, 'data/len', self.image_len)
+                    completed = True
+
+        self.assert_no_active_commit()
+        self.vm.shutdown()
+
+        self.assertEqual(-1, qemu_io('-c', 'read -P 0xab 0 524288', backing_img).find("verification failed"))
+        self.assertEqual(-1, qemu_io('-c', 'read -P 0xef 524288 524288', backing_img).find("verification failed"))
+
+    def test_device_not_found(self):
+        result = self.vm.qmp('block-commit', device='nonexistent')
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+
+class TestSetSpeed(ImageCommitTestCase):
+    image_len = 80 * 1024 * 1024 # MB
+
+    def setUp(self):
+        qemu_img('create', backing_img, str(TestSetSpeed.image_len))
+        qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % backing_img, mid_img)
+        qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % mid_img, test_img)
+        self.vm = iotests.VM().add_drive(test_img)
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        os.remove(mid_img)
+        os.remove(backing_img)
+
+    def test_set_speed(self):
+        self.assert_no_active_commit()
+
+        result = self.vm.qmp('block-commit', device='drive0', top=mid_img, speed=1024 * 1024)
+        self.assert_qmp(result, 'return', {})
+
+        # Ensure the speed we set was accepted
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return[0]/device', 'drive0')
+        self.assert_qmp(result, 'return[0]/speed', 1024 * 1024)
+
+        self.cancel_and_wait()
+
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/040.out b/tests/qemu-iotests/040.out
new file mode 100644
index 0000000..8d7e996
--- /dev/null
+++ b/tests/qemu-iotests/040.out
@@ -0,0 +1,5 @@
+...
+----------------------------------------------------------------------
+Ran 3 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index ebb5ca4..4b54fa6 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -46,3 +46,4 @@
 037 rw auto backing
 038 rw auto backing
 039 rw auto
+040 rw auto
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only
  2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
                   ` (5 preceding siblings ...)
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit Jeff Cody
@ 2012-09-25 16:29 ` Jeff Cody
  2012-09-26 14:20   ` Kevin Wolf
  6 siblings, 1 reply; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 16:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, pbonzini, eblake

Currently, after a live snapshot of a drive, the image that has
been 'demoted' to be below the new active layer remains r/w.
This patch reopens it read-only.

Note that we do not check for error on the reopen(), because we
will not abort the snapshots if the reopen fails.

This patch depends on the bdrv_reopen() series.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 blockdev.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 6f1080c..fa7efae 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -803,6 +803,11 @@ void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
     QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
         /* This removes our old bs from the bdrv_states, and adds the new bs */
         bdrv_append(states->new_bs, states->old_bs);
+        /* We don't need (or want) to use the transactional
+         * bdrv_reopen_multiple() across all the entries at once, because we
+         * don't want to abort all of them if one of them fails the reopen */
+        bdrv_reopen(states->new_bs, states->new_bs->open_flags & ~BDRV_O_RDWR,
+                    NULL);
     }
 
     /* success */
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit Jeff Cody
@ 2012-09-25 18:02   ` Eric Blake
  2012-09-25 18:53     ` Jeff Cody
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2012-09-25 18:02 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1187 bytes --]

On 09/25/2012 10:29 AM, Jeff Cody wrote:
> Derived from the streaming test cases (030), this adds the
> following tests:
> 
> 1. For the following image chain, commit [mid] into [backing],
>    and use qemu-io to verify [backing] has its original data, as
>    well as the data from [mid]
> 
>            [backing] <-- [mid] <-- [test]
> 
> 2. Verifies that 'block-commit' with the 'speed' parameter sets the
>    speed parameter, as reported by 'query-block-jobs'
> 
> 3. Verifies that a bogus 'device' parameter to 'block-commit'
>    results in error

I think you are missing a test; you should also verify that:

{ "command":"block-commit", "arguments":{
  "device":"drive0", "base":"mid", "top":"backing" } }

properly fails, since 'mid' is not a backing file of 'backing'.  I saw
code in patch 1/7 that bdrv_drop_intermediate() should detect the
situation, but I'm not confident enough in my reading of patch 2/7 to
know if that detection point was early enough, or whether the coroutine
stuff in 2/7 ends up corrupting 'mid' prior to failure.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality Jeff Cody
@ 2012-09-25 18:12   ` Eric Blake
  2012-09-25 18:58     ` Jeff Cody
  2012-09-26 14:03   ` Kevin Wolf
  1 sibling, 1 reply; 25+ messages in thread
From: Eric Blake @ 2012-09-25 18:12 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2834 bytes --]

On 09/25/2012 10:29 AM, Jeff Cody wrote:
> This adds the live commit coroutine.  This iteration focuses on the
> commit only below the active layer, and not the active layer itself.
> 
> The behaviour is similar to block streaming; the sectors are walked
> through, and anything that exists above 'base' is committed back down
> into base.  At the end, intermediate images are deleted, and the
> chain stitched together.  Images are restored to their original open
> flags upon completion.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>

> +void commit_start(BlockDriverState *bs, BlockDriverState *base,
> +                 BlockDriverState *top, int64_t speed,
> +                 BlockErrorAction on_error, BlockDriverCompletionFunc *cb,
> +                 void *opaque, Error **errp)
> +{
> +    CommitBlockJob *s;
> +    BlockReopenQueue *reopen_queue = NULL;
> +    int orig_overlay_flags;
> +    int orig_base_flags;
> +    BlockDriverState *overlay_bs;
> +    Error *local_err = NULL;
> +
> +    if ((on_error == BLOCK_ERR_STOP_ANY ||
> +         on_error == BLOCK_ERR_STOP_ENOSPC) &&
> +        !bdrv_iostatus_is_enabled(bs)) {
> +        error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
> +        return;
> +    }
> +
> +    overlay_bs = bdrv_find_overlay(bs, top);
> +
> +    if (overlay_bs == NULL) {
> +        error_setg(errp, "Could not find overlay image for %s:", top->filename);
> +        return;
> +    }
> +
> +    orig_base_flags    = bdrv_get_flags(base);
> +    orig_overlay_flags = bdrv_get_flags(overlay_bs);

I think you are missing a check here that base is on the backing chain
of top.  See also my comments to 5/7.

> +
> +    /* convert base_bs & overlay_bs to r/w, if necessary */
> +    if (!(orig_base_flags & BDRV_O_RDWR)) {
> +        reopen_queue = bdrv_reopen_queue(reopen_queue, base,
> +                                         orig_base_flags | BDRV_O_RDWR);
> +    }
> +    if (!(orig_overlay_flags & BDRV_O_RDWR)) {
> +        reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs,
> +                                         orig_overlay_flags | BDRV_O_RDWR);
> +    }
> +    if (reopen_queue) {
> +        bdrv_reopen_multiple(reopen_queue, &local_err);

Is it valid to make a no-op call, such as:

{ "execute":"block-commit", "arguments":{
  "device":"drive0", "top":"base", "base":"base" }}

If so, should we do an early exit here, rather than temporarily changing
base to R/W just to change it back to R/O?

If not, should we be rejecting it up front (again, back to the question
of failing if 'base' is not a backing file of 'top', even if both 'top'
and 'base' are backing files of 'device').

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit
  2012-09-25 18:02   ` Eric Blake
@ 2012-09-25 18:53     ` Jeff Cody
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 18:53 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, pbonzini, qemu-devel

On 09/25/2012 02:02 PM, Eric Blake wrote:
> On 09/25/2012 10:29 AM, Jeff Cody wrote:
>> Derived from the streaming test cases (030), this adds the
>> following tests:
>>
>> 1. For the following image chain, commit [mid] into [backing],
>>    and use qemu-io to verify [backing] has its original data, as
>>    well as the data from [mid]
>>
>>            [backing] <-- [mid] <-- [test]
>>
>> 2. Verifies that 'block-commit' with the 'speed' parameter sets the
>>    speed parameter, as reported by 'query-block-jobs'
>>
>> 3. Verifies that a bogus 'device' parameter to 'block-commit'
>>    results in error
> 
> I think you are missing a test; you should also verify that:
> 
> { "command":"block-commit", "arguments":{
>   "device":"drive0", "base":"mid", "top":"backing" } }
> 
> properly fails, since 'mid' is not a backing file of 'backing'.  I saw
> code in patch 1/7 that bdrv_drop_intermediate() should detect the
> situation, but I'm not confident enough in my reading of patch 2/7 to
> know if that detection point was early enough, or whether the coroutine
> stuff in 2/7 ends up corrupting 'mid' prior to failure.
> 

Good idea.  This seems like a good test to have in place.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality
  2012-09-25 18:12   ` Eric Blake
@ 2012-09-25 18:58     ` Jeff Cody
  2012-09-25 19:05       ` Eric Blake
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 18:58 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, pbonzini, qemu-devel

On 09/25/2012 02:12 PM, Eric Blake wrote:
> On 09/25/2012 10:29 AM, Jeff Cody wrote:
>> This adds the live commit coroutine.  This iteration focuses on the
>> commit only below the active layer, and not the active layer itself.
>>
>> The behaviour is similar to block streaming; the sectors are walked
>> through, and anything that exists above 'base' is committed back down
>> into base.  At the end, intermediate images are deleted, and the
>> chain stitched together.  Images are restored to their original open
>> flags upon completion.
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
> 
>> +void commit_start(BlockDriverState *bs, BlockDriverState *base,
>> +                 BlockDriverState *top, int64_t speed,
>> +                 BlockErrorAction on_error, BlockDriverCompletionFunc *cb,
>> +                 void *opaque, Error **errp)
>> +{
>> +    CommitBlockJob *s;
>> +    BlockReopenQueue *reopen_queue = NULL;
>> +    int orig_overlay_flags;
>> +    int orig_base_flags;
>> +    BlockDriverState *overlay_bs;
>> +    Error *local_err = NULL;
>> +
>> +    if ((on_error == BLOCK_ERR_STOP_ANY ||
>> +         on_error == BLOCK_ERR_STOP_ENOSPC) &&
>> +        !bdrv_iostatus_is_enabled(bs)) {
>> +        error_set(errp, QERR_INVALID_PARAMETER_COMBINATION);
>> +        return;
>> +    }
>> +
>> +    overlay_bs = bdrv_find_overlay(bs, top);
>> +
>> +    if (overlay_bs == NULL) {
>> +        error_setg(errp, "Could not find overlay image for %s:", top->filename);
>> +        return;
>> +    }
>> +
>> +    orig_base_flags    = bdrv_get_flags(base);
>> +    orig_overlay_flags = bdrv_get_flags(overlay_bs);
> 
> I think you are missing a check here that base is on the backing chain
> of top.  See also my comments to 5/7.
> 

Did you mean your comments on 6/7 (or am I missing an email)?

This does get partially validated in patch 5/7, in the
qmp_block_commit() handler - both base and top are verified to be in the
chain 'bs'.  What is not validated, however, is that you did not swap
your 'top' and 'base' arguments.  I'll add a check here for that, to
make sure that base is reachable from overlay_bs.


>> +
>> +    /* convert base_bs & overlay_bs to r/w, if necessary */
>> +    if (!(orig_base_flags & BDRV_O_RDWR)) {
>> +        reopen_queue = bdrv_reopen_queue(reopen_queue, base,
>> +                                         orig_base_flags | BDRV_O_RDWR);
>> +    }
>> +    if (!(orig_overlay_flags & BDRV_O_RDWR)) {
>> +        reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs,
>> +                                         orig_overlay_flags | BDRV_O_RDWR);
>> +    }
>> +    if (reopen_queue) {
>> +        bdrv_reopen_multiple(reopen_queue, &local_err);
> 
> Is it valid to make a no-op call, such as:
> 
> { "execute":"block-commit", "arguments":{
>   "device":"drive0", "top":"base", "base":"base" }}
> 
> If so, should we do an early exit here, rather than temporarily changing
> base to R/W just to change it back to R/O?
> 
> If not, should we be rejecting it up front (again, back to the question
> of failing if 'base' is not a backing file of 'top', even if both 'top'
> and 'base' are backing files of 'device').
> 

I'll add a quick check for that as well.  Patch 5 checks for it in one
scenario (default 'top'), and returns a generic error of:
"Invalid files for merge: top and base are the same"

I'll make sure to check for that in all scenarios (or just make the test
mentioned earlier incorporate top == base as well)

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality
  2012-09-25 18:58     ` Jeff Cody
@ 2012-09-25 19:05       ` Eric Blake
  0 siblings, 0 replies; 25+ messages in thread
From: Eric Blake @ 2012-09-25 19:05 UTC (permalink / raw)
  To: jcody; +Cc: kwolf, pbonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]

On 09/25/2012 12:58 PM, Jeff Cody wrote:
> On 09/25/2012 02:12 PM, Eric Blake wrote:
>> On 09/25/2012 10:29 AM, Jeff Cody wrote:
>>> This adds the live commit coroutine.  This iteration focuses on the
>>> commit only below the active layer, and not the active layer itself.
>>>

>> I think you are missing a check here that base is on the backing chain
>> of top.  See also my comments to 5/7.
>>
> 
> Did you mean your comments on 6/7 (or am I missing an email)?

Shoot.  You're right, and I messed myself up by reviewing out of order :)

> 
> This does get partially validated in patch 5/7, in the
> qmp_block_commit() handler - both base and top are verified to be in the
> chain 'bs'.  What is not validated, however, is that you did not swap
> your 'top' and 'base' arguments.  I'll add a check here for that, to
> make sure that base is reachable from overlay_bs.

At any rate, it sounds like I got my point across, in spite of myself.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain Jeff Cody
@ 2012-09-25 19:13   ` Eric Blake
  2012-09-25 19:45     ` Jeff Cody
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Blake @ 2012-09-25 19:13 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 708 bytes --]

On 09/25/2012 10:29 AM, Jeff Cody wrote:
> This is a simple helper function, that will return the base image
> of a given image chain.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block.c | 16 ++++++++++++++++
>  block.h |  1 +
>  2 files changed, 17 insertions(+)

Bikeshed question: Any reason patch 1/7 adds two helper functions, and
patch 4/7 adds a third?  Perhaps it would be worth squashing all three
into one commit, or else having three commits, one per helper?  But
there's no point in repainting things now, so:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit' Jeff Cody
@ 2012-09-25 19:42   ` Eric Blake
  2012-09-25 19:57     ` Jeff Cody
  2012-09-26 14:13   ` Kevin Wolf
  1 sibling, 1 reply; 25+ messages in thread
From: Eric Blake @ 2012-09-25 19:42 UTC (permalink / raw)
  To: Jeff Cody; +Cc: kwolf, pbonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 4038 bytes --]

On 09/25/2012 10:29 AM, Jeff Cody wrote:
> The command for live block commit is added, which has the following
> arguments:
> 
> device: the block device to perform the commit on (mandatory)
> base:   the base image to commit into; optional (if not specified,
>         it is the underlying original image)
> top:    the top image of the commit - all data from inside top down
>         to base will be committed into base. optional (if not specified,
>         it is one below the active image) - see note below

This says that for 'base' <- 'mid' <- 'active', omitting 'top' is the
same as specifying 'top':'mid'.

> speed:  maximum speed, in bytes/sec
> 
> note: eventually this will support merging down the active layer,
>       but that code is not yet complete.  If the active layer is passed
>       in currently as top, or top is left to the default, then an error
>       will be returned.

This says that for 'base' <- 'mid' <- 'active', omitting 'top' is an
error (because it would be the same as an explicit 'top:'active').

Let's check the code...

> +    if (top && has_top) {
> +        /* if we want to allow the active layer,
> +         * use 'bdrv_find_image()' here */
> +        top_bs = bdrv_find_backing_image(bs, top);
> +    } else {
> +        /* default is one below the active layer, unless that is
> +         * the base */
> +        top_bs = bs->backing_hd;

Aha - the former is correct as coded (you default to 'top':'mid' in my
example), so the 'note' in your commit message needs editing.

On the other hand, when we ever DO get around to adding live commit,
which is the saner default?  That is, am I more likely to want to do
live commit from 'active', or more likely to do commit of the layer
below 'active'?  If live commit is the more desirable case, then that
argues that THIS commit should always error out if !has_top, and that
the future patch will default top_bs = bs, and the rest of your commit
message and documentation would need tweaking accordingly.

I don't have a preference either way (libvirt can arrange to always pass
the top argument, and thus avoid the confusion when top is omitted), but
if anyone else cares, now is the time to get the API right before we are
locked in to it.

> +++ b/qapi-schema.json
> @@ -1468,6 +1468,41 @@
>    'returns': 'str' }
>  
>  ##
> +# @block-commit
> +#
> +# Live commit of data from overlay image nodes into backing nodes - i.e.,
> +# writes data between 'top' and 'base' into 'base'.
> +#
> +# @device:  the name of the device
> +#
> +# @base:   #optional The file name of the backing image to write data into.
> +#                    If not specified, this is the deepest backing image
> +#
> +# @top:    #optional The file name of the backing image within the image chain,
> +#                    which contains the topmost data to be committed down.
> +#                    If not specified, this is one layer below the active
> +#                    layer (i.e. active->backing_hd).
> +#
> +#                    If top == base, that is an error.
> +#

This documentation of @top matches the first paragraph of your commit
message.

> +#
> +# @speed:  #optional the maximum speed, in bytes per second
> +#
> +# Returns: Nothing on success
> +#          If commit or stream is already active on this device, DeviceInUse
> +#          If @device does not exist, DeviceNotFound
> +#          If image commit is not supported by this device, NotSupported
> +#          If @base does not exist, a generic error is returned
> +#          If @top does not exist, a generic error is returned

These two lines could be merged, or even made more generic (for example,
based on the rest of this thread, you will also be erroring out if base
and top exist, but appear as swapped arguments):

If @base or @top is invalid, a generic error is returned

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain
  2012-09-25 19:13   ` Eric Blake
@ 2012-09-25 19:45     ` Jeff Cody
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 19:45 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, pbonzini, qemu-devel

On 09/25/2012 03:13 PM, Eric Blake wrote:
> On 09/25/2012 10:29 AM, Jeff Cody wrote:
>> This is a simple helper function, that will return the base image
>> of a given image chain.
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
>> ---
>>  block.c | 16 ++++++++++++++++
>>  block.h |  1 +
>>  2 files changed, 17 insertions(+)
> 
> Bikeshed question: Any reason patch 1/7 adds two helper functions, and
> patch 4/7 adds a third?  Perhaps it would be worth squashing all three
> into one commit, or else having three commits, one per helper?  But
> there's no point in repainting things now, so:
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
> 

The reasoning was the helper functions in 1/7 were to support block
commit (patch 2/7), and the helper function in 4/7 was to support the
block commit QMP command (patch 5/7), so it seemed logical to split
them.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-25 19:42   ` Eric Blake
@ 2012-09-25 19:57     ` Jeff Cody
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-25 19:57 UTC (permalink / raw)
  To: Eric Blake; +Cc: kwolf, pbonzini, qemu-devel

On 09/25/2012 03:42 PM, Eric Blake wrote:
> On 09/25/2012 10:29 AM, Jeff Cody wrote:
>> The command for live block commit is added, which has the following
>> arguments:
>>
>> device: the block device to perform the commit on (mandatory)
>> base:   the base image to commit into; optional (if not specified,
>>         it is the underlying original image)
>> top:    the top image of the commit - all data from inside top down
>>         to base will be committed into base. optional (if not specified,
>>         it is one below the active image) - see note below
> 
> This says that for 'base' <- 'mid' <- 'active', omitting 'top' is the
> same as specifying 'top':'mid'.
> 

That is the intended default.

>> speed:  maximum speed, in bytes/sec
>>
>> note: eventually this will support merging down the active layer,
>>       but that code is not yet complete.  If the active layer is passed
>>       in currently as top, or top is left to the default, then an error
>>       will be returned.
> 
> This says that for 'base' <- 'mid' <- 'active', omitting 'top' is an
> error (because it would be the same as an explicit 'top:'active').

This is how I originally had it, and then I changed it to default to
top->backing_hd.  I apparently did not remember to catch the note,
however.

> 
> Let's check the code...
> 
>> +    if (top && has_top) {
>> +        /* if we want to allow the active layer,
>> +         * use 'bdrv_find_image()' here */
>> +        top_bs = bdrv_find_backing_image(bs, top);
>> +    } else {
>> +        /* default is one below the active layer, unless that is
>> +         * the base */
>> +        top_bs = bs->backing_hd;
> 
> Aha - the former is correct as coded (you default to 'top':'mid' in my
> example), so the 'note' in your commit message needs editing.
> 
> On the other hand, when we ever DO get around to adding live commit,
> which is the saner default?  That is, am I more likely to want to do
> live commit from 'active', or more likely to do commit of the layer
> below 'active'?  If live commit is the more desirable case, then that
> argues that THIS commit should always error out if !has_top, and that
> the future patch will default top_bs = bs, and the rest of your commit
> message and documentation would need tweaking accordingly.
> 
> I don't have a preference either way (libvirt can arrange to always pass
> the top argument, and thus avoid the confusion when top is omitted), but
> if anyone else cares, now is the time to get the API right before we are
> locked in to it.
>

I guess I don't have a strong preference either - I originally had it
the other way, but then that meant the default in the current
implementation was actually an error.

Also, I assumed (danger!) that the most common use of commit would be a
snapshot, followed by a commit of active->backing_hd. With that
assumption, it seemed like a sane default.

>> +++ b/qapi-schema.json
>> @@ -1468,6 +1468,41 @@
>>    'returns': 'str' }
>>  
>>  ##
>> +# @block-commit
>> +#
>> +# Live commit of data from overlay image nodes into backing nodes - i.e.,
>> +# writes data between 'top' and 'base' into 'base'.
>> +#
>> +# @device:  the name of the device
>> +#
>> +# @base:   #optional The file name of the backing image to write data into.
>> +#                    If not specified, this is the deepest backing image
>> +#
>> +# @top:    #optional The file name of the backing image within the image chain,
>> +#                    which contains the topmost data to be committed down.
>> +#                    If not specified, this is one layer below the active
>> +#                    layer (i.e. active->backing_hd).
>> +#
>> +#                    If top == base, that is an error.
>> +#
> 
> This documentation of @top matches the first paragraph of your commit
> message.
> 
>> +#
>> +# @speed:  #optional the maximum speed, in bytes per second
>> +#
>> +# Returns: Nothing on success
>> +#          If commit or stream is already active on this device, DeviceInUse
>> +#          If @device does not exist, DeviceNotFound
>> +#          If image commit is not supported by this device, NotSupported
>> +#          If @base does not exist, a generic error is returned
>> +#          If @top does not exist, a generic error is returned
> 
> These two lines could be merged, or even made more generic (for example,
> based on the rest of this thread, you will also be erroring out if base
> and top exist, but appear as swapped arguments):
> 
> If @base or @top is invalid, a generic error is returned
> 

OK

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images.
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images Jeff Cody
@ 2012-09-26 13:53   ` Kevin Wolf
  0 siblings, 0 replies; 25+ messages in thread
From: Kevin Wolf @ 2012-09-26 13:53 UTC (permalink / raw)
  To: Jeff Cody; +Cc: pbonzini, eblake, qemu-devel

Am 25.09.2012 18:29, schrieb Jeff Cody:
> Add bdrv_find_overlay(), and bdrv_drop_intermediate().
> 
> bdrv_find_overlay():  given 'bs' and the active (topmost) BDS of an image chain,
>                     find the image that is the immediate top of 'bs'
> 
> bdrv_drop_intermediate():
>                     Given 3 BDS (active, top, base), drop images above
>                     base up to and including top, and set base to be the
>                     backing file of top's overlay node.
> 
>                     E.g., this converts:
> 
>                     bottom <- base <- intermediate <- top <- active
> 
>                     to
> 
>                     bottom <- base <- active
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>

> +/*
> + * Drops images above 'base' up to and including 'top', and sets the image
> + * above 'top' to have base as its backing file.
> + *
> + * Requires that the overlay to 'top' is opened r/w, so that the backing file
> + * information in 'bs' can be properly updated.
> + *
> + * E.g., this will convert the following chain:
> + * bottom <- base <- intermediate <- top <- active
> + *
> + * to
> + *
> + * bottom <- base <- active
> + *
> + * It is allowed for bottom==base, in which case it converts:
> + *
> + * base <- intermediate <- top <- active
> + *
> + * to
> + *
> + * base <- active

Compared to RFC v2 you deleted this part of the comment:

  + *
  + * It is also allowed for top==active, except in that case active is not
  + * deleted:

You describe the condition now as an error (which I think is what our
conclusion on IRC was), but now I think the following example must be
removed as well:

> + *
> + * base <- intermediate <- top
> + *
> + * becomes
> + *
> + * base <- top
> + *
> + * Error conditions:
> + *  if active == top, that is considered an error
> + *
> + */
> +int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
> +                           BlockDriverState *base)
> +{
> +    BlockDriverState *intermediate;
> +    BlockDriverState *base_bs = NULL;
> +    BlockDriverState *new_top_bs = NULL;
> +    BlkIntermediateStates *intermediate_state, *next;
> +    int ret = -EIO;
> +
> +    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> +    QSIMPLEQ_INIT(&states_to_delete);
> +
> +    if (!top->drv || !base->drv) {
> +        goto exit;
> +    }
> +
> +    new_top_bs = bdrv_find_overlay(active, top);
> +
> +    if (new_top_bs == NULL) {
> +        /* we could not find the image above 'top', this is an error */
> +        goto exit;
> +    }
> +
> +    /* special case of new_top_bs->backing_hd already pointing to base - nothing
> +     * to do, no intermediate images */
> +    if (new_top_bs->backing_hd == base) {
> +        ret = 0;
> +        goto exit;
> +    }
> +
> +    intermediate = new_top_bs->backing_hd;  /* this should be the same
> +                                               as 'top' */

Should? Is it the same or not? If yes, you can write it as an assert();
it no, the comment is only confusing.

(In fact it's obvious enough that I'd just go with intermediate = top;
without any comment)

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality Jeff Cody
  2012-09-25 18:12   ` Eric Blake
@ 2012-09-26 14:03   ` Kevin Wolf
  1 sibling, 0 replies; 25+ messages in thread
From: Kevin Wolf @ 2012-09-26 14:03 UTC (permalink / raw)
  To: Jeff Cody; +Cc: pbonzini, eblake, qemu-devel

Am 25.09.2012 18:29, schrieb Jeff Cody:
> This adds the live commit coroutine.  This iteration focuses on the
> commit only below the active layer, and not the active layer itself.
> 
> The behaviour is similar to block streaming; the sectors are walked
> through, and anything that exists above 'base' is committed back down
> into base.  At the end, intermediate images are deleted, and the
> chain stitched together.  Images are restored to their original open
> flags upon completion.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit' Jeff Cody
  2012-09-25 19:42   ` Eric Blake
@ 2012-09-26 14:13   ` Kevin Wolf
  2012-09-26 14:25     ` Jeff Cody
  1 sibling, 1 reply; 25+ messages in thread
From: Kevin Wolf @ 2012-09-26 14:13 UTC (permalink / raw)
  To: Jeff Cody; +Cc: pbonzini, eblake, qemu-devel

Am 25.09.2012 18:29, schrieb Jeff Cody:
> The command for live block commit is added, which has the following
> arguments:
> 
> device: the block device to perform the commit on (mandatory)
> base:   the base image to commit into; optional (if not specified,
>         it is the underlying original image)
> top:    the top image of the commit - all data from inside top down
>         to base will be committed into base. optional (if not specified,
>         it is one below the active image) - see note below
> speed:  maximum speed, in bytes/sec
> 
> note: eventually this will support merging down the active layer,
>       but that code is not yet complete.  If the active layer is passed
>       in currently as top, or top is left to the default, then an error
>       will be returned.
> 
> The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
> be emitted.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>

> diff --git a/qapi-schema.json b/qapi-schema.json
> index 14e4419..e614453 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -1468,6 +1468,41 @@
>    'returns': 'str' }
>  
>  ##
> +# @block-commit
> +#
> +# Live commit of data from overlay image nodes into backing nodes - i.e.,
> +# writes data between 'top' and 'base' into 'base'.
> +#
> +# @device:  the name of the device
> +#
> +# @base:   #optional The file name of the backing image to write data into.
> +#                    If not specified, this is the deepest backing image
> +#
> +# @top:    #optional The file name of the backing image within the image chain,
> +#                    which contains the topmost data to be committed down.
> +#                    If not specified, this is one layer below the active
> +#                    layer (i.e. active->backing_hd).

Why isn't active the default any more? I know, we don't support it yet,
but long term this is what makes most sense as a default.

> +#
> +#                    If top == base, that is an error.
> +#
> +#
> +# @speed:  #optional the maximum speed, in bytes per second
> +#
> +# Returns: Nothing on success
> +#          If commit or stream is already active on this device, DeviceInUse
> +#          If @device does not exist, DeviceNotFound
> +#          If image commit is not supported by this device, NotSupported
> +#          If @base does not exist, a generic error is returned
> +#          If @top does not exist, a generic error is returned
> +#          If @speed is invalid, InvalidParameter
> +#
> +# Since: 1.3
> +#
> +##
> +{ 'command': 'block-commit',
> +  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
> +            '*speed': 'int' } }

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only
  2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only Jeff Cody
@ 2012-09-26 14:20   ` Kevin Wolf
  2012-09-26 14:21     ` Jeff Cody
  0 siblings, 1 reply; 25+ messages in thread
From: Kevin Wolf @ 2012-09-26 14:20 UTC (permalink / raw)
  To: Jeff Cody; +Cc: pbonzini, eblake, qemu-devel

Am 25.09.2012 18:29, schrieb Jeff Cody:
> Currently, after a live snapshot of a drive, the image that has
> been 'demoted' to be below the new active layer remains r/w.
> This patch reopens it read-only.
> 
> Note that we do not check for error on the reopen(), because we
> will not abort the snapshots if the reopen fails.
> 
> This patch depends on the bdrv_reopen() series.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>

This should be independent from the live commit patches, so I already
applied this one to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only
  2012-09-26 14:20   ` Kevin Wolf
@ 2012-09-26 14:21     ` Jeff Cody
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-26 14:21 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: pbonzini, eblake, qemu-devel

On 09/26/2012 10:20 AM, Kevin Wolf wrote:
> Am 25.09.2012 18:29, schrieb Jeff Cody:
>> Currently, after a live snapshot of a drive, the image that has
>> been 'demoted' to be below the new active layer remains r/w.
>> This patch reopens it read-only.
>>
>> Note that we do not check for error on the reopen(), because we
>> will not abort the snapshots if the reopen fails.
>>
>> This patch depends on the bdrv_reopen() series.
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
> 
> This should be independent from the live commit patches, so I already
> applied this one to the block branch.
> 
> Kevin
> 

Thanks

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-26 14:13   ` Kevin Wolf
@ 2012-09-26 14:25     ` Jeff Cody
  2012-09-26 14:33       ` Kevin Wolf
  2012-09-26 14:34       ` Eric Blake
  0 siblings, 2 replies; 25+ messages in thread
From: Jeff Cody @ 2012-09-26 14:25 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: pbonzini, eblake, qemu-devel

On 09/26/2012 10:13 AM, Kevin Wolf wrote:
> Am 25.09.2012 18:29, schrieb Jeff Cody:
>> The command for live block commit is added, which has the following
>> arguments:
>>
>> device: the block device to perform the commit on (mandatory)
>> base:   the base image to commit into; optional (if not specified,
>>         it is the underlying original image)
>> top:    the top image of the commit - all data from inside top down
>>         to base will be committed into base. optional (if not specified,
>>         it is one below the active image) - see note below
>> speed:  maximum speed, in bytes/sec
>>
>> note: eventually this will support merging down the active layer,
>>       but that code is not yet complete.  If the active layer is passed
>>       in currently as top, or top is left to the default, then an error
>>       will be returned.
>>
>> The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
>> be emitted.
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
> 
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 14e4419..e614453 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -1468,6 +1468,41 @@
>>    'returns': 'str' }
>>  
>>  ##
>> +# @block-commit
>> +#
>> +# Live commit of data from overlay image nodes into backing nodes - i.e.,
>> +# writes data between 'top' and 'base' into 'base'.
>> +#
>> +# @device:  the name of the device
>> +#
>> +# @base:   #optional The file name of the backing image to write data into.
>> +#                    If not specified, this is the deepest backing image
>> +#
>> +# @top:    #optional The file name of the backing image within the image chain,
>> +#                    which contains the topmost data to be committed down.
>> +#                    If not specified, this is one layer below the active
>> +#                    layer (i.e. active->backing_hd).
> 
> Why isn't active the default any more? I know, we don't support it yet,
> but long term this is what makes most sense as a default.
> 

Eric had a similar question, and asked if anyone had any preference -
this was my response:

---

I guess I don't have a strong preference either - I originally had it
the other way, but then that meant the default in the current
implementation was actually an error.

Also, I assumed (danger!) that the most common use of commit would be a
snapshot, followed by a commit of active->backing_hd. With that
assumption, it seemed like a sane default.

---

I can certainly revert back to having the active layer be the top, if
that is the preference.

>> +#
>> +#                    If top == base, that is an error.
>> +#
>> +#
>> +# @speed:  #optional the maximum speed, in bytes per second
>> +#
>> +# Returns: Nothing on success
>> +#          If commit or stream is already active on this device, DeviceInUse
>> +#          If @device does not exist, DeviceNotFound
>> +#          If image commit is not supported by this device, NotSupported
>> +#          If @base does not exist, a generic error is returned
>> +#          If @top does not exist, a generic error is returned
>> +#          If @speed is invalid, InvalidParameter
>> +#
>> +# Since: 1.3
>> +#
>> +##
>> +{ 'command': 'block-commit',
>> +  'data': { 'device': 'str', '*base': 'str', '*top': 'str',
>> +            '*speed': 'int' } }
> 
> Kevin
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-26 14:25     ` Jeff Cody
@ 2012-09-26 14:33       ` Kevin Wolf
  2012-09-26 14:34       ` Eric Blake
  1 sibling, 0 replies; 25+ messages in thread
From: Kevin Wolf @ 2012-09-26 14:33 UTC (permalink / raw)
  To: jcody; +Cc: pbonzini, eblake, qemu-devel

Am 26.09.2012 16:25, schrieb Jeff Cody:
> On 09/26/2012 10:13 AM, Kevin Wolf wrote:
>> Am 25.09.2012 18:29, schrieb Jeff Cody:
>>> The command for live block commit is added, which has the following
>>> arguments:
>>>
>>> device: the block device to perform the commit on (mandatory)
>>> base:   the base image to commit into; optional (if not specified,
>>>         it is the underlying original image)
>>> top:    the top image of the commit - all data from inside top down
>>>         to base will be committed into base. optional (if not specified,
>>>         it is one below the active image) - see note below
>>> speed:  maximum speed, in bytes/sec
>>>
>>> note: eventually this will support merging down the active layer,
>>>       but that code is not yet complete.  If the active layer is passed
>>>       in currently as top, or top is left to the default, then an error
>>>       will be returned.
>>>
>>> The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
>>> be emitted.
>>>
>>> Signed-off-by: Jeff Cody <jcody@redhat.com>
>>
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index 14e4419..e614453 100644
>>> --- a/qapi-schema.json
>>> +++ b/qapi-schema.json
>>> @@ -1468,6 +1468,41 @@
>>>    'returns': 'str' }
>>>  
>>>  ##
>>> +# @block-commit
>>> +#
>>> +# Live commit of data from overlay image nodes into backing nodes - i.e.,
>>> +# writes data between 'top' and 'base' into 'base'.
>>> +#
>>> +# @device:  the name of the device
>>> +#
>>> +# @base:   #optional The file name of the backing image to write data into.
>>> +#                    If not specified, this is the deepest backing image
>>> +#
>>> +# @top:    #optional The file name of the backing image within the image chain,
>>> +#                    which contains the topmost data to be committed down.
>>> +#                    If not specified, this is one layer below the active
>>> +#                    layer (i.e. active->backing_hd).
>>
>> Why isn't active the default any more? I know, we don't support it yet,
>> but long term this is what makes most sense as a default.
>>
> 
> Eric had a similar question, and asked if anyone had any preference -
> this was my response:
> 
> ---
> 
> I guess I don't have a strong preference either - I originally had it
> the other way, but then that meant the default in the current
> implementation was actually an error.

We can make it non-optional for now and use active as the default once
we introduce support for committing the active layer.

> Also, I assumed (danger!) that the most common use of commit would be a
> snapshot, followed by a commit of active->backing_hd. With that
> assumption, it seemed like a sane default.
> 
> ---
> 
> I can certainly revert back to having the active layer be the top, if
> that is the preference.

I think it is, if nothing else for consistency with the existing
synchronous 'commit' command.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit'
  2012-09-26 14:25     ` Jeff Cody
  2012-09-26 14:33       ` Kevin Wolf
@ 2012-09-26 14:34       ` Eric Blake
  1 sibling, 0 replies; 25+ messages in thread
From: Eric Blake @ 2012-09-26 14:34 UTC (permalink / raw)
  To: jcody; +Cc: Kevin Wolf, pbonzini, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]

On 09/26/2012 08:25 AM, Jeff Cody wrote:

>>> +# @top:    #optional The file name of the backing image within the image chain,
>>> +#                    which contains the topmost data to be committed down.
>>> +#                    If not specified, this is one layer below the active
>>> +#                    layer (i.e. active->backing_hd).
>>
>> Why isn't active the default any more? I know, we don't support it yet,
>> but long term this is what makes most sense as a default.
>>
> 
> Eric had a similar question, and asked if anyone had any preference -
> this was my response:
> 
> ---
> 
> I guess I don't have a strong preference either - I originally had it
> the other way, but then that meant the default in the current
> implementation was actually an error.
> 
> Also, I assumed (danger!) that the most common use of commit would be a
> snapshot, followed by a commit of active->backing_hd. With that
> assumption, it seemed like a sane default.

Actually, I envision my personal use case to be:

Take a snapshot, do an experiment in the guest (such as install a
questionable package), and then either roll back to the snapshot
(experiment failed) or commit the active (experiment worked, no need to
have a snapshot any more); either way, taking the snapshot created a new
temporary file name, and after I make my decision on whether to commit
or discard the snapshot, I want to get back to the original file name.
Since the snapshot effectively created a temporary file name, I'd rather
not have to know the name of that temporary file just to pass it to an
explicit 'top' argument when committing the active layer.

Having to specify 'top' to avoid an error when not committing the active
layer is not as bad as getting the defaults wrong for when we do add
support for committing the active layer.

> ---
> 
> I can certainly revert back to having the active layer be the top, if
> that is the preference.

Given that it has been asked again, I'd say yes, go ahead and revert to
the behavior of defaulting 'top' to the active layer.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-09-26 14:34 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-25 16:29 [Qemu-devel] [PATCH v2 0/7] Live block commit Jeff Cody
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 1/7] block: add support functions for live commit, to find and delete images Jeff Cody
2012-09-26 13:53   ` Kevin Wolf
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 2/7] block: add live block commit functionality Jeff Cody
2012-09-25 18:12   ` Eric Blake
2012-09-25 18:58     ` Jeff Cody
2012-09-25 19:05       ` Eric Blake
2012-09-26 14:03   ` Kevin Wolf
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 3/7] blockdev: rename block_stream_cb to a generic block_job_cb Jeff Cody
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 4/7] block: helper function, to find the base image of a chain Jeff Cody
2012-09-25 19:13   ` Eric Blake
2012-09-25 19:45     ` Jeff Cody
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 5/7] QAPI: add command for live block commit, 'block-commit' Jeff Cody
2012-09-25 19:42   ` Eric Blake
2012-09-25 19:57     ` Jeff Cody
2012-09-26 14:13   ` Kevin Wolf
2012-09-26 14:25     ` Jeff Cody
2012-09-26 14:33       ` Kevin Wolf
2012-09-26 14:34       ` Eric Blake
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 6/7] qemu-iotests: add initial tests for live block commit Jeff Cody
2012-09-25 18:02   ` Eric Blake
2012-09-25 18:53     ` Jeff Cody
2012-09-25 16:29 ` [Qemu-devel] [PATCH v2 7/7] block: after creating a live snapshot, make old image read-only Jeff Cody
2012-09-26 14:20   ` Kevin Wolf
2012-09-26 14:21     ` Jeff Cody

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).