qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints
@ 2015-05-21  4:52 Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 01/15] docs: block replication's description Wen Congyang
                   ` (16 more replies)
  0 siblings, 17 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Stefan Hajnoczi, Yang Hongyang

Block replication is a very important feature which is used for
continuous checkpoints(for example: COLO).

Usage:
Please refer to docs/block-replication.txt

You can get the patch here:
https://github.com/wencongyang/qemu-colo/commits/block-replication-v5

The other newest COLO patchse will be sent soon.

TODO:
1. Continuous block replication. It will be started after basic functions
   are accepted.

Changs Log:
V5:
1. Address the comments from Gong Lei
2. Speed the failover up. The secondary vm can take over very quickly even
   if there are too many I/O requests.
V4:
1. Introduce a new driver replication to avoid touch nbd and qcow2.
V3:
1: use error_setg() instead of error_set()
2. Add a new block job API
3. Active disk, hidden disk and nbd target uses the same AioContext
4. Add a testcase to test new hbitmap API
V2:
1. Redesign the secondary qemu(use image-fleecing)
2. Use Error objects to return error message
3. Address the comments from Max Reitz and Eric Blake
Wen Congyang (15):
  docs: block replication's description
  allow writing to the backing file
  Allow creating backup jobs when opening BDS
  block: Parse "backing_reference" option to reference existing BDS
  Backup: clear all bitmap when doing block checkpoint
  Don't allow a disk use backing reference target
  Add new block driver interface to connect/disconnect the remote target
  NBD client: implement block driver interfaces to connect/disconnect
    NBD server
  Introduce a new -drive option to control whether to connect to remote
    target
  NBD client: connect to nbd server later
  Add new block driver interfaces to control block replication
  skip nbd_target when starting block replication
  quorum: implement block driver interfaces for block replication
  quorum: allow ignoring child errors
  Implement new driver for block replication

 block.c                    | 272 +++++++++++++++++++++++++++-
 block/Makefile.objs        |   3 +-
 block/backup.c             |  13 ++
 block/nbd.c                |  69 +++++--
 block/quorum.c             | 142 ++++++++++++++-
 block/replication.c        | 441 +++++++++++++++++++++++++++++++++++++++++++++
 blockdev.c                 |   8 +
 blockjob.c                 |  10 +
 docs/block-replication.txt | 179 ++++++++++++++++++
 include/block/block.h      |  10 +
 include/block/block_int.h  |  18 ++
 include/block/blockjob.h   |  12 ++
 qapi/block.json            |  16 ++
 qemu-options.hx            |   4 +
 tests/qemu-iotests/051     |  13 ++
 tests/qemu-iotests/051.out |  13 ++
 16 files changed, 1193 insertions(+), 30 deletions(-)
 create mode 100644 block/replication.c
 create mode 100644 docs/block-replication.txt

-- 
2.1.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 01/15] docs: block replication's description
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
@ 2015-05-21  4:52 ` Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 02/15] allow writing to the backing file Wen Congyang
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 docs/block-replication.txt | 179 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 179 insertions(+)
 create mode 100644 docs/block-replication.txt

diff --git a/docs/block-replication.txt b/docs/block-replication.txt
new file mode 100644
index 0000000..a29f51a
--- /dev/null
+++ b/docs/block-replication.txt
@@ -0,0 +1,179 @@
+Block replication
+----------------------------------------
+Copyright Fujitsu, Corp. 2015
+Copyright (c) 2015 Intel Corporation
+Copyright (c) 2015 HUAWEI TECHNOLOGIES CO., LTD.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+Block replication is used for continuous checkpoints. It is designed
+for COLO (COurse-grain LOck-stepping) where the Secondary VM is running.
+It can also be applied for FT/HA (Fault-tolerance/High Assurance) scenario,
+where the Secondary VM is not running.
+
+This document gives an overview of block replication's design.
+
+== Background ==
+High availability solutions such as micro checkpoint and COLO will do
+consecutive checkpoints. The VM state of Primary VM and Secondary VM is
+identical right after a VM checkpoint, but becomes different as the VM
+executes till the next checkpoint. To support disk contents checkpoint,
+the modified disk contents in the Secondary VM must be buffered, and are
+only dropped at next checkpoint time. To reduce the network transportation
+effort at the time of checkpoint, the disk modification operations of
+Primary disk are asynchronously forwarded to the Secondary node.
+
+== Workflow ==
+The following is the image of block replication workflow:
+
+        +----------------------+            +------------------------+
+        |Primary Write Requests|            |Secondary Write Requests|
+        +----------------------+            +------------------------+
+                  |                                       |
+                  |                                      (4)
+                  |                                       V
+                  |                              /-------------\
+                  |      Copy and Forward        |             |
+                  |---------(1)----------+       | Disk Buffer |
+                  |                      |       |             |
+                  |                     (3)      \-------------/
+                  |                 speculative      ^
+                  |                write through    (2)
+                  |                      |           |
+                  V                      V           |
+           +--------------+           +----------------+
+           | Primary Disk |           | Secondary Disk |
+           +--------------+           +----------------+
+
+    1) Primary write requests will be copied and forwarded to Secondary
+       QEMU.
+    2) Before Primary write requests are written to Secondary disk, the
+       original sector content will be read from Secondary disk and
+       buffered in the Disk buffer, but it will not overwrite the existing
+       sector content(it could be from either "Secondary Write Requests" or
+       previous COW of "Primary Write Requests") in the Disk buffer.
+    3) Primary write requests will be written to Secondary disk.
+    4) Secondary write requests will be buffered in the Disk buffer and it
+       will overwrite the existing sector content in the buffer.
+
+== Architecture ==
+We are going to implement block replication from many basic
+blocks that are already in QEMU.
+
+         virtio-blk       ||
+             ^            ||                            .----------
+             |            ||                            | Secondary
+        1 Quorum          ||                            '----------
+         /      \         ||
+        /        \        ||
+   Primary    2 filter
+     disk         ^                                                             virtio-blk
+                  |                                                                  ^
+                3 NBD  ------->  3 NBD                                               |
+                client    ||     server                                          2 filter
+                          ||        ^                                                ^
+--------.                 ||        |                                                |
+Primary |                 ||  Secondary disk <--------- hidden-disk 5 <--------- active-disk 4
+--------'                 ||        |          backing        ^       backing
+                          ||        |                         |
+                          ||        |                         |
+                          ||        '-------------------------'
+                          ||           drive-backup sync=none
+
+1) The disk on the primary is represented by a block device with two
+children, providing replication between a primary disk and the host that
+runs the secondary VM. The read pattern for quorum can be extended to
+make the primary always read from the local disk instead of going through
+NBD.
+
+2) The new block filter(the name is replication) will control the block
+replication.
+
+3) The secondary disk receives writes from the primary VM through QEMU's
+embedded NBD server (speculative write-through).
+
+4) The disk on the secondary is represented by a custom block device
+(called active-disk). It should be an empty disk, and the format should
+support bdrv_make_empty() and backing file.
+
+5) The hidden-disk is created automatically. It buffers the original content
+that is modified by the primary VM. It should also be an empty disk, and
+the driver supports bdrv_make_empty() and backing file.
+
+== Failure Handling ==
+There are 6 internal errors when block replication is running:
+1. I/O error on primary disk
+2. Forwarding primay write requests failed
+3. Bacup failed or writing to secondary disk failed
+4. I/O error on secondary disk
+5. I/O error on active disk
+6. Making active disk or hidden disk empty failed
+In case 1 and 5, we just report the error to the disk layer. In case 2, 3,
+4 and 6, we just report block replication's error to FT/HA manager(which
+decides when to do a new checkpoint, when to do failover).
+There is one internal error when doing failover:
+1. Commiting the data in active disk/hidden disk to secondary disk failed
+We just to report this error to FT/HA manager.
+
+== New block driver interface ==
+We add three block driver interfaces to control block replication:
+a. bdrv_start_replication()
+   Start block replication, called in migration/checkpoint thread.
+   We must call bdrv_start_replication() in secondary QEMU before
+   calling bdrv_start_replication() in primary QEMU. The caller
+   must hold the I/O mutex lock if it is in migration/checkpoint
+   thread.
+b. bdrv_do_checkpoint()
+   This interface is called after all VM state is transferred to
+   Secondary QEMU. The Disk buffer will be dropped in this interface.
+   The caller must hold the I/O mutex lock if it is in migration/checkpoint
+   thread.
+c. bdrv_stop_replication()
+   It is called on failover. We will flush the Disk buffer into
+   Secondary Disk and stop block replication. The vm should be stopped
+   before calling it. The caller must hold the I/O mutex lock if it is
+   in migration/checkpoint thread.
+
+== Usage ==
+Primary:
+  -drive if=xxx,driver=quorum,read-pattern=fifo,no-connect=on,\
+         children.0.file.filename=1.raw,\
+         children.0.driver=raw,\
+         children.1.file.driver=nbd,\
+         children.1.file.host=xxx,\
+         children.1.file.port=xxx,\
+         children.1.file.export=xxx,\
+         children.1.driver=replication,\
+         children.1.mode=primary,\
+         children.1.ignore-errors=on
+  Note:
+  1. NBD Client should not be the first child of quorum.
+  2. There should be only one NBD Client.
+  3. host is the secondary physical machine's hostname or IP
+  4. Each disk must have its own export name.
+  5. It is all a single argument to -drive, and you should ignore
+     the leading whitespace.
+
+Secondary:
+  -drive if=none,driver=raw,file=1.raw,id=nbd_target1 \
+  -drive if=xxx,driver=replication,mode=secondary,export=xxx,\
+         file.file.filename=active_disk.qcow2,\
+         file.driver=qcow2,\
+         file.backing_reference.drive_id=nbd_target1,\
+         file.backing_reference.hidden-disk.file.filename=hidden_disk.qcow2,\
+         file.backing_reference.hidden-disk.driver=qcow2,\
+         file.backing_reference.hidden-disk.allow-write-backing-file=on
+  Then run qmp command:
+    nbd-server-start host:port
+  Note:
+  1. The export name for the same disk must be the same in primary
+     and secondary QEMU command line
+  2. The qmp command nbd-server-start must be run before running the
+     qmp command migrate on primary QEMU
+  3. Don't use nbd-server-start's other options
+  4. Active disk, hidden disk and nbd target's length should be the
+     same.
+  5. It is better to put active disk and hidden disk in ramdisk.
+  6. It is all a single argument to -drive, and you should ignore
+     the leading whitespace.
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 02/15] allow writing to the backing file
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 01/15] docs: block replication's description Wen Congyang
@ 2015-05-21  4:52 ` Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 03/15] Allow creating backup jobs when opening BDS Wen Congyang
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 49d472f..adffa0e 100644
--- a/block.c
+++ b/block.c
@@ -1051,6 +1051,20 @@ out:
     bdrv_refresh_limits(bs, NULL);
 }
 
+#define ALLOW_WRITE_BACKING_FILE    "allow-write-backing-file"
+static QemuOptsList backing_file_opts = {
+    .name = "backing_file",
+    .head = QTAILQ_HEAD_INITIALIZER(backing_file_opts.head),
+    .desc = {
+        {
+            .name = ALLOW_WRITE_BACKING_FILE,
+            .type = QEMU_OPT_BOOL,
+            .help = "allow write to backing file",
+        },
+        { /* end of list */ }
+    },
+};
+
 /*
  * Opens the backing file for a BlockDriverState if not yet open
  *
@@ -1065,6 +1079,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
     int ret = 0;
     BlockDriverState *backing_hd;
     Error *local_err = NULL;
+    QemuOpts *opts = NULL;
+    int flags;
 
     if (bs->backing_hd != NULL) {
         QDECREF(options);
@@ -1077,6 +1093,21 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
     }
 
     bs->open_flags &= ~BDRV_O_NO_BACKING;
+    flags = bdrv_backing_flags(bs->open_flags);
+
+    opts = qemu_opts_create(&backing_file_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        ret = -EINVAL;
+        error_propagate(errp, local_err);
+        QDECREF(options);
+        goto free_exit;
+    }
+    if (qemu_opt_get_bool(opts, ALLOW_WRITE_BACKING_FILE, false)) {
+        flags |= BDRV_O_RDWR;
+    }
+    qemu_opts_del(opts);
+
     if (qdict_haskey(options, "file.filename")) {
         backing_filename[0] = '\0';
     } else if (bs->backing_file[0] == '\0' && qdict_size(options) == 0) {
@@ -1109,7 +1140,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
     assert(bs->backing_hd == NULL);
     ret = bdrv_open(&backing_hd,
                     *backing_filename ? backing_filename : NULL, NULL, options,
-                    bdrv_backing_flags(bs->open_flags), NULL, &local_err);
+                    flags, NULL, &local_err);
     if (ret < 0) {
         bdrv_unref(backing_hd);
         backing_hd = NULL;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 03/15] Allow creating backup jobs when opening BDS
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 01/15] docs: block replication's description Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 02/15] allow writing to the backing file Wen Congyang
@ 2015-05-21  4:52 ` Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 04/15] block: Parse "backing_reference" option to reference existing BDS Wen Congyang
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jeff Cody, Jiang Yunhong,
	Dong Eddie, Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi,
	Yang Hongyang, zhanghailiang

When opening BDS, we need to create backup jobs for
image-fleecing.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Jeff Cody <jcody@redhat.com>
---
 block/Makefile.objs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 0d8c2a4..8dd9b9c 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -21,10 +21,10 @@ block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
 block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o
 block-obj-y += write-threshold.o
+block-obj-y += backup.o
 
 common-obj-y += stream.o
 common-obj-y += commit.o
-common-obj-y += backup.o
 
 iscsi.o-cflags     := $(LIBISCSI_CFLAGS)
 iscsi.o-libs       := $(LIBISCSI_LIBS)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 04/15] block: Parse "backing_reference" option to reference existing BDS
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (2 preceding siblings ...)
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 03/15] Allow creating backup jobs when opening BDS Wen Congyang
@ 2015-05-21  4:52 ` Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 05/15] Backup: clear all bitmap when doing block checkpoint Wen Congyang
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Usage:
-drive file=xxx,id=Y, \
-drive file=xxxx,id=X,backing_reference.drive_id=Y,backing_reference.hidden-disk.*

It will create such backing chain:
               {virtio-blk dev 'Y'}                                      {virtio-blk dev 'X'}
                         |                                                          |
                         |                                                          |
                         v                                                          v

    [base] <- [mid] <- ( Y )  <----------------- (hidden target) <--------------- ( X )

                         v                              ^
                         v                              ^
                         v                              ^
                         v                              ^
                         >>>> drive-backup sync=none >>>>

X's backing file is hidden-disk, and hidden-disk's backing file is Y.
Disk Y may be opened or reopened in read-write mode, so A block backup
job is automatically created: source is Y and target is hidden disk.
Active disk X, hidden disk, and Y are all on the same AioContext.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block.c                    | 154 ++++++++++++++++++++++++++++++++++++++++++++-
 include/block/block.h      |   1 +
 include/block/block_int.h  |   1 +
 tests/qemu-iotests/051     |  13 ++++
 tests/qemu-iotests/051.out |  13 ++++
 5 files changed, 179 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index adffa0e..220ba9a 100644
--- a/block.c
+++ b/block.c
@@ -1157,6 +1157,119 @@ free_exit:
     return ret;
 }
 
+static void backing_reference_completed(void *opaque, int ret)
+{
+    BlockDriverState *hidden_disk = opaque;
+
+    assert(!hidden_disk->backing_reference);
+}
+
+static int bdrv_open_backing_reference_file(BlockDriverState *bs,
+                                            QDict *options, Error **errp)
+{
+    const char *backing_name;
+    QDict *hidden_disk_options = NULL;
+    BlockDriverState *backing_hd, *hidden_disk;
+    BlockBackend *backing_blk;
+    AioContext *aio_context;
+    Error *local_err = NULL;
+    int ret = 0;
+
+    backing_name = qdict_get_try_str(options, "drive_id");
+    if (!backing_name) {
+        error_setg(errp, "Backing reference needs option drive_id");
+        ret = -EINVAL;
+        goto free_exit;
+    }
+    qdict_del(options, "drive_id");
+
+    qdict_extract_subqdict(options, &hidden_disk_options, "hidden-disk.");
+    if (!qdict_size(hidden_disk_options)) {
+        error_setg(errp, "Backing reference needs option hidden-disk.*");
+        ret = -EINVAL;
+        goto free_exit;
+    }
+
+    if (qdict_size(options)) {
+        const QDictEntry *entry = qdict_first(options);
+        error_setg(errp, "Backing reference used by '%s' doesn't support "
+                   "the option '%s'", bdrv_get_device_name(bs), entry->key);
+        ret = -EINVAL;
+        goto free_exit;
+    }
+
+    backing_blk = blk_by_name(backing_name);
+    if (!backing_blk) {
+        error_setg(errp, "Device '%s' not found", backing_name);
+        ret = -ENOENT;
+        goto free_exit;
+    }
+
+    backing_hd = blk_bs(backing_blk);
+    /* Backing reference itself? */
+    if (backing_hd == bs || bdrv_find_overlay(backing_hd, bs)) {
+        error_setg(errp, "Backing reference itself");
+        ret = -EINVAL;
+        goto free_exit;
+    }
+
+    if (bdrv_op_is_blocked(backing_hd, BLOCK_OP_TYPE_BACKING_REFERENCE,
+                           errp)) {
+        ret = -EBUSY;
+        goto free_exit;
+    }
+
+    /* hidden-disk is bs's backing file */
+    ret = bdrv_open_backing_file(bs, hidden_disk_options, errp);
+    hidden_disk_options = NULL;
+    if (ret < 0) {
+        goto free_exit;
+    }
+
+    hidden_disk = bs->backing_hd;
+    if (!hidden_disk->drv || !hidden_disk->drv->supports_backing) {
+        ret = -EINVAL;
+        error_setg(errp, "Hidden disk's driver doesn't support backing files");
+        goto free_exit;
+    }
+
+    bdrv_set_backing_hd(hidden_disk, backing_hd);
+    bdrv_ref(backing_hd);
+
+    /*
+     * backing hd may be opened or reopened in read-write mode, so we
+     * should backup backing hd to hidden disk
+     */
+    bdrv_op_unblock(hidden_disk, BLOCK_OP_TYPE_BACKUP_TARGET,
+                    bs->backing_blocker);
+    bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE,
+                    hidden_disk->backing_blocker);
+
+    bdrv_ref(hidden_disk);
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+    bdrv_set_aio_context(backing_hd, aio_context);
+    backup_start(backing_hd, hidden_disk, 0, MIRROR_SYNC_MODE_NONE, NULL,
+                 BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
+                 backing_reference_completed, hidden_disk, &local_err);
+    aio_context_release(aio_context);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        bdrv_unref(hidden_disk);
+        /* FIXME, use which errno? */
+        ret = -EIO;
+        goto free_exit;
+    }
+
+    bs->backing_reference = true;
+
+free_exit:
+    QDECREF(hidden_disk_options);
+    QDECREF(options);
+    return ret;
+}
+
 /*
  * Opens a disk image whose options are given as BlockdevRef in another block
  * device's options.
@@ -1410,13 +1523,37 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
 
     /* If there is a backing file, use it */
     if ((flags & BDRV_O_NO_BACKING) == 0) {
-        QDict *backing_options;
+        QDict *backing_options, *backing_reference_options;
 
+        qdict_extract_subqdict(options, &backing_reference_options,
+                               "backing_reference.");
         qdict_extract_subqdict(options, &backing_options, "backing.");
-        ret = bdrv_open_backing_file(bs, backing_options, &local_err);
-        if (ret < 0) {
+
+        if (qdict_size(backing_reference_options) &&
+            qdict_size(backing_options)) {
+            error_setg(&local_err,
+                       "Option \"backing_reference.*\" and \"backing.*\""
+                       " cannot be used together");
+            ret = -EINVAL;
+            QDECREF(backing_reference_options);
+            QDECREF(backing_options);
             goto close_and_fail;
         }
+        if (qdict_size(backing_reference_options)) {
+            QDECREF(backing_options);
+            ret = bdrv_open_backing_reference_file(bs,
+                                                   backing_reference_options,
+                                                   &local_err);
+            if (ret) {
+                goto close_and_fail;
+            }
+        } else {
+            QDECREF(backing_reference_options);
+            ret = bdrv_open_backing_file(bs, backing_options, &local_err);
+            if (ret < 0) {
+                goto close_and_fail;
+            }
+        }
     }
 
     bdrv_refresh_filename(bs);
@@ -1747,6 +1884,14 @@ void bdrv_close(BlockDriverState *bs)
     if (bs->drv) {
         if (bs->backing_hd) {
             BlockDriverState *backing_hd = bs->backing_hd;
+            if (bs->backing_reference) {
+                assert(backing_hd->backing_hd);
+                if (backing_hd->backing_hd->job) {
+                    block_job_cancel(backing_hd->backing_hd->job);
+                }
+                bdrv_set_backing_hd(backing_hd, NULL);
+                bdrv_unref(backing_hd->backing_hd);
+            }
             bdrv_set_backing_hd(bs, NULL);
             bdrv_unref(backing_hd);
         }
@@ -3725,6 +3870,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
 
     bs->aio_context = new_context;
 
+    if (bs->backing_reference) {
+        bdrv_attach_aio_context(bs->backing_hd->backing_hd, new_context);
+    }
     if (bs->backing_hd) {
         bdrv_attach_aio_context(bs->backing_hd, new_context);
     }
diff --git a/include/block/block.h b/include/block/block.h
index 7d1a717..006c941 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -159,6 +159,7 @@ typedef enum BlockOpType {
     BLOCK_OP_TYPE_RESIZE,
     BLOCK_OP_TYPE_STREAM,
     BLOCK_OP_TYPE_REPLACE,
+    BLOCK_OP_TYPE_BACKING_REFERENCE,
     BLOCK_OP_TYPE_MAX,
 } BlockOpType;
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index db29b74..49ca826 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -364,6 +364,7 @@ struct BlockDriverState {
     QDict *full_open_options;
     char exact_filename[PATH_MAX];
 
+    bool backing_reference;
     BlockDriverState *backing_hd;
     BlockDriverState *file;
 
diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 0360f37..fd67f40 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -116,6 +116,19 @@ run_qemu -drive file="$TEST_IMG",file.backing.driver=file,file.backing.filename=
 run_qemu -drive file="$TEST_IMG",file.backing.driver=qcow2,file.backing.file.filename="$TEST_IMG.orig"
 
 echo
+echo === Backing file reference ===
+echo
+
+run_qemu -drive file="$TEST_IMG",if=none,id=drive0 \
+    -drive file="$TEST_IMG",driver=qcow2,backing_reference.drive_id=drive0,backing_reference.hidden-disk.filename="$TEST_IMG.hidden"
+
+run_qemu -drive file="$TEST_IMG",if=none,id=drive0 \
+    -drive file="$TEST_IMG",driver=qcow2,backing_reference.drive_id=drive0,backing_reference.hidden-disk.filename="$TEST_IMG.hidden",backing.file.filename="$TEST_IMG.orig"
+
+run_qemu -drive file="$TEST_IMG",if=none,id=drive0 \
+    -drive file="$TEST_IMG",driver=qcow2,file.backing_reference.drive_id=drive0,file.backing_reference.hidden-disk.filename="$TEST_IMG.hidden",file.backing.file.filename="$TEST_IMG.orig"
+
+echo
 echo === Enable and disable lazy refcounting on the command line, plus some invalid values ===
 echo
 
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 2890eac..cb8340b 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -75,6 +75,19 @@ Testing: -drive file=TEST_DIR/t.qcow2,file.backing.driver=qcow2,file.backing.fil
 QEMU_PROG: -drive file=TEST_DIR/t.qcow2,file.backing.driver=qcow2,file.backing.file.filename=TEST_DIR/t.qcow2.orig: Driver doesn't support backing files
 
 
+=== Backing file reference ===
+
+Testing: -drive file=TEST_DIR/t.qcow2,if=none,id=drive0 -drive file=TEST_DIR/t.qcow2,driver=qcow2,backing_reference.drive_id=drive0,backing_reference.hidden-disk.filename=TEST_DIR/t.qcow2.hidden
+QEMU X.Y.Z monitor - type 'help' for more information
+(qemu) q^[[K^[[Dqu^[[K^[[D^[[Dqui^[[K^[[D^[[D^[[Dquit^[[K
+
+Testing: -drive file=TEST_DIR/t.qcow2,if=none,id=drive0 -drive file=TEST_DIR/t.qcow2,driver=qcow2,backing_reference.drive_id=drive0,backing_reference.hidden-disk.filename=TEST_DIR/t.qcow2.hidden,backing.file.filename=TEST_DIR/t.qcow2.orig
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,driver=qcow2,backing=drive0,backing.file.filename=TEST_DIR/t.qcow2.orig: Option "backing_reference.*" and "backing.*" cannot be used together
+
+Testing: -drive file=TEST_DIR/t.qcow2,if=none,id=drive0 -drive file=TEST_DIR/t.qcow2,driver=qcow2,file.backing_reference.drive_id=drive0,file.backing_reference.hidden-disk.filename=TEST_DIR/t.qcow2.hidden,file.backing.file.filename=TEST_DIR/t.qcow2.orig
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,driver=qcow2,file.backing=drive0,file.backing.file.filename=TEST_DIR/t.qcow2.orig: Option "backing_reference.*" and "backing.*" cannot be used together
+
+
 === Enable and disable lazy refcounting on the command line, plus some invalid values ===
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,lazy-refcounts=on
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 05/15] Backup: clear all bitmap when doing block checkpoint
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (3 preceding siblings ...)
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 04/15] block: Parse "backing_reference" option to reference existing BDS Wen Congyang
@ 2015-05-21  4:52 ` Wen Congyang
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 06/15] Don't allow a disk use backing reference target Wen Congyang
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jeff Cody, Jiang Yunhong,
	Dong Eddie, Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi,
	Yang Hongyang, zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Jeff Cody <jcody@redhat.com>
---
 block/backup.c           | 13 +++++++++++++
 blockjob.c               | 10 ++++++++++
 include/block/blockjob.h | 12 ++++++++++++
 3 files changed, 35 insertions(+)

diff --git a/block/backup.c b/block/backup.c
index d3f648d..d3d8ba7 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -210,11 +210,24 @@ static void backup_iostatus_reset(BlockJob *job)
     bdrv_iostatus_reset(s->target);
 }
 
+static void backup_do_checkpoint(BlockJob *job, Error **errp)
+{
+    BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
+
+    if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
+        error_setg(errp, "this feature or command is not currently supported");
+        return;
+    }
+
+    hbitmap_reset_all(backup_job->bitmap);
+}
+
 static const BlockJobDriver backup_job_driver = {
     .instance_size  = sizeof(BackupBlockJob),
     .job_type       = BLOCK_JOB_TYPE_BACKUP,
     .set_speed      = backup_set_speed,
     .iostatus_reset = backup_iostatus_reset,
+    .do_checkpoint  = backup_do_checkpoint,
 };
 
 static BlockErrorAction backup_error_action(BackupBlockJob *job,
diff --git a/blockjob.c b/blockjob.c
index 2755465..9d2128a 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -399,3 +399,13 @@ void block_job_defer_to_main_loop(BlockJob *job,
 
     qemu_bh_schedule(data->bh);
 }
+
+void block_job_do_checkpoint(BlockJob *job, Error **errp)
+{
+    if (!job->driver->do_checkpoint) {
+        error_setg(errp, "this feature or command is not currently supported");
+        return;
+    }
+
+    job->driver->do_checkpoint(job, errp);
+}
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 57d8ef1..b832dc3 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -50,6 +50,9 @@ typedef struct BlockJobDriver {
      * manually.
      */
     void (*complete)(BlockJob *job, Error **errp);
+
+    /** Optional callback for job types that support checkpoint. */
+    void (*do_checkpoint)(BlockJob *job, Error **errp);
 } BlockJobDriver;
 
 /**
@@ -348,4 +351,13 @@ void block_job_defer_to_main_loop(BlockJob *job,
                                   BlockJobDeferToMainLoopFn *fn,
                                   void *opaque);
 
+/**
+ * block_job_do_checkpoint:
+ * @job: The job.
+ * @errp: Error object.
+ *
+ * Do block checkpoint on the specified job.
+ */
+void block_job_do_checkpoint(BlockJob *job, Error **errp);
+
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 06/15] Don't allow a disk use backing reference target
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (4 preceding siblings ...)
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 05/15] Backup: clear all bitmap when doing block checkpoint Wen Congyang
@ 2015-05-21  4:52 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 07/15] Add new block driver interface to connect/disconnect the remote target Wen Congyang
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:52 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block.c b/block.c
index 220ba9a..1b362ba 100644
--- a/block.c
+++ b/block.c
@@ -1206,6 +1206,14 @@ static int bdrv_open_backing_reference_file(BlockDriverState *bs,
     }
 
     backing_hd = blk_bs(backing_blk);
+    /* Don't allow a disk use backing reference target */
+    ret = blk_attach_dev(backing_hd->blk, bs);
+    if (ret < 0) {
+        error_setg(errp, "backing_hd %s is used by the other device model",
+                   backing_name);
+        goto free_exit;
+    }
+
     /* Backing reference itself? */
     if (backing_hd == bs || bdrv_find_overlay(backing_hd, bs)) {
         error_setg(errp, "Backing reference itself");
@@ -1889,6 +1897,7 @@ void bdrv_close(BlockDriverState *bs)
                 if (backing_hd->backing_hd->job) {
                     block_job_cancel(backing_hd->backing_hd->job);
                 }
+                blk_detach_dev(backing_hd->backing_hd->blk, bs);
                 bdrv_set_backing_hd(backing_hd, NULL);
                 bdrv_unref(backing_hd->backing_hd);
             }
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 07/15] Add new block driver interface to connect/disconnect the remote target
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (5 preceding siblings ...)
  2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 06/15] Don't allow a disk use backing reference target Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 08/15] NBD client: implement block driver interfaces to connect/disconnect NBD server Wen Congyang
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

In some cases, we want to connect/disconnect the remote target when
we need, not in bdrv_open()/bdrv_close().

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block.c                   | 24 ++++++++++++++++++++++++
 include/block/block.h     |  3 +++
 include/block/block_int.h |  3 +++
 3 files changed, 30 insertions(+)

diff --git a/block.c b/block.c
index 1b362ba..9a8ef9f 100644
--- a/block.c
+++ b/block.c
@@ -4191,3 +4191,27 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs)
 {
     return &bs->stats;
 }
+
+void bdrv_connect(BlockDriverState *bs, Error **errp)
+{
+    BlockDriver *drv = bs->drv;
+
+    if (drv && drv->bdrv_connect) {
+        drv->bdrv_connect(bs, errp);
+    } else if (bs->file) {
+        bdrv_connect(bs->file, errp);
+    } else {
+        error_setg(errp, "this feature or command is not currently supported");
+    }
+}
+
+void bdrv_disconnect(BlockDriverState *bs)
+{
+    BlockDriver *drv = bs->drv;
+
+    if (drv && drv->bdrv_disconnect) {
+        drv->bdrv_disconnect(bs);
+    } else if (bs->file) {
+        bdrv_disconnect(bs->file);
+    }
+}
diff --git a/include/block/block.h b/include/block/block.h
index 006c941..e479a34 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -592,4 +592,7 @@ void bdrv_flush_io_queue(BlockDriverState *bs);
 
 BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
 
+void bdrv_connect(BlockDriverState *bs, Error **errp);
+void bdrv_disconnect(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 49ca826..da29fa7 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -290,6 +290,9 @@ struct BlockDriver {
      */
     int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+    void (*bdrv_connect)(BlockDriverState *bs, Error **errp);
+    void (*bdrv_disconnect)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 08/15] NBD client: implement block driver interfaces to connect/disconnect NBD server
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (6 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 07/15] Add new block driver interface to connect/disconnect the remote target Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 09/15] Introduce a new -drive option to control whether to connect to remote target Wen Congyang
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block/nbd.c | 67 ++++++++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 49 insertions(+), 18 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 2176186..bc9477a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -44,6 +44,8 @@
 typedef struct BDRVNBDState {
     NbdClientSession client;
     QemuOpts *socket_opts;
+    char *export;
+    bool connected;
 } BDRVNBDState;
 
 static int nbd_parse_uri(const char *filename, QDict *options)
@@ -254,34 +256,56 @@ static int nbd_establish_connection(BlockDriverState *bs, Error **errp)
     return sock;
 }
 
-static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
-                    Error **errp)
+static void nbd_connect_server(BlockDriverState *bs, Error **errp)
 {
     BDRVNBDState *s = bs->opaque;
-    char *export = NULL;
-    int result, sock;
-    Error *local_err = NULL;
-
-    /* Pop the config into our state object. Exit if invalid. */
-    nbd_config(s, options, &export, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        return -EINVAL;
-    }
+    int sock;
 
     /* establish TCP connection, return error if it fails
      * TODO: Configurable retry-until-timeout behaviour.
      */
     sock = nbd_establish_connection(bs, errp);
     if (sock < 0) {
-        g_free(export);
-        return sock;
+        return;
     }
 
     /* NBD handshake */
-    result = nbd_client_init(bs, sock, export, errp);
-    g_free(export);
-    return result;
+    nbd_client_init(bs, sock, s->export, errp);
+
+    s->connected = true;
+}
+
+static void nbd_disconnect_server(BlockDriverState *bs)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->connected) {
+        nbd_client_close(bs);
+        s->connected = false;
+    }
+}
+
+static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
+                    Error **errp)
+{
+    BDRVNBDState *s = bs->opaque;
+    Error *local_err = NULL;
+
+    /* Pop the config into our state object. Exit if invalid. */
+    nbd_config(s, options, &s->export, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return -EINVAL;
+    }
+
+    nbd_connect_server(bs, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        g_free(s->export);
+        return -EINVAL;
+    }
+
+    return 0;
 }
 
 static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
@@ -318,7 +342,8 @@ static void nbd_close(BlockDriverState *bs)
     BDRVNBDState *s = bs->opaque;
 
     qemu_opts_del(s->socket_opts);
-    nbd_client_close(bs);
+    nbd_disconnect_server(bs);
+    g_free(s->export);
 }
 
 static int64_t nbd_getlength(BlockDriverState *bs)
@@ -400,6 +425,8 @@ static BlockDriver bdrv_nbd = {
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
     .bdrv_attach_aio_context    = nbd_attach_aio_context,
     .bdrv_refresh_filename      = nbd_refresh_filename,
+    .bdrv_connect               = nbd_connect_server,
+    .bdrv_disconnect            = nbd_disconnect_server,
 };
 
 static BlockDriver bdrv_nbd_tcp = {
@@ -418,6 +445,8 @@ static BlockDriver bdrv_nbd_tcp = {
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
     .bdrv_attach_aio_context    = nbd_attach_aio_context,
     .bdrv_refresh_filename      = nbd_refresh_filename,
+    .bdrv_connect               = nbd_connect_server,
+    .bdrv_disconnect            = nbd_disconnect_server,
 };
 
 static BlockDriver bdrv_nbd_unix = {
@@ -436,6 +465,8 @@ static BlockDriver bdrv_nbd_unix = {
     .bdrv_detach_aio_context    = nbd_detach_aio_context,
     .bdrv_attach_aio_context    = nbd_attach_aio_context,
     .bdrv_refresh_filename      = nbd_refresh_filename,
+    .bdrv_connect               = nbd_connect_server,
+    .bdrv_disconnect            = nbd_disconnect_server,
 };
 
 static void bdrv_nbd_init(void)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 09/15] Introduce a new -drive option to control whether to connect to remote target
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (7 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 08/15] NBD client: implement block driver interfaces to connect/disconnect NBD server Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 10/15] NBD client: connect to nbd server later Wen Congyang
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 blockdev.c            | 8 ++++++++
 include/block/block.h | 1 +
 qemu-options.hx       | 4 ++++
 3 files changed, 13 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 95a0c6a..0c457ee 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -429,6 +429,10 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
         qdict_put(bs_opts, "driver", qstring_from_str(buf));
     }
 
+    if (qemu_opt_get_bool(opts, "no-connect", false)) {
+        bdrv_flags |= BDRV_O_NO_CONNECT;
+    }
+
     /* disk I/O throttling */
     memset(&cfg, 0, sizeof(cfg));
     cfg.buckets[THROTTLE_BPS_TOTAL].avg =
@@ -3195,6 +3199,10 @@ QemuOptsList qemu_common_drive_opts = {
             .name = "detect-zeroes",
             .type = QEMU_OPT_STRING,
             .help = "try to optimize zero writes (off, on, unmap)",
+        },{
+            .name = "no-connect",
+            .type = QEMU_OPT_BOOL,
+            .help = "enable whether to connect remote target"
         },
         { /* end of list */ }
     },
diff --git a/include/block/block.h b/include/block/block.h
index e479a34..ed5b8e5 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -87,6 +87,7 @@ typedef struct HDGeometry {
 #define BDRV_O_PROTOCOL    0x8000  /* if no block driver is explicitly given:
                                       select an appropriate protocol driver,
                                       ignoring the format layer */
+#define BDRV_O_NO_CONNECT  0x10000 /* do not connect to remote target */
 
 #define BDRV_O_CACHE_MASK  (BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH)
 
diff --git a/qemu-options.hx b/qemu-options.hx
index ec356f6..c41e3e3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -464,6 +464,7 @@ DEF("drive", HAS_ARG, QEMU_OPTION_drive,
     "       [[,bps_max=bm]|[[,bps_rd_max=rm][,bps_wr_max=wm]]]\n"
     "       [[,iops_max=im]|[[,iops_rd_max=irm][,iops_wr_max=iwm]]]\n"
     "       [[,iops_size=is]]\n"
+    "       [,no-connect=on|off]\n"
     "                use 'file' as a drive image\n", QEMU_ARCH_ALL)
 STEXI
 @item -drive @var{option}[,@var{option}[,@var{option}[,...]]]
@@ -525,6 +526,9 @@ file sectors into the image file.
 conversion of plain zero writes by the OS to driver specific optimized
 zero write commands. You may even choose "unmap" if @var{discard} is set
 to "unmap" to allow a zero write to be converted to an UNMAP operation.
+@item no-connect=@var{no-connect}
+@var{no-connect} is "on" or "off", and enables whether to connect to remote
+target when open the drive. The default value is "off".
 @end table
 
 By default, the @option{cache=writeback} mode is used. It will report data
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 10/15] NBD client: connect to nbd server later
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (8 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 09/15] Introduce a new -drive option to control whether to connect to remote target Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 11/15] Add new block driver interfaces to control block replication Wen Congyang
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

The secondary qemu starts later than the primary qemu, so we
cannot connect to nbd server in bdrv_open(). Introduce a new
open flags to control it.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block/nbd.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index bc9477a..4964cf8 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -298,11 +298,13 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
-    nbd_connect_server(bs, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        g_free(s->export);
-        return -EINVAL;
+    if (!(flags & BDRV_O_NO_CONNECT)) {
+        nbd_connect_server(bs, &local_err);
+        if (local_err) {
+            error_propagate(errp, local_err);
+            g_free(s->export);
+            return -EINVAL;
+        }
     }
 
     return 0;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 11/15] Add new block driver interfaces to control block replication
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (9 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 10/15] NBD client: connect to nbd server later Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 12/15] skip nbd_target when starting " Wen Congyang
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Luiz Capitulino, Gonglei, Stefan Hajnoczi,
	Yang Hongyang, Michael Roth, zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c                   | 40 ++++++++++++++++++++++++++++++++++++++++
 include/block/block.h     |  5 +++++
 include/block/block_int.h | 14 ++++++++++++++
 qapi/block.json           | 16 ++++++++++++++++
 4 files changed, 75 insertions(+)

diff --git a/block.c b/block.c
index 9a8ef9f..74fd3d6 100644
--- a/block.c
+++ b/block.c
@@ -4215,3 +4215,43 @@ void bdrv_disconnect(BlockDriverState *bs)
         bdrv_disconnect(bs->file);
     }
 }
+
+void bdrv_start_replication(BlockDriverState *bs, ReplicationMode mode,
+                            Error **errp)
+{
+    BlockDriver *drv = bs->drv;
+
+    if (drv && drv->bdrv_start_replication) {
+        drv->bdrv_start_replication(bs, mode, errp);
+    } else if (bs->file) {
+        bdrv_start_replication(bs->file, mode, errp);
+    } else {
+        error_setg(errp, "this feature or command is not currently supported");
+    }
+}
+
+void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp)
+{
+    BlockDriver *drv = bs->drv;
+
+    if (drv && drv->bdrv_do_checkpoint) {
+        drv->bdrv_do_checkpoint(bs, errp);
+    } else if (bs->file) {
+        bdrv_do_checkpoint(bs->file, errp);
+    } else {
+        error_setg(errp, "this feature or command is not currently supported");
+    }
+}
+
+void bdrv_stop_replication(BlockDriverState *bs, bool failover, Error **errp)
+{
+    BlockDriver *drv = bs->drv;
+
+    if (drv && drv->bdrv_stop_replication) {
+        drv->bdrv_stop_replication(bs, failover, errp);
+    } else if (bs->file) {
+        bdrv_stop_replication(bs->file, failover, errp);
+    } else {
+        error_setg(errp, "this feature or command is not currently supported");
+    }
+}
diff --git a/include/block/block.h b/include/block/block.h
index ed5b8e5..c98adfb 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -596,4 +596,9 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
 void bdrv_connect(BlockDriverState *bs, Error **errp);
 void bdrv_disconnect(BlockDriverState *bs);
 
+void bdrv_start_replication(BlockDriverState *bs, ReplicationMode mode,
+                            Error **errp);
+void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp);
+void bdrv_stop_replication(BlockDriverState *bs, bool failover, Error **errp);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index da29fa7..761e6ff 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -293,6 +293,20 @@ struct BlockDriver {
     void (*bdrv_connect)(BlockDriverState *bs, Error **errp);
     void (*bdrv_disconnect)(BlockDriverState *bs);
 
+    void (*bdrv_start_replication)(BlockDriverState *bs, ReplicationMode mode,
+                                   Error **errp);
+    /* Drop Disk buffer when doing checkpoint. */
+    void (*bdrv_do_checkpoint)(BlockDriverState *bs, Error **errp);
+    /*
+     * After failover, we should flush Disk buffer into secondary disk
+     * and stop block replication.
+     *
+     * If the guest is shutdown, we should drop Disk buffer and stop
+     * block representation.
+     */
+    void (*bdrv_stop_replication)(BlockDriverState *bs, bool failover,
+                                  Error **errp);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
diff --git a/qapi/block.json b/qapi/block.json
index aad645c..04dc4c2 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -40,6 +40,22 @@
   'data': ['auto', 'none', 'lba', 'large', 'rechs']}
 
 ##
+# @ReplicationMode
+#
+# An enumeration of replication modes.
+#
+# @unprotected: Replication is not started or after failover.
+#
+# @primary: Primary mode, the vm's state will be sent to secondary QEMU.
+#
+# @secondary: Secondary mode, receive the vm's state from primary QEMU.
+#
+# Since: 2.4
+##
+{ 'enum' : 'ReplicationMode',
+  'data' : ['unprotected', 'primary', 'secondary']}
+
+##
 # @BlockdevSnapshotInternal
 #
 # @device: the name of the device to generate the snapshot from
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 12/15] skip nbd_target when starting block replication
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (10 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 11/15] Add new block driver interfaces to control block replication Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 13/15] quorum: implement block driver interfaces for " Wen Congyang
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/block.c b/block.c
index 74fd3d6..961e395 100644
--- a/block.c
+++ b/block.c
@@ -4221,6 +4221,10 @@ void bdrv_start_replication(BlockDriverState *bs, ReplicationMode mode,
 {
     BlockDriver *drv = bs->drv;
 
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKING_REFERENCE, NULL)) {
+        return;
+    }
+
     if (drv && drv->bdrv_start_replication) {
         drv->bdrv_start_replication(bs, mode, errp);
     } else if (bs->file) {
@@ -4234,6 +4238,10 @@ void bdrv_do_checkpoint(BlockDriverState *bs, Error **errp)
 {
     BlockDriver *drv = bs->drv;
 
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKING_REFERENCE, NULL)) {
+        return;
+    }
+
     if (drv && drv->bdrv_do_checkpoint) {
         drv->bdrv_do_checkpoint(bs, errp);
     } else if (bs->file) {
@@ -4247,6 +4255,10 @@ void bdrv_stop_replication(BlockDriverState *bs, bool failover, Error **errp)
 {
     BlockDriver *drv = bs->drv;
 
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKING_REFERENCE, NULL)) {
+        return;
+    }
+
     if (drv && drv->bdrv_stop_replication) {
         drv->bdrv_stop_replication(bs, failover, errp);
     } else if (bs->file) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 13/15] quorum: implement block driver interfaces for block replication
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (11 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 12/15] skip nbd_target when starting " Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 14/15] quorum: allow ignoring child errors Wen Congyang
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block/quorum.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/block/quorum.c b/block/quorum.c
index f91ef75..3eb8fd3 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -82,6 +82,8 @@ typedef struct BDRVQuorumState {
                             */
 
     QuorumReadPattern read_pattern;
+
+    int replication_index; /* store which child supports block replication */
 } BDRVQuorumState;
 
 typedef struct QuorumAIOCB QuorumAIOCB;
@@ -972,6 +974,7 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
     }
 
     g_free(opened);
+    s->replication_index = -1;
     goto exit;
 
 close_exit:
@@ -1061,6 +1064,77 @@ static void quorum_refresh_filename(BlockDriverState *bs)
     bs->full_open_options = opts;
 }
 
+static void quorum_start_replication(BlockDriverState *bs, ReplicationMode mode,
+                                     Error **errp)
+{
+    BDRVQuorumState *s = bs->opaque;
+    int count = 0, i, index;
+    Error *local_err = NULL;
+
+    /*
+     * TODO: support REPLICATION_MODE_SECONDARY if we allow secondary
+     * QEMU becoming primary QEMU.
+     */
+    if (mode != REPLICATION_MODE_PRIMARY) {
+        error_setg(errp, "Invalid parameter 'mode'");
+        return;
+    }
+
+    if (s->read_pattern != QUORUM_READ_PATTERN_FIFO) {
+        error_setg(errp, "Invalid parameter 'read pattern'");
+        return;
+    }
+
+    for (i = 0; i < s->num_children; i++) {
+        bdrv_start_replication(s->bs[i], mode, &local_err);
+        if (local_err) {
+            error_free(local_err);
+            local_err = NULL;
+        } else {
+            count++;
+            index = i;
+        }
+    }
+
+    if (count == 0) {
+        /* No child supports block replication */
+        error_setg(errp, "this feature or command is not currently supported");
+    } else if (count > 1) {
+        for (i = 0; i < s->num_children; i++) {
+            bdrv_stop_replication(s->bs[i], false, NULL);
+        }
+        error_setg(errp, "too many children support block replication");
+    } else {
+        s->replication_index = index;
+    }
+}
+
+static void quorum_do_checkpoint(BlockDriverState *bs, Error **errp)
+{
+    BDRVQuorumState *s = bs->opaque;
+
+    if (s->replication_index < 0) {
+        error_setg(errp, "Block replication is not started");
+        return;
+    }
+
+    bdrv_do_checkpoint(s->bs[s->replication_index], errp);
+}
+
+static void quorum_stop_replication(BlockDriverState *bs, bool failover,
+                                    Error **errp)
+{
+    BDRVQuorumState *s = bs->opaque;
+
+    if (s->replication_index < 0) {
+        error_setg(errp, "Block replication is not started");
+        return;
+    }
+
+    bdrv_stop_replication(s->bs[s->replication_index], failover, errp);
+    s->replication_index = -1;
+}
+
 static BlockDriver bdrv_quorum = {
     .format_name                        = "quorum",
     .protocol_name                      = "quorum",
@@ -1084,6 +1158,10 @@ static BlockDriver bdrv_quorum = {
 
     .is_filter                          = true,
     .bdrv_recurse_is_first_non_filter   = quorum_recurse_is_first_non_filter,
+
+    .bdrv_start_replication             = quorum_start_replication,
+    .bdrv_do_checkpoint                 = quorum_do_checkpoint,
+    .bdrv_stop_replication              = quorum_stop_replication,
 };
 
 static void bdrv_quorum_init(void)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 14/15] quorum: allow ignoring child errors
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (12 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 13/15] quorum: implement block driver interfaces for " Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 15/15] Implement new driver for block replication Wen Congyang
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

If the child is not ready, read/write/getlength/flush will
return -errno. It is not critical error, and can be ignored:
1. read/write:
   Just not report the error event.
2. getlength:
   just ignore it. If all children's getlength return -errno,
   and be ignored, return -EIO.
3. flush:
   Just ignore it. If all children's getlength return -errno,
   and be ignored, return 0.

Usage: children.x.ignore-errors=true

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block/quorum.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 3eb8fd3..2f86fa1 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -30,6 +30,7 @@
 #define QUORUM_OPT_BLKVERIFY      "blkverify"
 #define QUORUM_OPT_REWRITE        "rewrite-corrupted"
 #define QUORUM_OPT_READ_PATTERN   "read-pattern"
+#define QUORUM_CHILDREN_OPT_IGNORE_ERRORS   "ignore-errors"
 
 /* This union holds a vote hash value */
 typedef union QuorumVoteValue {
@@ -65,6 +66,7 @@ typedef struct QuorumVotes {
 /* the following structure holds the state of one quorum instance */
 typedef struct BDRVQuorumState {
     BlockDriverState **bs; /* children BlockDriverStates */
+    bool *ignore_errors;   /* ignore children's error? */
     int num_children;      /* children count */
     int threshold;         /* if less than threshold children reads gave the
                             * same result a quorum error occurs.
@@ -99,6 +101,7 @@ typedef struct QuorumChildRequest {
     uint8_t *buf;
     int ret;
     QuorumAIOCB *parent;
+    int index;
 } QuorumChildRequest;
 
 /* Quorum will use the following structure to track progress of each read/write
@@ -211,6 +214,7 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
         acb->qcrs[i].buf = NULL;
         acb->qcrs[i].ret = 0;
         acb->qcrs[i].parent = acb;
+        acb->qcrs[i].index = i;
     }
 
     return acb;
@@ -304,7 +308,7 @@ static void quorum_aio_cb(void *opaque, int ret)
     acb->count++;
     if (ret == 0) {
         acb->success_count++;
-    } else {
+    } else if (!s->ignore_errors[sacb->index]) {
         quorum_report_bad(acb, sacb->aiocb->bs->node_name, ret);
     }
     assert(acb->count <= s->num_children);
@@ -719,19 +723,31 @@ static BlockAIOCB *quorum_aio_writev(BlockDriverState *bs,
 static int64_t quorum_getlength(BlockDriverState *bs)
 {
     BDRVQuorumState *s = bs->opaque;
-    int64_t result;
+    int64_t result = -EIO;
     int i;
 
     /* check that all file have the same length */
-    result = bdrv_getlength(s->bs[0]);
-    if (result < 0) {
-        return result;
-    }
-    for (i = 1; i < s->num_children; i++) {
+    for (i = 0; i < s->num_children; i++) {
         int64_t value = bdrv_getlength(s->bs[i]);
+
         if (value < 0) {
             return value;
         }
+
+        if (value == 0 && s->ignore_errors[i]) {
+            /*
+             * If the child is not ready, it cannot return -errno,
+             * otherwise refresh_total_sectors() will fail when
+             * we open the child.
+             */
+            continue;
+        }
+
+        if (result == -EIO) {
+            result = value;
+            continue;
+        }
+
         if (value != result) {
             return -EIO;
         }
@@ -769,6 +785,9 @@ static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
 
     for (i = 0; i < s->num_children; i++) {
         result = bdrv_co_flush(s->bs[i]);
+        if (result < 0 && s->ignore_errors[i]) {
+            result = 0;
+        }
         result_value.l = result;
         quorum_count_vote(&error_votes, &result_value, i);
     }
@@ -843,6 +862,19 @@ static QemuOptsList quorum_runtime_opts = {
     },
 };
 
+static QemuOptsList quorum_children_common_opts = {
+    .name = "quorum children",
+    .head = QTAILQ_HEAD_INITIALIZER(quorum_children_common_opts.head),
+    .desc = {
+        {
+            .name = QUORUM_CHILDREN_OPT_IGNORE_ERRORS,
+            .type = QEMU_OPT_BOOL,
+            .help = "ignore child I/O error",
+        },
+        { /* end of list */ }
+    },
+};
+
 static int parse_read_pattern(const char *opt)
 {
     int i;
@@ -938,11 +970,14 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
     /* allocate the children BlockDriverState array */
     s->bs = g_new0(BlockDriverState *, s->num_children);
     opened = g_new0(bool, s->num_children);
+    s->ignore_errors = g_new0(bool, s->num_children);
 
     for (i = 0, lentry = qlist_first(list); lentry;
          lentry = qlist_next(lentry), i++) {
         QDict *d;
         QString *string;
+        QemuOpts *children_opts = NULL;
+        bool value;
 
         switch (qobject_type(lentry->value))
         {
@@ -950,6 +985,20 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
             case QTYPE_QDICT:
                 d = qobject_to_qdict(lentry->value);
                 QINCREF(d);
+
+                children_opts = qemu_opts_create(&quorum_children_common_opts,
+                                                 NULL, 0, &error_abort);
+                qemu_opts_absorb_qdict(children_opts, d, &local_err);
+                if (local_err) {
+                    ret = -EINVAL;
+                    qemu_opts_del(children_opts);
+                    goto close_exit;
+                }
+                value = qemu_opt_get_bool(children_opts,
+                                          QUORUM_CHILDREN_OPT_IGNORE_ERRORS,
+                                          false);
+                s->ignore_errors[i]  = value;
+                qemu_opts_del(children_opts);
                 ret = bdrv_open(&s->bs[i], NULL, NULL, d, flags, NULL,
                                 &local_err);
                 break;
@@ -1008,6 +1057,7 @@ static void quorum_close(BlockDriverState *bs)
     }
 
     g_free(s->bs);
+    g_free(s->ignore_errors);
 }
 
 static void quorum_detach_aio_context(BlockDriverState *bs)
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH COLO-Block v5 15/15] Implement new driver for block replication
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (13 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 14/15] quorum: allow ignoring child errors Wen Congyang
@ 2015-05-21  4:53 ` Wen Congyang
  2015-05-21  6:06 ` [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
  2015-05-28  4:59 ` Wen Congyang
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  4:53 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, qemu block, Lai Jiangshan, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Gonglei, Stefan Hajnoczi, Yang Hongyang,
	zhanghailiang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 block/Makefile.objs |   1 +
 block/replication.c | 441 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 442 insertions(+)
 create mode 100644 block/replication.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 8dd9b9c..8b1c587 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -22,6 +22,7 @@ block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o
 block-obj-y += write-threshold.o
 block-obj-y += backup.o
+block-obj-y += replication.o
 
 common-obj-y += stream.o
 common-obj-y += commit.o
diff --git a/block/replication.c b/block/replication.c
new file mode 100644
index 0000000..da3b512
--- /dev/null
+++ b/block/replication.c
@@ -0,0 +1,441 @@
+#include "qemu-common.h"
+#include "block/block_int.h"
+#include "block/blockjob.h"
+#include "block/nbd.h"
+
+typedef struct BDRVReplicationState {
+    ReplicationMode mode;
+    int replication_state;
+    char *export_name;
+    NBDExport *exp;
+    BlockDriverState *active_disk;
+    BlockDriverState *hidden_disk;
+    BlockDriverState *secondary_disk; /* nbd target */
+    int error;
+} BDRVReplicationState;
+
+enum {
+    BLOCK_REPLICATION_NONE,     /* block replication is not started */
+    BLOCK_REPLICATION_RUNNING,  /* block replication is running */
+    BLOCK_REPLICATION_DONE,     /* block replication is done(failover) */
+};
+
+#define COMMIT_CLUSTER_BITS 16
+#define COMMIT_CLUSTER_SIZE (1 << COMMIT_CLUSTER_BITS)
+#define COMMIT_SECTORS_PER_CLUSTER (COMMIT_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
+
+static void replication_stop(BlockDriverState *bs, bool failover, Error **errp);
+
+#define NBD_OPT_EXPORT          "export"
+#define REPLICATION_MODE        "mode"
+static QemuOptsList replication_runtime_opts = {
+    .name = "replication",
+    .head = QTAILQ_HEAD_INITIALIZER(replication_runtime_opts.head),
+    .desc = {
+        {
+            .name = REPLICATION_MODE,
+            .type = QEMU_OPT_STRING,
+        },
+        {
+            .name = NBD_OPT_EXPORT,
+            .type = QEMU_OPT_STRING,
+            .help = "The NBD server name",
+        },
+        { /* end of list */ }
+    },
+};
+
+static int replication_open(BlockDriverState *bs, QDict *options,
+                            int flags, Error **errp)
+{
+    int ret;
+    BDRVReplicationState *s = bs->opaque;;
+    Error *local_err = NULL;
+    QemuOpts *opts = NULL;
+    const char *mode;
+
+    ret = -EINVAL;
+    opts = qemu_opts_create(&replication_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        goto fail;
+    }
+
+    mode = qemu_opt_get(opts, REPLICATION_MODE);
+    if (!mode) {
+        error_setg(&local_err, "Missing the option mode");
+        goto fail;
+    }
+
+    if (!strcmp(mode, "primary")) {
+        s->mode = REPLICATION_MODE_PRIMARY;
+    } else if (!strcmp(mode, "secondary")) {
+        s->mode = REPLICATION_MODE_SECONDARY;
+    } else {
+        error_setg(&local_err,
+                   "The option mode's value should be primary or secondary");
+        goto fail;
+    }
+
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        s->export_name = g_strdup(qemu_opt_get(opts, NBD_OPT_EXPORT));
+        if (!s->export_name) {
+            error_setg(&local_err, "Missing the option export");
+            goto fail;
+        }
+    }
+
+    return 0;
+
+fail:
+    qemu_opts_del(opts);
+    /* propagate error */
+    if (local_err) {
+        error_propagate(errp, local_err);
+    }
+    return ret;
+}
+
+static void replication_close(BlockDriverState *bs)
+{
+    BDRVReplicationState *s = bs->opaque;
+
+    if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
+        replication_stop(bs, false, NULL);
+    }
+
+    g_free(s->export_name);
+}
+
+static int64_t replication_getlength(BlockDriverState *bs)
+{
+    return bdrv_getlength(bs->file);
+}
+
+static int replication_get_io_status(BDRVReplicationState *s)
+{
+    switch (s->replication_state) {
+    case BLOCK_REPLICATION_NONE:
+        return -EIO;
+    case BLOCK_REPLICATION_RUNNING:
+        return 0;
+    case BLOCK_REPLICATION_DONE:
+        return s->mode == REPLICATION_MODE_PRIMARY ? -EIO : 1;
+    default:
+        abort();
+    }
+}
+
+static int replication_return_value(BDRVReplicationState *s, int ret)
+{
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        return ret;
+    }
+
+    if (ret < 0) {
+        s->error = ret;
+        ret = 0;
+    }
+
+    return ret;
+}
+
+static coroutine_fn int replication_co_readv(BlockDriverState *bs,
+                                             int64_t sector_num,
+                                             int remaining_sectors,
+                                             QEMUIOVector *qiov)
+{
+    BDRVReplicationState *s = bs->opaque;
+    int ret;
+
+    if (s->mode == REPLICATION_MODE_PRIMARY) {
+        /* We only use it to forward primary write requests */
+        return -EIO;
+    }
+
+    ret = replication_get_io_status(s);
+    if (ret < 0) {
+        return ret;
+    }
+
+    /*
+     * After failover, because we don't commit active disk/hidden disk
+     * to secondary disk(nbd target), so we should read from active disk
+     * directly.
+     */
+    ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
+    return replication_return_value(s, ret);
+}
+
+static coroutine_fn int replication_co_writev(BlockDriverState *bs,
+                                              int64_t sector_num,
+                                              int remaining_sectors,
+                                              QEMUIOVector *qiov)
+{
+    BDRVReplicationState *s = bs->opaque;
+    QEMUIOVector hd_qiov;
+    uint64_t bytes_done = 0;
+    BlockDriverState *top = bs->file;
+    BlockDriverState *base = s->secondary_disk;
+    BlockDriverState *target;
+    int ret, n;
+
+    ret = replication_get_io_status(s);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (ret == 0) {
+        ret = bdrv_co_writev(bs->file, sector_num, remaining_sectors, qiov);
+        return replication_return_value(s, ret);
+    }
+
+    /*
+     * Only write to active disk if the sectors have
+     * already been allocated in active disk/hidden disk.
+     */
+    qemu_iovec_init(&hd_qiov, qiov->niov);
+    while (remaining_sectors > 0) {
+        ret = bdrv_is_allocated_above(top, base, sector_num,
+                                      remaining_sectors, &n);
+        if (ret < 0) {
+            return ret;
+        }
+
+        qemu_iovec_reset(&hd_qiov);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, n * 512);
+
+        target = ret ? top: base;
+        ret = bdrv_co_writev(target, sector_num, n, &hd_qiov);
+        if (ret < 0) {
+            return ret;
+        }
+
+        remaining_sectors -= n;
+        sector_num += n;
+        bytes_done += n * BDRV_SECTOR_SIZE;
+    }
+
+    return 0;
+}
+
+static coroutine_fn int replication_co_discard(BlockDriverState *bs,
+                                               int64_t sector_num,
+                                               int nb_sectors)
+{
+    BDRVReplicationState *s = bs->opaque;
+    int ret;
+
+    ret = replication_get_io_status(s);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (ret == 1) {
+        /* It is secondary qemu and we are after failover */
+        ret = bdrv_co_discard(s->secondary_disk, sector_num, nb_sectors);
+        if (ret) {
+            return ret;
+        }
+    }
+
+    ret = bdrv_co_discard(bs->file, sector_num, nb_sectors);
+    return replication_return_value(s, ret);
+}
+
+static bool replication_recurse_is_first_non_filter(BlockDriverState *bs,
+                                                    BlockDriverState *candidate)
+{
+    return bdrv_recurse_is_first_non_filter(bs->file, candidate);
+}
+
+static void secondary_do_checkpoint(BDRVReplicationState *s, Error **errp)
+{
+    Error *local_err = NULL;
+    int ret;
+
+    if (!s->secondary_disk->job) {
+        error_setg(errp, "Backup job is cancelled unexpectedly");
+        return;
+    }
+
+    block_job_do_checkpoint(s->secondary_disk->job, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    ret = s->active_disk->drv->bdrv_make_empty(s->active_disk);
+    if (ret < 0) {
+        error_setg(errp, "Cannot make active disk empty");
+        return;
+    }
+
+    ret = s->hidden_disk->drv->bdrv_make_empty(s->hidden_disk);
+    if (ret < 0) {
+        error_setg(errp, "Cannot make hidden disk empty");
+        return;
+    }
+}
+
+static void replication_start(BlockDriverState *bs, ReplicationMode mode,
+                              Error **errp)
+{
+    BDRVReplicationState *s = bs->opaque;
+    int64_t active_length, hidden_length, nbd_length;
+    Error *local_err = NULL;
+
+    if (s->replication_state != BLOCK_REPLICATION_NONE) {
+        error_setg(errp, "Block replication is running or done");
+        return;
+    }
+
+    if (s->mode != mode) {
+        error_setg(errp, "Invalid parameter 'mode'");
+        return;
+    }
+
+    switch (s->mode) {
+    case REPLICATION_MODE_PRIMARY:
+        bdrv_connect(bs->file, errp);
+        break;
+    case REPLICATION_MODE_SECONDARY:
+        if (!bs->file->backing_reference) {
+            error_setg(errp, "Active disk doesn't use backing_reference");
+        }
+
+        s->active_disk = bs->file;
+        s->hidden_disk = s->active_disk->backing_hd;
+        s->secondary_disk = s->hidden_disk->backing_hd;
+
+        if (!s->secondary_disk->job ||
+            s->secondary_disk->job->driver->job_type != BLOCK_JOB_TYPE_BACKUP) {
+            error_setg(errp, "Backup job is cancelled unexpectedly");
+            return;
+        }
+
+        /* verify the length */
+        active_length = bdrv_getlength(s->active_disk);
+        hidden_length = bdrv_getlength(s->hidden_disk);
+        nbd_length = bdrv_getlength(s->secondary_disk);
+        if (active_length < 0 || hidden_length < 0 || nbd_length < 0 ||
+            active_length != hidden_length || hidden_length != nbd_length) {
+            error_setg(errp, "active disk, hidden disk, nbd target's length"
+                       " are not the same");
+            return;
+        }
+
+        if (!s->active_disk->drv->bdrv_make_empty ||
+            !s->hidden_disk->drv->bdrv_make_empty) {
+            error_setg(errp,
+                       "active disk or hidden disk doesn't support make_empty");
+            return;
+        }
+        break;
+    default:
+        abort();
+    }
+
+    s->replication_state = BLOCK_REPLICATION_RUNNING;
+
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        secondary_do_checkpoint(s, errp);
+
+        /* start NBD server */
+        s->exp = nbd_export_new(s->secondary_disk->blk, 0, -1, 0, NULL,
+                                &local_err);
+        if (!s->exp) {
+            s->replication_state = BLOCK_REPLICATION_NONE;
+            error_propagate(errp, local_err);
+            return;
+        }
+        nbd_export_set_name(s->exp, s->export_name);
+    }
+}
+
+static void replication_do_checkpoint(BlockDriverState *bs, Error **errp)
+{
+    BDRVReplicationState *s = bs->opaque;
+
+    if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
+        error_setg(errp, "Block replication is not running");
+        return;
+    }
+
+    if (s->error) {
+        error_setg(errp, "I/O error occurs");
+        return;
+    }
+
+    if (s->mode == REPLICATION_MODE_SECONDARY) {
+        secondary_do_checkpoint(s, errp);
+    }
+}
+
+static void replication_stop(BlockDriverState *bs, bool failover, Error **errp)
+{
+    BDRVReplicationState *s = bs->opaque;
+
+    if (s->replication_state != BLOCK_REPLICATION_RUNNING) {
+        error_setg(errp, "Block replication is not running");
+        return;
+    }
+
+    s->replication_state = BLOCK_REPLICATION_DONE;
+
+    switch (s->mode) {
+    case REPLICATION_MODE_PRIMARY:
+        bdrv_disconnect(bs->file);
+        break;
+    case REPLICATION_MODE_SECONDARY:
+        /* stop NBD server */
+        nbd_export_close(s->exp);
+        nbd_export_put(s->exp);
+
+        if (!failover) {
+            secondary_do_checkpoint(s, errp);
+            return;
+        }
+
+        if (!s->secondary_disk->job ||
+            s->secondary_disk->job->driver->job_type != BLOCK_JOB_TYPE_BACKUP) {
+            error_setg(errp, "Backup job is cancelled unexpectedly");
+            return;
+        }
+
+        block_job_cancel(s->secondary_disk->job);
+        break;
+    default:
+        abort();
+    }
+}
+
+BlockDriver bdrv_replication = {
+    .format_name                = "replication",
+    .protocol_name              = "replication",
+    .instance_size              = sizeof(BDRVReplicationState),
+
+    .bdrv_open                  = replication_open,
+    .bdrv_close                 = replication_close,
+
+    .bdrv_getlength             = replication_getlength,
+    .bdrv_co_readv              = replication_co_readv,
+    .bdrv_co_writev             = replication_co_writev,
+    .bdrv_co_discard            = replication_co_discard,
+
+    .is_filter                  = true,
+    .bdrv_recurse_is_first_non_filter = replication_recurse_is_first_non_filter,
+
+    .bdrv_start_replication     = replication_start,
+    .bdrv_do_checkpoint         = replication_do_checkpoint,
+    .bdrv_stop_replication      = replication_stop,
+
+    .has_variable_length        = true,
+};
+
+static void bdrv_replication_init(void)
+{
+    bdrv_register(&bdrv_replication);
+}
+
+block_init(bdrv_replication_init);
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (14 preceding siblings ...)
  2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 15/15] Implement new driver for block replication Wen Congyang
@ 2015-05-21  6:06 ` Wen Congyang
  2015-05-28  4:59 ` Wen Congyang
  16 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-05-21  6:06 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, Lai Jiangshan, qemu block, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Stefan Hajnoczi, Yang Hongyang

On 05/21/2015 12:52 PM, Wen Congyang wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> You can get the patch here:
> https://github.com/wencongyang/qemu-colo/commits/block-replication-v5
> 
> The other newest COLO patchse will be sent soon.
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>    are accepted.

I forgot to methin that: you should apply the following patch first:
http://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg01317.html

Thanks
Wen Congyang

> 
> Changs Log:
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>    if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> Wen Congyang (15):
>   docs: block replication's description
>   allow writing to the backing file
>   Allow creating backup jobs when opening BDS
>   block: Parse "backing_reference" option to reference existing BDS
>   Backup: clear all bitmap when doing block checkpoint
>   Don't allow a disk use backing reference target
>   Add new block driver interface to connect/disconnect the remote target
>   NBD client: implement block driver interfaces to connect/disconnect
>     NBD server
>   Introduce a new -drive option to control whether to connect to remote
>     target
>   NBD client: connect to nbd server later
>   Add new block driver interfaces to control block replication
>   skip nbd_target when starting block replication
>   quorum: implement block driver interfaces for block replication
>   quorum: allow ignoring child errors
>   Implement new driver for block replication
> 
>  block.c                    | 272 +++++++++++++++++++++++++++-
>  block/Makefile.objs        |   3 +-
>  block/backup.c             |  13 ++
>  block/nbd.c                |  69 +++++--
>  block/quorum.c             | 142 ++++++++++++++-
>  block/replication.c        | 441 +++++++++++++++++++++++++++++++++++++++++++++
>  blockdev.c                 |   8 +
>  blockjob.c                 |  10 +
>  docs/block-replication.txt | 179 ++++++++++++++++++
>  include/block/block.h      |  10 +
>  include/block/block_int.h  |  18 ++
>  include/block/blockjob.h   |  12 ++
>  qapi/block.json            |  16 ++
>  qemu-options.hx            |   4 +
>  tests/qemu-iotests/051     |  13 ++
>  tests/qemu-iotests/051.out |  13 ++
>  16 files changed, 1193 insertions(+), 30 deletions(-)
>  create mode 100644 block/replication.c
>  create mode 100644 docs/block-replication.txt
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints
  2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
                   ` (15 preceding siblings ...)
  2015-05-21  6:06 ` [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
@ 2015-05-28  4:59 ` Wen Congyang
  2015-06-04  5:34   ` Wen Congyang
  16 siblings, 1 reply; 19+ messages in thread
From: Wen Congyang @ 2015-05-28  4:59 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, Lai Jiangshan, qemu block, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Stefan Hajnoczi, Yang Hongyang

Ping...

On 05/21/2015 12:52 PM, Wen Congyang wrote:
> Block replication is a very important feature which is used for
> continuous checkpoints(for example: COLO).
> 
> Usage:
> Please refer to docs/block-replication.txt
> 
> You can get the patch here:
> https://github.com/wencongyang/qemu-colo/commits/block-replication-v5
> 
> The other newest COLO patchse will be sent soon.
> 
> TODO:
> 1. Continuous block replication. It will be started after basic functions
>    are accepted.
> 
> Changs Log:
> V5:
> 1. Address the comments from Gong Lei
> 2. Speed the failover up. The secondary vm can take over very quickly even
>    if there are too many I/O requests.
> V4:
> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
> V3:
> 1: use error_setg() instead of error_set()
> 2. Add a new block job API
> 3. Active disk, hidden disk and nbd target uses the same AioContext
> 4. Add a testcase to test new hbitmap API
> V2:
> 1. Redesign the secondary qemu(use image-fleecing)
> 2. Use Error objects to return error message
> 3. Address the comments from Max Reitz and Eric Blake
> Wen Congyang (15):
>   docs: block replication's description
>   allow writing to the backing file
>   Allow creating backup jobs when opening BDS
>   block: Parse "backing_reference" option to reference existing BDS
>   Backup: clear all bitmap when doing block checkpoint
>   Don't allow a disk use backing reference target
>   Add new block driver interface to connect/disconnect the remote target
>   NBD client: implement block driver interfaces to connect/disconnect
>     NBD server
>   Introduce a new -drive option to control whether to connect to remote
>     target
>   NBD client: connect to nbd server later
>   Add new block driver interfaces to control block replication
>   skip nbd_target when starting block replication
>   quorum: implement block driver interfaces for block replication
>   quorum: allow ignoring child errors
>   Implement new driver for block replication
> 
>  block.c                    | 272 +++++++++++++++++++++++++++-
>  block/Makefile.objs        |   3 +-
>  block/backup.c             |  13 ++
>  block/nbd.c                |  69 +++++--
>  block/quorum.c             | 142 ++++++++++++++-
>  block/replication.c        | 441 +++++++++++++++++++++++++++++++++++++++++++++
>  blockdev.c                 |   8 +
>  blockjob.c                 |  10 +
>  docs/block-replication.txt | 179 ++++++++++++++++++
>  include/block/block.h      |  10 +
>  include/block/block_int.h  |  18 ++
>  include/block/blockjob.h   |  12 ++
>  qapi/block.json            |  16 ++
>  qemu-options.hx            |   4 +
>  tests/qemu-iotests/051     |  13 ++
>  tests/qemu-iotests/051.out |  13 ++
>  16 files changed, 1193 insertions(+), 30 deletions(-)
>  create mode 100644 block/replication.c
>  create mode 100644 docs/block-replication.txt
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints
  2015-05-28  4:59 ` Wen Congyang
@ 2015-06-04  5:34   ` Wen Congyang
  0 siblings, 0 replies; 19+ messages in thread
From: Wen Congyang @ 2015-06-04  5:34 UTC (permalink / raw)
  To: qemu devel, Fam Zheng, Max Reitz, Paolo Bonzini
  Cc: Kevin Wolf, Lai Jiangshan, qemu block, Jiang Yunhong, Dong Eddie,
	Dr. David Alan Gilbert, Stefan Hajnoczi, Yang Hongyang

On 05/28/2015 12:59 PM, Wen Congyang wrote:
> Ping...

Does anybody have time to review it.

Thanks
Wen Congyang

> 
> On 05/21/2015 12:52 PM, Wen Congyang wrote:
>> Block replication is a very important feature which is used for
>> continuous checkpoints(for example: COLO).
>>
>> Usage:
>> Please refer to docs/block-replication.txt
>>
>> You can get the patch here:
>> https://github.com/wencongyang/qemu-colo/commits/block-replication-v5
>>
>> The other newest COLO patchse will be sent soon.
>>
>> TODO:
>> 1. Continuous block replication. It will be started after basic functions
>>    are accepted.
>>
>> Changs Log:
>> V5:
>> 1. Address the comments from Gong Lei
>> 2. Speed the failover up. The secondary vm can take over very quickly even
>>    if there are too many I/O requests.
>> V4:
>> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
>> V3:
>> 1: use error_setg() instead of error_set()
>> 2. Add a new block job API
>> 3. Active disk, hidden disk and nbd target uses the same AioContext
>> 4. Add a testcase to test new hbitmap API
>> V2:
>> 1. Redesign the secondary qemu(use image-fleecing)
>> 2. Use Error objects to return error message
>> 3. Address the comments from Max Reitz and Eric Blake
>> Wen Congyang (15):
>>   docs: block replication's description
>>   allow writing to the backing file
>>   Allow creating backup jobs when opening BDS
>>   block: Parse "backing_reference" option to reference existing BDS
>>   Backup: clear all bitmap when doing block checkpoint
>>   Don't allow a disk use backing reference target
>>   Add new block driver interface to connect/disconnect the remote target
>>   NBD client: implement block driver interfaces to connect/disconnect
>>     NBD server
>>   Introduce a new -drive option to control whether to connect to remote
>>     target
>>   NBD client: connect to nbd server later
>>   Add new block driver interfaces to control block replication
>>   skip nbd_target when starting block replication
>>   quorum: implement block driver interfaces for block replication
>>   quorum: allow ignoring child errors
>>   Implement new driver for block replication
>>
>>  block.c                    | 272 +++++++++++++++++++++++++++-
>>  block/Makefile.objs        |   3 +-
>>  block/backup.c             |  13 ++
>>  block/nbd.c                |  69 +++++--
>>  block/quorum.c             | 142 ++++++++++++++-
>>  block/replication.c        | 441 +++++++++++++++++++++++++++++++++++++++++++++
>>  blockdev.c                 |   8 +
>>  blockjob.c                 |  10 +
>>  docs/block-replication.txt | 179 ++++++++++++++++++
>>  include/block/block.h      |  10 +
>>  include/block/block_int.h  |  18 ++
>>  include/block/blockjob.h   |  12 ++
>>  qapi/block.json            |  16 ++
>>  qemu-options.hx            |   4 +
>>  tests/qemu-iotests/051     |  13 ++
>>  tests/qemu-iotests/051.out |  13 ++
>>  16 files changed, 1193 insertions(+), 30 deletions(-)
>>  create mode 100644 block/replication.c
>>  create mode 100644 docs/block-replication.txt
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-06-04  5:30 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-21  4:52 [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 01/15] docs: block replication's description Wen Congyang
2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 02/15] allow writing to the backing file Wen Congyang
2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 03/15] Allow creating backup jobs when opening BDS Wen Congyang
2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 04/15] block: Parse "backing_reference" option to reference existing BDS Wen Congyang
2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 05/15] Backup: clear all bitmap when doing block checkpoint Wen Congyang
2015-05-21  4:52 ` [Qemu-devel] [PATCH COLO-Block v5 06/15] Don't allow a disk use backing reference target Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 07/15] Add new block driver interface to connect/disconnect the remote target Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 08/15] NBD client: implement block driver interfaces to connect/disconnect NBD server Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 09/15] Introduce a new -drive option to control whether to connect to remote target Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 10/15] NBD client: connect to nbd server later Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 11/15] Add new block driver interfaces to control block replication Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 12/15] skip nbd_target when starting " Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 13/15] quorum: implement block driver interfaces for " Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 14/15] quorum: allow ignoring child errors Wen Congyang
2015-05-21  4:53 ` [Qemu-devel] [PATCH COLO-Block v5 15/15] Implement new driver for block replication Wen Congyang
2015-05-21  6:06 ` [Qemu-devel] [PATCH COLO-Block v5 00/15] Block replication for continuous checkpoints Wen Congyang
2015-05-28  4:59 ` Wen Congyang
2015-06-04  5:34   ` Wen Congyang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).