All of lore.kernel.org
 help / color / mirror / Atom feed
From: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
To: zhanghailiang <zhang.zhanghailiang@huawei.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: stefanha@redhat.com, kwolf@redhat.com, mreitz@redhat.com,
	pbonzini@redhat.com, wency@cn.fujitsu.com,
	Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Subject: Re: [Qemu-devel] [PATCH RFC v2 1/6] docs/block-replication: Add description for shared-disk case
Date: Tue, 20 Dec 2016 19:23:29 +0800	[thread overview]
Message-ID: <585914B1.1030309@cn.fujitsu.com> (raw)
In-Reply-To: <1480926904-17596-2-git-send-email-zhang.zhanghailiang@huawei.com>

On 12/05/2016 04:34 PM, zhanghailiang wrote:
> Introuduce the scenario of shared-disk block replication
> and how to use it.
>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
> v2:
> - fix some problems found by Changlong
> ---
>   docs/block-replication.txt | 139 +++++++++++++++++++++++++++++++++++++++++++--
>   1 file changed, 135 insertions(+), 4 deletions(-)
>
> diff --git a/docs/block-replication.txt b/docs/block-replication.txt
> index 6bde673..fbfe005 100644
> --- a/docs/block-replication.txt
> +++ b/docs/block-replication.txt
> @@ -24,7 +24,7 @@ only dropped at next checkpoint time. To reduce the network transportation
>   effort during a vmstate checkpoint, the disk modification operations of
>   the Primary disk are asynchronously forwarded to the Secondary node.
>
> -== Workflow ==
> +== Non-shared disk workflow ==
>   The following is the image of block replication workflow:
>
>           +----------------------+            +------------------------+
> @@ -57,7 +57,7 @@ The following is the image of block replication workflow:
>       4) Secondary write requests will be buffered in the Disk buffer and it
>          will overwrite the existing sector content in the buffer.
>
> -== Architecture ==
> +== Non-shared disk architecture ==
>   We are going to implement block replication from many basic
>   blocks that are already in QEMU.
>
> @@ -106,6 +106,74 @@ any state that would otherwise be lost by the speculative write-through
>   of the NBD server into the secondary disk. So before block replication,
>   the primary disk and secondary disk should contain the same data.
>
> +== Shared Disk Mode Workflow ==
> +The following is the image of block replication workflow:
> +
> +        +----------------------+            +------------------------+
> +        |Primary Write Requests|            |Secondary Write Requests|
> +        +----------------------+            +------------------------+
> +                  |                                       |
> +                  |                                      (4)
> +                  |                                       V
> +                  |                              /-------------\
> +                  | (2)Forward and write through |             |
> +                  | +--------------------------> | Disk Buffer |
> +                  | |                            |             |
> +                  | |                            \-------------/
> +                  | |(1)read                           |
> +                  | |                                  |
> +       (3)write   | |                                  | backing file
> +                  V |                                  |
> +                 +-----------------------------+       |
> +                 | Shared Disk                 | <-----+
> +                 +-----------------------------+
> +
> +    1) Primary writes will read original data and forward it to Secondary
> +       QEMU.
> +    2) Before Primary write requests are written to Shared disk, the
> +       original sector content will be read from Shared disk and
> +       forwarded and buffered in the Disk buffer on the secondary site,
> +       but it will not overwrite the existing sector content (it could be
> +       from either "Secondary Write Requests" or previous COW of "Primary
> +       Write Requests") in the Disk buffer.
> +    3) Primary write requests will be written to Shared disk.
> +    4) Secondary write requests will be buffered in the Disk buffer and it
> +       will overwrite the existing sector content in the buffer.
> +
> +== Shared Disk Mode Architecture ==
> +We are going to implement block replication from many basic
> +blocks that are already in QEMU.
> +         virtio-blk                     ||                               .----------
> +             /                          ||                               | Secondary
> +            /                           ||                               '----------
> +           /                            ||                                 virtio-blk
> +          /                             ||                                      |
> +          |                             ||                               replication(5)
> +          |                    NBD  -------->   NBD   (2)                       |
> +          |                  client     ||    server ---> hidden disk <-- active disk(4)
> +          |                     ^       ||                      |
> +          |              replication(1) ||                      |
> +          |                     |       ||                      |
> +          |   +-----------------'       ||                      |
> +         (3)  |drive-backup sync=none   ||                      |
> +--------. |   +-----------------+       ||                      |
> +Primary | |                     |       ||           backing    |
> +--------' |                     |       ||                      |
> +          V                     |                               |
> +       +-------------------------------------------+            |
> +       |               shared disk                 | <----------+
> +       +-------------------------------------------+
> +
> +
> +    1) Primary writes will read original data and forward it to Secondary
> +       QEMU.
> +    2) The hidden-disk buffers the original content that is modified by the
> +       primary VM. It should also be an empty disk, and the driver supports
> +       bdrv_make_empty() and backing file.
> +    3) Primary write requests will be written to Shared disk.
> +    4) Secondary write requests will be buffered in the active disk and it
> +       will overwrite the existing sector content in the buffer.
> +
>   == Failure Handling ==
>   There are 7 internal errors when block replication is running:
>   1. I/O error on primary disk
> @@ -145,7 +213,7 @@ d. replication_stop_all()
>      things except failover. The caller must hold the I/O mutex lock if it is
>      in migration/checkpoint thread.
>
> -== Usage ==
> +== Non-shared disk usage ==
>   Primary:
>     -drive if=xxx,driver=quorum,read-pattern=fifo,id=colo1,vote-threshold=1,\
>            children.0.file.filename=1.raw,\
> @@ -234,6 +302,69 @@ Secondary:
>     The primary host is down, so we should do the following thing:
>     { 'execute': 'nbd-server-stop' }
>
> +== Shared disk usage ==
> +Primary:
> + -drive if=virtio,id=primary_disk0,file.filename=1.raw,driver=raw
> +
> +Issue qmp command:
> +  { 'execute': 'blockdev-add',
> +    'arguments': {
> +        'driver': 'replication',
> +        'node-name': 'rep',
> +        'mode': 'primary',
> +        'shared-disk-id': 'primary_disk0',
> +        'shared-disk': true,
> +        'file': {
> +            'driver': 'nbd',
> +            'export': 'hidden_disk0',
> +            'server': {
> +                'type': 'inet',
> +                'data': {
> +                    'host': 'xxx.xxx.xxx.xxx',
> +                    'port': 'yyy'
> +                }
> +            }
> +        }
> +     }
> +  }
> +
> +Secondary:
> + -drive if=none,driver=qcow2,file.filename=/mnt/ramfs/hidden_disk.img,id=hidden_disk0,\
> +        backing.driver=raw,backing.file.filename=1.raw \
> + -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
> +        file.driver=qcow2,top-id=active-disk0,\
> +        file.file.filename=/mnt/ramfs/active_disk.img,\
> +        file.backing=hidden_disk0,shared-disk=on
> +
> +Issue qmp command:
> +1. { 'execute': 'nbd-server-start',
> +     'arguments': {
> +        'addr': {
> +            'type': 'inet',
> +            'data': {
> +                'host': '0',
> +                'port': 'yyy'
> +            }
> +        }
> +     }
> +   }
> +2. { 'execute': 'nbd-server-add',
> +     'arguments': {
> +        'device': 'hidden_disk0',
> +        'writable': true
> +    }
> +  }
> +
> +After Failover:
> +Primary:
> +  { 'execute': 'x-blockdev-del',
> +    'arguments': {
> +        'node-name': 'rep'
> +    }
> +  }
> +
> +Secondary:
> +  {'execute': 'nbd-server-stop' }
> +
>   TODO:
>   1. Continuous block replication
> -2. Shared disk
>

Looks good to me

Reviewed-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>

  reply	other threads:[~2016-12-20 11:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  8:34 [Qemu-devel] [PATCH RFC v2 0/6] COLO block replication supports shared disk case zhanghailiang
2016-12-05  8:34 ` [Qemu-devel] [PATCH RFC v2 1/6] docs/block-replication: Add description for shared-disk case zhanghailiang
2016-12-20 11:23   ` Changlong Xie [this message]
2017-01-13 13:41   ` Stefan Hajnoczi
2017-01-19  2:50     ` Hailiang Zhang
2017-01-19 16:41       ` Stefan Hajnoczi
2017-01-20  2:35         ` Hailiang Zhang
2016-12-05  8:35 ` [Qemu-devel] [PATCH RFC v2 2/6] replication: add shared-disk and shared-disk-id options zhanghailiang
2016-12-05 16:22   ` Eric Blake
2017-01-18  6:58     ` Hailiang Zhang
2016-12-20 11:34   ` Changlong Xie
2017-01-17 11:25   ` Stefan Hajnoczi
2017-01-18  6:54     ` Hailiang Zhang
2016-12-05  8:35 ` [Qemu-devel] [PATCH RFC v2 3/6] replication: Split out backup_do_checkpoint() from secondary_do_checkpoint() zhanghailiang
2016-12-20 12:41   ` Changlong Xie
2017-01-17 13:10   ` Stefan Hajnoczi
2016-12-05  8:35 ` [Qemu-devel] [PATCH RFC v2 4/6] replication: fix code logic with the new shared_disk option zhanghailiang
2016-12-20 12:42   ` Changlong Xie
2017-01-18  6:53     ` Hailiang Zhang
2017-01-17 13:15   ` Stefan Hajnoczi
2016-12-05  8:35 ` [Qemu-devel] [PATCH RFC v2 5/6] replication: Implement block replication for shared disk case zhanghailiang
2017-01-17 13:19   ` Stefan Hajnoczi
2017-01-18  6:53     ` Hailiang Zhang
2016-12-05  8:35 ` [Qemu-devel] [PATCH RFC v2 6/6] nbd/replication: implement .bdrv_get_info() for nbd and replication driver zhanghailiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585914B1.1030309@cn.fujitsu.com \
    --to=xiecl.fnst@cn.fujitsu.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=wency@cn.fujitsu.com \
    --cc=zhang.zhanghailiang@huawei.com \
    --cc=zhangchen.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.