From: Lukas Straub <lukasstraub2@web.de>
To: "Zhang, Chen" <chen.zhang@intel.com>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-block <qemu-block@nongnu.org>,
Wen Congyang <wencongyang2@huawei.com>,
Jason Wang <jasowang@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>, Max Reitz <mreitz@redhat.com>,
Xie Changlong <xiechanglong.d@gmail.com>
Subject: Re: [PATCH v6 4/4] colo: Update Documentation for continuous replication
Date: Wed, 9 Oct 2019 17:16:35 +0200 [thread overview]
Message-ID: <20191009171635.0cb4fa12@luklap> (raw)
In-Reply-To: <9CFF81C0F6B98A43A459C9EDAD400D78062A7867@shsmsx102.ccr.corp.intel.com>
On Wed, 9 Oct 2019 08:36:52 +0000
"Zhang, Chen" <chen.zhang@intel.com> wrote:
> > -----Original Message-----
> > From: Lukas Straub <lukasstraub2@web.de>
> > Sent: Saturday, October 5, 2019 9:06 PM
> > To: qemu-devel <qemu-devel@nongnu.org>
> > Cc: Zhang, Chen <chen.zhang@intel.com>; Jason Wang
> > <jasowang@redhat.com>; Wen Congyang <wencongyang2@huawei.com>;
> > Xie Changlong <xiechanglong.d@gmail.com>; Kevin Wolf
> > <kwolf@redhat.com>; Max Reitz <mreitz@redhat.com>; qemu-block
> > <qemu-block@nongnu.org>
> > Subject: [PATCH v6 4/4] colo: Update Documentation for continuous
> > replication
> >
> > Document the qemu command-line and qmp commands for continuous
> > replication
> >
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > ---
> > docs/COLO-FT.txt | 213 +++++++++++++++++++++++++++----------
> > docs/block-replication.txt | 28 +++--
> > 2 files changed, 174 insertions(+), 67 deletions(-)
> >
> > diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt index
> > ad24680d13..bc1a0ccb99 100644
> > --- a/docs/COLO-FT.txt
> > +++ b/docs/COLO-FT.txt
> > @@ -145,35 +145,65 @@ The diagram just shows the main qmp command,
> > you can get the detail in test procedure.
> >
> > ...
> >
> > +Note: Here we are running both instances on the same Host for testing,
> > +change the IP Addresses if you want to run it on two Hosts. Initally
> > +127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
> > +
> > +== Startup qemu ==
> > +1. Primary:
> > +Note: Initally, $imagefolder/primary.qcow2 needs to be copied to all Hosts.
> > +# imagefolder="/mnt/vms/colo-test-primary"
> > +
> > +# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp
> > 1 -qmp stdio \
> > + -device piix3-usb-uhci -device usb-tablet -name primary \
> > + -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper
> > \
> > + -device rtl8139,id=e0,netdev=hn0 \
> > + -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server,nowait \
> > + -chardev socket,id=compare1,host=0.0.0.0,port=9004,server,wait \
>
> We should change the host=127.0.0.1 consistent with the expression below.
Hi,
This (and the IPs below in the QMP commands) needs to be this way, because
it's a listening port and with 127.0.0.1 it would only listen on the loopback ip
and wouldn't be reachable from another node for example. With 0.0.0.0 it will
listen on all Interfaces.
> > + -chardev socket,id=compare0,host=127.0.0.1,port=9001,server,nowait \
> > + -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
> > + -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server,nowait
> > \
> > + -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
> > + -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
> > + -object filter-
> > redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
> > + -object filter-
> > redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
> > + -object iothread,id=iothread1 \
> > + -object
> > +colo-compare,id=comp0,primary_in=compare0-
> > 0,secondary_in=compare1,\
> > +outdev=compare_out0,iothread=iothread1 \
> > + -drive
> > +if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
> > +children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=q
> > +cow2 -S
> > +
> > +2. Secondary:
> > +# imagefolder="/mnt/vms/colo-test-secondary"
> > +# primary_ip=127.0.0.1
> > +
> > +# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
> > +
> > +# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
> > +
>
> The active disk and hidden disk just need create one time, we can note that here.
Ok, I will Note that. But I will wait until the block changes are reviewed before sending the next version.
Regards,
Lukas Straub
> > +# qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 512 -smp
> > 1 -qmp stdio \
> > + -device piix3-usb-uhci -device usb-tablet -name secondary \
> > + -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper
> > \
> > + -device rtl8139,id=e0,netdev=hn0 \
> > + -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect=1 \
> > + -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect=1 \
> > + -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
> > + -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
> > + -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
> > + -drive
> > if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow
> > 2 \
> > + -drive
> > +if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,
> > +\
> > +top-id=childs0,file.file.filename=$imagefolder/secondary-active.qcow2,\
> > +file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secon
> > +dary-hidden.qcow2,\
> > +file.backing.backing=parent0 \
> > + -drive
> > +if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
> > +children.0=childs0 \
> > + -incoming tcp:0.0.0.0:9998
> > +
> > +
> > +3. On Secondary VM's QEMU monitor, issue command
> > {'execute':'qmp_capabilities'}
> > -{ 'execute': 'nbd-server-start',
> > - 'arguments': {'addr': {'type': 'inet', 'data': {'host': 'xx.xx.xx.xx', 'port':
> > '8889'} } } -}
> > -{'execute': 'nbd-server-add', 'arguments': {'device': 'secondary-disk0',
> > 'writable': true } }
> > +{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet',
> > +'data': {'host': '0.0.0.0', 'port': '9999'} } } }
> > +{'execute': 'nbd-server-add', 'arguments': {'device': 'parent0',
> > +'writable': true } }
> >
> > Note:
> > a. The qmp command nbd-server-start and nbd-server-add must be run
> > @@ -182,44 +212,113 @@ Note:
> > same.
> > c. It is better to put active disk and hidden disk in ramdisk.
> >
> > -3. On Primary VM's QEMU monitor, issue command:
> > +4. On Primary VM's QEMU monitor, issue command:
> > {'execute':'qmp_capabilities'}
> > -{ 'execute': 'human-monitor-command',
> > - 'arguments': {'command-line': 'drive_add -n buddy
> > driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.p
> > ort=8889,file.export=secondary-disk0,node-name=nbd_client0'}}
> > -{ 'execute':'x-blockdev-change', 'arguments':{'parent': 'primary-disk0',
> > 'node': 'nbd_client0' } } -{ 'execute': 'migrate-set-capabilities',
> > - 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
> > -{ 'execute': 'migrate', 'arguments': {'uri': 'tcp:xx.xx.xx.xx:8888' } }
> > +{'execute': 'human-monitor-command', 'arguments': {'command-line':
> > +'drive_add -n buddy
> > +driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,fil
> > +e.port=9999,file.export=parent0,node-name=replication0'}}
> > +{'execute': 'x-blockdev-change', 'arguments':{'parent': 'colo-disk0',
> > +'node': 'replication0' } }
> > +{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [
> > +{'capability': 'x-colo', 'state': true } ] } }
> > +{'execute': 'migrate', 'arguments': {'uri': 'tcp:127.0.0.2:9998' } }
> >
> > Note:
> > a. There should be only one NBD Client for each primary disk.
> > - b. xx.xx.xx.xx is the secondary physical machine's hostname or IP
> > - c. The qmp command line must be run after running qmp command line in
> > + b. The qmp command line must be run after running qmp command line in
> > secondary qemu.
> >
> > -4. After the above steps, you will see, whenever you make changes to PVM,
> > SVM will be synced.
> > +5. After the above steps, you will see, whenever you make changes to PVM,
> > SVM will be synced.
> > You can issue command '{ "execute": "migrate-set-parameters" ,
> > "arguments":{ "x-checkpoint-delay": 2000 } }'
> > -to change the checkpoint period time
> > +to change the idle checkpoint period time
> > +
> > +6. Failover test
> > +You can kill one of the VMs and Failover on the surviving VM:
> > +
> > +If you killed the Secondary, then follow "Primary Failover". After
> > +that, if you want to resume the replication, follow "Primary resume
> > replication"
> > +
> > +If you killed the Primary, then follow "Secondary Failover". After
> > +that, if you want to resume the replication, follow "Secondary resume
> > replication"
> > +
> > +== Primary Failover ==
> > +The Secondary died, resume on the Primary
> > +
> > +{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0',
> > +'child': 'children.1'} }
> > +{'execute': 'human-monitor-command', 'arguments':{ 'command-line':
> > +'drive_del replication0' } }
> > +{'execute': 'object-del', 'arguments':{ 'id': 'comp0' } }
> > +{'execute': 'object-del', 'arguments':{ 'id': 'iothread1' } }
> > +{'execute': 'object-del', 'arguments':{ 'id': 'm0' } }
> > +{'execute': 'object-del', 'arguments':{ 'id': 'redire0' } }
> > +{'execute': 'object-del', 'arguments':{ 'id': 'redire1' } }
> > +{'execute': 'x-colo-lost-heartbeat' }
> > +
> > +== Secondary Failover ==
> > +The Primary died, resume on the Secondary and prepare to become the
> > new
> > +Primary
> > +
> > +{'execute': 'nbd-server-stop'}
> > +{'execute': 'x-colo-lost-heartbeat'}
> > +
> > +{'execute': 'object-del', 'arguments':{ 'id': 'f2' } }
> > +{'execute': 'object-del', 'arguments':{ 'id': 'f1' } }
> > +{'execute': 'chardev-remove', 'arguments':{ 'id': 'red1' } }
> > +{'execute': 'chardev-remove', 'arguments':{ 'id': 'red0' } }
> > +
> > +{'execute': 'chardev-add', 'arguments':{ 'id': 'mirror0', 'backend':
> > +{'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host':
> > +'0.0.0.0', 'port': '9003' } }, 'server': true } } } }
>
> Same like I said before.
>
> Others statement looks good for me.
>
> Thanks
> Zhang Chen
>
> > +{'execute': 'chardev-add', 'arguments':{ 'id': 'compare1', 'backend':
> > +{'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host':
> > +'0.0.0.0', 'port': '9004' } }, 'server': true } } } }
> > +{'execute': 'chardev-add', 'arguments':{ 'id': 'compare0', 'backend':
> > +{'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host':
> > +'127.0.0.1', 'port': '9001' } }, 'server': true } } } }
> > +{'execute': 'chardev-add', 'arguments':{ 'id': 'compare0-0', 'backend':
> > +{'type': 'socket', 'data': {'addr': { 'type': 'inet', 'data': { 'host':
> > +'127.0.0.1', 'port': '9001' } }, 'server': false } } } }
> > +{'execute': 'chardev-add', 'arguments':{ 'id': 'compare_out',
> > +'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet',
> > +'data': { 'host': '127.0.0.1', 'port': '9005' } }, 'server': true } } }
> > +}
> > +{'execute': 'chardev-add', 'arguments':{ 'id': 'compare_out0',
> > +'backend': {'type': 'socket', 'data': {'addr': { 'type': 'inet',
> > +'data': { 'host': '127.0.0.1', 'port': '9005' } }, 'server': false } }
> > +} }
> > +
> > +== Primary resume replication ==
> > +Resume replication after new Secondary is up.
> > +
> > +Start the new Secondary (Steps 2 and 3 above), then on the Primary:
> > +{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0',
> > +'job-id': 'resync', 'target': 'nbd://127.0.0.2:9999/parent0', 'mode':
> > +'existing', 'format': 'raw', 'sync': 'full'} }
> > +
> > +Wait until disk is synced, then:
> > +{'execute': 'stop'}
> > +{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} }
> > +
> > +{'execute': 'human-monitor-command', 'arguments':{ 'command-line':
> > +'drive_add -n buddy
> > +driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,fil
> > +e.port=9999,file.export=parent0,node-name=replication0'}}
> > +{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0',
> > +'node': 'replication0' } }
> > +
> > +{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror',
> > +'id': 'm0', 'props': { 'netdev': 'hn0', 'queue': 'tx', 'outdev':
> > +'mirror0' } } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type':
> > +'filter-redirector', 'id': 'redire0', 'props': { 'netdev': 'hn0',
> > +'queue': 'rx', 'indev': 'compare_out' } } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type':
> > +'filter-redirector', 'id': 'redire1', 'props': { 'netdev': 'hn0',
> > +'queue': 'rx', 'outdev': 'compare0' } } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id':
> > +'iothread1' } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare',
> > +'id': 'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in':
> > +'compare1', 'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
> > +
> > +{'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [
> > +{'capability': 'x-colo', 'state': true } ] } }
> > +{'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.2:9998' } }
> > +
> > +Note:
> > +If this Primary previously was a Secondary, then we need to insert the
> > +filters before the filter-rewriter by using the
> > +"'insert': 'before', 'position': 'id=rew0'" Options. See below.
> > +
> > +== Secondary resume replication ==
> > +Become Primary and resume replication after new Secondary is up. Note
> > +that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
> > +
> > +Start the new Secondary (Steps 2 and 3 above, but with
> > +primary_ip=127.0.0.2), then on the old Secondary:
> > +{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0',
> > +'job-id': 'resync', 'target': 'nbd://127.0.0.1:9999/parent0', 'mode':
> > +'existing', 'format': 'raw', 'sync': 'full'} }
> > +
> > +Wait until disk is synced, then:
> > +{'execute': 'stop'}
> > +{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync' } }
> >
> > -5. Failover test
> > -You can kill Primary VM and run 'x_colo_lost_heartbeat' in Secondary VM's -
> > monitor at the same time, then SVM will failover and client will not detect
> > this -change.
> > +{'execute': 'human-monitor-command', 'arguments':{ 'command-line':
> > +'drive_add -n buddy
> > +driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,fil
> > +e.port=9999,file.export=parent0,node-name=replication0'}}
> > +{'execute': 'x-blockdev-change', 'arguments':{ 'parent': 'colo-disk0',
> > +'node': 'replication0' } }
> >
> > -Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to
> > -issue block related command to stop block replication.
> > -Primary:
> > - Remove the nbd child from the quorum:
> > - { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 'child':
> > 'children.1'}}
> > - { 'execute': 'human-monitor-command','arguments': {'command-line':
> > 'drive_del blk-buddy0'}}
> > - Note: there is no qmp command to remove the blockdev now
> > +{'execute': 'object-add', 'arguments':{ 'qom-type': 'filter-mirror',
> > +'id': 'm0', 'props': { 'insert': 'before', 'position': 'id=rew0',
> > +'netdev': 'hn0', 'queue': 'tx', 'outdev': 'mirror0' } } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type':
> > +'filter-redirector', 'id': 'redire0', 'props': { 'insert': 'before',
> > +'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'rx', 'indev':
> > +'compare_out' } } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type':
> > +'filter-redirector', 'id': 'redire1', 'props': { 'insert': 'before',
> > +'position': 'id=rew0', 'netdev': 'hn0', 'queue': 'rx', 'outdev':
> > +'compare0' } } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type': 'iothread', 'id':
> > +'iothread1' } }
> > +{'execute': 'object-add', 'arguments':{ 'qom-type': 'colo-compare',
> > +'id': 'comp0', 'props': { 'primary_in': 'compare0-0', 'secondary_in':
> > +'compare1', 'outdev': 'compare_out0', 'iothread': 'iothread1' } } }
> >
> > -Secondary:
> > - The primary host is down, so we should do the following thing:
> > - { 'execute': 'nbd-server-stop' }
> > +{'execute': 'migrate-set-capabilities', 'arguments':{ 'capabilities': [
> > +{'capability': 'x-colo', 'state': true } ] } }
> > +{'execute': 'migrate', 'arguments':{ 'uri': 'tcp:127.0.0.1:9998' } }
> >
> > == TODO ==
> > -1. Support continuous VM replication.
> > -2. Support shared storage.
> > -3. Develop the heartbeat part.
> > -4. Reduce checkpoint VM’s downtime while doing checkpoint.
> > +1. Support shared storage.
> > +2. Develop the heartbeat part.
> > +3. Reduce checkpoint VM’s downtime while doing checkpoint.
next prev parent reply other threads:[~2019-10-09 18:46 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-05 13:05 [PATCH v6 0/4] colo: Add support for continuous replication Lukas Straub
2019-10-05 13:05 ` [PATCH v6 1/4] block/replication.c: Ignore requests after failover Lukas Straub
2019-10-18 18:46 ` Lukas Straub
2019-10-23 8:13 ` Zhang, Chen
2019-10-23 12:49 ` Max Reitz
2019-10-23 18:07 ` Lukas Straub
2019-10-05 13:05 ` [PATCH v6 2/4] tests/test-replication.c: Add test for for secondary node continuing replication Lukas Straub
2019-10-09 6:03 ` Zhang, Chen
2019-10-09 15:19 ` Lukas Straub
2019-10-05 13:05 ` [PATCH v6 3/4] net/filter.c: Add Options to insert filters anywhere in the filter list Lukas Straub
2019-10-05 13:05 ` [PATCH v6 4/4] colo: Update Documentation for continuous replication Lukas Straub
2019-10-09 8:36 ` Zhang, Chen
2019-10-09 15:16 ` Lukas Straub [this message]
2019-10-10 10:34 ` Zhang, Chen
2019-10-11 16:00 ` Lukas Straub
2019-10-11 17:39 ` Zhang, Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191009171635.0cb4fa12@luklap \
--to=lukasstraub2@web.de \
--cc=chen.zhang@intel.com \
--cc=jasowang@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=wencongyang2@huawei.com \
--cc=xiechanglong.d@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.