From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47899) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brmNt-0006Ep-M1 for qemu-devel@nongnu.org; Wed, 05 Oct 2016 09:37:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1brmNo-0006EF-98 for qemu-devel@nongnu.org; Wed, 05 Oct 2016 09:37:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57245) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1brmNo-0006Di-06 for qemu-devel@nongnu.org; Wed, 05 Oct 2016 09:37:16 -0400 References: <1475138797-9908-1-git-send-email-zhang.zhanghailiang@huawei.com> <1475138797-9908-17-git-send-email-zhang.zhanghailiang@huawei.com> From: Eric Blake Message-ID: <52ae0dae-e536-7362-e175-e3e36a130026@redhat.com> Date: Wed, 5 Oct 2016 08:37:13 -0500 MIME-Version: 1.0 In-Reply-To: <1475138797-9908-17-git-send-email-zhang.zhanghailiang@huawei.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Xv6ebDj853J34Ig7mooOwqelXujKAxe18" Subject: Re: [Qemu-devel] [PATCH COLO-Frame (Base) v20 16/17] docs: Add documentation for COLO feature List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: zhanghailiang , amit.shah@redhat.com, quintela@redhat.com Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, zhangchen.fnst@cn.fujitsu.com, qemu-devel@nongnu.org, dgilbert@redhat.com This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Xv6ebDj853J34Ig7mooOwqelXujKAxe18 From: Eric Blake To: zhanghailiang , amit.shah@redhat.com, quintela@redhat.com Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, zhangchen.fnst@cn.fujitsu.com, qemu-devel@nongnu.org, dgilbert@redhat.com Message-ID: <52ae0dae-e536-7362-e175-e3e36a130026@redhat.com> Subject: Re: [Qemu-devel] [PATCH COLO-Frame (Base) v20 16/17] docs: Add documentation for COLO feature References: <1475138797-9908-1-git-send-email-zhang.zhanghailiang@huawei.com> <1475138797-9908-17-git-send-email-zhang.zhanghailiang@huawei.com> In-Reply-To: <1475138797-9908-17-git-send-email-zhang.zhanghailiang@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 09/29/2016 03:46 AM, zhanghailiang wrote: > Introduce the design of COLO, and how to test it. >=20 > Signed-off-by: zhanghailiang > --- > docs/COLO-FT.txt | 190 +++++++++++++++++++++++++++++++++++++++++++++++= ++++++++ > 1 file changed, 190 insertions(+) > create mode 100644 docs/COLO-FT.txt >=20 > + > +=3D=3D Background =3D=3D > +Virtual machine (VM) replication is a well known technique for providi= ng > +application-agnostic software-implemented hardware fault tolerance > +"non-stop service". Do you want s/tolerance/tolerance, also known as/ ? > +=3D=3D Architecture =3D=3D > + > +The architecture of COLO is shown in the bellow diagram. s/bellow diagram/diagram below/ > +It consists of a pair of networked physical nodes: > +The primary node running the PVM, and the secondary node running the S= VM > +to maintain a valid replica of the PVM. > +PVM and SVM execute in parallel and generate output of response packet= s for > +client requests according to the application semantics. > + > +The incoming packets from the client or external network are received = by the > +primary node, and then forwarded to the secondary node, so that Both t= he PVM s/Both/both/ > +and the SVM are stimulated with the same requests. > + > +COLO receives the outbound packets from both the PVM and SVM and compa= res them > +before allowing the output to be sent to clients. > + > +The SVM is qualified as a valid replica of the PVM, as long as it gene= rates > +identical responses to all client requests. Once the differences in th= e outputs > +are detected between the PVM and SVM, COLO withholds transmission of t= he > +outbound packets until it has successfully synchronized the PVM state = to the SVM. > + > +=3D=3D Components introduction =3D=3D > + > +You can see there are several components in COLO's diagram of architec= ture. > +Their functions are described as bellow. s/as bellow/below/ > + > +HeartBeat: > +Runs on both the primary and secondary nodes, to periodically check pl= atform > +availability. When the primary node suffers a hardware fail-stop failu= re, > +the heartbeat stops responding, the secondary node will trigger a fail= over > +as soon as it determines the absence. > + > +COLO disk Manager: > +When primary VM writes data into image, the colo disk manger captures = this data > +and send it to secondary VM=E2=80=99s which makes sure the context of = secondary VM's s/send/sends/ > +image is consentient with the context of primary VM 's image. s/consentient/consistent/ s/VM 's/VM's/ > +For more details, please refer to docs/block-replication.txt. > + > +Checkpoint/Failover Controller: > +Modifications of save/restore flow to realize continuous migration, > +to make sure the state of VM in Secondary side always be consistent wi= th VM in s/always be/is always/ > +Primary side. > + > +COLO Proxy: > +Delivers packets to Primary and Seconday, and then compare the respons= es from > +both side. Then decide whether to start a checkpoint according to some= rules. > + > +Note: > + a. HeartBeat is not been realized, so you need to trigger failover pr= ocess s/is/has/ s/realized/implemented yet/ Is this note going to be stale once heartbeat is implemented? > + by using 'x-colo-lost-heartbeat' command. > + b. COLO proxy compents is work-in-process, it only support periodic c= heckpoint s/compents is/components are a/ > + mode now, just as Micro-checkpointing. > + > +3. On Primary VM's QEMU monitor, issue command: > +{'execute':'qmp_capabilities'} > +{ 'execute': 'human-monitor-command', > + 'arguments': {'command-line': 'drive_add -n buddy driver=3Dreplicati= on,mode=3Dprimary,file.driver=3Dnbd,file.host=3Dxx.xx.xx.xx,file.port=3D8= 889,file.export=3Dcolo-disk0,node-name=3Dnode0'}} It would be really nice if we could get this done through QMP blockdev-add instead of HMP drive_add. > + > +Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we ha= ve to > +issue block related command to stop block replication. > +Primary: > + Remove the nbd child from the quorum: > + { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk= 0', 'child': 'children.1'}} > + { 'execute': 'human-monitor-command','arguments': {'command-line': '= drive_del blk-buddy0'}} > + Note: there is no qmp command to remove the blockdev now Don't we have x-blockdev-del? > + > +Secondary: > + The primary host is down, so we should do the following thing: > + { 'execute': 'nbd-server-stop' } > + > +=3D=3D TODO =3D=3D > +1. Support continuously VM replication. s/continuously/continuous/ > +2. Support shared storage. > +3. Develop the heartbeat part. > +4. Reduce checkpoint VM=E2=80=99s downtime while do checkpoint. s/do/doing/ >=20 --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --Xv6ebDj853J34Ig7mooOwqelXujKAxe18 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJX9QIJAAoJEKeha0olJ0NqIcYH+wUTT+a5D8a6dnhx96UYgnEp 0KFE6YLB0HAMdBMePoLzCjfFitl95tCcfyx+NcLtV+b3vGKpVFaFPAYTWhpryjpt UbHucROcvFBgtFT5RE4Aw8BcoIs4wu4/10vOBIEBvNaryNqd3uvDCd3mlLBXL+rH pPkwQkh2A1T1G/gR5tISTiJVdwQ76uY84YFngG1cMqlKgThwcsP9cD9o6AEUfIgk CVXBljylE+Y49sBoSjVItex5tC+saRV7g/i1DVGWKCLwhPGL9rbf6nwZKMiobkyg 2k4PDl1OakmCv3IAkT0o0WYQeG49dxQeaouMPD7OJCZyCqCo7elAbuaQQ2rjgLM= =L7eP -----END PGP SIGNATURE----- --Xv6ebDj853J34Ig7mooOwqelXujKAxe18--