From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49324) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bso1H-0005qO-GQ for qemu-devel@nongnu.org; Sat, 08 Oct 2016 05:34:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bso1C-0001wn-Hh for qemu-devel@nongnu.org; Sat, 08 Oct 2016 05:34:15 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:2418) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bso1B-0001oN-J9 for qemu-devel@nongnu.org; Sat, 08 Oct 2016 05:34:10 -0400 References: <1475138797-9908-1-git-send-email-zhang.zhanghailiang@huawei.com> <1475138797-9908-17-git-send-email-zhang.zhanghailiang@huawei.com> <52ae0dae-e536-7362-e175-e3e36a130026@redhat.com> From: Hailiang Zhang Message-ID: <57F8BD25.8020809@huawei.com> Date: Sat, 8 Oct 2016 17:32:21 +0800 MIME-Version: 1.0 In-Reply-To: <52ae0dae-e536-7362-e175-e3e36a130026@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH COLO-Frame (Base) v20 16/17] docs: Add documentation for COLO feature List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , amit.shah@redhat.com, quintela@redhat.com Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, zhangchen.fnst@cn.fujitsu.com, qemu-devel@nongnu.org, dgilbert@redhat.com On 2016/10/5 21:37, Eric Blake wrote: > On 09/29/2016 03:46 AM, zhanghailiang wrote: >> Introduce the design of COLO, and how to test it. >> >> Signed-off-by: zhanghailiang >> --- >> docs/COLO-FT.txt | 190 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 190 insertions(+) >> create mode 100644 docs/COLO-FT.txt >> > >> + >> +== Background == >> +Virtual machine (VM) replication is a well known technique for providing >> +application-agnostic software-implemented hardware fault tolerance >> +"non-stop service". > > Do you want s/tolerance/tolerance, also known as/ ? > Yes, that is more appropriate. > >> +== Architecture == >> + >> +The architecture of COLO is shown in the bellow diagram. > > s/bellow diagram/diagram below/ > >> +It consists of a pair of networked physical nodes: >> +The primary node running the PVM, and the secondary node running the SVM >> +to maintain a valid replica of the PVM. >> +PVM and SVM execute in parallel and generate output of response packets for >> +client requests according to the application semantics. >> + >> +The incoming packets from the client or external network are received by the >> +primary node, and then forwarded to the secondary node, so that Both the PVM > > s/Both/both/ > >> +and the SVM are stimulated with the same requests. >> + >> +COLO receives the outbound packets from both the PVM and SVM and compares them >> +before allowing the output to be sent to clients. >> + >> +The SVM is qualified as a valid replica of the PVM, as long as it generates >> +identical responses to all client requests. Once the differences in the outputs >> +are detected between the PVM and SVM, COLO withholds transmission of the >> +outbound packets until it has successfully synchronized the PVM state to the SVM. >> + > >> +== Components introduction == >> + >> +You can see there are several components in COLO's diagram of architecture. >> +Their functions are described as bellow. > > s/as bellow/below/ > >> + >> +HeartBeat: >> +Runs on both the primary and secondary nodes, to periodically check platform >> +availability. When the primary node suffers a hardware fail-stop failure, >> +the heartbeat stops responding, the secondary node will trigger a failover >> +as soon as it determines the absence. >> + >> +COLO disk Manager: >> +When primary VM writes data into image, the colo disk manger captures this data >> +and send it to secondary VM’s which makes sure the context of secondary VM's > > s/send/sends/ > >> +image is consentient with the context of primary VM 's image. > > s/consentient/consistent/ > s/VM 's/VM's/ > >> +For more details, please refer to docs/block-replication.txt. >> + >> +Checkpoint/Failover Controller: >> +Modifications of save/restore flow to realize continuous migration, >> +to make sure the state of VM in Secondary side always be consistent with VM in > > s/always be/is always/ > >> +Primary side. >> + >> +COLO Proxy: >> +Delivers packets to Primary and Seconday, and then compare the responses from >> +both side. Then decide whether to start a checkpoint according to some rules. >> + >> +Note: >> + a. HeartBeat is not been realized, so you need to trigger failover process > > s/is/has/ > s/realized/implemented yet/ > > Is this note going to be stale once heartbeat is implemented? > Yes, but we're not sure if it is suitable to implement it in qemu. >> + by using 'x-colo-lost-heartbeat' command. >> + b. COLO proxy compents is work-in-process, it only support periodic checkpoint > > s/compents is/components are a/ > >> + mode now, just as Micro-checkpointing. >> + > >> +3. On Primary VM's QEMU monitor, issue command: >> +{'execute':'qmp_capabilities'} >> +{ 'execute': 'human-monitor-command', >> + 'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.port=8889,file.export=colo-disk0,node-name=node0'}} > > It would be really nice if we could get this done through QMP > blockdev-add instead of HMP drive_add. > You are right, but this command doesn't support nbd drive yet in upstream. I saw Max had send a patch-set to support it. I will update this after his patches been merged. >> + >> +Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to >> +issue block related command to stop block replication. >> +Primary: >> + Remove the nbd child from the quorum: >> + { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 'child': 'children.1'}} >> + { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del blk-buddy0'}} >> + Note: there is no qmp command to remove the blockdev now > > Don't we have x-blockdev-del? > Yes, we can use this command, I'll fix it in next version. >> + >> +Secondary: >> + The primary host is down, so we should do the following thing: >> + { 'execute': 'nbd-server-stop' } >> + >> +== TODO == >> +1. Support continuously VM replication. > > s/continuously/continuous/ > >> +2. Support shared storage. >> +3. Develop the heartbeat part. >> +4. Reduce checkpoint VM’s downtime while do checkpoint. > > s/do/doing/ > >> All the above typos and grammatical mistake will be fixed in next version, thanks! Hailiang >