From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:49324)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <zhang.zhanghailiang@huawei.com>) id 1bso1H-0005qO-GQ
	for qemu-devel@nongnu.org; Sat, 08 Oct 2016 05:34:16 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <zhang.zhanghailiang@huawei.com>) id 1bso1C-0001wn-Hh
	for qemu-devel@nongnu.org; Sat, 08 Oct 2016 05:34:15 -0400
Received: from szxga03-in.huawei.com ([119.145.14.66]:2418)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <zhang.zhanghailiang@huawei.com>) id 1bso1B-0001oN-J9
	for qemu-devel@nongnu.org; Sat, 08 Oct 2016 05:34:10 -0400
References: <1475138797-9908-1-git-send-email-zhang.zhanghailiang@huawei.com>
	<1475138797-9908-17-git-send-email-zhang.zhanghailiang@huawei.com>
	<52ae0dae-e536-7362-e175-e3e36a130026@redhat.com>
From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Message-ID: <57F8BD25.8020809@huawei.com>
Date: Sat, 8 Oct 2016 17:32:21 +0800
MIME-Version: 1.0
In-Reply-To: <52ae0dae-e536-7362-e175-e3e36a130026@redhat.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [PATCH COLO-Frame (Base) v20 16/17] docs: Add
 documentation for COLO feature
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Eric Blake <eblake@redhat.com>, amit.shah@redhat.com, quintela@redhat.com
Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, zhangchen.fnst@cn.fujitsu.com, qemu-devel@nongnu.org, dgilbert@redhat.com

On 2016/10/5 21:37, Eric Blake wrote:
> On 09/29/2016 03:46 AM, zhanghailiang wrote:
>> Introduce the design of COLO, and how to test it.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> ---
>>   docs/COLO-FT.txt | 190 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 190 insertions(+)
>>   create mode 100644 docs/COLO-FT.txt
>>
>
>> +
>> +== Background ==
>> +Virtual machine (VM) replication is a well known technique for providing
>> +application-agnostic software-implemented hardware fault tolerance
>> +"non-stop service".
>
> Do you want s/tolerance/tolerance, also known as/ ?
>

Yes, that is more appropriate.

>
>> +== Architecture ==
>> +
>> +The architecture of COLO is shown in the bellow diagram.
>
> s/bellow diagram/diagram below/
>

>> +It consists of a pair of networked physical nodes:
>> +The primary node running the PVM, and the secondary node running the SVM
>> +to maintain a valid replica of the PVM.
>> +PVM and SVM execute in parallel and generate output of response packets for
>> +client requests according to the application semantics.
>> +
>> +The incoming packets from the client or external network are received by the
>> +primary node, and then forwarded to the secondary node, so that Both the PVM
>
> s/Both/both/
>

>> +and the SVM are stimulated with the same requests.
>> +
>> +COLO receives the outbound packets from both the PVM and SVM and compares them
>> +before allowing the output to be sent to clients.
>> +
>> +The SVM is qualified as a valid replica of the PVM, as long as it generates
>> +identical responses to all client requests. Once the differences in the outputs
>> +are detected between the PVM and SVM, COLO withholds transmission of the
>> +outbound packets until it has successfully synchronized the PVM state to the SVM.
>> +
>
>> +== Components introduction ==
>> +
>> +You can see there are several components in COLO's diagram of architecture.
>> +Their functions are described as bellow.
>
> s/as bellow/below/
>

>> +
>> +HeartBeat:
>> +Runs on both the primary and secondary nodes, to periodically check platform
>> +availability. When the primary node suffers a hardware fail-stop failure,
>> +the heartbeat stops responding, the secondary node will trigger a failover
>> +as soon as it determines the absence.
>> +
>> +COLO disk Manager:
>> +When primary VM writes data into image, the colo disk manger captures this data
>> +and send it to secondary VM’s which makes sure the context of secondary VM's
>
> s/send/sends/
>

>> +image is consentient with the context of primary VM 's image.
>
> s/consentient/consistent/
> s/VM 's/VM's/
>

>> +For more details, please refer to docs/block-replication.txt.
>> +
>> +Checkpoint/Failover Controller:
>> +Modifications of save/restore flow to realize continuous migration,
>> +to make sure the state of VM in Secondary side always be consistent with VM in
>
> s/always be/is always/
>

>> +Primary side.
>> +
>> +COLO Proxy:
>> +Delivers packets to Primary and Seconday, and then compare the responses from
>> +both side. Then decide whether to start a checkpoint according to some rules.
>> +
>> +Note:
>> + a. HeartBeat is not been realized, so you need to trigger failover process
>
> s/is/has/
> s/realized/implemented yet/
>
> Is this note going to be stale once heartbeat is implemented?
>

Yes, but we're not sure if it is suitable to implement it in qemu.

>> +    by using 'x-colo-lost-heartbeat' command.
>> + b. COLO proxy compents is work-in-process, it only support periodic checkpoint
>
> s/compents is/components are a/
>

>> +    mode now, just as Micro-checkpointing.
>> +
>
>> +3. On Primary VM's QEMU monitor, issue command:
>> +{'execute':'qmp_capabilities'}
>> +{ 'execute': 'human-monitor-command',
>> +  'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.port=8889,file.export=colo-disk0,node-name=node0'}}
>
> It would be really nice if we could get this done through QMP
> blockdev-add instead of HMP drive_add.
>

You are right, but this command doesn't support nbd drive yet in upstream.
I saw Max had send a patch-set to support it. I will update this after his
patches been merged.

>> +
>> +Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to
>> +issue block related command to stop block replication.
>> +Primary:
>> +  Remove the nbd child from the quorum:
>> +  { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 'child': 'children.1'}}
>> +  { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del blk-buddy0'}}
>> +  Note: there is no qmp command to remove the blockdev now
>
> Don't we have x-blockdev-del?
>

Yes, we can use this command, I'll fix it in next version.

>> +
>> +Secondary:
>> +  The primary host is down, so we should do the following thing:
>> +  { 'execute': 'nbd-server-stop' }
>> +
>> +== TODO ==
>> +1. Support continuously VM replication.
>
> s/continuously/continuous/
>
>> +2. Support shared storage.
>> +3. Develop the heartbeat part.
>> +4. Reduce checkpoint VM’s downtime while do checkpoint.
>
> s/do/doing/
>
>>

All the above typos and grammatical mistake  will be fixed in next version, thanks!

Hailiang
>