From: Amit Shah <amit.shah@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Juan Quintela <quintela@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
qemu list <qemu-devel@nongnu.org>,
zhang.zhanghailiang@huawei.com
Subject: [Qemu-devel] [PULL 16/18] docs: Add documentation for COLO feature
Date: Sun, 30 Oct 2016 16:17:08 +0530 [thread overview]
Message-ID: <1477824430-1460-17-git-send-email-amit.shah@redhat.com> (raw)
In-Reply-To: <1477824430-1460-1-git-send-email-amit.shah@redhat.com>
From: zhanghailiang <zhang.zhanghailiang@huawei.com>
Introduce the design of COLO, and how to test it.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
---
docs/COLO-FT.txt | 189 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 189 insertions(+)
create mode 100644 docs/COLO-FT.txt
diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
new file mode 100644
index 0000000..6282938
--- /dev/null
+++ b/docs/COLO-FT.txt
@@ -0,0 +1,189 @@
+COarse-grained LOck-stepping Virtual Machines for Non-stop Service
+----------------------------------------
+Copyright (c) 2016 Intel Corporation
+Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+Copyright (c) 2016 Fujitsu, Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+This document gives an overview of COLO's design and how to use it.
+
+== Background ==
+Virtual machine (VM) replication is a well known technique for providing
+application-agnostic software-implemented hardware fault tolerance,
+also known as "non-stop service".
+
+COLO (COarse-grained LOck-stepping) is a high availability solution.
+Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
+same request from client, and generate response in parallel too.
+If the response packets from PVM and SVM are identical, they are released
+immediately. Otherwise, a VM checkpoint (on demand) is conducted.
+
+== Architecture ==
+
+The architecture of COLO is shown in the diagram below.
+It consists of a pair of networked physical nodes:
+The primary node running the PVM, and the secondary node running the SVM
+to maintain a valid replica of the PVM.
+PVM and SVM execute in parallel and generate output of response packets for
+client requests according to the application semantics.
+
+The incoming packets from the client or external network are received by the
+primary node, and then forwarded to the secondary node, so that both the PVM
+and the SVM are stimulated with the same requests.
+
+COLO receives the outbound packets from both the PVM and SVM and compares them
+before allowing the output to be sent to clients.
+
+The SVM is qualified as a valid replica of the PVM, as long as it generates
+identical responses to all client requests. Once the differences in the outputs
+are detected between the PVM and SVM, COLO withholds transmission of the
+outbound packets until it has successfully synchronized the PVM state to the SVM.
+
+ Primary Node Secondary Node
+ +------------+ +-----------------------+ +------------------------+ +------------+
+ | | | HeartBeat |<----->| HeartBeat | | |
+ | Primary VM | +-----------|-----------+ +-----------|------------+ |Secondary VM|
+ | | | | | |
+ | | +-----------|-----------+ +-----------|------------+ | |
+ | | |QEMU +---v----+ | |QEMU +----v---+ | | |
+ | | | |Failover| | | |Failover| | | |
+ | | | +--------+ | | +--------+ | | |
+ | | | +---------------+ | | +---------------+ | | |
+ | | | | VM Checkpoint |-------------->| VM Checkpoint | | | |
+ | | | +---------------+ | | +---------------+ | | |
+ | | | | | | | |
+ |Requests<---------------------------^------------------------------------------>Requests|
+ |Responses----------------------\ /--|--------------\ /------------------------Responses|
+ | | | | | | | | | | | | |
+ | | | +-----------+ | | | | | | | +------------+ | | |
+ | | | | COLO disk | | | | | | | | | COLO disk | | | |
+ | | | | Manager |-|-|--|--------------|--|->| Manager | | | |
+ | | | +|----------+ | | | | | | | +-----------|+ | | |
+ | | | | | | | | | | | | | | |
+ +------------+ +--|------------|-|--|--+ +---|--|--------------|--+ +------------+
+ | | | | | | |
+ +-------------+ | +----------v-v--|--+ +---|--v-----------+ | +-------------+
+ | VM Monitor | | | COLO Proxy | | COLO Proxy | | | VM Monitor |
+ | | | |(compare packet) | | (adjust sequence)| | | |
+ +-------------+ | +----------|----^--+ +------------------+ | +-------------+
+ | | | |
+ +------------------|------------|----|--+ +---------------------|------------------+
+ | Kernel | | | | | Kernel | |
+ +------------------|------------|----|--+ +---------------------|------------------+
+ | | | |
+ +--------------v+ +--------v----|--+ +------------------+ +v-------------+
+ | Storage | |External Network| | External Network | | Storage |
+ +---------------+ +----------------+ +------------------+ +--------------+
+
+== Components introduction ==
+
+You can see there are several components in COLO's diagram of architecture.
+Their functions are described below.
+
+HeartBeat:
+Runs on both the primary and secondary nodes, to periodically check platform
+availability. When the primary node suffers a hardware fail-stop failure,
+the heartbeat stops responding, the secondary node will trigger a failover
+as soon as it determines the absence.
+
+COLO disk Manager:
+When primary VM writes data into image, the colo disk manger captures this data
+and sends it to secondary VM's which makes sure the context of secondary VM's
+image is consistent with the context of primary VM 's image.
+For more details, please refer to docs/block-replication.txt.
+
+Checkpoint/Failover Controller:
+Modifications of save/restore flow to realize continuous migration,
+to make sure the state of VM in Secondary side is always consistent with VM in
+Primary side.
+
+COLO Proxy:
+Delivers packets to Primary and Seconday, and then compare the responses from
+both side. Then decide whether to start a checkpoint according to some rules.
+Please refer to docs/colo-proxy.txt for more informations.
+
+Note:
+HeartBeat has not been implemented yet, so you need to trigger failover process
+by using 'x-colo-lost-heartbeat' command.
+
+== Test procedure ==
+1. Startup qemu
+Primary:
+# qemu-kvm -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc :7 -name primary \
+ -device piix3-usb-uhci \
+ -device usb-tablet -netdev tap,id=hn0,vhost=off \
+ -device virtio-net-pci,id=net-pci0,netdev=hn0 \
+ -drive if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+ children.0.file.filename=1.raw,\
+ children.0.driver=raw -S
+Secondary:
+# qemu-kvm -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc :7 -name secondary \
+ -device piix3-usb-uhci \
+ -device usb-tablet -netdev tap,id=hn0,vhost=off \
+ -device virtio-net-pci,id=net-pci0,netdev=hn0 \
+ -drive if=none,id=secondary-disk0,file.filename=1.raw,driver=raw,node-name=node0 \
+ -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
+ file.driver=qcow2,top-id=active-disk0,\
+ file.file.filename=/mnt/ramfs/active_disk.img,\
+ file.backing.driver=qcow2,\
+ file.backing.file.filename=/mnt/ramfs/hidden_disk.img,\
+ file.backing.backing=secondary-disk0 \
+ -incoming tcp:0:8888
+
+2. On Secondary VM's QEMU monitor, issue command
+{'execute':'qmp_capabilities'}
+{ 'execute': 'nbd-server-start',
+ 'arguments': {'addr': {'type': 'inet', 'data': {'host': 'xx.xx.xx.xx', 'port': '8889'} } }
+}
+{'execute': 'nbd-server-add', 'arguments': {'device': 'secondeary-disk0', 'writable': true } }
+
+Note:
+ a. The qmp command nbd-server-start and nbd-server-add must be run
+ before running the qmp command migrate on primary QEMU
+ b. Active disk, hidden disk and nbd target's length should be the
+ same.
+ c. It is better to put active disk and hidden disk in ramdisk.
+
+3. On Primary VM's QEMU monitor, issue command:
+{'execute':'qmp_capabilities'}
+{ 'execute': 'human-monitor-command',
+ 'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.port=8889,file.export=secondary-disk0,node-name=nbd_client0'}}
+{ 'execute':'x-blockdev-change', 'arguments':{'parent': 'primary-disk0', 'node': 'nbd_client0' } }
+{ 'execute': 'migrate-set-capabilities',
+ 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } }
+{ 'execute': 'migrate', 'arguments': {'uri': 'tcp:xx.xx.xx.xx:8888' } }
+
+ Note:
+ a. There should be only one NBD Client for each primary disk.
+ b. xx.xx.xx.xx is the secondary physical machine's hostname or IP
+ c. The qmp command line must be run after running qmp command line in
+ secondary qemu.
+
+4. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
+You can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
+to change the checkpoint period time
+
+5. Failover test
+You can kill Primary VM and run 'x_colo_lost_heartbeat' in Secondary VM's
+monitor at the same time, then SVM will failover and client will not detect this
+change.
+
+Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to
+issue block related command to stop block replication.
+Primary:
+ Remove the nbd child from the quorum:
+ { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 'child': 'children.1'}}
+ { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del blk-buddy0'}}
+ Note: there is no qmp command to remove the blockdev now
+
+Secondary:
+ The primary host is down, so we should do the following thing:
+ { 'execute': 'nbd-server-stop' }
+
+== TODO ==
+1. Support continuous VM replication.
+2. Support shared storage.
+3. Develop the heartbeat part.
+4. Reduce checkpoint VM’s downtime while doing checkpoint.
--
2.7.4
next prev parent reply other threads:[~2016-10-30 10:48 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-30 10:46 [Qemu-devel] [PULL 00/18] migration: COLO Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 01/18] migration: Introduce capability 'x-colo' to migration Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 02/18] COLO: migrate COLO related info to secondary node Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 03/18] migration: Enter into COLO mode after migration if COLO is enabled Amit Shah
2016-10-31 22:27 ` Eric Blake
2016-11-01 3:39 ` Hailiang Zhang
2016-10-30 10:46 ` [Qemu-devel] [PULL 04/18] migration: Switch to COLO process after finishing loadvm Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 05/18] COLO: Establish a new communicating path for COLO Amit Shah
2016-10-30 10:46 ` [Qemu-devel] [PULL 06/18] COLO: Introduce checkpointing protocol Amit Shah
2016-10-31 18:25 ` Eduardo Habkost
2016-11-01 1:48 ` Hailiang Zhang
2016-10-30 10:46 ` [Qemu-devel] [PULL 07/18] COLO: Add a new RunState RUN_STATE_COLO Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 08/18] COLO: Send PVM state to secondary side when do checkpoint Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 09/18] COLO: Load VMState into QIOChannelBuffer before restore it Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 10/18] COLO: Add checkpoint-delay parameter for migrate-set-parameters Amit Shah
2016-10-31 17:17 ` Juan Quintela
2016-11-01 2:10 ` Hailiang Zhang
2016-10-30 10:47 ` [Qemu-devel] [PULL 11/18] COLO: Synchronize PVM's state to SVM periodically Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 12/18] COLO: Add 'x-colo-lost-heartbeat' command to trigger failover Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 13/18] COLO: Introduce state to record failover process Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 14/18] COLO: Implement the process of failover for primary VM Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 15/18] COLO: Implement failover work for secondary VM Amit Shah
2016-10-30 10:47 ` Amit Shah [this message]
2016-10-30 10:47 ` [Qemu-devel] [PULL 17/18] configure: Support enable/disable COLO feature Amit Shah
2016-10-30 10:47 ` [Qemu-devel] [PULL 18/18] MAINTAINERS: Add maintainer for COLO framework related files Amit Shah
2016-10-31 13:57 ` [Qemu-devel] [PULL 00/18] migration: COLO Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1477824430-1460-17-git-send-email-amit.shah@redhat.com \
--to=amit.shah@redhat.com \
--cc=dgilbert@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).