* [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test
@ 2026-02-03 10:15 Lukas Straub
2026-02-03 10:15 ` [PATCH v5 01/16] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
` (15 more replies)
0 siblings, 16 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub, Juan Quintela
Hello everyone,
This has some cleanups for and adds multifd support and migration unit tests
for COLO migration.
Regards,
Lukas
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
Changes in v5:
- Remove unused inmports from multifd-colo.c
- Mention the checkpoint overhead of reset to the Q35 fix
- Link to v4: https://lore.kernel.org/qemu-devel/20260130-colo_unit_test_multifd-v4-0-7115ab6f0e77@web.de
Changes in v4:
- Add cleanup patches to remove migration_incoming_colo_enabled() and MIG_CMD_ENABLE_COLO
- Add more comments to the colo unit test
- Call colo_release_ram_cache() after multifd threads terminate
- Link to v3: https://lore.kernel.org/qemu-devel/20260125-colo_unit_test_multifd-v3-0-ae926ccd8eae@web.de
Changes in v3:
- Fix peter's review comments.
- Fix COLO with Q35 machine
- Link to v2: https://lore.kernel.org/qemu-devel/20260117-colo_unit_test_multifd-v2-0-ab521777fa51@web.de
Changes in v2:
- Fix review comments
- Hide stderr in colo migration test since the logged errors are expected
- Add benchmarking data for multifd
- Add myself as maintainer for COLO migration framework
- Link to v1: https://lore.kernel.org/qemu-devel/20251230-colo_unit_test_multifd-v1-0-f9734bc74c71@web.de
---
Lukas Straub (16):
MAINTAINERS: Add myself as maintainer for COLO migration framework
MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
colo: Setup ram cache in normal migration path
colo: Replace migration_incoming_colo_enabled() with migrate_colo()
colo: Remove ENABLE_COLO loadvm command functions
colo: Don't send ENABLE_COLO command
ram: Remove colo special-casing
Move ram state receive into multifd_ram_state_recv()
multifd: Add COLO support
Call colo_release_ram_cache() after multifd threads terminate
colo: Fix crash during device vmstate load
migration-test: Add COLO migration unit test
Convert colo main documentation to restructuredText
qemu-colo.rst: Miscellaneous changes
qemu-colo.rst: Add my copyright
qemu-colo.rst: Simplify the block replication setup
MAINTAINERS | 6 +-
docs/COLO-FT.txt | 334 ----------------------------------
docs/system/index.rst | 1 +
docs/system/qemu-colo.rst | 362 +++++++++++++++++++++++++++++++++++++
include/migration/colo.h | 3 -
migration/colo.c | 11 +-
migration/meson.build | 2 +-
migration/migration.c | 61 ++-----
migration/multifd-colo.c | 44 +++++
migration/multifd-colo.h | 26 +++
migration/multifd-nocomp.c | 10 +-
migration/multifd.c | 19 +-
migration/multifd.h | 5 +-
migration/ram.c | 12 +-
migration/savevm.c | 30 +--
migration/savevm.h | 1 -
migration/trace-events | 1 -
tests/qtest/meson.build | 7 +-
tests/qtest/migration-test.c | 1 +
tests/qtest/migration/colo-tests.c | 206 +++++++++++++++++++++
tests/qtest/migration/framework.h | 5 +
21 files changed, 720 insertions(+), 427 deletions(-)
---
base-commit: b377abc220fc53e9cab2aac3c73fc20be6d85eea
change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
Best regards,
--
Lukas Straub <lukasstraub2@web.de>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH v5 01/16] MAINTAINERS: Add myself as maintainer for COLO migration framework
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-03 10:15 ` [PATCH v5 02/16] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
` (14 subsequent siblings)
15 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
I am ready to maintain it.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 9b7ed4fccb1dd0572abbb52ecdc9f0b217fea13a..a9b29cbd528633b25adb6ed7ab2162a2d11b179f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3849,6 +3849,7 @@ F: qapi/yank.json
COLO Framework
M: Hailiang Zhang <zhanghailiang@xfusion.com>
+M: Lukas Straub <lukasstraub2@web.de>
S: Maintained
F: migration/colo*
F: include/migration/colo.h
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 02/16] MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-02-03 10:15 ` [PATCH v5 01/16] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-03 10:15 ` [PATCH v5 03/16] colo: Setup ram cache in normal migration path Lukas Straub
` (13 subsequent siblings)
15 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
His last email to the mailing list is from December 2021:
https://lore.kernel.org/qemu-devel/20211214075424.6920-1-zhanghailiang@xfusion.com/
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 -
1 file changed, 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index a9b29cbd528633b25adb6ed7ab2162a2d11b179f..ea170280580af6e3ebc586c3cb9bf6e144b30c11 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3848,7 +3848,6 @@ F: include/qemu/yank.h
F: qapi/yank.json
COLO Framework
-M: Hailiang Zhang <zhanghailiang@xfusion.com>
M: Lukas Straub <lukasstraub2@web.de>
S: Maintained
F: migration/colo*
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 03/16] colo: Setup ram cache in normal migration path
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-02-03 10:15 ` [PATCH v5 01/16] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
2026-02-03 10:15 ` [PATCH v5 02/16] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:10 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 04/16] colo: Replace migration_incoming_colo_enabled() with migrate_colo() Lukas Straub
` (12 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
Since
121ccedc2b migration: block incoming colo when capability is disabled
x-colo capability needs to be always enabled on the incoming side.
So migration_incoming_colo_enabled() and migrate_colo() are equivalent
with migrate_colo() being easier to reason about since it is always true
during the whole migration.
Use migrate_colo() to initialize the ram cache in the normal migration path.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/migration.c | 18 ++++++++++++++----
migration/savevm.c | 14 +-------------
2 files changed, 15 insertions(+), 17 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index b103a82fc0b83009d01d238ff16c0a542d83509f..a73d842ad8b060dc84273ade36ef7dc8b87421f3 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -630,10 +630,6 @@ int migration_incoming_enable_colo(Error **errp)
return -EINVAL;
}
- if (ram_block_discard_disable(true)) {
- error_setg(errp, "COLO: cannot disable RAM discard");
- return -EBUSY;
- }
migration_colo_enabled = true;
return 0;
}
@@ -770,6 +766,20 @@ process_incoming_migration_co(void *opaque)
assert(mis->from_src_file);
+ if (migrate_colo()) {
+ if (ram_block_discard_disable(true)) {
+ error_setg(&local_err, "COLO: cannot disable RAM discard");
+ goto fail;
+ }
+
+ ret = colo_init_ram_cache(&local_err);
+ if (ret) {
+ error_prepend(&local_err, "failed to init colo RAM cache: %d: ",
+ ret);
+ goto fail;
+ }
+ }
+
mis->largest_page_size = qemu_ram_pagesize_largest();
postcopy_state_set(POSTCOPY_INCOMING_NONE);
migrate_set_state(&mis->state, MIGRATION_STATUS_SETUP,
diff --git a/migration/savevm.c b/migration/savevm.c
index 3dc812a7bbb4e8f5321114c9919d4619798fed5e..0353ac2d0de819b6547a1f771e6a4c3b8fb1e4ef 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2407,19 +2407,7 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis,
Error **errp)
{
ERRP_GUARD();
- int ret;
-
- ret = migration_incoming_enable_colo(errp);
- if (ret < 0) {
- return ret;
- }
-
- ret = colo_init_ram_cache(errp);
- if (ret) {
- error_prepend(errp, "failed to init colo RAM cache: %d: ", ret);
- migration_incoming_disable_colo();
- }
- return ret;
+ return migration_incoming_enable_colo(errp);
}
static int loadvm_postcopy_handle_switchover_start(Error **errp)
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 04/16] colo: Replace migration_incoming_colo_enabled() with migrate_colo()
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (2 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 03/16] colo: Setup ram cache in normal migration path Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:11 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions Lukas Straub
` (11 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
Since
121ccedc2b migration: block incoming colo when capability is disabled
x-colo capability needs to be always enabled on the incoming side.
So migration_incoming_colo_enabled() and migrate_colo() are equivalent
with migrate_colo() being easier to reason about since it is always true
during the whole migration.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
include/migration/colo.h | 1 -
migration/colo.c | 2 +-
migration/migration.c | 9 ++-------
migration/ram.c | 2 +-
4 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/include/migration/colo.h b/include/migration/colo.h
index d4fe422e4d335d3bef4f860f56400fcd73287a0e..2496a968cc1ce709f706c0efe57e4f765f163d3c 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -27,7 +27,6 @@ bool migration_in_colo_state(void);
/* loadvm */
int migration_incoming_enable_colo(Error **errp);
void migration_incoming_disable_colo(void);
-bool migration_incoming_colo_enabled(void);
bool migration_incoming_in_colo_state(void);
COLOMode get_colo_mode(void);
diff --git a/migration/colo.c b/migration/colo.c
index db783f6fa77500386d923dd97e522883027e71d8..8dfd39b035c48590fcebeb20459f01fb37fb67d1 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -933,7 +933,7 @@ void coroutine_fn colo_incoming_co(void)
QemuThread th;
assert(bql_locked());
- assert(migration_incoming_colo_enabled());
+ assert(migrate_colo());
qemu_thread_create(&th, MIGRATION_THREAD_DST_COLO,
colo_process_incoming_thread,
diff --git a/migration/migration.c b/migration/migration.c
index a73d842ad8b060dc84273ade36ef7dc8b87421f3..bc8ce64ff5000b0eb634a20b22e5f3e3289b9707 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -605,11 +605,6 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis,
}
static bool migration_colo_enabled;
-bool migration_incoming_colo_enabled(void)
-{
- return migration_colo_enabled;
-}
-
void migration_incoming_disable_colo(void)
{
ram_block_discard_disable(false);
@@ -739,7 +734,7 @@ static void process_incoming_migration_bh(void *opaque)
} else {
runstate_set(RUN_STATE_PAUSED);
}
- } else if (migration_incoming_colo_enabled()) {
+ } else if (migrate_colo()) {
migration_incoming_disable_colo();
vm_start();
} else {
@@ -807,7 +802,7 @@ process_incoming_migration_co(void *opaque)
goto fail;
}
- if (migration_incoming_colo_enabled()) {
+ if (migrate_colo()) {
/* yield until COLO exit */
colo_incoming_co();
}
diff --git a/migration/ram.c b/migration/ram.c
index fc7ece2c1a10f34aa5a91f58cbe42ea418d7c078..aebf77aa0b861e00516d6f1090aebefdd0d97e54 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4370,7 +4370,7 @@ static int ram_load_precopy(QEMUFile *f)
* speed of the migration, but it obviously reduce the downtime of
* back-up all SVM'S memory in COLO preparing stage.
*/
- if (migration_incoming_colo_enabled()) {
+ if (migrate_colo()) {
if (migration_incoming_in_colo_state()) {
/* In COLO stage, put all pages into cache temporarily */
host = colo_cache_from_block_offset(block, addr, true);
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (3 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 04/16] colo: Replace migration_incoming_colo_enabled() with migrate_colo() Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:13 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 06/16] colo: Don't send ENABLE_COLO command Lukas Straub
` (10 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
No need for it anymore now that x-colo capability is required
on incoming side.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
include/migration/colo.h | 2 --
migration/migration.c | 26 --------------------------
migration/savevm.c | 10 ----------
3 files changed, 38 deletions(-)
diff --git a/include/migration/colo.h b/include/migration/colo.h
index 2496a968cc1ce709f706c0efe57e4f765f163d3c..8f94054a10760d0f2598f080643f45f9944cf051 100644
--- a/include/migration/colo.h
+++ b/include/migration/colo.h
@@ -25,8 +25,6 @@ void migrate_start_colo_process(MigrationState *s);
bool migration_in_colo_state(void);
/* loadvm */
-int migration_incoming_enable_colo(Error **errp);
-void migration_incoming_disable_colo(void);
bool migration_incoming_in_colo_state(void);
COLOMode get_colo_mode(void);
diff --git a/migration/migration.c b/migration/migration.c
index bc8ce64ff5000b0eb634a20b22e5f3e3289b9707..3f3fc5276bb067ae1960e4b675b33208ad641b23 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -604,31 +604,6 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis,
return migrate_send_rp_message_req_pages(mis, rb, start);
}
-static bool migration_colo_enabled;
-void migration_incoming_disable_colo(void)
-{
- ram_block_discard_disable(false);
- migration_colo_enabled = false;
-}
-
-int migration_incoming_enable_colo(Error **errp)
-{
-#ifndef CONFIG_REPLICATION
- error_setg(errp, "ENABLE_COLO command come in migration stream, but the "
- "replication module is not built in");
- return -ENOTSUP;
-#endif
-
- if (!migrate_colo()) {
- error_setg(errp, "ENABLE_COLO command come in migration stream"
- ", but x-colo capability is not set");
- return -EINVAL;
- }
-
- migration_colo_enabled = true;
- return 0;
-}
-
void migrate_add_address(SocketAddress *address)
{
MigrationIncomingState *mis = migration_incoming_get_current();
@@ -735,7 +710,6 @@ static void process_incoming_migration_bh(void *opaque)
runstate_set(RUN_STATE_PAUSED);
}
} else if (migrate_colo()) {
- migration_incoming_disable_colo();
vm_start();
} else {
runstate_set(global_state_get_runstate());
diff --git a/migration/savevm.c b/migration/savevm.c
index 0353ac2d0de819b6547a1f771e6a4c3b8fb1e4ef..413688b75f4bee6cb10878eb51886cf6ba14872d 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2403,13 +2403,6 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis,
return 0;
}
-static int loadvm_process_enable_colo(MigrationIncomingState *mis,
- Error **errp)
-{
- ERRP_GUARD();
- return migration_incoming_enable_colo(errp);
-}
-
static int loadvm_postcopy_handle_switchover_start(Error **errp)
{
SaveStateEntry *se;
@@ -2528,9 +2521,6 @@ static int loadvm_process_command(QEMUFile *f, Error **errp)
case MIG_CMD_RECV_BITMAP:
return loadvm_handle_recv_bitmap(mis, len, errp);
- case MIG_CMD_ENABLE_COLO:
- return loadvm_process_enable_colo(mis, errp);
-
case MIG_CMD_SWITCHOVER_START:
return loadvm_postcopy_handle_switchover_start(errp);
}
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 06/16] colo: Don't send ENABLE_COLO command
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (4 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:17 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 07/16] ram: Remove colo special-casing Lukas Straub
` (9 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
We only support COLO with the same version on both sides so this is
not needed anymore.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/migration.c | 5 -----
migration/savevm.c | 8 +-------
migration/savevm.h | 1 -
migration/trace-events | 1 -
4 files changed, 1 insertion(+), 14 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 3f3fc5276bb067ae1960e4b675b33208ad641b23..5515be1bf305b40ba0b590136df18a53451872c5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3496,11 +3496,6 @@ static void *migration_thread(void *opaque)
qemu_savevm_send_postcopy_advise(s->to_dst_file);
}
- if (migrate_colo()) {
- /* Notify migration destination that we enable COLO */
- qemu_savevm_send_colo_enable(s->to_dst_file);
- }
-
if (migrate_auto_converge()) {
/* Start RAMBlock dirty bitmap sync timer */
cpu_throttle_dirty_sync_timer(true);
diff --git a/migration/savevm.c b/migration/savevm.c
index 413688b75f4bee6cb10878eb51886cf6ba14872d..a3af09616a7bd22194ffba3cfb7cc4cf15fc88e0 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -90,7 +90,7 @@ enum qemu_vm_cmd {
were previously sent during
precopy but are dirty. */
MIG_CMD_PACKAGED, /* Send a wrapped stream within this stream */
- MIG_CMD_ENABLE_COLO, /* Enable COLO */
+ MIG_CMD_UNUSED_0, /* Unused since 11.0 */
MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */
MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */
MIG_CMD_SWITCHOVER_START, /* Switchover start notification */
@@ -1092,12 +1092,6 @@ static void qemu_savevm_command_send(QEMUFile *f,
qemu_fflush(f);
}
-void qemu_savevm_send_colo_enable(QEMUFile *f)
-{
- trace_savevm_send_colo_enable();
- qemu_savevm_command_send(f, MIG_CMD_ENABLE_COLO, 0, NULL);
-}
-
void qemu_savevm_send_ping(QEMUFile *f, uint32_t value)
{
uint32_t buf;
diff --git a/migration/savevm.h b/migration/savevm.h
index 125a2507b7279412bcb0745b95a774874c31c54f..0a1e5bfd1ca125565a4c90c6f31b2f8c94404117 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -62,7 +62,6 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
uint16_t len,
uint64_t *start_list,
uint64_t *length_list);
-void qemu_savevm_send_colo_enable(QEMUFile *f);
void qemu_savevm_live_state(QEMUFile *f);
int qemu_save_device_state(QEMUFile *f);
diff --git a/migration/trace-events b/migration/trace-events
index 91d7506634c9f110e8f0b5f9183728058fe6542a..cfd4d58a0f82ec299ca9e8a9260dd3c3a210cece 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -37,7 +37,6 @@ savevm_send_ping(uint32_t val) "0x%x"
savevm_send_postcopy_listen(void) ""
savevm_send_postcopy_run(void) ""
savevm_send_postcopy_resume(void) ""
-savevm_send_colo_enable(void) ""
savevm_send_recv_bitmap(char *name) "%s"
savevm_send_switchover_start(void) ""
savevm_state_setup(void) ""
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 07/16] ram: Remove colo special-casing
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (5 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 06/16] colo: Don't send ENABLE_COLO command Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:19 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 08/16] Move ram state receive into multifd_ram_state_recv() Lukas Straub
` (8 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
We only enter colo state after the precopy migration is finished
so this if is always taken.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/ram.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index aebf77aa0b861e00516d6f1090aebefdd0d97e54..979751f61b30d6c4b878866b5011507e7c519176 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3116,12 +3116,12 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
RAMBlock *block;
int ret, max_hg_page_size;
- /* migration has already setup the bitmap, reuse it. */
- if (!migration_in_colo_state()) {
- if (ram_init_all(rsp, errp) != 0) {
- return -1;
- }
+ assert(!migration_in_colo_state());
+
+ if (ram_init_all(rsp, errp) != 0) {
+ return -1;
}
+
(*rsp)->pss[RAM_CHANNEL_PRECOPY].pss_channel = f;
/*
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 08/16] Move ram state receive into multifd_ram_state_recv()
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (6 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 07/16] ram: Remove colo special-casing Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:20 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 09/16] multifd: Add COLO support Lukas Straub
` (7 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
This is in preparation for the next patch.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/multifd.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index ad6261688fdf98a5c7f4ee9fb80ba2901201a33e..332e6fc58053462419f3171f6c320ac37648ef7b 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1253,6 +1253,15 @@ static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp)
return ret;
}
+static int multifd_ram_state_recv(MultiFDRecvParams *p, Error **errp)
+{
+ int ret;
+
+ ret = multifd_recv_state->ops->recv(p, errp);
+
+ return ret;
+}
+
static void *multifd_recv_thread(void *opaque)
{
MigrationState *s = migrate_get_current();
@@ -1387,7 +1396,7 @@ static void *multifd_recv_thread(void *opaque)
assert(use_packets);
ret = multifd_device_state_recv(p, &local_err);
} else {
- ret = multifd_recv_state->ops->recv(p, &local_err);
+ ret = multifd_ram_state_recv(p, &local_err);
}
if (ret != 0) {
break;
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 09/16] multifd: Add COLO support
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (7 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 08/16] Move ram state receive into multifd_ram_state_recv() Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-04 18:13 ` Fabiano Rosas
2026-02-09 16:25 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 10/16] Call colo_release_ram_cache() after multifd threads terminate Lukas Straub
` (6 subsequent siblings)
15 siblings, 2 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub, Juan Quintela
Like in the normal ram_load() path, put the received pages into the
colo cache and mark the pages in the bitmap so that they will be
flushed to the guest later.
Multifd with COLO is useful to reduce the VM pause time during checkpointing
for latency sensitive workloads. In such workloads the worst-case latency
is especially important.
Also, this is already worth it for the precopy phase as it helps with
converging. Moreover, multifd migration is the preferred way to do migration
nowadays and this allows to use multifd compression with COLO.
Benchmark:
Cluster nodes
- Intel Xenon E5-2630 v3
- 48Gb RAM
- 10G Ethernet
Guest
- Windows Server 2016
- 6Gb RAM
- 4 cores
Workload
- Upload a file to the guest with SMB to simulate moderate
memory dirtying
- Measure the memory transfer time portion of each checkpoint
- 600ms COLO checkpoint interval
Results
Plain
idle mean: 4.50ms 99per: 10.33ms
load mean: 24.30ms 99per: 78.05ms
Multifd-4
idle mean: 6.48ms 99per: 10.41ms
load mean: 14.12ms 99per: 31.27ms
Evaluation
While multifd has slightly higher latency when the guest idles, it is
10ms faster under load and more importantly it's worst case latency is
less than 1/2 of plain under load as can be seen in the 99. Percentile.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 +
migration/meson.build | 2 +-
migration/multifd-colo.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
migration/multifd-colo.h | 26 ++++++++++++++++++++++++++
migration/multifd-nocomp.c | 10 +++++++++-
migration/multifd.c | 8 ++++++++
migration/multifd.h | 5 ++++-
7 files changed, 93 insertions(+), 3 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index ea170280580af6e3ebc586c3cb9bf6e144b30c11..70e8b9cae59a1768ad9966d1291bd358a0712573 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3851,6 +3851,7 @@ COLO Framework
M: Lukas Straub <lukasstraub2@web.de>
S: Maintained
F: migration/colo*
+F: migration/multifd-colo.*
F: include/migration/colo.h
F: include/migration/failover.h
F: docs/COLO-FT.txt
diff --git a/migration/meson.build b/migration/meson.build
index c7f39bdb55239ecb0e775c77b90a1aa9e6a4a9ce..c9f0f5f9f2137536497e53e960ce70654ad1b394 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -39,7 +39,7 @@ system_ss.add(files(
), gnutls, zlib)
if get_option('replication').allowed()
- system_ss.add(files('colo-failover.c', 'colo.c'))
+ system_ss.add(files('colo-failover.c', 'colo.c', 'multifd-colo.c'))
else
system_ss.add(files('colo-stubs.c'))
endif
diff --git a/migration/multifd-colo.c b/migration/multifd-colo.c
new file mode 100644
index 0000000000000000000000000000000000000000..f160c6543414d3e157a444d613c96df4c5f0e602
--- /dev/null
+++ b/migration/multifd-colo.c
@@ -0,0 +1,44 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * multifd colo implementation
+ *
+ * Copyright (c) Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "multifd.h"
+#include "multifd-colo.h"
+#include "migration/colo.h"
+#include "system/ramblock.h"
+
+void multifd_colo_prepare_recv(MultiFDRecvParams *p)
+{
+ /*
+ * While we're still in precopy state (not yet in colo state), we copy
+ * received pages to both guest and cache. No need to set dirty bits,
+ * since guest and cache memory are in sync.
+ */
+ if (migration_incoming_in_colo_state()) {
+ colo_record_bitmap(p->block, p->normal, p->normal_num);
+ colo_record_bitmap(p->block, p->zero, p->zero_num);
+ }
+}
+
+void multifd_colo_process_recv(MultiFDRecvParams *p)
+{
+ if (!migration_incoming_in_colo_state()) {
+ for (int i = 0; i < p->normal_num; i++) {
+ void *guest = p->block->host + p->normal[i];
+ void *cache = p->host + p->normal[i];
+ memcpy(guest, cache, multifd_ram_page_size());
+ }
+ for (int i = 0; i < p->zero_num; i++) {
+ void *guest = p->block->host + p->zero[i];
+ memset(guest, 0, multifd_ram_page_size());
+ }
+ }
+}
diff --git a/migration/multifd-colo.h b/migration/multifd-colo.h
new file mode 100644
index 0000000000000000000000000000000000000000..82eaf3f48c47de2f090f9de52f9d57a337d4754a
--- /dev/null
+++ b/migration/multifd-colo.h
@@ -0,0 +1,26 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * multifd colo header
+ *
+ * Copyright (c) Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_MULTIFD_COLO_H
+#define QEMU_MIGRATION_MULTIFD_COLO_H
+
+#ifdef CONFIG_REPLICATION
+
+void multifd_colo_prepare_recv(MultiFDRecvParams *p);
+void multifd_colo_process_recv(MultiFDRecvParams *p);
+
+#else
+
+static inline void multifd_colo_prepare_recv(MultiFDRecvParams *p) {}
+static inline void multifd_colo_process_recv(MultiFDRecvParams *p) {}
+
+#endif
+#endif
diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index 9be79b3b8e00371ebff9e112766c225bec260bf7..9f7a792fa761b3bc30b971b35f464103a61787f0 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -16,6 +16,7 @@
#include "file.h"
#include "migration-stats.h"
#include "multifd.h"
+#include "multifd-colo.h"
#include "options.h"
#include "migration.h"
#include "qapi/error.h"
@@ -269,7 +270,6 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
return -1;
}
- p->host = p->block->host;
for (i = 0; i < p->normal_num; i++) {
uint64_t offset = be64_to_cpu(packet->offset[i]);
@@ -294,6 +294,14 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
p->zero[i] = offset;
}
+ if (migrate_colo()) {
+ multifd_colo_prepare_recv(p);
+ assert(p->block->colo_cache);
+ p->host = p->block->colo_cache;
+ } else {
+ p->host = p->block->host;
+ }
+
return 0;
}
diff --git a/migration/multifd.c b/migration/multifd.c
index 332e6fc58053462419f3171f6c320ac37648ef7b..220ed8564960fdabc58e4baa069dd252c8ad293c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -29,6 +29,7 @@
#include "qemu-file.h"
#include "trace.h"
#include "multifd.h"
+#include "multifd-colo.h"
#include "options.h"
#include "qemu/yank.h"
#include "io/channel-file.h"
@@ -1258,6 +1259,13 @@ static int multifd_ram_state_recv(MultiFDRecvParams *p, Error **errp)
int ret;
ret = multifd_recv_state->ops->recv(p, errp);
+ if (ret != 0) {
+ return ret;
+ }
+
+ if (migrate_colo()) {
+ multifd_colo_process_recv(p);
+ }
return ret;
}
diff --git a/migration/multifd.h b/migration/multifd.h
index 89a395aef2b09a6762c45b5361e0ab63256feff6..fbc35702b062fdc3213ce92baed35994f5967c2b 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -279,7 +279,10 @@ typedef struct {
uint64_t packets_recved;
/* ramblock */
RAMBlock *block;
- /* ramblock host address */
+ /*
+ * Normally, it points to ramblock's host address. When COLO
+ * is enabled, it points to the mirror cache for the ramblock.
+ */
uint8_t *host;
/* buffers to recv */
struct iovec *iov;
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 10/16] Call colo_release_ram_cache() after multifd threads terminate
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (8 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 09/16] multifd: Add COLO support Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 16:27 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 11/16] colo: Fix crash during device vmstate load Lukas Straub
` (5 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
The multifd threads still may access the colo cache, so release it
only after they terminate.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/colo.c | 3 ---
migration/migration.c | 3 +++
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index 8dfd39b035c48590fcebeb20459f01fb37fb67d1..d3534d1a32ad82f02101ac092ebf818a0caee6f2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -947,7 +947,4 @@ void coroutine_fn colo_incoming_co(void)
/* Wait checkpoint incoming thread exit before free resource */
qemu_thread_join(&th);
bql_lock();
-
- /* We hold the global BQL, so it is safe here */
- colo_release_ram_cache();
}
diff --git a/migration/migration.c b/migration/migration.c
index 5515be1bf305b40ba0b590136df18a53451872c5..9e3f73f27766196ea8673bf9a58c97d5b8b1672f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -454,6 +454,9 @@ void migration_incoming_state_destroy(void)
* BQL and retake unconditionally.
*/
assert(bql_locked());
+ if (migrate_colo()) {
+ colo_release_ram_cache();
+ }
qemu_loadvm_state_cleanup(mis);
if (mis->to_src_file) {
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 11/16] colo: Fix crash during device vmstate load
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (9 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 10/16] Call colo_release_ram_cache() after multifd threads terminate Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-03 10:15 ` [PATCH v5 12/16] migration-test: Add COLO migration unit test Lukas Straub
` (4 subsequent siblings)
15 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
With colo we load device vmstate during each checkpoint, on top of
a vm that was already running. Some devices expect a reset before
loading vmstate on such a previously running vm.
This fixes a crash when using COLO with Q35 machine.
The reset adds 10-20ms overhead to the checkpointing proces in my
testing.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/colo.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/migration/colo.c b/migration/colo.c
index d3534d1a32ad82f02101ac092ebf818a0caee6f2..afab8eeb14d09c1db9b235121c5845b11a80deba 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -727,6 +727,12 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis,
bql_lock();
vmstate_loading = true;
+ /*
+ * With colo we load device vmstate during each checkpoint, on top of
+ * a vm that was already running. Some devices expect a reset before
+ * loading vmstate on such a previously running vm.
+ */
+ qemu_system_reset(SHUTDOWN_CAUSE_SNAPSHOT_LOAD);
colo_flush_ram_cache();
ret = qemu_load_device_state(fb, errp);
if (ret < 0) {
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 12/16] migration-test: Add COLO migration unit test
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (10 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 11/16] colo: Fix crash during device vmstate load Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-09 15:56 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 13/16] Convert colo main documentation to restructuredText Lukas Straub
` (3 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
Add a COLO migration test for COLO migration and failover.
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Tested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 +
tests/qtest/meson.build | 7 +-
tests/qtest/migration-test.c | 1 +
tests/qtest/migration/colo-tests.c | 206 +++++++++++++++++++++++++++++++++++++
tests/qtest/migration/framework.h | 5 +
5 files changed, 219 insertions(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 70e8b9cae59a1768ad9966d1291bd358a0712573..8e63e0a08fc7417036986f27c2d910eb99d8a96a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3854,6 +3854,7 @@ F: migration/colo*
F: migration/multifd-colo.*
F: include/migration/colo.h
F: include/migration/failover.h
+F: tests/qtest/migration/colo-tests.c
F: docs/COLO-FT.txt
COLO Proxy
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index dfb83650c643d884daad53a66034ab7aa8c45509..624f7744ec9bd81c8823075b966bc95f7750a667 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -371,6 +371,11 @@ if gnutls.found()
endif
endif
+migration_colo_files = []
+if get_option('replication').allowed()
+ migration_colo_files = [files('migration/colo-tests.c')]
+endif
+
qtests = {
'aspeed_hace-test': files('aspeed-hace-utils.c', 'aspeed_hace-test.c'),
'aspeed_smc-test': files('aspeed-smc-utils.c', 'aspeed_smc-test.c'),
@@ -382,7 +387,7 @@ qtests = {
'migration/migration-util.c') + dbus_vmstate1,
'erst-test': files('erst-test.c'),
'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
- 'migration-test': test_migration_files + migration_tls_files,
+ 'migration-test': test_migration_files + migration_tls_files + migration_colo_files,
'pxe-test': files('boot-sector.c'),
'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
'pnv-xive2-nvpg_bar.c'),
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 08936871741535c926eeac40a7d7c3f461c72fd0..e582f05c7dc2673dbd05a936df8feb6c964b5bbc 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -55,6 +55,7 @@ int main(int argc, char **argv)
migration_test_add_precopy(env);
migration_test_add_cpr(env);
migration_test_add_misc(env);
+ migration_test_add_colo(env);
ret = g_test_run();
diff --git a/tests/qtest/migration/colo-tests.c b/tests/qtest/migration/colo-tests.c
new file mode 100644
index 0000000000000000000000000000000000000000..e64b7aadabf24ff87046988eb75dd34a7d3e34d8
--- /dev/null
+++ b/tests/qtest/migration/colo-tests.c
@@ -0,0 +1,206 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * QTest testcases for COLO migration
+ *
+ * Copyright (c) 2025 Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "migration/framework.h"
+#include "migration/migration-qmp.h"
+#include "migration/migration-util.h"
+#include "qemu/module.h"
+
+static int test_colo_common(MigrateCommon *args,
+ bool failover_during_checkpoint,
+ bool primary_failover)
+{
+ QTestState *from, *to;
+ void *data_hook = NULL;
+
+ /*
+ * For the COLO test, both VMs will run in parallel. Thus both VMs want to
+ * open the image read/write at the same time. Using read-only=on is not
+ * possible here, because ide-hd does not support read-only backing image.
+ *
+ * So use -snapshot, where each qemu instance creates its own writable
+ * snapshot internally while leaving the real image read-only.
+ */
+ args->start.opts_source = "-snapshot";
+ args->start.opts_target = "-snapshot";
+
+ /*
+ * COLO migration code logs many errors when the migration socket
+ * is shut down, these are expected so we hide them here.
+ */
+ args->start.hide_stderr = true;
+
+ /*
+ * Test with yank with out of band capability since that is how it is
+ * used in production.
+ */
+ args->start.oob = true;
+ args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
+
+ if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
+ return -1;
+ }
+
+ migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
+
+ if (args->start_hook) {
+ data_hook = args->start_hook(from, to);
+ }
+
+ migrate_ensure_converge(from);
+ wait_for_serial("src_serial");
+
+ migrate_qmp(from, to, args->connect_uri, NULL, "{}");
+
+ wait_for_migration_status(from, "colo", NULL);
+ wait_for_resume(to, get_dst());
+
+ wait_for_serial("src_serial");
+ wait_for_serial("dest_serial");
+
+ /* wait for 3 checkpoints */
+ for (int i = 0; i < 3; i++) {
+ qtest_qmp_eventwait(to, "RESUME");
+ wait_for_serial("src_serial");
+ wait_for_serial("dest_serial");
+ }
+
+ if (failover_during_checkpoint) {
+ qtest_qmp_eventwait(to, "STOP");
+ }
+ if (primary_failover) {
+ qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
+ "'arguments': {'instances':"
+ "[{'type': 'migration'}]}}");
+ qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
+ wait_for_serial("src_serial");
+ } else {
+ qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
+ "'arguments': {'instances':"
+ "[{'type': 'migration'}]}}");
+ qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
+ wait_for_serial("dest_serial");
+ }
+
+ if (args->end_hook) {
+ args->end_hook(from, to, data_hook);
+ }
+
+ migrate_end(from, to, !primary_failover);
+
+ return 0;
+}
+
+static void test_colo_plain_common(MigrateCommon *args,
+ bool failover_during_checkpoint,
+ bool primary_failover)
+{
+ args->listen_uri = "tcp:127.0.0.1:0";
+ test_colo_common(args, failover_during_checkpoint, primary_failover);
+}
+
+static void *hook_start_multifd(QTestState *from, QTestState *to)
+{
+ return migrate_hook_start_precopy_tcp_multifd_common(from, to, "none");
+}
+
+static void test_colo_multifd_common(MigrateCommon *args,
+ bool failover_during_checkpoint,
+ bool primary_failover)
+{
+ args->listen_uri = "defer";
+ args->start_hook = hook_start_multifd;
+ args->start.caps[MIGRATION_CAPABILITY_MULTIFD] = true;
+ test_colo_common(args, failover_during_checkpoint, primary_failover);
+}
+
+static void test_colo_plain_primary_failover(char *name, MigrateCommon *args)
+{
+ test_colo_plain_common(args, false, true);
+}
+
+static void test_colo_plain_secondary_failover(char *name, MigrateCommon *args)
+{
+ test_colo_plain_common(args, false, false);
+}
+
+static void test_colo_multifd_primary_failover(char *name, MigrateCommon *args)
+{
+ test_colo_multifd_common(args, false, true);
+}
+
+static void test_colo_multifd_secondary_failover(char *name,
+ MigrateCommon *args)
+{
+ test_colo_multifd_common(args, false, false);
+}
+
+static void test_colo_plain_primary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_plain_common(args, true, true);
+}
+
+static void test_colo_plain_secondary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_plain_common(args, true, false);
+}
+
+static void test_colo_multifd_primary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_multifd_common(args, true, true);
+}
+
+static void test_colo_multifd_secondary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_multifd_common(args, true, false);
+}
+
+void migration_test_add_colo(MigrationTestEnv *env)
+{
+ /*
+ * COLO crashes with TCG accelerator.
+ */
+ if (!env->has_kvm) {
+ g_test_skip("COLO requires KVM accelerator");
+ return;
+ }
+
+ if (!env->full_set) {
+ return;
+ }
+
+ migration_test_add("/migration/colo/plain/primary_failover",
+ test_colo_plain_primary_failover);
+ migration_test_add("/migration/colo/plain/secondary_failover",
+ test_colo_plain_secondary_failover);
+
+ migration_test_add("/migration/colo/multifd/primary_failover",
+ test_colo_multifd_primary_failover);
+ migration_test_add("/migration/colo/multifd/secondary_failover",
+ test_colo_multifd_secondary_failover);
+
+ migration_test_add("/migration/colo/plain/primary_failover_checkpoint",
+ test_colo_plain_primary_failover_checkpoint);
+ migration_test_add("/migration/colo/plain/secondary_failover_checkpoint",
+ test_colo_plain_secondary_failover_checkpoint);
+
+ migration_test_add("/migration/colo/multifd/primary_failover_checkpoint",
+ test_colo_multifd_primary_failover_checkpoint);
+ migration_test_add("/migration/colo/multifd/secondary_failover_checkpoint",
+ test_colo_multifd_secondary_failover_checkpoint);
+}
diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
index 40984d04930da2d181326d9f6a742bde49018103..80eef758932ce9c301ed6c0f6383d18756144870 100644
--- a/tests/qtest/migration/framework.h
+++ b/tests/qtest/migration/framework.h
@@ -264,5 +264,10 @@ void migration_test_add_file(MigrationTestEnv *env);
void migration_test_add_precopy(MigrationTestEnv *env);
void migration_test_add_cpr(MigrationTestEnv *env);
void migration_test_add_misc(MigrationTestEnv *env);
+#ifdef CONFIG_REPLICATION
+void migration_test_add_colo(MigrationTestEnv *env);
+#else
+static inline void migration_test_add_colo(MigrationTestEnv *env) {};
+#endif
#endif /* TEST_FRAMEWORK_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 13/16] Convert colo main documentation to restructuredText
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (11 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 12/16] migration-test: Add COLO migration unit test Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-06 7:47 ` Zhang Chen
2026-02-03 10:15 ` [PATCH v5 14/16] qemu-colo.rst: Miscellaneous changes Lukas Straub
` (2 subsequent siblings)
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 2 +-
docs/COLO-FT.txt | 334 ------------------------------------------
docs/system/index.rst | 1 +
docs/system/qemu-colo.rst | 360 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 362 insertions(+), 335 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 8e63e0a08fc7417036986f27c2d910eb99d8a96a..f645590b8b940919bdc84ad585ee493f5452fc20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3855,7 +3855,7 @@ F: migration/multifd-colo.*
F: include/migration/colo.h
F: include/migration/failover.h
F: tests/qtest/migration/colo-tests.c
-F: docs/COLO-FT.txt
+F: docs/system/qemu-colo.rst
COLO Proxy
M: Zhang Chen <zhangckid@gmail.com>
diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
deleted file mode 100644
index 2283a09c080b8996f9767eeb415e8d4fbdc940af..0000000000000000000000000000000000000000
--- a/docs/COLO-FT.txt
+++ /dev/null
@@ -1,334 +0,0 @@
-COarse-grained LOck-stepping Virtual Machines for Non-stop Service
-----------------------------------------
-Copyright (c) 2016 Intel Corporation
-Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
-Copyright (c) 2016 Fujitsu, Corp.
-
-This work is licensed under the terms of the GNU GPL, version 2 or later.
-See the COPYING file in the top-level directory.
-
-This document gives an overview of COLO's design and how to use it.
-
-== Background ==
-Virtual machine (VM) replication is a well known technique for providing
-application-agnostic software-implemented hardware fault tolerance,
-also known as "non-stop service".
-
-COLO (COarse-grained LOck-stepping) is a high availability solution.
-Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
-same request from client, and generate response in parallel too.
-If the response packets from PVM and SVM are identical, they are released
-immediately. Otherwise, a VM checkpoint (on demand) is conducted.
-
-== Architecture ==
-
-The architecture of COLO is shown in the diagram below.
-It consists of a pair of networked physical nodes:
-The primary node running the PVM, and the secondary node running the SVM
-to maintain a valid replica of the PVM.
-PVM and SVM execute in parallel and generate output of response packets for
-client requests according to the application semantics.
-
-The incoming packets from the client or external network are received by the
-primary node, and then forwarded to the secondary node, so that both the PVM
-and the SVM are stimulated with the same requests.
-
-COLO receives the outbound packets from both the PVM and SVM and compares them
-before allowing the output to be sent to clients.
-
-The SVM is qualified as a valid replica of the PVM, as long as it generates
-identical responses to all client requests. Once the differences in the outputs
-are detected between the PVM and SVM, COLO withholds transmission of the
-outbound packets until it has successfully synchronized the PVM state to the SVM.
-
- Primary Node Secondary Node
-+------------+ +-----------------------+ +------------------------+ +------------+
-| | | HeartBeat +<----->+ HeartBeat | | |
-| Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM|
-| | | | | |
-| | +-----------|-----------+ +-----------|------------+ | |
-| | |QEMU +---v----+ | |QEMU +----v---+ | | |
-| | | |Failover| | | |Failover| | | |
-| | | +--------+ | | +--------+ | | |
-| | | +---------------+ | | +---------------+ | | |
-| | | | VM Checkpoint +-------------->+ VM Checkpoint | | | |
-| | | +---------------+ | | +---------------+ | | |
-|Requests<--------------------------\ /-----------------\ /--------------------->Requests|
-| | | ^ ^ | | | | | | |
-|Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
-| | | | | | | | | | | | | | | |
-| | | +-----------+ | | | | | | | | | | +----------+ | | |
-| | | | COLO disk | | | | | | | | | | | | COLO disk| | | |
-| | | | Manager +---------------------------->| Manager | | | |
-| | | ++----------+ v v | | | | | v v | +---------++ | | |
-| | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | |
-| | | || COLO Proxy || | | COLO Proxy | | | | |
-| | | || (compare packet || | |(adjust sequence | | | | |
-| | | ||and mirror packet)|| | | and ACK) | | | | |
-| | | |+------------+---+-+| | +-----------------+ | | | |
-+------------+ +-----------------------+ +------------------------+ +------------+
-+------------+ | | | | +------------+
-| VM Monitor | | | | | | VM Monitor |
-+------------+ | | | | +------------+
-+---------------------------------------+ +----------------------------------------+
-| Kernel | | | | | Kernel | |
-+---------------------------------------+ +----------------------------------------+
- | | | |
- +--------------v+ +---------v---+--+ +------------------+ +v-------------+
- | Storage | |External Network| | External Network | | Storage |
- +---------------+ +----------------+ +------------------+ +--------------+
-
-
-== Components introduction ==
-
-You can see there are several components in COLO's diagram of architecture.
-Their functions are described below.
-
-HeartBeat:
-Runs on both the primary and secondary nodes, to periodically check platform
-availability. When the primary node suffers a hardware fail-stop failure,
-the heartbeat stops responding, the secondary node will trigger a failover
-as soon as it determines the absence.
-
-COLO disk Manager:
-When primary VM writes data into image, the colo disk manager captures this data
-and sends it to secondary VM's which makes sure the context of secondary VM's
-image is consistent with the context of primary VM 's image.
-For more details, please refer to docs/block-replication.txt.
-
-Checkpoint/Failover Controller:
-Modifications of save/restore flow to realize continuous migration,
-to make sure the state of VM in Secondary side is always consistent with VM in
-Primary side.
-
-COLO Proxy:
-Delivers packets to Primary and Secondary, and then compare the responses from
-both side. Then decide whether to start a checkpoint according to some rules.
-Please refer to docs/colo-proxy.txt for more information.
-
-Note:
-HeartBeat has not been implemented yet, so you need to trigger failover process
-by using 'x-colo-lost-heartbeat' command.
-
-== COLO operation status ==
-
-+-----------------+
-| |
-| Start COLO |
-| |
-+--------+--------+
- |
- | Main qmp command:
- | migrate-set-capabilities with x-colo
- | migrate
- |
- v
-+--------+--------+
-| |
-| COLO running |
-| |
-+--------+--------+
- |
- | Main qmp command:
- | x-colo-lost-heartbeat
- | or
- | some error happened
- v
-+--------+--------+
-| | send qmp event:
-| COLO failover | COLO_EXIT
-| |
-+-----------------+
-
-COLO use the qmp command to switch and report operation status.
-The diagram just shows the main qmp command, you can get the detail
-in test procedure.
-
-== Test procedure ==
-Note: Here we are running both instances on the same host for testing,
-change the IP Addresses if you want to run it on two hosts. Initially
-127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
-
-== Startup qemu ==
-1. Primary:
-Note: Initially, $imagefolder/primary.qcow2 needs to be copied to all hosts.
-You don't need to change any IP's here, because 0.0.0.0 listens on any
-interface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
-instance.
-
-# imagefolder="/mnt/vms/colo-test-primary"
-
-# qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
- -device piix3-usb-uhci -device usb-tablet -name primary \
- -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
- -device rtl8139,id=e0,netdev=hn0 \
- -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
- -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
- -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
- -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
- -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
- -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
- -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
- -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
- -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
- -object iothread,id=iothread1 \
- -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
-outdev=compare_out0,iothread=iothread1 \
- -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
-children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
-
-2. Secondary:
-Note: Active and hidden images need to be created only once and the
-size should be the same as primary.qcow2. Again, you don't need to change
-any IP's here, except for the $primary_ip variable.
-
-# imagefolder="/mnt/vms/colo-test-secondary"
-# primary_ip=127.0.0.1
-
-# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
-
-# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
-
-# qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
- -device piix3-usb-uhci -device usb-tablet -name secondary \
- -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
- -device rtl8139,id=e0,netdev=hn0 \
- -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect-ms=1000 \
- -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect-ms=1000 \
- -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
- -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
- -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
- -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
- -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
-top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
-file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
-file.backing.backing=parent0 \
- -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
-children.0=childs0 \
- -incoming tcp:0.0.0.0:9998
-
-
-3. On Secondary VM's QEMU monitor, issue command
-{"execute":"qmp_capabilities"}
-{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": {"host": "0.0.0.0", "port": "9999"} } } }
-{"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": true } }
-
-Note:
- a. The qmp command nbd-server-start and nbd-server-add must be run
- before running the qmp command migrate on primary QEMU
- b. Active disk, hidden disk and nbd target's length should be the
- same.
- c. It is better to put active disk and hidden disk in ramdisk. They
- will be merged into the parent disk on failover.
-
-4. On Primary VM's QEMU monitor, issue command:
-{"execute":"qmp_capabilities"}
-{"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
-{"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
-{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
-
- Note:
- a. There should be only one NBD Client for each primary disk.
- b. The qmp command line must be run after running qmp command line in
- secondary qemu.
-
-5. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
-You can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
-to change the idle checkpoint period time
-
-6. Failover test
-You can kill one of the VMs and Failover on the surviving VM:
-
-If you killed the Secondary, then follow "Primary Failover". After that,
-if you want to resume the replication, follow "Primary resume replication"
-
-If you killed the Primary, then follow "Secondary Failover". After that,
-if you want to resume the replication, follow "Secondary resume replication"
-
-== Primary Failover ==
-The Secondary died, resume on the Primary
-
-{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
-{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
-{"execute": "object-del", "arguments":{ "id": "comp0" } }
-{"execute": "object-del", "arguments":{ "id": "iothread1" } }
-{"execute": "object-del", "arguments":{ "id": "m0" } }
-{"execute": "object-del", "arguments":{ "id": "redire0" } }
-{"execute": "object-del", "arguments":{ "id": "redire1" } }
-{"execute": "x-colo-lost-heartbeat" }
-
-== Secondary Failover ==
-The Primary died, resume on the Secondary and prepare to become the new Primary
-
-{"execute": "nbd-server-stop"}
-{"execute": "x-colo-lost-heartbeat"}
-
-{"execute": "object-del", "arguments":{ "id": "f2" } }
-{"execute": "object-del", "arguments":{ "id": "f1" } }
-{"execute": "chardev-remove", "arguments":{ "id": "red1" } }
-{"execute": "chardev-remove", "arguments":{ "id": "red0" } }
-
-{"execute": "chardev-add", "arguments":{ "id": "mirror0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9003" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare1", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9004" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare0-0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": false } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare_out", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare_out0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": false } } } }
-
-== Primary resume replication ==
-Resume replication after new Secondary is up.
-
-Start the new Secondary (Steps 2 and 3 above), then on the Primary:
-{"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.2:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
-
-Wait until disk is synced, then:
-{"execute": "stop"}
-{"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
-
-{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
-{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
-
-{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
-{"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
-
-{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.2:9998" } }
-
-Note:
-If this Primary previously was a Secondary, then we need to insert the
-filters before the filter-rewriter by using the
-""insert": "before", "position": "id=rew0"" Options. See below.
-
-== Secondary resume replication ==
-Become Primary and resume replication after new Secondary is up. Note
-that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
-
-Start the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
-then on the old Secondary:
-{"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.1:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
-
-Wait until disk is synced, then:
-{"execute": "stop"}
-{"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
-
-{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
-{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
-
-{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
-{"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
-
-{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
-
-== TODO ==
-1. Support shared storage.
-2. Develop the heartbeat part.
-3. Reduce checkpoint VM’s downtime while doing checkpoint.
diff --git a/docs/system/index.rst b/docs/system/index.rst
index 427b020483104f6589878bbf255a367ae114c61b..6268c41aea9c74dc3e59d896b5ae082360bfbb1a 100644
--- a/docs/system/index.rst
+++ b/docs/system/index.rst
@@ -41,3 +41,4 @@ or Hypervisor.Framework.
igvm
vm-templating
sriov
+ qemu-colo
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
new file mode 100644
index 0000000000000000000000000000000000000000..4b5fbbf398f8a5c4ea6baad615bde94b2b4678d2
--- /dev/null
+++ b/docs/system/qemu-colo.rst
@@ -0,0 +1,360 @@
+Qemu COLO Fault Tolerance
+=========================
+
+| Copyright (c) 2016 Intel Corporation
+| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+| Copyright (c) 2016 Fujitsu, Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+This document gives an overview of COLO's design and how to use it.
+
+Background
+----------
+Virtual machine (VM) replication is a well known technique for providing
+application-agnostic software-implemented hardware fault tolerance,
+also known as "non-stop service".
+
+COLO (COarse-grained LOck-stepping) is a high availability solution.
+Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
+same request from client, and generate response in parallel too.
+If the response packets from PVM and SVM are identical, they are released
+immediately. Otherwise, a VM checkpoint (on demand) is conducted.
+
+Architecture
+------------
+The architecture of COLO is shown in the diagram below.
+It consists of a pair of networked physical nodes:
+The primary node running the PVM, and the secondary node running the SVM
+to maintain a valid replica of the PVM.
+PVM and SVM execute in parallel and generate output of response packets for
+client requests according to the application semantics.
+
+The incoming packets from the client or external network are received by the
+primary node, and then forwarded to the secondary node, so that both the PVM
+and the SVM are stimulated with the same requests.
+
+COLO receives the outbound packets from both the PVM and SVM and compares them
+before allowing the output to be sent to clients.
+
+The SVM is qualified as a valid replica of the PVM, as long as it generates
+identical responses to all client requests. Once the differences in the outputs
+are detected between the PVM and SVM, COLO withholds transmission of the
+outbound packets until it has successfully synchronized the PVM state to the SVM.
+
+Overview::
+
+ Primary Node Secondary Node
+ +------------+ +-----------------------+ +------------------------+ +------------+
+ | | | HeartBeat +<----->+ HeartBeat | | |
+ | Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM|
+ | | | | | |
+ | | +-----------|-----------+ +-----------|------------+ | |
+ | | |QEMU +---v----+ | |QEMU +----v---+ | | |
+ | | | |Failover| | | |Failover| | | |
+ | | | +--------+ | | +--------+ | | |
+ | | | +---------------+ | | +---------------+ | | |
+ | | | | VM Checkpoint +-------------->+ VM Checkpoint | | | |
+ | | | +---------------+ | | +---------------+ | | |
+ |Requests<--------------------------\ /-----------------\ /--------------------->Requests|
+ | | | ^ ^ | | | | | | |
+ |Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
+ | | | | | | | | | | | | | | | |
+ | | | +-----------+ | | | | | | | | | | +----------+ | | |
+ | | | | COLO disk | | | | | | | | | | | | COLO disk| | | |
+ | | | | Manager +---------------------------->| Manager | | | |
+ | | | ++----------+ v v | | | | | v v | +---------++ | | |
+ | | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | |
+ | | | || COLO Proxy || | | COLO Proxy | | | | |
+ | | | || (compare packet || | |(adjust sequence | | | | |
+ | | | ||and mirror packet)|| | | and ACK) | | | | |
+ | | | |+------------+---+-+| | +-----------------+ | | | |
+ +------------+ +-----------------------+ +------------------------+ +------------+
+ +------------+ | | | | +------------+
+ | VM Monitor | | | | | | VM Monitor |
+ +------------+ | | | | +------------+
+ +---------------------------------------+ +----------------------------------------+
+ | Kernel | | | | | Kernel | |
+ +---------------------------------------+ +----------------------------------------+
+ | | | |
+ +--------------v+ +---------v---+--+ +------------------+ +v-------------+
+ | Storage | |External Network| | External Network | | Storage |
+ +---------------+ +----------------+ +------------------+ +--------------+
+
+Components introduction
+^^^^^^^^^^^^^^^^^^^^^^^
+You can see there are several components in COLO's diagram of architecture.
+Their functions are described below.
+
+HeartBeat
+~~~~~~~~~
+Runs on both the primary and secondary nodes, to periodically check platform
+availability. When the primary node suffers a hardware fail-stop failure,
+the heartbeat stops responding, the secondary node will trigger a failover
+as soon as it determines the absence.
+
+COLO disk Manager
+~~~~~~~~~~~~~~~~~
+When primary VM writes data into image, the colo disk manager captures this data
+and sends it to secondary VM's which makes sure the context of secondary VM's
+image is consistent with the context of primary VM 's image.
+For more details, please refer to docs/block-replication.txt.
+
+Checkpoint/Failover Controller
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Modifications of save/restore flow to realize continuous migration,
+to make sure the state of VM in Secondary side is always consistent with VM in
+Primary side.
+
+COLO Proxy
+~~~~~~~~~~
+Delivers packets to Primary and Secondary, and then compare the responses from
+both side. Then decide whether to start a checkpoint according to some rules.
+Please refer to docs/colo-proxy.txt for more information.
+
+Note:
+HeartBeat has not been implemented yet, so you need to trigger failover process
+by using 'x-colo-lost-heartbeat' command.
+
+COLO operation status
+^^^^^^^^^^^^^^^^^^^^^
+
+Overview::
+
+ +-----------------+
+ | |
+ | Start COLO |
+ | |
+ +--------+--------+
+ |
+ | Main qmp command:
+ | migrate-set-capabilities with x-colo
+ | migrate
+ |
+ v
+ +--------+--------+
+ | |
+ | COLO running |
+ | |
+ +--------+--------+
+ |
+ | Main qmp command:
+ | x-colo-lost-heartbeat
+ | or
+ | some error happened
+ v
+ +--------+--------+
+ | | send qmp event:
+ | COLO failover | COLO_EXIT
+ | |
+ +-----------------+
+
+
+COLO use the qmp command to switch and report operation status.
+The diagram just shows the main qmp command, you can get the detail
+in test procedure.
+
+Test procedure
+--------------
+Note: Here we are running both instances on the same host for testing,
+change the IP Addresses if you want to run it on two hosts. Initially
+``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
+
+Startup qemu
+^^^^^^^^^^^^
+**1. Primary**:
+Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
+You don't need to change any IP's here, because ``0.0.0.0`` listens on any
+interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
+instance::
+
+ # imagefolder="/mnt/vms/colo-test-primary"
+
+ # qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
+ -device piix3-usb-uhci -device usb-tablet -name primary \
+ -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
+ -device rtl8139,id=e0,netdev=hn0 \
+ -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
+ -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
+ -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
+ -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
+ -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
+ -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
+ -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
+ -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
+ -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
+ -object iothread,id=iothread1 \
+ -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
+ outdev=compare_out0,iothread=iothread1 \
+ -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+ children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
+
+
+**2. Secondary**:
+Note: Active and hidden images need to be created only once and the
+size should be the same as ``primary.qcow2``. Again, you don't need to change
+any IP's here, except for the ``$primary_ip`` variable::
+
+ # imagefolder="/mnt/vms/colo-test-secondary"
+ # primary_ip=127.0.0.1
+
+ # qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
+
+ # qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
+
+ # qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
+ -device piix3-usb-uhci -device usb-tablet -name secondary \
+ -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
+ -device rtl8139,id=e0,netdev=hn0 \
+ -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect-ms=1000 \
+ -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect-ms=1000 \
+ -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
+ -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
+ -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
+ -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
+ -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
+ top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
+ file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
+ file.backing.backing=parent0 \
+ -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+ children.0=childs0 \
+ -incoming tcp:0.0.0.0:9998
+
+
+**3.** On Secondary VM's QEMU monitor, issue command::
+
+ {"execute":"qmp_capabilities"}
+ {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": {"host": "0.0.0.0", "port": "9999"} } } }
+ {"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": true } }
+
+Note:
+ a. The qmp command ``nbd-server-start`` and ``nbd-server-add`` must be run
+ before running the qmp command migrate on primary QEMU
+ b. Active disk, hidden disk and nbd target's length should be the
+ same.
+ c. It is better to put active disk and hidden disk in ramdisk. They
+ will be merged into the parent disk on failover.
+
+**4.** On Primary VM's QEMU monitor, issue command::
+
+ {"execute":"qmp_capabilities"}
+ {"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
+ {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
+
+Note:
+ a. There should be only one NBD Client for each primary disk.
+ b. The qmp command line must be run after running qmp command line in
+ secondary qemu.
+
+**5.** After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
+You can issue command ``{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }``
+to change the idle checkpoint period time
+
+Failover test
+^^^^^^^^^^^^^
+You can kill one of the VMs and Failover on the surviving VM:
+
+If you killed the Secondary, then follow "Primary Failover".
+After that, if you want to resume the replication, follow "Primary resume replication"
+
+If you killed the Primary, then follow "Secondary Failover".
+After that, if you want to resume the replication, follow "Secondary resume replication"
+
+Primary Failover
+~~~~~~~~~~~~~~~~
+The Secondary died, resume on the Primary::
+
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
+ {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
+ {"execute": "object-del", "arguments":{ "id": "comp0" } }
+ {"execute": "object-del", "arguments":{ "id": "iothread1" } }
+ {"execute": "object-del", "arguments":{ "id": "m0" } }
+ {"execute": "object-del", "arguments":{ "id": "redire0" } }
+ {"execute": "object-del", "arguments":{ "id": "redire1" } }
+ {"execute": "x-colo-lost-heartbeat" }
+
+Secondary Failover
+~~~~~~~~~~~~~~~~~~
+The Primary died, resume on the Secondary and prepare to become the new Primary::
+
+ {"execute": "nbd-server-stop"}
+ {"execute": "x-colo-lost-heartbeat"}
+
+ {"execute": "object-del", "arguments":{ "id": "f2" } }
+ {"execute": "object-del", "arguments":{ "id": "f1" } }
+ {"execute": "chardev-remove", "arguments":{ "id": "red1" } }
+ {"execute": "chardev-remove", "arguments":{ "id": "red0" } }
+
+ {"execute": "chardev-add", "arguments":{ "id": "mirror0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9003" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare1", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9004" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare0-0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": false } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare_out", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare_out0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": false } } } }
+
+Primary resume replication
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Resume replication after new Secondary is up.
+
+Start the new Secondary (Steps 2 and 3 above), then on the Primary::
+
+ {"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.2:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
+
+Wait until disk is synced, then::
+
+ {"execute": "stop"}
+ {"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
+
+ {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
+
+ {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.2:9998" } }
+
+Note:
+If this Primary previously was a Secondary, then we need to insert the
+filters before the filter-rewriter by using the
+""insert": "before", "position": "id=rew0"" Options. See below.
+
+Secondary resume replication
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Become Primary and resume replication after new Secondary is up. Note
+that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
+
+Start the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
+then on the old Secondary::
+
+ {"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.1:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
+
+Wait until disk is synced, then::
+
+ {"execute": "stop"}
+ {"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
+
+ {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
+
+ {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
+
+TODO
+----
+1. Support shared storage.
+2. Develop the heartbeat part.
+3. Reduce checkpoint VM’s downtime while doing checkpoint.
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 14/16] qemu-colo.rst: Miscellaneous changes
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (12 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 13/16] Convert colo main documentation to restructuredText Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-06 7:52 ` Zhang Chen
2026-02-03 10:15 ` [PATCH v5 15/16] qemu-colo.rst: Add my copyright Lukas Straub
2026-02-03 10:15 ` [PATCH v5 16/16] qemu-colo.rst: Simplify the block replication setup Lukas Straub
15 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
docs/system/qemu-colo.rst | 35 ++++++++++++++++++-----------------
1 file changed, 18 insertions(+), 17 deletions(-)
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
index 4b5fbbf398f8a5c4ea6baad615bde94b2b4678d2..a70e61aa09391cda933031535fa982d27cf6654b 100644
--- a/docs/system/qemu-colo.rst
+++ b/docs/system/qemu-colo.rst
@@ -1,13 +1,6 @@
Qemu COLO Fault Tolerance
=========================
-| Copyright (c) 2016 Intel Corporation
-| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
-| Copyright (c) 2016 Fujitsu, Corp.
-
-This work is licensed under the terms of the GNU GPL, version 2 or later.
-See the COPYING file in the top-level directory.
-
This document gives an overview of COLO's design and how to use it.
Background
@@ -82,8 +75,8 @@ Overview::
| Storage | |External Network| | External Network | | Storage |
+---------------+ +----------------+ +------------------+ +--------------+
-Components introduction
-^^^^^^^^^^^^^^^^^^^^^^^
+Components
+^^^^^^^^^^
You can see there are several components in COLO's diagram of architecture.
Their functions are described below.
@@ -157,14 +150,21 @@ in test procedure.
Test procedure
--------------
-Note: Here we are running both instances on the same host for testing,
+
+Setup
+^^^^^
+
+Here we are running both instances on the same host for testing,
change the IP Addresses if you want to run it on two hosts. Initially
``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
+COLO uses double the guest ram size on the secondary side. The Qemu version
+should be the same on both hosts.
+
Startup qemu
^^^^^^^^^^^^
**1. Primary**:
-Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
+Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
You don't need to change any IP's here, because ``0.0.0.0`` listens on any
interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
instance::
@@ -192,7 +192,7 @@ instance::
**2. Secondary**:
-Note: Active and hidden images need to be created only once and the
+Active and hidden images need to be created only once and the
size should be the same as ``primary.qcow2``. Again, you don't need to change
any IP's here, except for the ``$primary_ip`` variable::
@@ -353,8 +353,9 @@ Wait until disk is synced, then::
{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
-TODO
-----
-1. Support shared storage.
-2. Develop the heartbeat part.
-3. Reduce checkpoint VM’s downtime while doing checkpoint.
+| Copyright (c) 2016 Intel Corporation
+| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+| Copyright (c) 2016 Fujitsu, Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 15/16] qemu-colo.rst: Add my copyright
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (13 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 14/16] qemu-colo.rst: Miscellaneous changes Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
2026-02-03 10:15 ` [PATCH v5 16/16] qemu-colo.rst: Simplify the block replication setup Lukas Straub
15 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
I have so far contributed 61 commits to the colo project, waranting
the addition of my copyright to this file.
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
docs/system/qemu-colo.rst | 1 +
1 file changed, 1 insertion(+)
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
index a70e61aa09391cda933031535fa982d27cf6654b..75abbd80298df79223cb8e70064a5dc83d70f4eb 100644
--- a/docs/system/qemu-colo.rst
+++ b/docs/system/qemu-colo.rst
@@ -356,6 +356,7 @@ Wait until disk is synced, then::
| Copyright (c) 2016 Intel Corporation
| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
| Copyright (c) 2016 Fujitsu, Corp.
+| Copyright (c) 2026 Lukas Straub <lukasstraub2@web.de>
This work is licensed under the terms of the GNU GPL, version 2 or later.
See the COPYING file in the top-level directory.
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH v5 16/16] qemu-colo.rst: Simplify the block replication setup
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (14 preceding siblings ...)
2026-02-03 10:15 ` [PATCH v5 15/16] qemu-colo.rst: Add my copyright Lukas Straub
@ 2026-02-03 10:15 ` Lukas Straub
15 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-03 10:15 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub
On the primary side we don't actually need the replication
block driver, since it only passes trough all IO.
So simplify the setup and also use 'blockdev-add' instead of
'human-monitor-command'.
This is how my clients use colo in production.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
docs/system/qemu-colo.rst | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
index 75abbd80298df79223cb8e70064a5dc83d70f4eb..f7d3b6439cf3401a58c412634239d1a43999a10e 100644
--- a/docs/system/qemu-colo.rst
+++ b/docs/system/qemu-colo.rst
@@ -240,8 +240,8 @@ Note:
**4.** On Primary VM's QEMU monitor, issue command::
{"execute":"qmp_capabilities"}
- {"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
- {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.2", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
+ {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "nbd0" } }
{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
@@ -269,7 +269,7 @@ Primary Failover
The Secondary died, resume on the Primary::
{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
- {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
+ {"execute": "blockdev-del", "arguments": {"node-name": "nbd0"} }
{"execute": "object-del", "arguments":{ "id": "comp0" } }
{"execute": "object-del", "arguments":{ "id": "iothread1" } }
{"execute": "object-del", "arguments":{ "id": "m0" } }
@@ -309,8 +309,8 @@ Wait until disk is synced, then::
{"execute": "stop"}
{"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
- {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
- {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.2", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "nbd0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
@@ -341,8 +341,8 @@ Wait until disk is synced, then::
{"execute": "stop"}
{"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
- {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
- {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.1", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "nbd0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
--
2.39.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH v5 09/16] multifd: Add COLO support
2026-02-03 10:15 ` [PATCH v5 09/16] multifd: Add COLO support Lukas Straub
@ 2026-02-04 18:13 ` Fabiano Rosas
2026-02-09 16:25 ` Peter Xu
1 sibling, 0 replies; 33+ messages in thread
From: Fabiano Rosas @ 2026-02-04 18:13 UTC (permalink / raw)
To: Lukas Straub, qemu-devel
Cc: Peter Xu, Laurent Vivier, Paolo Bonzini, Zhang Chen,
Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Lukas Straub, Juan Quintela
Lukas Straub <lukasstraub2@web.de> writes:
> Like in the normal ram_load() path, put the received pages into the
> colo cache and mark the pages in the bitmap so that they will be
> flushed to the guest later.
>
> Multifd with COLO is useful to reduce the VM pause time during checkpointing
> for latency sensitive workloads. In such workloads the worst-case latency
> is especially important.
>
> Also, this is already worth it for the precopy phase as it helps with
> converging. Moreover, multifd migration is the preferred way to do migration
> nowadays and this allows to use multifd compression with COLO.
>
> Benchmark:
> Cluster nodes
> - Intel Xenon E5-2630 v3
> - 48Gb RAM
> - 10G Ethernet
> Guest
> - Windows Server 2016
> - 6Gb RAM
> - 4 cores
> Workload
> - Upload a file to the guest with SMB to simulate moderate
> memory dirtying
> - Measure the memory transfer time portion of each checkpoint
> - 600ms COLO checkpoint interval
>
> Results
> Plain
> idle mean: 4.50ms 99per: 10.33ms
> load mean: 24.30ms 99per: 78.05ms
> Multifd-4
> idle mean: 6.48ms 99per: 10.41ms
> load mean: 14.12ms 99per: 31.27ms
>
> Evaluation
> While multifd has slightly higher latency when the guest idles, it is
> 10ms faster under load and more importantly it's worst case latency is
> less than 1/2 of plain under load as can be seen in the 99. Percentile.
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 13/16] Convert colo main documentation to restructuredText
2026-02-03 10:15 ` [PATCH v5 13/16] Convert colo main documentation to restructuredText Lukas Straub
@ 2026-02-06 7:47 ` Zhang Chen
0 siblings, 0 replies; 33+ messages in thread
From: Zhang Chen @ 2026-02-06 7:47 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Laurent Vivier,
Paolo Bonzini, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 3, 2026 at 6:15 PM Lukas Straub <lukasstraub2@web.de> wrote:
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
Thanks
Chen
> ---
> MAINTAINERS | 2 +-
> docs/COLO-FT.txt | 334 ------------------------------------------
> docs/system/index.rst | 1 +
> docs/system/qemu-colo.rst | 360 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 362 insertions(+), 335 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8e63e0a08fc7417036986f27c2d910eb99d8a96a..f645590b8b940919bdc84ad585ee493f5452fc20 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3855,7 +3855,7 @@ F: migration/multifd-colo.*
> F: include/migration/colo.h
> F: include/migration/failover.h
> F: tests/qtest/migration/colo-tests.c
> -F: docs/COLO-FT.txt
> +F: docs/system/qemu-colo.rst
>
> COLO Proxy
> M: Zhang Chen <zhangckid@gmail.com>
> diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
> deleted file mode 100644
> index 2283a09c080b8996f9767eeb415e8d4fbdc940af..0000000000000000000000000000000000000000
> --- a/docs/COLO-FT.txt
> +++ /dev/null
> @@ -1,334 +0,0 @@
> -COarse-grained LOck-stepping Virtual Machines for Non-stop Service
> -----------------------------------------
> -Copyright (c) 2016 Intel Corporation
> -Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> -Copyright (c) 2016 Fujitsu, Corp.
> -
> -This work is licensed under the terms of the GNU GPL, version 2 or later.
> -See the COPYING file in the top-level directory.
> -
> -This document gives an overview of COLO's design and how to use it.
> -
> -== Background ==
> -Virtual machine (VM) replication is a well known technique for providing
> -application-agnostic software-implemented hardware fault tolerance,
> -also known as "non-stop service".
> -
> -COLO (COarse-grained LOck-stepping) is a high availability solution.
> -Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
> -same request from client, and generate response in parallel too.
> -If the response packets from PVM and SVM are identical, they are released
> -immediately. Otherwise, a VM checkpoint (on demand) is conducted.
> -
> -== Architecture ==
> -
> -The architecture of COLO is shown in the diagram below.
> -It consists of a pair of networked physical nodes:
> -The primary node running the PVM, and the secondary node running the SVM
> -to maintain a valid replica of the PVM.
> -PVM and SVM execute in parallel and generate output of response packets for
> -client requests according to the application semantics.
> -
> -The incoming packets from the client or external network are received by the
> -primary node, and then forwarded to the secondary node, so that both the PVM
> -and the SVM are stimulated with the same requests.
> -
> -COLO receives the outbound packets from both the PVM and SVM and compares them
> -before allowing the output to be sent to clients.
> -
> -The SVM is qualified as a valid replica of the PVM, as long as it generates
> -identical responses to all client requests. Once the differences in the outputs
> -are detected between the PVM and SVM, COLO withholds transmission of the
> -outbound packets until it has successfully synchronized the PVM state to the SVM.
> -
> - Primary Node Secondary Node
> -+------------+ +-----------------------+ +------------------------+ +------------+
> -| | | HeartBeat +<----->+ HeartBeat | | |
> -| Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM|
> -| | | | | |
> -| | +-----------|-----------+ +-----------|------------+ | |
> -| | |QEMU +---v----+ | |QEMU +----v---+ | | |
> -| | | |Failover| | | |Failover| | | |
> -| | | +--------+ | | +--------+ | | |
> -| | | +---------------+ | | +---------------+ | | |
> -| | | | VM Checkpoint +-------------->+ VM Checkpoint | | | |
> -| | | +---------------+ | | +---------------+ | | |
> -|Requests<--------------------------\ /-----------------\ /--------------------->Requests|
> -| | | ^ ^ | | | | | | |
> -|Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
> -| | | | | | | | | | | | | | | |
> -| | | +-----------+ | | | | | | | | | | +----------+ | | |
> -| | | | COLO disk | | | | | | | | | | | | COLO disk| | | |
> -| | | | Manager +---------------------------->| Manager | | | |
> -| | | ++----------+ v v | | | | | v v | +---------++ | | |
> -| | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | |
> -| | | || COLO Proxy || | | COLO Proxy | | | | |
> -| | | || (compare packet || | |(adjust sequence | | | | |
> -| | | ||and mirror packet)|| | | and ACK) | | | | |
> -| | | |+------------+---+-+| | +-----------------+ | | | |
> -+------------+ +-----------------------+ +------------------------+ +------------+
> -+------------+ | | | | +------------+
> -| VM Monitor | | | | | | VM Monitor |
> -+------------+ | | | | +------------+
> -+---------------------------------------+ +----------------------------------------+
> -| Kernel | | | | | Kernel | |
> -+---------------------------------------+ +----------------------------------------+
> - | | | |
> - +--------------v+ +---------v---+--+ +------------------+ +v-------------+
> - | Storage | |External Network| | External Network | | Storage |
> - +---------------+ +----------------+ +------------------+ +--------------+
> -
> -
> -== Components introduction ==
> -
> -You can see there are several components in COLO's diagram of architecture.
> -Their functions are described below.
> -
> -HeartBeat:
> -Runs on both the primary and secondary nodes, to periodically check platform
> -availability. When the primary node suffers a hardware fail-stop failure,
> -the heartbeat stops responding, the secondary node will trigger a failover
> -as soon as it determines the absence.
> -
> -COLO disk Manager:
> -When primary VM writes data into image, the colo disk manager captures this data
> -and sends it to secondary VM's which makes sure the context of secondary VM's
> -image is consistent with the context of primary VM 's image.
> -For more details, please refer to docs/block-replication.txt.
> -
> -Checkpoint/Failover Controller:
> -Modifications of save/restore flow to realize continuous migration,
> -to make sure the state of VM in Secondary side is always consistent with VM in
> -Primary side.
> -
> -COLO Proxy:
> -Delivers packets to Primary and Secondary, and then compare the responses from
> -both side. Then decide whether to start a checkpoint according to some rules.
> -Please refer to docs/colo-proxy.txt for more information.
> -
> -Note:
> -HeartBeat has not been implemented yet, so you need to trigger failover process
> -by using 'x-colo-lost-heartbeat' command.
> -
> -== COLO operation status ==
> -
> -+-----------------+
> -| |
> -| Start COLO |
> -| |
> -+--------+--------+
> - |
> - | Main qmp command:
> - | migrate-set-capabilities with x-colo
> - | migrate
> - |
> - v
> -+--------+--------+
> -| |
> -| COLO running |
> -| |
> -+--------+--------+
> - |
> - | Main qmp command:
> - | x-colo-lost-heartbeat
> - | or
> - | some error happened
> - v
> -+--------+--------+
> -| | send qmp event:
> -| COLO failover | COLO_EXIT
> -| |
> -+-----------------+
> -
> -COLO use the qmp command to switch and report operation status.
> -The diagram just shows the main qmp command, you can get the detail
> -in test procedure.
> -
> -== Test procedure ==
> -Note: Here we are running both instances on the same host for testing,
> -change the IP Addresses if you want to run it on two hosts. Initially
> -127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
> -
> -== Startup qemu ==
> -1. Primary:
> -Note: Initially, $imagefolder/primary.qcow2 needs to be copied to all hosts.
> -You don't need to change any IP's here, because 0.0.0.0 listens on any
> -interface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
> -instance.
> -
> -# imagefolder="/mnt/vms/colo-test-primary"
> -
> -# qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
> - -device piix3-usb-uhci -device usb-tablet -name primary \
> - -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
> - -device rtl8139,id=e0,netdev=hn0 \
> - -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
> - -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
> - -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
> - -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
> - -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
> - -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
> - -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
> - -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
> - -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
> - -object iothread,id=iothread1 \
> - -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
> -outdev=compare_out0,iothread=iothread1 \
> - -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
> -children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
> -
> -2. Secondary:
> -Note: Active and hidden images need to be created only once and the
> -size should be the same as primary.qcow2. Again, you don't need to change
> -any IP's here, except for the $primary_ip variable.
> -
> -# imagefolder="/mnt/vms/colo-test-secondary"
> -# primary_ip=127.0.0.1
> -
> -# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
> -
> -# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
> -
> -# qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
> - -device piix3-usb-uhci -device usb-tablet -name secondary \
> - -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
> - -device rtl8139,id=e0,netdev=hn0 \
> - -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect-ms=1000 \
> - -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect-ms=1000 \
> - -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
> - -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
> - -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
> - -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
> - -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
> -top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
> -file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
> -file.backing.backing=parent0 \
> - -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
> -children.0=childs0 \
> - -incoming tcp:0.0.0.0:9998
> -
> -
> -3. On Secondary VM's QEMU monitor, issue command
> -{"execute":"qmp_capabilities"}
> -{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
> -{"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": {"host": "0.0.0.0", "port": "9999"} } } }
> -{"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": true } }
> -
> -Note:
> - a. The qmp command nbd-server-start and nbd-server-add must be run
> - before running the qmp command migrate on primary QEMU
> - b. Active disk, hidden disk and nbd target's length should be the
> - same.
> - c. It is better to put active disk and hidden disk in ramdisk. They
> - will be merged into the parent disk on failover.
> -
> -4. On Primary VM's QEMU monitor, issue command:
> -{"execute":"qmp_capabilities"}
> -{"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
> -{"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
> -{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
> -{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
> -
> - Note:
> - a. There should be only one NBD Client for each primary disk.
> - b. The qmp command line must be run after running qmp command line in
> - secondary qemu.
> -
> -5. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
> -You can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
> -to change the idle checkpoint period time
> -
> -6. Failover test
> -You can kill one of the VMs and Failover on the surviving VM:
> -
> -If you killed the Secondary, then follow "Primary Failover". After that,
> -if you want to resume the replication, follow "Primary resume replication"
> -
> -If you killed the Primary, then follow "Secondary Failover". After that,
> -if you want to resume the replication, follow "Secondary resume replication"
> -
> -== Primary Failover ==
> -The Secondary died, resume on the Primary
> -
> -{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
> -{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
> -{"execute": "object-del", "arguments":{ "id": "comp0" } }
> -{"execute": "object-del", "arguments":{ "id": "iothread1" } }
> -{"execute": "object-del", "arguments":{ "id": "m0" } }
> -{"execute": "object-del", "arguments":{ "id": "redire0" } }
> -{"execute": "object-del", "arguments":{ "id": "redire1" } }
> -{"execute": "x-colo-lost-heartbeat" }
> -
> -== Secondary Failover ==
> -The Primary died, resume on the Secondary and prepare to become the new Primary
> -
> -{"execute": "nbd-server-stop"}
> -{"execute": "x-colo-lost-heartbeat"}
> -
> -{"execute": "object-del", "arguments":{ "id": "f2" } }
> -{"execute": "object-del", "arguments":{ "id": "f1" } }
> -{"execute": "chardev-remove", "arguments":{ "id": "red1" } }
> -{"execute": "chardev-remove", "arguments":{ "id": "red0" } }
> -
> -{"execute": "chardev-add", "arguments":{ "id": "mirror0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9003" } }, "server": true } } } }
> -{"execute": "chardev-add", "arguments":{ "id": "compare1", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9004" } }, "server": true } } } }
> -{"execute": "chardev-add", "arguments":{ "id": "compare0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": true } } } }
> -{"execute": "chardev-add", "arguments":{ "id": "compare0-0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": false } } } }
> -{"execute": "chardev-add", "arguments":{ "id": "compare_out", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": true } } } }
> -{"execute": "chardev-add", "arguments":{ "id": "compare_out0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": false } } } }
> -
> -== Primary resume replication ==
> -Resume replication after new Secondary is up.
> -
> -Start the new Secondary (Steps 2 and 3 above), then on the Primary:
> -{"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.2:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
> -
> -Wait until disk is synced, then:
> -{"execute": "stop"}
> -{"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
> -
> -{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
> -{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
> -
> -{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
> -
> -{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
> -{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.2:9998" } }
> -
> -Note:
> -If this Primary previously was a Secondary, then we need to insert the
> -filters before the filter-rewriter by using the
> -""insert": "before", "position": "id=rew0"" Options. See below.
> -
> -== Secondary resume replication ==
> -Become Primary and resume replication after new Secondary is up. Note
> -that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
> -
> -Start the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
> -then on the old Secondary:
> -{"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.1:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
> -
> -Wait until disk is synced, then:
> -{"execute": "stop"}
> -{"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
> -
> -{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
> -{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
> -
> -{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
> -{"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
> -
> -{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
> -{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
> -
> -== TODO ==
> -1. Support shared storage.
> -2. Develop the heartbeat part.
> -3. Reduce checkpoint VM’s downtime while doing checkpoint.
> diff --git a/docs/system/index.rst b/docs/system/index.rst
> index 427b020483104f6589878bbf255a367ae114c61b..6268c41aea9c74dc3e59d896b5ae082360bfbb1a 100644
> --- a/docs/system/index.rst
> +++ b/docs/system/index.rst
> @@ -41,3 +41,4 @@ or Hypervisor.Framework.
> igvm
> vm-templating
> sriov
> + qemu-colo
> diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
> new file mode 100644
> index 0000000000000000000000000000000000000000..4b5fbbf398f8a5c4ea6baad615bde94b2b4678d2
> --- /dev/null
> +++ b/docs/system/qemu-colo.rst
> @@ -0,0 +1,360 @@
> +Qemu COLO Fault Tolerance
> +=========================
> +
> +| Copyright (c) 2016 Intel Corporation
> +| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> +| Copyright (c) 2016 Fujitsu, Corp.
> +
> +This work is licensed under the terms of the GNU GPL, version 2 or later.
> +See the COPYING file in the top-level directory.
> +
> +This document gives an overview of COLO's design and how to use it.
> +
> +Background
> +----------
> +Virtual machine (VM) replication is a well known technique for providing
> +application-agnostic software-implemented hardware fault tolerance,
> +also known as "non-stop service".
> +
> +COLO (COarse-grained LOck-stepping) is a high availability solution.
> +Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
> +same request from client, and generate response in parallel too.
> +If the response packets from PVM and SVM are identical, they are released
> +immediately. Otherwise, a VM checkpoint (on demand) is conducted.
> +
> +Architecture
> +------------
> +The architecture of COLO is shown in the diagram below.
> +It consists of a pair of networked physical nodes:
> +The primary node running the PVM, and the secondary node running the SVM
> +to maintain a valid replica of the PVM.
> +PVM and SVM execute in parallel and generate output of response packets for
> +client requests according to the application semantics.
> +
> +The incoming packets from the client or external network are received by the
> +primary node, and then forwarded to the secondary node, so that both the PVM
> +and the SVM are stimulated with the same requests.
> +
> +COLO receives the outbound packets from both the PVM and SVM and compares them
> +before allowing the output to be sent to clients.
> +
> +The SVM is qualified as a valid replica of the PVM, as long as it generates
> +identical responses to all client requests. Once the differences in the outputs
> +are detected between the PVM and SVM, COLO withholds transmission of the
> +outbound packets until it has successfully synchronized the PVM state to the SVM.
> +
> +Overview::
> +
> + Primary Node Secondary Node
> + +------------+ +-----------------------+ +------------------------+ +------------+
> + | | | HeartBeat +<----->+ HeartBeat | | |
> + | Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM|
> + | | | | | |
> + | | +-----------|-----------+ +-----------|------------+ | |
> + | | |QEMU +---v----+ | |QEMU +----v---+ | | |
> + | | | |Failover| | | |Failover| | | |
> + | | | +--------+ | | +--------+ | | |
> + | | | +---------------+ | | +---------------+ | | |
> + | | | | VM Checkpoint +-------------->+ VM Checkpoint | | | |
> + | | | +---------------+ | | +---------------+ | | |
> + |Requests<--------------------------\ /-----------------\ /--------------------->Requests|
> + | | | ^ ^ | | | | | | |
> + |Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
> + | | | | | | | | | | | | | | | |
> + | | | +-----------+ | | | | | | | | | | +----------+ | | |
> + | | | | COLO disk | | | | | | | | | | | | COLO disk| | | |
> + | | | | Manager +---------------------------->| Manager | | | |
> + | | | ++----------+ v v | | | | | v v | +---------++ | | |
> + | | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | |
> + | | | || COLO Proxy || | | COLO Proxy | | | | |
> + | | | || (compare packet || | |(adjust sequence | | | | |
> + | | | ||and mirror packet)|| | | and ACK) | | | | |
> + | | | |+------------+---+-+| | +-----------------+ | | | |
> + +------------+ +-----------------------+ +------------------------+ +------------+
> + +------------+ | | | | +------------+
> + | VM Monitor | | | | | | VM Monitor |
> + +------------+ | | | | +------------+
> + +---------------------------------------+ +----------------------------------------+
> + | Kernel | | | | | Kernel | |
> + +---------------------------------------+ +----------------------------------------+
> + | | | |
> + +--------------v+ +---------v---+--+ +------------------+ +v-------------+
> + | Storage | |External Network| | External Network | | Storage |
> + +---------------+ +----------------+ +------------------+ +--------------+
> +
> +Components introduction
> +^^^^^^^^^^^^^^^^^^^^^^^
> +You can see there are several components in COLO's diagram of architecture.
> +Their functions are described below.
> +
> +HeartBeat
> +~~~~~~~~~
> +Runs on both the primary and secondary nodes, to periodically check platform
> +availability. When the primary node suffers a hardware fail-stop failure,
> +the heartbeat stops responding, the secondary node will trigger a failover
> +as soon as it determines the absence.
> +
> +COLO disk Manager
> +~~~~~~~~~~~~~~~~~
> +When primary VM writes data into image, the colo disk manager captures this data
> +and sends it to secondary VM's which makes sure the context of secondary VM's
> +image is consistent with the context of primary VM 's image.
> +For more details, please refer to docs/block-replication.txt.
> +
> +Checkpoint/Failover Controller
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Modifications of save/restore flow to realize continuous migration,
> +to make sure the state of VM in Secondary side is always consistent with VM in
> +Primary side.
> +
> +COLO Proxy
> +~~~~~~~~~~
> +Delivers packets to Primary and Secondary, and then compare the responses from
> +both side. Then decide whether to start a checkpoint according to some rules.
> +Please refer to docs/colo-proxy.txt for more information.
> +
> +Note:
> +HeartBeat has not been implemented yet, so you need to trigger failover process
> +by using 'x-colo-lost-heartbeat' command.
> +
> +COLO operation status
> +^^^^^^^^^^^^^^^^^^^^^
> +
> +Overview::
> +
> + +-----------------+
> + | |
> + | Start COLO |
> + | |
> + +--------+--------+
> + |
> + | Main qmp command:
> + | migrate-set-capabilities with x-colo
> + | migrate
> + |
> + v
> + +--------+--------+
> + | |
> + | COLO running |
> + | |
> + +--------+--------+
> + |
> + | Main qmp command:
> + | x-colo-lost-heartbeat
> + | or
> + | some error happened
> + v
> + +--------+--------+
> + | | send qmp event:
> + | COLO failover | COLO_EXIT
> + | |
> + +-----------------+
> +
> +
> +COLO use the qmp command to switch and report operation status.
> +The diagram just shows the main qmp command, you can get the detail
> +in test procedure.
> +
> +Test procedure
> +--------------
> +Note: Here we are running both instances on the same host for testing,
> +change the IP Addresses if you want to run it on two hosts. Initially
> +``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
> +
> +Startup qemu
> +^^^^^^^^^^^^
> +**1. Primary**:
> +Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
> +You don't need to change any IP's here, because ``0.0.0.0`` listens on any
> +interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
> +instance::
> +
> + # imagefolder="/mnt/vms/colo-test-primary"
> +
> + # qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
> + -device piix3-usb-uhci -device usb-tablet -name primary \
> + -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
> + -device rtl8139,id=e0,netdev=hn0 \
> + -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
> + -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
> + -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
> + -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
> + -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
> + -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
> + -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
> + -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
> + -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
> + -object iothread,id=iothread1 \
> + -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
> + outdev=compare_out0,iothread=iothread1 \
> + -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
> + children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
> +
> +
> +**2. Secondary**:
> +Note: Active and hidden images need to be created only once and the
> +size should be the same as ``primary.qcow2``. Again, you don't need to change
> +any IP's here, except for the ``$primary_ip`` variable::
> +
> + # imagefolder="/mnt/vms/colo-test-secondary"
> + # primary_ip=127.0.0.1
> +
> + # qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
> +
> + # qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
> +
> + # qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
> + -device piix3-usb-uhci -device usb-tablet -name secondary \
> + -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
> + -device rtl8139,id=e0,netdev=hn0 \
> + -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect-ms=1000 \
> + -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect-ms=1000 \
> + -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
> + -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
> + -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
> + -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
> + -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
> + top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
> + file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
> + file.backing.backing=parent0 \
> + -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
> + children.0=childs0 \
> + -incoming tcp:0.0.0.0:9998
> +
> +
> +**3.** On Secondary VM's QEMU monitor, issue command::
> +
> + {"execute":"qmp_capabilities"}
> + {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
> + {"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": {"host": "0.0.0.0", "port": "9999"} } } }
> + {"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": true } }
> +
> +Note:
> + a. The qmp command ``nbd-server-start`` and ``nbd-server-add`` must be run
> + before running the qmp command migrate on primary QEMU
> + b. Active disk, hidden disk and nbd target's length should be the
> + same.
> + c. It is better to put active disk and hidden disk in ramdisk. They
> + will be merged into the parent disk on failover.
> +
> +**4.** On Primary VM's QEMU monitor, issue command::
> +
> + {"execute":"qmp_capabilities"}
> + {"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
> + {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
> + {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
> + {"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
> +
> +Note:
> + a. There should be only one NBD Client for each primary disk.
> + b. The qmp command line must be run after running qmp command line in
> + secondary qemu.
> +
> +**5.** After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
> +You can issue command ``{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }``
> +to change the idle checkpoint period time
> +
> +Failover test
> +^^^^^^^^^^^^^
> +You can kill one of the VMs and Failover on the surviving VM:
> +
> +If you killed the Secondary, then follow "Primary Failover".
> +After that, if you want to resume the replication, follow "Primary resume replication"
> +
> +If you killed the Primary, then follow "Secondary Failover".
> +After that, if you want to resume the replication, follow "Secondary resume replication"
> +
> +Primary Failover
> +~~~~~~~~~~~~~~~~
> +The Secondary died, resume on the Primary::
> +
> + {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
> + {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
> + {"execute": "object-del", "arguments":{ "id": "comp0" } }
> + {"execute": "object-del", "arguments":{ "id": "iothread1" } }
> + {"execute": "object-del", "arguments":{ "id": "m0" } }
> + {"execute": "object-del", "arguments":{ "id": "redire0" } }
> + {"execute": "object-del", "arguments":{ "id": "redire1" } }
> + {"execute": "x-colo-lost-heartbeat" }
> +
> +Secondary Failover
> +~~~~~~~~~~~~~~~~~~
> +The Primary died, resume on the Secondary and prepare to become the new Primary::
> +
> + {"execute": "nbd-server-stop"}
> + {"execute": "x-colo-lost-heartbeat"}
> +
> + {"execute": "object-del", "arguments":{ "id": "f2" } }
> + {"execute": "object-del", "arguments":{ "id": "f1" } }
> + {"execute": "chardev-remove", "arguments":{ "id": "red1" } }
> + {"execute": "chardev-remove", "arguments":{ "id": "red0" } }
> +
> + {"execute": "chardev-add", "arguments":{ "id": "mirror0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9003" } }, "server": true } } } }
> + {"execute": "chardev-add", "arguments":{ "id": "compare1", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9004" } }, "server": true } } } }
> + {"execute": "chardev-add", "arguments":{ "id": "compare0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": true } } } }
> + {"execute": "chardev-add", "arguments":{ "id": "compare0-0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": false } } } }
> + {"execute": "chardev-add", "arguments":{ "id": "compare_out", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": true } } } }
> + {"execute": "chardev-add", "arguments":{ "id": "compare_out0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": false } } } }
> +
> +Primary resume replication
> +~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Resume replication after new Secondary is up.
> +
> +Start the new Secondary (Steps 2 and 3 above), then on the Primary::
> +
> + {"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.2:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
> +
> +Wait until disk is synced, then::
> +
> + {"execute": "stop"}
> + {"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
> +
> + {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
> + {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
> +
> + {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
> +
> + {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
> + {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.2:9998" } }
> +
> +Note:
> +If this Primary previously was a Secondary, then we need to insert the
> +filters before the filter-rewriter by using the
> +""insert": "before", "position": "id=rew0"" Options. See below.
> +
> +Secondary resume replication
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +Become Primary and resume replication after new Secondary is up. Note
> +that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
> +
> +Start the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
> +then on the old Secondary::
> +
> + {"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.1:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
> +
> +Wait until disk is synced, then::
> +
> + {"execute": "stop"}
> + {"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
> +
> + {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
> + {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
> +
> + {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
> + {"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
> +
> + {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
> + {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
> +
> +TODO
> +----
> +1. Support shared storage.
> +2. Develop the heartbeat part.
> +3. Reduce checkpoint VM’s downtime while doing checkpoint.
>
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 14/16] qemu-colo.rst: Miscellaneous changes
2026-02-03 10:15 ` [PATCH v5 14/16] qemu-colo.rst: Miscellaneous changes Lukas Straub
@ 2026-02-06 7:52 ` Zhang Chen
0 siblings, 0 replies; 33+ messages in thread
From: Zhang Chen @ 2026-02-06 7:52 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Peter Xu, Fabiano Rosas, Laurent Vivier,
Paolo Bonzini, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 3, 2026 at 6:16 PM Lukas Straub <lukasstraub2@web.de> wrote:
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
By the way, I will move the docs/colo-proxy.txt file in this way.
Thanks
Chen
> ---
> docs/system/qemu-colo.rst | 35 ++++++++++++++++++-----------------
> 1 file changed, 18 insertions(+), 17 deletions(-)
>
> diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
> index 4b5fbbf398f8a5c4ea6baad615bde94b2b4678d2..a70e61aa09391cda933031535fa982d27cf6654b 100644
> --- a/docs/system/qemu-colo.rst
> +++ b/docs/system/qemu-colo.rst
> @@ -1,13 +1,6 @@
> Qemu COLO Fault Tolerance
> =========================
>
> -| Copyright (c) 2016 Intel Corporation
> -| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> -| Copyright (c) 2016 Fujitsu, Corp.
> -
> -This work is licensed under the terms of the GNU GPL, version 2 or later.
> -See the COPYING file in the top-level directory.
> -
> This document gives an overview of COLO's design and how to use it.
>
> Background
> @@ -82,8 +75,8 @@ Overview::
> | Storage | |External Network| | External Network | | Storage |
> +---------------+ +----------------+ +------------------+ +--------------+
>
> -Components introduction
> -^^^^^^^^^^^^^^^^^^^^^^^
> +Components
> +^^^^^^^^^^
> You can see there are several components in COLO's diagram of architecture.
> Their functions are described below.
>
> @@ -157,14 +150,21 @@ in test procedure.
>
> Test procedure
> --------------
> -Note: Here we are running both instances on the same host for testing,
> +
> +Setup
> +^^^^^
> +
> +Here we are running both instances on the same host for testing,
> change the IP Addresses if you want to run it on two hosts. Initially
> ``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
>
> +COLO uses double the guest ram size on the secondary side. The Qemu version
> +should be the same on both hosts.
> +
> Startup qemu
> ^^^^^^^^^^^^
> **1. Primary**:
> -Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
> +Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
> You don't need to change any IP's here, because ``0.0.0.0`` listens on any
> interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
> instance::
> @@ -192,7 +192,7 @@ instance::
>
>
> **2. Secondary**:
> -Note: Active and hidden images need to be created only once and the
> +Active and hidden images need to be created only once and the
> size should be the same as ``primary.qcow2``. Again, you don't need to change
> any IP's here, except for the ``$primary_ip`` variable::
>
> @@ -353,8 +353,9 @@ Wait until disk is synced, then::
> {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
> {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
>
> -TODO
> -----
> -1. Support shared storage.
> -2. Develop the heartbeat part.
> -3. Reduce checkpoint VM’s downtime while doing checkpoint.
> +| Copyright (c) 2016 Intel Corporation
> +| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> +| Copyright (c) 2016 Fujitsu, Corp.
> +
> +This work is licensed under the terms of the GNU GPL, version 2 or later.
> +See the COPYING file in the top-level directory.
>
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 12/16] migration-test: Add COLO migration unit test
2026-02-03 10:15 ` [PATCH v5 12/16] migration-test: Add COLO migration unit test Lukas Straub
@ 2026-02-09 15:56 ` Peter Xu
0 siblings, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 15:56 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:18AM +0100, Lukas Straub wrote:
> +void migration_test_add_colo(MigrationTestEnv *env)
> +{
> + /*
> + * COLO crashes with TCG accelerator.
> + */
> + if (!env->has_kvm) {
Lukas,
Per the discussion here:
https://lore.kernel.org/qemu-devel/20260206201050.6a692a34@penguin/
Do you think we can drop this line here with some proper fix in the code
instead?
> + g_test_skip("COLO requires KVM accelerator");
> + return;
> + }
The test itself looks all good here otherwise. If this is the only reason
to respin, we can also do it as a follow up.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 03/16] colo: Setup ram cache in normal migration path
2026-02-03 10:15 ` [PATCH v5 03/16] colo: Setup ram cache in normal migration path Lukas Straub
@ 2026-02-09 16:10 ` Peter Xu
0 siblings, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:10 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:09AM +0100, Lukas Straub wrote:
> Since
> 121ccedc2b migration: block incoming colo when capability is disabled
>
> x-colo capability needs to be always enabled on the incoming side.
> So migration_incoming_colo_enabled() and migrate_colo() are equivalent
> with migrate_colo() being easier to reason about since it is always true
> during the whole migration.
>
> Use migrate_colo() to initialize the ram cache in the normal migration path.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Since the goal of this patch is to drop the COLO migration command, this
looks an improvement indeed,
Reviewed-by: Peter Xu <peterx@redhat.com>
I'll still comment though for possible future updates..
> ---
> migration/migration.c | 18 ++++++++++++++----
> migration/savevm.c | 14 +-------------
> 2 files changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index b103a82fc0b83009d01d238ff16c0a542d83509f..a73d842ad8b060dc84273ade36ef7dc8b87421f3 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -630,10 +630,6 @@ int migration_incoming_enable_colo(Error **errp)
> return -EINVAL;
> }
>
> - if (ram_block_discard_disable(true)) {
> - error_setg(errp, "COLO: cannot disable RAM discard");
> - return -EBUSY;
> - }
> migration_colo_enabled = true;
> return 0;
> }
> @@ -770,6 +766,20 @@ process_incoming_migration_co(void *opaque)
>
> assert(mis->from_src_file);
>
> + if (migrate_colo()) {
> + if (ram_block_discard_disable(true)) {
IMHO this is something we could have done at migrate_incoming and fail the
QMP command directly, rather than waiting until this late.
> + error_setg(&local_err, "COLO: cannot disable RAM discard");
> + goto fail;
> + }
> +
> + ret = colo_init_ram_cache(&local_err);
This might be more suitable to be put in ram's load_setup() (which should
happen at the initial migration phase, before colo taking snapshots), then
I believe this function can be unexported too.
> + if (ret) {
> + error_prepend(&local_err, "failed to init colo RAM cache: %d: ",
> + ret);
> + goto fail;
> + }
> + }
> +
> mis->largest_page_size = qemu_ram_pagesize_largest();
> postcopy_state_set(POSTCOPY_INCOMING_NONE);
> migrate_set_state(&mis->state, MIGRATION_STATUS_SETUP,
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 3dc812a7bbb4e8f5321114c9919d4619798fed5e..0353ac2d0de819b6547a1f771e6a4c3b8fb1e4ef 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2407,19 +2407,7 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis,
> Error **errp)
> {
> ERRP_GUARD();
> - int ret;
> -
> - ret = migration_incoming_enable_colo(errp);
> - if (ret < 0) {
> - return ret;
> - }
> -
> - ret = colo_init_ram_cache(errp);
> - if (ret) {
> - error_prepend(errp, "failed to init colo RAM cache: %d: ", ret);
> - migration_incoming_disable_colo();
> - }
> - return ret;
> + return migration_incoming_enable_colo(errp);
> }
>
> static int loadvm_postcopy_handle_switchover_start(Error **errp)
>
> --
> 2.39.5
>
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 04/16] colo: Replace migration_incoming_colo_enabled() with migrate_colo()
2026-02-03 10:15 ` [PATCH v5 04/16] colo: Replace migration_incoming_colo_enabled() with migrate_colo() Lukas Straub
@ 2026-02-09 16:11 ` Peter Xu
0 siblings, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:11 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:10AM +0100, Lukas Straub wrote:
> Since
> 121ccedc2b migration: block incoming colo when capability is disabled
>
> x-colo capability needs to be always enabled on the incoming side.
> So migration_incoming_colo_enabled() and migrate_colo() are equivalent
> with migrate_colo() being easier to reason about since it is always true
> during the whole migration.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions
2026-02-03 10:15 ` [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions Lukas Straub
@ 2026-02-09 16:13 ` Peter Xu
2026-02-10 13:28 ` Lukas Straub
0 siblings, 1 reply; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:13 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:11AM +0100, Lukas Straub wrote:
> No need for it anymore now that x-colo capability is required
> on incoming side.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
IIUC this patch needs to be squashed into the next or it will break COLO..
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 06/16] colo: Don't send ENABLE_COLO command
2026-02-03 10:15 ` [PATCH v5 06/16] colo: Don't send ENABLE_COLO command Lukas Straub
@ 2026-02-09 16:17 ` Peter Xu
2026-02-10 13:29 ` Lukas Straub
0 siblings, 1 reply; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:17 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:12AM +0100, Lukas Straub wrote:
> We only support COLO with the same version on both sides so this is
> not needed anymore.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Besides squashing with prior patch, another nitpick inline:
> ---
> migration/migration.c | 5 -----
> migration/savevm.c | 8 +-------
> migration/savevm.h | 1 -
> migration/trace-events | 1 -
> 4 files changed, 1 insertion(+), 14 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 3f3fc5276bb067ae1960e4b675b33208ad641b23..5515be1bf305b40ba0b590136df18a53451872c5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3496,11 +3496,6 @@ static void *migration_thread(void *opaque)
> qemu_savevm_send_postcopy_advise(s->to_dst_file);
> }
>
> - if (migrate_colo()) {
> - /* Notify migration destination that we enable COLO */
> - qemu_savevm_send_colo_enable(s->to_dst_file);
> - }
> -
> if (migrate_auto_converge()) {
> /* Start RAMBlock dirty bitmap sync timer */
> cpu_throttle_dirty_sync_timer(true);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 413688b75f4bee6cb10878eb51886cf6ba14872d..a3af09616a7bd22194ffba3cfb7cc4cf15fc88e0 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -90,7 +90,7 @@ enum qemu_vm_cmd {
> were previously sent during
> precopy but are dirty. */
> MIG_CMD_PACKAGED, /* Send a wrapped stream within this stream */
> - MIG_CMD_ENABLE_COLO, /* Enable COLO */
> + MIG_CMD_UNUSED_0, /* Unused since 11.0 */
IMHO it's not "when unused" that matters, but "when it was used, and used
as what" that matters. E.g. if we received this unused command in some
future QEMU debugging session, we can guess where it came from with that info.
Hence, I'd suggest:
MIG_CMD_DEPRECATED_0, /* Prior to 10.2, used as MIG_CMD_ENABLE_COLO */
I still think DEPRECATED is better here, as it reminds people we shouldn't
"reuse" it and better be left untouched to catch surprises, where "UNSED"
may imply "you can use it now".
Other than that looks all good, thanks.
> MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */
> MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */
> MIG_CMD_SWITCHOVER_START, /* Switchover start notification */
> @@ -1092,12 +1092,6 @@ static void qemu_savevm_command_send(QEMUFile *f,
> qemu_fflush(f);
> }
>
> -void qemu_savevm_send_colo_enable(QEMUFile *f)
> -{
> - trace_savevm_send_colo_enable();
> - qemu_savevm_command_send(f, MIG_CMD_ENABLE_COLO, 0, NULL);
> -}
> -
> void qemu_savevm_send_ping(QEMUFile *f, uint32_t value)
> {
> uint32_t buf;
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 125a2507b7279412bcb0745b95a774874c31c54f..0a1e5bfd1ca125565a4c90c6f31b2f8c94404117 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -62,7 +62,6 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
> uint16_t len,
> uint64_t *start_list,
> uint64_t *length_list);
> -void qemu_savevm_send_colo_enable(QEMUFile *f);
> void qemu_savevm_live_state(QEMUFile *f);
> int qemu_save_device_state(QEMUFile *f);
>
> diff --git a/migration/trace-events b/migration/trace-events
> index 91d7506634c9f110e8f0b5f9183728058fe6542a..cfd4d58a0f82ec299ca9e8a9260dd3c3a210cece 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -37,7 +37,6 @@ savevm_send_ping(uint32_t val) "0x%x"
> savevm_send_postcopy_listen(void) ""
> savevm_send_postcopy_run(void) ""
> savevm_send_postcopy_resume(void) ""
> -savevm_send_colo_enable(void) ""
> savevm_send_recv_bitmap(char *name) "%s"
> savevm_send_switchover_start(void) ""
> savevm_state_setup(void) ""
>
> --
> 2.39.5
>
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 07/16] ram: Remove colo special-casing
2026-02-03 10:15 ` [PATCH v5 07/16] ram: Remove colo special-casing Lukas Straub
@ 2026-02-09 16:19 ` Peter Xu
0 siblings, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:19 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:13AM +0100, Lukas Straub wrote:
> We only enter colo state after the precopy migration is finished
> so this if is always taken.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 08/16] Move ram state receive into multifd_ram_state_recv()
2026-02-03 10:15 ` [PATCH v5 08/16] Move ram state receive into multifd_ram_state_recv() Lukas Straub
@ 2026-02-09 16:20 ` Peter Xu
0 siblings, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:20 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
Nit: subject can be prefixed with "migration/multifd:".
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 09/16] multifd: Add COLO support
2026-02-03 10:15 ` [PATCH v5 09/16] multifd: Add COLO support Lukas Straub
2026-02-04 18:13 ` Fabiano Rosas
@ 2026-02-09 16:25 ` Peter Xu
1 sibling, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:25 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert, Juan Quintela
On Tue, Feb 03, 2026 at 11:15:15AM +0100, Lukas Straub wrote:
> Like in the normal ram_load() path, put the received pages into the
> colo cache and mark the pages in the bitmap so that they will be
> flushed to the guest later.
>
> Multifd with COLO is useful to reduce the VM pause time during checkpointing
> for latency sensitive workloads. In such workloads the worst-case latency
> is especially important.
>
> Also, this is already worth it for the precopy phase as it helps with
> converging. Moreover, multifd migration is the preferred way to do migration
> nowadays and this allows to use multifd compression with COLO.
>
> Benchmark:
> Cluster nodes
> - Intel Xenon E5-2630 v3
> - 48Gb RAM
> - 10G Ethernet
> Guest
> - Windows Server 2016
> - 6Gb RAM
> - 4 cores
> Workload
> - Upload a file to the guest with SMB to simulate moderate
> memory dirtying
> - Measure the memory transfer time portion of each checkpoint
> - 600ms COLO checkpoint interval
>
> Results
> Plain
> idle mean: 4.50ms 99per: 10.33ms
> load mean: 24.30ms 99per: 78.05ms
> Multifd-4
> idle mean: 6.48ms 99per: 10.41ms
> load mean: 14.12ms 99per: 31.27ms
>
> Evaluation
> While multifd has slightly higher latency when the guest idles, it is
> 10ms faster under load and more importantly it's worst case latency is
> less than 1/2 of plain under load as can be seen in the 99. Percentile.
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 10/16] Call colo_release_ram_cache() after multifd threads terminate
2026-02-03 10:15 ` [PATCH v5 10/16] Call colo_release_ram_cache() after multifd threads terminate Lukas Straub
@ 2026-02-09 16:27 ` Peter Xu
0 siblings, 0 replies; 33+ messages in thread
From: Peter Xu @ 2026-02-09 16:27 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 03, 2026 at 11:15:16AM +0100, Lukas Straub wrote:
> The multifd threads still may access the colo cache, so release it
> only after they terminate.
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions
2026-02-09 16:13 ` Peter Xu
@ 2026-02-10 13:28 ` Lukas Straub
2026-02-10 14:38 ` Peter Xu
0 siblings, 1 reply; 33+ messages in thread
From: Lukas Straub @ 2026-02-10 13:28 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
[-- Attachment #1: Type: text/plain, Size: 576 bytes --]
On Mon, 9 Feb 2026 11:13:19 -0500
Peter Xu <peterx@redhat.com> wrote:
> On Tue, Feb 03, 2026 at 11:15:11AM +0100, Lukas Straub wrote:
> > No need for it anymore now that x-colo capability is required
> > on incoming side.
> >
> > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
>
> IIUC this patch needs to be squashed into the next or it will break COLO..
>
No it's fine actually. Now when we receive the MIG_CMD_ENABLE_COLO
command, we just go to the return 0 at the end of
loadvm_process_command().
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 06/16] colo: Don't send ENABLE_COLO command
2026-02-09 16:17 ` Peter Xu
@ 2026-02-10 13:29 ` Lukas Straub
0 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-10 13:29 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
[-- Attachment #1: Type: text/plain, Size: 4533 bytes --]
On Mon, 9 Feb 2026 11:17:41 -0500
Peter Xu <peterx@redhat.com> wrote:
> On Tue, Feb 03, 2026 at 11:15:12AM +0100, Lukas Straub wrote:
> > We only support COLO with the same version on both sides so this is
> > not needed anymore.
> >
> > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
>
> Besides squashing with prior patch, another nitpick inline:
>
> > ---
> > migration/migration.c | 5 -----
> > migration/savevm.c | 8 +-------
> > migration/savevm.h | 1 -
> > migration/trace-events | 1 -
> > 4 files changed, 1 insertion(+), 14 deletions(-)
> >
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 3f3fc5276bb067ae1960e4b675b33208ad641b23..5515be1bf305b40ba0b590136df18a53451872c5 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -3496,11 +3496,6 @@ static void *migration_thread(void *opaque)
> > qemu_savevm_send_postcopy_advise(s->to_dst_file);
> > }
> >
> > - if (migrate_colo()) {
> > - /* Notify migration destination that we enable COLO */
> > - qemu_savevm_send_colo_enable(s->to_dst_file);
> > - }
> > -
> > if (migrate_auto_converge()) {
> > /* Start RAMBlock dirty bitmap sync timer */
> > cpu_throttle_dirty_sync_timer(true);
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 413688b75f4bee6cb10878eb51886cf6ba14872d..a3af09616a7bd22194ffba3cfb7cc4cf15fc88e0 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -90,7 +90,7 @@ enum qemu_vm_cmd {
> > were previously sent during
> > precopy but are dirty. */
> > MIG_CMD_PACKAGED, /* Send a wrapped stream within this stream */
> > - MIG_CMD_ENABLE_COLO, /* Enable COLO */
> > + MIG_CMD_UNUSED_0, /* Unused since 11.0 */
>
> IMHO it's not "when unused" that matters, but "when it was used, and used
> as what" that matters. E.g. if we received this unused command in some
> future QEMU debugging session, we can guess where it came from with that info.
>
> Hence, I'd suggest:
>
> MIG_CMD_DEPRECATED_0, /* Prior to 10.2, used as MIG_CMD_ENABLE_COLO */
>
> I still think DEPRECATED is better here, as it reminds people we shouldn't
> "reuse" it and better be left untouched to catch surprises, where "UNSED"
> may imply "you can use it now".
>
> Other than that looks all good, thanks.
Okay, will fix this in the next version.
>
> > MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */
> > MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */
> > MIG_CMD_SWITCHOVER_START, /* Switchover start notification */
> > @@ -1092,12 +1092,6 @@ static void qemu_savevm_command_send(QEMUFile *f,
> > qemu_fflush(f);
> > }
> >
> > -void qemu_savevm_send_colo_enable(QEMUFile *f)
> > -{
> > - trace_savevm_send_colo_enable();
> > - qemu_savevm_command_send(f, MIG_CMD_ENABLE_COLO, 0, NULL);
> > -}
> > -
> > void qemu_savevm_send_ping(QEMUFile *f, uint32_t value)
> > {
> > uint32_t buf;
> > diff --git a/migration/savevm.h b/migration/savevm.h
> > index 125a2507b7279412bcb0745b95a774874c31c54f..0a1e5bfd1ca125565a4c90c6f31b2f8c94404117 100644
> > --- a/migration/savevm.h
> > +++ b/migration/savevm.h
> > @@ -62,7 +62,6 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name,
> > uint16_t len,
> > uint64_t *start_list,
> > uint64_t *length_list);
> > -void qemu_savevm_send_colo_enable(QEMUFile *f);
> > void qemu_savevm_live_state(QEMUFile *f);
> > int qemu_save_device_state(QEMUFile *f);
> >
> > diff --git a/migration/trace-events b/migration/trace-events
> > index 91d7506634c9f110e8f0b5f9183728058fe6542a..cfd4d58a0f82ec299ca9e8a9260dd3c3a210cece 100644
> > --- a/migration/trace-events
> > +++ b/migration/trace-events
> > @@ -37,7 +37,6 @@ savevm_send_ping(uint32_t val) "0x%x"
> > savevm_send_postcopy_listen(void) ""
> > savevm_send_postcopy_run(void) ""
> > savevm_send_postcopy_resume(void) ""
> > -savevm_send_colo_enable(void) ""
> > savevm_send_recv_bitmap(char *name) "%s"
> > savevm_send_switchover_start(void) ""
> > savevm_state_setup(void) ""
> >
> > --
> > 2.39.5
> >
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions
2026-02-10 13:28 ` Lukas Straub
@ 2026-02-10 14:38 ` Peter Xu
2026-02-10 15:34 ` Lukas Straub
0 siblings, 1 reply; 33+ messages in thread
From: Peter Xu @ 2026-02-10 14:38 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
On Tue, Feb 10, 2026 at 02:28:35PM +0100, Lukas Straub wrote:
> On Mon, 9 Feb 2026 11:13:19 -0500
> Peter Xu <peterx@redhat.com> wrote:
>
> > On Tue, Feb 03, 2026 at 11:15:11AM +0100, Lukas Straub wrote:
> > > No need for it anymore now that x-colo capability is required
> > > on incoming side.
> > >
> > > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> > > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> >
> > IIUC this patch needs to be squashed into the next or it will break COLO..
> >
>
> No it's fine actually. Now when we receive the MIG_CMD_ENABLE_COLO
> command, we just go to the return 0 at the end of
> loadvm_process_command().
Indeed, but we should actually raise an error when receiving deprecated
commands because they're unexpected.
Please still consider merging these two patches. When at it, we could
change the previous check into a "default" here:
if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
error_setg(errp, "MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
return -EINVAL;
}
Or we just add a "default" to cover deprecated commands.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions
2026-02-10 14:38 ` Peter Xu
@ 2026-02-10 15:34 ` Lukas Straub
0 siblings, 0 replies; 33+ messages in thread
From: Lukas Straub @ 2026-02-10 15:34 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Li Zhijian,
Dr. David Alan Gilbert
[-- Attachment #1: Type: text/plain, Size: 1517 bytes --]
On Tue, 10 Feb 2026 09:38:43 -0500
Peter Xu <peterx@redhat.com> wrote:
> On Tue, Feb 10, 2026 at 02:28:35PM +0100, Lukas Straub wrote:
> > On Mon, 9 Feb 2026 11:13:19 -0500
> > Peter Xu <peterx@redhat.com> wrote:
> >
> > > On Tue, Feb 03, 2026 at 11:15:11AM +0100, Lukas Straub wrote:
> > > > No need for it anymore now that x-colo capability is required
> > > > on incoming side.
> > > >
> > > > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> > > > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > >
> > > IIUC this patch needs to be squashed into the next or it will break COLO..
> > >
> >
> > No it's fine actually. Now when we receive the MIG_CMD_ENABLE_COLO
> > command, we just go to the return 0 at the end of
> > loadvm_process_command().
>
> Indeed, but we should actually raise an error when receiving deprecated
> commands because they're unexpected.
>
> Please still consider merging these two patches. When at it, we could
> change the previous check into a "default" here:
>
> if (cmd >= MIG_CMD_MAX || cmd == MIG_CMD_INVALID) {
> error_setg(errp, "MIG_CMD 0x%x unknown (len 0x%x)", cmd, len);
> return -EINVAL;
> }
We can't remove this since we use cmd as an index into an array below
and this can lead to out of bounds access if it is invalid.
>
> Or we just add a "default" to cover deprecated commands.
I will add a default case in the switch to error out for deprecated
commands,
>
> Thanks,
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2026-02-10 15:34 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-03 10:15 [PATCH v5 00/16] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-02-03 10:15 ` [PATCH v5 01/16] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
2026-02-03 10:15 ` [PATCH v5 02/16] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
2026-02-03 10:15 ` [PATCH v5 03/16] colo: Setup ram cache in normal migration path Lukas Straub
2026-02-09 16:10 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 04/16] colo: Replace migration_incoming_colo_enabled() with migrate_colo() Lukas Straub
2026-02-09 16:11 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 05/16] colo: Remove ENABLE_COLO loadvm command functions Lukas Straub
2026-02-09 16:13 ` Peter Xu
2026-02-10 13:28 ` Lukas Straub
2026-02-10 14:38 ` Peter Xu
2026-02-10 15:34 ` Lukas Straub
2026-02-03 10:15 ` [PATCH v5 06/16] colo: Don't send ENABLE_COLO command Lukas Straub
2026-02-09 16:17 ` Peter Xu
2026-02-10 13:29 ` Lukas Straub
2026-02-03 10:15 ` [PATCH v5 07/16] ram: Remove colo special-casing Lukas Straub
2026-02-09 16:19 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 08/16] Move ram state receive into multifd_ram_state_recv() Lukas Straub
2026-02-09 16:20 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 09/16] multifd: Add COLO support Lukas Straub
2026-02-04 18:13 ` Fabiano Rosas
2026-02-09 16:25 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 10/16] Call colo_release_ram_cache() after multifd threads terminate Lukas Straub
2026-02-09 16:27 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 11/16] colo: Fix crash during device vmstate load Lukas Straub
2026-02-03 10:15 ` [PATCH v5 12/16] migration-test: Add COLO migration unit test Lukas Straub
2026-02-09 15:56 ` Peter Xu
2026-02-03 10:15 ` [PATCH v5 13/16] Convert colo main documentation to restructuredText Lukas Straub
2026-02-06 7:47 ` Zhang Chen
2026-02-03 10:15 ` [PATCH v5 14/16] qemu-colo.rst: Miscellaneous changes Lukas Straub
2026-02-06 7:52 ` Zhang Chen
2026-02-03 10:15 ` [PATCH v5 15/16] qemu-colo.rst: Add my copyright Lukas Straub
2026-02-03 10:15 ` [PATCH v5 16/16] qemu-colo.rst: Simplify the block replication setup Lukas Straub
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.