* [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test
@ 2026-01-17 14:09 Lukas Straub
2026-01-17 14:09 ` [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
` (7 more replies)
0 siblings, 8 replies; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub,
Juan Quintela
Hello everyone,
This adds COLO multifd support and migration unit tests for COLO migration
and failover.
Regards,
Lukas
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
Changes in v2:
- Fix review comments
- Hide stderr in colo migration test since the logged errors are expected
- Add benchmarking data for multifd
- Add myself as maintainer for COLO migration framework
- Link to v1: https://lore.kernel.org/qemu-devel/20251230-colo_unit_test_multifd-v1-0-f9734bc74c71@web.de
---
Lukas Straub (8):
MAINTAINERS: Add myself as maintainer for COLO migration framework
MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
Move ram state receive into multifd_ram_state_recv()
multifd: Add COLO support
migration-test: Add COLO migration unit test
Convert colo main documentation to restructuredText
qemu-colo.rst: Miscellaneous changes
qemu-colo.rst: Simplify the block replication setup
MAINTAINERS | 6 +-
docs/COLO-FT.txt | 334 ----------------------------------
docs/system/index.rst | 1 +
docs/system/qemu-colo.rst | 355 +++++++++++++++++++++++++++++++++++++
migration/meson.build | 2 +-
migration/multifd-colo.c | 49 +++++
migration/multifd-colo.h | 26 +++
migration/multifd.c | 23 ++-
migration/multifd.h | 1 +
tests/qtest/meson.build | 7 +-
tests/qtest/migration-test.c | 1 +
tests/qtest/migration/colo-tests.c | 113 ++++++++++++
tests/qtest/migration/framework.c | 87 ++++++++-
tests/qtest/migration/framework.h | 10 ++
14 files changed, 675 insertions(+), 340 deletions(-)
---
base-commit: 42a5675aa9dd718f395ca3279098051dfdbbc6e1
change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
Best regards,
--
Lukas Straub <lukasstraub2@web.de>
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:32 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
` (6 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
I am ready to maintain it.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index de8246c3ffdbfb73d8d3df06cb1fffd80a707522..38691feea8941635c7ce45f30a822030016e922f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3835,6 +3835,7 @@ F: qapi/yank.json
COLO Framework
M: Hailiang Zhang <zhanghailiang@xfusion.com>
+M: Lukas Straub <lukasstraub2@web.de>
S: Maintained
F: migration/colo*
F: include/migration/colo.h
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-01-17 14:09 ` [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:32 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv() Lukas Straub
` (5 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
His last email to the mailing list is from December 2021:
https://lore.kernel.org/qemu-devel/20211214075424.6920-1-zhanghailiang@xfusion.com/
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 -
1 file changed, 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 38691feea8941635c7ce45f30a822030016e922f..563804345fec68ee72793dbb7c1b7e5be4c32083 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3834,7 +3834,6 @@ F: include/qemu/yank.h
F: qapi/yank.json
COLO Framework
-M: Hailiang Zhang <zhanghailiang@xfusion.com>
M: Lukas Straub <lukasstraub2@web.de>
S: Maintained
F: migration/colo*
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv()
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-01-17 14:09 ` [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
2026-01-17 14:09 ` [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:14 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 4/8] multifd: Add COLO support Lukas Straub
` (4 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
This is in preparation for the next patch.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
migration/multifd.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index bf6da85af8a1e207235ce06b8dbace33beded6d8..8e71171fb7a17726ba7eb0705e293c41e8aa32ec 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1265,6 +1265,15 @@ static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp)
return ret;
}
+static int multifd_ram_state_recv(MultiFDRecvParams *p, Error **errp)
+{
+ int ret;
+
+ ret = multifd_recv_state->ops->recv(p, errp);
+
+ return ret;
+}
+
static void *multifd_recv_thread(void *opaque)
{
MigrationState *s = migrate_get_current();
@@ -1399,7 +1408,7 @@ static void *multifd_recv_thread(void *opaque)
assert(use_packets);
ret = multifd_device_state_recv(p, &local_err);
} else {
- ret = multifd_recv_state->ops->recv(p, &local_err);
+ ret = multifd_ram_state_recv(p, &local_err);
}
if (ret != 0) {
break;
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 4/8] multifd: Add COLO support
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (2 preceding siblings ...)
2026-01-17 14:09 ` [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv() Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:13 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 5/8] migration-test: Add COLO migration unit test Lukas Straub
` (3 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub,
Juan Quintela
Like in the normal ram_load() path, put the received pages into the
colo cache and mark the pages in the bitmap so that they will be
flushed to the guest later.
Multifd with COLO is useful to reduce the VM pause time during checkpointing
for latency sensitive workloads. In such workloads the worst-case latency
is especially important.
Also, multifd migration is the preferred way to do migration nowadays and this
allows to use multifd compression with COLO.
Benchmark:
Cluster nodes
- Intel Xenon E5-2630 v3
- 48Gb RAM
- 10G Ethernet
Guest
- Windows Server 2016
- 6Gb RAM
- 4 cores
Workload
- Upload a file to the guest with SMB to simulate moderate
memory dirtying
- Measure the memory transfer time portion of each checkpoint
- 600ms COLO checkpoint interval
Results
Plain
idle mean: 4.50ms 99per: 10.33ms
load mean: 24.30ms 99per: 78.05ms
Multifd-4
idle mean: 6.48ms 99per: 10.41ms
load mean: 14.12ms 99per: 31.27ms
Evaluation
While multifd has slightly higher latency when the guest idles, it is
10ms faster under load and more importantly it's worst case latency is
less than 1/2 of plain under load as can be seen in the 99. Percentile.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 +
migration/meson.build | 2 +-
migration/multifd-colo.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
migration/multifd-colo.h | 26 +++++++++++++++++++++++++
migration/multifd.c | 12 ++++++++++++
migration/multifd.h | 1 +
6 files changed, 90 insertions(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 563804345fec68ee72793dbb7c1b7e5be4c32083..dbb217255c2cf35dc0ce971c2021b130fac5469b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3837,6 +3837,7 @@ COLO Framework
M: Lukas Straub <lukasstraub2@web.de>
S: Maintained
F: migration/colo*
+F: migration/multifd-colo.*
F: include/migration/colo.h
F: include/migration/failover.h
F: docs/COLO-FT.txt
diff --git a/migration/meson.build b/migration/meson.build
index 16909d54c5110fc5d8187fd3a68c4a5b08b59ea7..1e59fe4f1f0bbfffed90df38e8f39fa87bceb9b9 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -40,7 +40,7 @@ system_ss.add(files(
), gnutls, zlib)
if get_option('replication').allowed()
- system_ss.add(files('colo-failover.c', 'colo.c'))
+ system_ss.add(files('colo-failover.c', 'colo.c', 'multifd-colo.c'))
else
system_ss.add(files('colo-stubs.c'))
endif
diff --git a/migration/multifd-colo.c b/migration/multifd-colo.c
new file mode 100644
index 0000000000000000000000000000000000000000..d8d98e79b12ed52c41f341052a682d7786e221b5
--- /dev/null
+++ b/migration/multifd-colo.c
@@ -0,0 +1,49 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * multifd colo implementation
+ *
+ * Copyright (c) Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "exec/target_page.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "ram.h"
+#include "multifd.h"
+#include "options.h"
+#include "io/channel-socket.h"
+#include "migration/colo.h"
+#include "multifd-colo.h"
+#include "system/ramblock.h"
+
+void multifd_colo_prepare_recv(MultiFDRecvParams *p)
+{
+ assert(p->block->colo_cache);
+
+ /*
+ * While we're still in precopy state (not yet in colo state), we copy
+ * received pages to both guest and cache. No need to set dirty bits,
+ * since guest and cache memory are in sync.
+ */
+ if (migration_incoming_in_colo_state()) {
+ colo_record_bitmap(p->block, p->normal, p->normal_num);
+ }
+ p->host = p->block->colo_cache;
+}
+
+void multifd_colo_process_recv(MultiFDRecvParams *p)
+{
+ if (!migration_incoming_in_colo_state()) {
+ for (int i = 0; i < p->normal_num; i++) {
+ void *guest = p->block->host + p->normal[i];
+ void *cache = p->host + p->normal[i];
+ memcpy(guest, cache, multifd_ram_page_size());
+ }
+ }
+ p->host = p->block->host;
+}
diff --git a/migration/multifd-colo.h b/migration/multifd-colo.h
new file mode 100644
index 0000000000000000000000000000000000000000..82eaf3f48c47de2f090f9de52f9d57a337d4754a
--- /dev/null
+++ b/migration/multifd-colo.h
@@ -0,0 +1,26 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * multifd colo header
+ *
+ * Copyright (c) Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_MIGRATION_MULTIFD_COLO_H
+#define QEMU_MIGRATION_MULTIFD_COLO_H
+
+#ifdef CONFIG_REPLICATION
+
+void multifd_colo_prepare_recv(MultiFDRecvParams *p);
+void multifd_colo_process_recv(MultiFDRecvParams *p);
+
+#else
+
+static inline void multifd_colo_prepare_recv(MultiFDRecvParams *p) {}
+static inline void multifd_colo_process_recv(MultiFDRecvParams *p) {}
+
+#endif
+#endif
diff --git a/migration/multifd.c b/migration/multifd.c
index 8e71171fb7a17726ba7eb0705e293c41e8aa32ec..6c85acec3bac134e85cfcee0d32057134f5af8d1 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -29,6 +29,7 @@
#include "qemu-file.h"
#include "trace.h"
#include "multifd.h"
+#include "multifd-colo.h"
#include "threadinfo.h"
#include "options.h"
#include "qemu/yank.h"
@@ -1269,7 +1270,18 @@ static int multifd_ram_state_recv(MultiFDRecvParams *p, Error **errp)
{
int ret;
+ if (migrate_colo()) {
+ multifd_colo_prepare_recv(p);
+ }
+
ret = multifd_recv_state->ops->recv(p, errp);
+ if (ret != 0) {
+ return ret;
+ }
+
+ if (migrate_colo()) {
+ multifd_colo_process_recv(p);
+ }
return ret;
}
diff --git a/migration/multifd.h b/migration/multifd.h
index 9b6d81e7ede024f05d4cd235de95e73840d0bbc4..7036f438fade1baed2442bfdcf8b5d6397c4a448 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -280,6 +280,7 @@ typedef struct {
/* ramblock */
RAMBlock *block;
/* ramblock host address */
+ /* or points to the corresponding address in the colo cache */
uint8_t *host;
/* buffers to recv */
struct iovec *iov;
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 5/8] migration-test: Add COLO migration unit test
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (3 preceding siblings ...)
2026-01-17 14:09 ` [PATCH v2 4/8] multifd: Add COLO support Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:23 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 6/8] Convert colo main documentation to restructuredText Lukas Straub
` (2 subsequent siblings)
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
Add a COLO migration test for COLO migration and failover.
COLO does not support q35 machine at this time.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 1 +
tests/qtest/meson.build | 7 ++-
tests/qtest/migration-test.c | 1 +
tests/qtest/migration/colo-tests.c | 113 +++++++++++++++++++++++++++++++++++++
tests/qtest/migration/framework.c | 87 +++++++++++++++++++++++++++-
tests/qtest/migration/framework.h | 10 ++++
6 files changed, 217 insertions(+), 2 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index dbb217255c2cf35dc0ce971c2021b130fac5469b..92ca20c9d4186a08519d15bfe8cbd583ab061a8b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3840,6 +3840,7 @@ F: migration/colo*
F: migration/multifd-colo.*
F: include/migration/colo.h
F: include/migration/failover.h
+F: tests/qtest/migration/colo-tests.c
F: docs/COLO-FT.txt
COLO Proxy
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 0f053fb56de5806d3c213e3a26c0b19998ae151a..d0129af4431bb08a94a918a1e40a8f657059d764 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -367,6 +367,11 @@ if gnutls.found()
endif
endif
+migration_colo_files = []
+if get_option('replication').allowed()
+ migration_colo_files = [files('migration/colo-tests.c')]
+endif
+
qtests = {
'aspeed_hace-test': files('aspeed-hace-utils.c', 'aspeed_hace-test.c'),
'aspeed_smc-test': files('aspeed-smc-utils.c', 'aspeed_smc-test.c'),
@@ -378,7 +383,7 @@ qtests = {
'migration/migration-util.c') + dbus_vmstate1,
'erst-test': files('erst-test.c'),
'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
- 'migration-test': test_migration_files + migration_tls_files,
+ 'migration-test': test_migration_files + migration_tls_files + migration_colo_files,
'pxe-test': files('boot-sector.c'),
'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
'pnv-xive2-nvpg_bar.c'),
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 08936871741535c926eeac40a7d7c3f461c72fd0..e582f05c7dc2673dbd05a936df8feb6c964b5bbc 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -55,6 +55,7 @@ int main(int argc, char **argv)
migration_test_add_precopy(env);
migration_test_add_cpr(env);
migration_test_add_misc(env);
+ migration_test_add_colo(env);
ret = g_test_run();
diff --git a/tests/qtest/migration/colo-tests.c b/tests/qtest/migration/colo-tests.c
new file mode 100644
index 0000000000000000000000000000000000000000..5004f581e4d9e4e6f54eee6d70a9307b7fd123be
--- /dev/null
+++ b/tests/qtest/migration/colo-tests.c
@@ -0,0 +1,113 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * QTest testcases for COLO migration
+ *
+ * Copyright (c) 2025 Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "migration/framework.h"
+#include "migration/migration-qmp.h"
+#include "migration/migration-util.h"
+#include "qemu/module.h"
+
+static void test_colo_plain_common(MigrateCommon *args,
+ bool failover_during_checkpoint,
+ bool primary_failover)
+{
+ args->listen_uri = "tcp:127.0.0.1:0";
+ test_colo_common(args, failover_during_checkpoint, primary_failover);
+}
+
+static void *hook_start_multifd(QTestState *from, QTestState *to)
+{
+ return migrate_hook_start_precopy_tcp_multifd_common(from, to, "none");
+}
+
+static void test_colo_multifd_common(MigrateCommon *args,
+ bool failover_during_checkpoint,
+ bool primary_failover)
+{
+ args->listen_uri = "defer";
+ args->start_hook = hook_start_multifd;
+ args->start.caps[MIGRATION_CAPABILITY_MULTIFD] = true;
+ test_colo_common(args, failover_during_checkpoint, primary_failover);
+}
+
+static void test_colo_plain_primary_failover(char *name, MigrateCommon *args)
+{
+ test_colo_plain_common(args, false, true);
+}
+
+static void test_colo_plain_secondary_failover(char *name, MigrateCommon *args)
+{
+ test_colo_plain_common(args, false, false);
+}
+
+static void test_colo_multifd_primary_failover(char *name, MigrateCommon *args)
+{
+ test_colo_multifd_common(args, false, true);
+}
+
+static void test_colo_multifd_secondary_failover(char *name,
+ MigrateCommon *args)
+{
+ test_colo_multifd_common(args, false, false);
+}
+
+static void test_colo_plain_primary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_plain_common(args, true, true);
+}
+
+static void test_colo_plain_secondary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_plain_common(args, true, false);
+}
+
+static void test_colo_multifd_primary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_multifd_common(args, true, true);
+}
+
+static void test_colo_multifd_secondary_failover_checkpoint(char *name,
+ MigrateCommon *args)
+{
+ test_colo_multifd_common(args, true, false);
+}
+
+void migration_test_add_colo(MigrationTestEnv *env)
+{
+ if (!env->full_set) {
+ return;
+ }
+
+ migration_test_add("/migration/colo/plain/primary_failover",
+ test_colo_plain_primary_failover);
+ migration_test_add("/migration/colo/plain/secondary_failover",
+ test_colo_plain_secondary_failover);
+
+ migration_test_add("/migration/colo/multifd/primary_failover",
+ test_colo_multifd_primary_failover);
+ migration_test_add("/migration/colo/multifd/secondary_failover",
+ test_colo_multifd_secondary_failover);
+
+ migration_test_add("/migration/colo/plain/primary_failover_checkpoint",
+ test_colo_plain_primary_failover_checkpoint);
+ migration_test_add("/migration/colo/plain/secondary_failover_checkpoint",
+ test_colo_plain_secondary_failover_checkpoint);
+
+ migration_test_add("/migration/colo/multifd/primary_failover_checkpoint",
+ test_colo_multifd_primary_failover_checkpoint);
+ migration_test_add("/migration/colo/multifd/secondary_failover_checkpoint",
+ test_colo_multifd_secondary_failover_checkpoint);
+}
diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
index 57d3b9b7c5a269d31659971e308367bd916d28f6..fe34e7cc7a1a4eeb8d5219f54733bbd8446b0e4e 100644
--- a/tests/qtest/migration/framework.c
+++ b/tests/qtest/migration/framework.c
@@ -315,7 +315,7 @@ int migrate_args(char **from, char **to, const char *uri, MigrateStart *args)
if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
memory_size = "150M";
- if (g_str_equal(arch, "i386")) {
+ if (g_str_equal(arch, "i386") || args->force_pc_machine) {
machine_alias = "pc";
} else {
machine_alias = "q35";
@@ -1066,6 +1066,91 @@ void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
return NULL;
}
+int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
+ bool primary_failover)
+{
+ QTestState *from, *to;
+ void *data_hook = NULL;
+
+ /*
+ * For the COLO test, both VMs will run in parallel. Thus both VMs want to
+ * open the image read/write at the same time. Using read-only=on is not
+ * possible here, because ide-hd does not support read-only backing image.
+ *
+ * So use -snapshot, where each qemu instance creates its own writable
+ * snapshot internally while leaving the real image read-only.
+ */
+ args->start.opts_source = "-snapshot";
+ args->start.opts_target = "-snapshot";
+
+ /*
+ * COLO migration code logs many errors when the migration socket
+ * is shut down, these are expected so we hide them here.
+ */
+ args->start.hide_stderr = true;
+
+ /*
+ * COLO currently does not work with Q35 machine
+ */
+ args->start.force_pc_machine = true;
+
+ args->start.oob = true;
+ args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
+
+ if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
+ return -1;
+ }
+
+ migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
+
+ if (args->start_hook) {
+ data_hook = args->start_hook(from, to);
+ }
+
+ migrate_ensure_converge(from);
+ wait_for_serial("src_serial");
+
+ migrate_qmp(from, to, args->connect_uri, NULL, "{}");
+
+ wait_for_migration_status(from, "colo", NULL);
+ wait_for_resume(to, &dst_state);
+
+ wait_for_serial("src_serial");
+ wait_for_serial("dest_serial");
+
+ /* wait for 3 checkpoints */
+ for (int i = 0; i < 3; i++) {
+ qtest_qmp_eventwait(to, "RESUME");
+ wait_for_serial("src_serial");
+ wait_for_serial("dest_serial");
+ }
+
+ if (failover_during_checkpoint) {
+ qtest_qmp_eventwait(to, "STOP");
+ }
+ if (primary_failover) {
+ qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
+ "'arguments': {'instances':"
+ "[{'type': 'migration'}]}}");
+ qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
+ wait_for_serial("src_serial");
+ } else {
+ qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
+ "'arguments': {'instances':"
+ "[{'type': 'migration'}]}}");
+ qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
+ wait_for_serial("dest_serial");
+ }
+
+ if (args->end_hook) {
+ args->end_hook(from, to, data_hook);
+ }
+
+ migrate_end(from, to, !primary_failover);
+
+ return 0;
+}
+
QTestMigrationState *get_src(void)
{
return &src_state;
diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
index 2ef0f57962605c9e3bc7b7de48e52351e5389138..75088c5fb098a0f95acb1e23585d3b6e8307451e 100644
--- a/tests/qtest/migration/framework.h
+++ b/tests/qtest/migration/framework.h
@@ -139,6 +139,9 @@ typedef struct {
/* Do not connect to target monitor and qtest sockets in qtest_init */
bool defer_target_connect;
+ /* Use pc machine for x86_64 */
+ bool force_pc_machine;
+
/*
* Migration capabilities to be set in both source and
* destination. For unilateral capabilities, use
@@ -248,6 +251,8 @@ void test_postcopy_common(MigrateCommon *args);
void test_postcopy_recovery_common(MigrateCommon *args);
int test_precopy_common(MigrateCommon *args);
void test_file_common(MigrateCommon *args, bool stop_src);
+int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
+ bool colo_primary_failover);
void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
QTestState *to,
const char *method);
@@ -267,5 +272,10 @@ void migration_test_add_file(MigrationTestEnv *env);
void migration_test_add_precopy(MigrationTestEnv *env);
void migration_test_add_cpr(MigrationTestEnv *env);
void migration_test_add_misc(MigrationTestEnv *env);
+#ifdef CONFIG_REPLICATION
+void migration_test_add_colo(MigrationTestEnv *env);
+#else
+static inline void migration_test_add_colo(MigrationTestEnv *env) {};
+#endif
#endif /* TEST_FRAMEWORK_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 6/8] Convert colo main documentation to restructuredText
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (4 preceding siblings ...)
2026-01-17 14:09 ` [PATCH v2 5/8] migration-test: Add COLO migration unit test Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:26 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes Lukas Straub
2026-01-17 14:09 ` [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup Lukas Straub
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
MAINTAINERS | 2 +-
docs/COLO-FT.txt | 334 ------------------------------------------
docs/system/index.rst | 1 +
docs/system/qemu-colo.rst | 361 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 363 insertions(+), 335 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 92ca20c9d4186a08519d15bfe8cbd583ab061a8b..4c30dc50d15c74b317443e43920e01b4560b03a5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3841,7 +3841,7 @@ F: migration/multifd-colo.*
F: include/migration/colo.h
F: include/migration/failover.h
F: tests/qtest/migration/colo-tests.c
-F: docs/COLO-FT.txt
+F: docs/devel/qemu-colo.rst
COLO Proxy
M: Zhang Chen <zhangckid@gmail.com>
diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
deleted file mode 100644
index 2283a09c080b8996f9767eeb415e8d4fbdc940af..0000000000000000000000000000000000000000
--- a/docs/COLO-FT.txt
+++ /dev/null
@@ -1,334 +0,0 @@
-COarse-grained LOck-stepping Virtual Machines for Non-stop Service
-----------------------------------------
-Copyright (c) 2016 Intel Corporation
-Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
-Copyright (c) 2016 Fujitsu, Corp.
-
-This work is licensed under the terms of the GNU GPL, version 2 or later.
-See the COPYING file in the top-level directory.
-
-This document gives an overview of COLO's design and how to use it.
-
-== Background ==
-Virtual machine (VM) replication is a well known technique for providing
-application-agnostic software-implemented hardware fault tolerance,
-also known as "non-stop service".
-
-COLO (COarse-grained LOck-stepping) is a high availability solution.
-Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
-same request from client, and generate response in parallel too.
-If the response packets from PVM and SVM are identical, they are released
-immediately. Otherwise, a VM checkpoint (on demand) is conducted.
-
-== Architecture ==
-
-The architecture of COLO is shown in the diagram below.
-It consists of a pair of networked physical nodes:
-The primary node running the PVM, and the secondary node running the SVM
-to maintain a valid replica of the PVM.
-PVM and SVM execute in parallel and generate output of response packets for
-client requests according to the application semantics.
-
-The incoming packets from the client or external network are received by the
-primary node, and then forwarded to the secondary node, so that both the PVM
-and the SVM are stimulated with the same requests.
-
-COLO receives the outbound packets from both the PVM and SVM and compares them
-before allowing the output to be sent to clients.
-
-The SVM is qualified as a valid replica of the PVM, as long as it generates
-identical responses to all client requests. Once the differences in the outputs
-are detected between the PVM and SVM, COLO withholds transmission of the
-outbound packets until it has successfully synchronized the PVM state to the SVM.
-
- Primary Node Secondary Node
-+------------+ +-----------------------+ +------------------------+ +------------+
-| | | HeartBeat +<----->+ HeartBeat | | |
-| Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM|
-| | | | | |
-| | +-----------|-----------+ +-----------|------------+ | |
-| | |QEMU +---v----+ | |QEMU +----v---+ | | |
-| | | |Failover| | | |Failover| | | |
-| | | +--------+ | | +--------+ | | |
-| | | +---------------+ | | +---------------+ | | |
-| | | | VM Checkpoint +-------------->+ VM Checkpoint | | | |
-| | | +---------------+ | | +---------------+ | | |
-|Requests<--------------------------\ /-----------------\ /--------------------->Requests|
-| | | ^ ^ | | | | | | |
-|Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
-| | | | | | | | | | | | | | | |
-| | | +-----------+ | | | | | | | | | | +----------+ | | |
-| | | | COLO disk | | | | | | | | | | | | COLO disk| | | |
-| | | | Manager +---------------------------->| Manager | | | |
-| | | ++----------+ v v | | | | | v v | +---------++ | | |
-| | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | |
-| | | || COLO Proxy || | | COLO Proxy | | | | |
-| | | || (compare packet || | |(adjust sequence | | | | |
-| | | ||and mirror packet)|| | | and ACK) | | | | |
-| | | |+------------+---+-+| | +-----------------+ | | | |
-+------------+ +-----------------------+ +------------------------+ +------------+
-+------------+ | | | | +------------+
-| VM Monitor | | | | | | VM Monitor |
-+------------+ | | | | +------------+
-+---------------------------------------+ +----------------------------------------+
-| Kernel | | | | | Kernel | |
-+---------------------------------------+ +----------------------------------------+
- | | | |
- +--------------v+ +---------v---+--+ +------------------+ +v-------------+
- | Storage | |External Network| | External Network | | Storage |
- +---------------+ +----------------+ +------------------+ +--------------+
-
-
-== Components introduction ==
-
-You can see there are several components in COLO's diagram of architecture.
-Their functions are described below.
-
-HeartBeat:
-Runs on both the primary and secondary nodes, to periodically check platform
-availability. When the primary node suffers a hardware fail-stop failure,
-the heartbeat stops responding, the secondary node will trigger a failover
-as soon as it determines the absence.
-
-COLO disk Manager:
-When primary VM writes data into image, the colo disk manager captures this data
-and sends it to secondary VM's which makes sure the context of secondary VM's
-image is consistent with the context of primary VM 's image.
-For more details, please refer to docs/block-replication.txt.
-
-Checkpoint/Failover Controller:
-Modifications of save/restore flow to realize continuous migration,
-to make sure the state of VM in Secondary side is always consistent with VM in
-Primary side.
-
-COLO Proxy:
-Delivers packets to Primary and Secondary, and then compare the responses from
-both side. Then decide whether to start a checkpoint according to some rules.
-Please refer to docs/colo-proxy.txt for more information.
-
-Note:
-HeartBeat has not been implemented yet, so you need to trigger failover process
-by using 'x-colo-lost-heartbeat' command.
-
-== COLO operation status ==
-
-+-----------------+
-| |
-| Start COLO |
-| |
-+--------+--------+
- |
- | Main qmp command:
- | migrate-set-capabilities with x-colo
- | migrate
- |
- v
-+--------+--------+
-| |
-| COLO running |
-| |
-+--------+--------+
- |
- | Main qmp command:
- | x-colo-lost-heartbeat
- | or
- | some error happened
- v
-+--------+--------+
-| | send qmp event:
-| COLO failover | COLO_EXIT
-| |
-+-----------------+
-
-COLO use the qmp command to switch and report operation status.
-The diagram just shows the main qmp command, you can get the detail
-in test procedure.
-
-== Test procedure ==
-Note: Here we are running both instances on the same host for testing,
-change the IP Addresses if you want to run it on two hosts. Initially
-127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
-
-== Startup qemu ==
-1. Primary:
-Note: Initially, $imagefolder/primary.qcow2 needs to be copied to all hosts.
-You don't need to change any IP's here, because 0.0.0.0 listens on any
-interface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
-instance.
-
-# imagefolder="/mnt/vms/colo-test-primary"
-
-# qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
- -device piix3-usb-uhci -device usb-tablet -name primary \
- -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
- -device rtl8139,id=e0,netdev=hn0 \
- -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
- -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
- -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
- -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
- -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
- -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
- -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
- -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
- -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
- -object iothread,id=iothread1 \
- -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
-outdev=compare_out0,iothread=iothread1 \
- -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
-children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
-
-2. Secondary:
-Note: Active and hidden images need to be created only once and the
-size should be the same as primary.qcow2. Again, you don't need to change
-any IP's here, except for the $primary_ip variable.
-
-# imagefolder="/mnt/vms/colo-test-secondary"
-# primary_ip=127.0.0.1
-
-# qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
-
-# qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
-
-# qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
- -device piix3-usb-uhci -device usb-tablet -name secondary \
- -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
- -device rtl8139,id=e0,netdev=hn0 \
- -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect-ms=1000 \
- -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect-ms=1000 \
- -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
- -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
- -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
- -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
- -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
-top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
-file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
-file.backing.backing=parent0 \
- -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
-children.0=childs0 \
- -incoming tcp:0.0.0.0:9998
-
-
-3. On Secondary VM's QEMU monitor, issue command
-{"execute":"qmp_capabilities"}
-{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": {"host": "0.0.0.0", "port": "9999"} } } }
-{"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": true } }
-
-Note:
- a. The qmp command nbd-server-start and nbd-server-add must be run
- before running the qmp command migrate on primary QEMU
- b. Active disk, hidden disk and nbd target's length should be the
- same.
- c. It is better to put active disk and hidden disk in ramdisk. They
- will be merged into the parent disk on failover.
-
-4. On Primary VM's QEMU monitor, issue command:
-{"execute":"qmp_capabilities"}
-{"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
-{"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
-{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
-
- Note:
- a. There should be only one NBD Client for each primary disk.
- b. The qmp command line must be run after running qmp command line in
- secondary qemu.
-
-5. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
-You can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }'
-to change the idle checkpoint period time
-
-6. Failover test
-You can kill one of the VMs and Failover on the surviving VM:
-
-If you killed the Secondary, then follow "Primary Failover". After that,
-if you want to resume the replication, follow "Primary resume replication"
-
-If you killed the Primary, then follow "Secondary Failover". After that,
-if you want to resume the replication, follow "Secondary resume replication"
-
-== Primary Failover ==
-The Secondary died, resume on the Primary
-
-{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
-{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
-{"execute": "object-del", "arguments":{ "id": "comp0" } }
-{"execute": "object-del", "arguments":{ "id": "iothread1" } }
-{"execute": "object-del", "arguments":{ "id": "m0" } }
-{"execute": "object-del", "arguments":{ "id": "redire0" } }
-{"execute": "object-del", "arguments":{ "id": "redire1" } }
-{"execute": "x-colo-lost-heartbeat" }
-
-== Secondary Failover ==
-The Primary died, resume on the Secondary and prepare to become the new Primary
-
-{"execute": "nbd-server-stop"}
-{"execute": "x-colo-lost-heartbeat"}
-
-{"execute": "object-del", "arguments":{ "id": "f2" } }
-{"execute": "object-del", "arguments":{ "id": "f1" } }
-{"execute": "chardev-remove", "arguments":{ "id": "red1" } }
-{"execute": "chardev-remove", "arguments":{ "id": "red0" } }
-
-{"execute": "chardev-add", "arguments":{ "id": "mirror0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9003" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare1", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9004" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare0-0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": false } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare_out", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": true } } } }
-{"execute": "chardev-add", "arguments":{ "id": "compare_out0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": false } } } }
-
-== Primary resume replication ==
-Resume replication after new Secondary is up.
-
-Start the new Secondary (Steps 2 and 3 above), then on the Primary:
-{"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.2:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
-
-Wait until disk is synced, then:
-{"execute": "stop"}
-{"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
-
-{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
-{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
-
-{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
-{"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
-
-{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.2:9998" } }
-
-Note:
-If this Primary previously was a Secondary, then we need to insert the
-filters before the filter-rewriter by using the
-""insert": "before", "position": "id=rew0"" Options. See below.
-
-== Secondary resume replication ==
-Become Primary and resume replication after new Secondary is up. Note
-that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
-
-Start the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
-then on the old Secondary:
-{"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.1:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
-
-Wait until disk is synced, then:
-{"execute": "stop"}
-{"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
-
-{"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
-{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
-
-{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
-{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
-{"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
-{"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
-
-{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
-{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
-
-== TODO ==
-1. Support shared storage.
-2. Develop the heartbeat part.
-3. Reduce checkpoint VM’s downtime while doing checkpoint.
diff --git a/docs/system/index.rst b/docs/system/index.rst
index 427b020483104f6589878bbf255a367ae114c61b..6268c41aea9c74dc3e59d896b5ae082360bfbb1a 100644
--- a/docs/system/index.rst
+++ b/docs/system/index.rst
@@ -41,3 +41,4 @@ or Hypervisor.Framework.
igvm
vm-templating
sriov
+ qemu-colo
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
new file mode 100644
index 0000000000000000000000000000000000000000..5b00c6c4c2679153f398ed5a85a5d9cc515630e6
--- /dev/null
+++ b/docs/system/qemu-colo.rst
@@ -0,0 +1,361 @@
+Qemu COLO Fault Tolerance
+=========================
+
+| Copyright (c) 2016 Intel Corporation
+| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+| Copyright (c) 2016 Fujitsu, Corp.
+| Copyright (c) 2026 Lukas Straub <lukasstraub2@web.de>
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+This document gives an overview of COLO's design and how to use it.
+
+Background
+----------
+Virtual machine (VM) replication is a well known technique for providing
+application-agnostic software-implemented hardware fault tolerance,
+also known as "non-stop service".
+
+COLO (COarse-grained LOck-stepping) is a high availability solution.
+Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the
+same request from client, and generate response in parallel too.
+If the response packets from PVM and SVM are identical, they are released
+immediately. Otherwise, a VM checkpoint (on demand) is conducted.
+
+Architecture
+------------
+The architecture of COLO is shown in the diagram below.
+It consists of a pair of networked physical nodes:
+The primary node running the PVM, and the secondary node running the SVM
+to maintain a valid replica of the PVM.
+PVM and SVM execute in parallel and generate output of response packets for
+client requests according to the application semantics.
+
+The incoming packets from the client or external network are received by the
+primary node, and then forwarded to the secondary node, so that both the PVM
+and the SVM are stimulated with the same requests.
+
+COLO receives the outbound packets from both the PVM and SVM and compares them
+before allowing the output to be sent to clients.
+
+The SVM is qualified as a valid replica of the PVM, as long as it generates
+identical responses to all client requests. Once the differences in the outputs
+are detected between the PVM and SVM, COLO withholds transmission of the
+outbound packets until it has successfully synchronized the PVM state to the SVM.
+
+Overview::
+
+ Primary Node Secondary Node
+ +------------+ +-----------------------+ +------------------------+ +------------+
+ | | | HeartBeat +<----->+ HeartBeat | | |
+ | Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM|
+ | | | | | |
+ | | +-----------|-----------+ +-----------|------------+ | |
+ | | |QEMU +---v----+ | |QEMU +----v---+ | | |
+ | | | |Failover| | | |Failover| | | |
+ | | | +--------+ | | +--------+ | | |
+ | | | +---------------+ | | +---------------+ | | |
+ | | | | VM Checkpoint +-------------->+ VM Checkpoint | | | |
+ | | | +---------------+ | | +---------------+ | | |
+ |Requests<--------------------------\ /-----------------\ /--------------------->Requests|
+ | | | ^ ^ | | | | | | |
+ |Responses+---------------------\ /-|-|------------\ /-------------------------+Responses|
+ | | | | | | | | | | | | | | | |
+ | | | +-----------+ | | | | | | | | | | +----------+ | | |
+ | | | | COLO disk | | | | | | | | | | | | COLO disk| | | |
+ | | | | Manager +---------------------------->| Manager | | | |
+ | | | ++----------+ v v | | | | | v v | +---------++ | | |
+ | | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | |
+ | | | || COLO Proxy || | | COLO Proxy | | | | |
+ | | | || (compare packet || | |(adjust sequence | | | | |
+ | | | ||and mirror packet)|| | | and ACK) | | | | |
+ | | | |+------------+---+-+| | +-----------------+ | | | |
+ +------------+ +-----------------------+ +------------------------+ +------------+
+ +------------+ | | | | +------------+
+ | VM Monitor | | | | | | VM Monitor |
+ +------------+ | | | | +------------+
+ +---------------------------------------+ +----------------------------------------+
+ | Kernel | | | | | Kernel | |
+ +---------------------------------------+ +----------------------------------------+
+ | | | |
+ +--------------v+ +---------v---+--+ +------------------+ +v-------------+
+ | Storage | |External Network| | External Network | | Storage |
+ +---------------+ +----------------+ +------------------+ +--------------+
+
+Components introduction
+^^^^^^^^^^^^^^^^^^^^^^^
+You can see there are several components in COLO's diagram of architecture.
+Their functions are described below.
+
+HeartBeat
+~~~~~~~~~
+Runs on both the primary and secondary nodes, to periodically check platform
+availability. When the primary node suffers a hardware fail-stop failure,
+the heartbeat stops responding, the secondary node will trigger a failover
+as soon as it determines the absence.
+
+COLO disk Manager
+~~~~~~~~~~~~~~~~~
+When primary VM writes data into image, the colo disk manager captures this data
+and sends it to secondary VM's which makes sure the context of secondary VM's
+image is consistent with the context of primary VM 's image.
+For more details, please refer to docs/block-replication.txt.
+
+Checkpoint/Failover Controller
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Modifications of save/restore flow to realize continuous migration,
+to make sure the state of VM in Secondary side is always consistent with VM in
+Primary side.
+
+COLO Proxy
+~~~~~~~~~~
+Delivers packets to Primary and Secondary, and then compare the responses from
+both side. Then decide whether to start a checkpoint according to some rules.
+Please refer to docs/colo-proxy.txt for more information.
+
+Note:
+HeartBeat has not been implemented yet, so you need to trigger failover process
+by using 'x-colo-lost-heartbeat' command.
+
+COLO operation status
+^^^^^^^^^^^^^^^^^^^^^
+
+Overview::
+
+ +-----------------+
+ | |
+ | Start COLO |
+ | |
+ +--------+--------+
+ |
+ | Main qmp command:
+ | migrate-set-capabilities with x-colo
+ | migrate
+ |
+ v
+ +--------+--------+
+ | |
+ | COLO running |
+ | |
+ +--------+--------+
+ |
+ | Main qmp command:
+ | x-colo-lost-heartbeat
+ | or
+ | some error happened
+ v
+ +--------+--------+
+ | | send qmp event:
+ | COLO failover | COLO_EXIT
+ | |
+ +-----------------+
+
+
+COLO use the qmp command to switch and report operation status.
+The diagram just shows the main qmp command, you can get the detail
+in test procedure.
+
+Test procedure
+--------------
+Note: Here we are running both instances on the same host for testing,
+change the IP Addresses if you want to run it on two hosts. Initially
+``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
+
+Startup qemu
+^^^^^^^^^^^^
+**1. Primary**:
+Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
+You don't need to change any IP's here, because ``0.0.0.0`` listens on any
+interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
+instance::
+
+ # imagefolder="/mnt/vms/colo-test-primary"
+
+ # qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
+ -device piix3-usb-uhci -device usb-tablet -name primary \
+ -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
+ -device rtl8139,id=e0,netdev=hn0 \
+ -chardev socket,id=mirror0,host=0.0.0.0,port=9003,server=on,wait=off \
+ -chardev socket,id=compare1,host=0.0.0.0,port=9004,server=on,wait=on \
+ -chardev socket,id=compare0,host=127.0.0.1,port=9001,server=on,wait=off \
+ -chardev socket,id=compare0-0,host=127.0.0.1,port=9001 \
+ -chardev socket,id=compare_out,host=127.0.0.1,port=9005,server=on,wait=off \
+ -chardev socket,id=compare_out0,host=127.0.0.1,port=9005 \
+ -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 \
+ -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out \
+ -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 \
+ -object iothread,id=iothread1 \
+ -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,\
+ outdev=compare_out0,iothread=iothread1 \
+ -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+ children.0.file.filename=$imagefolder/primary.qcow2,children.0.driver=qcow2 -S
+
+
+**2. Secondary**:
+Note: Active and hidden images need to be created only once and the
+size should be the same as ``primary.qcow2``. Again, you don't need to change
+any IP's here, except for the ``$primary_ip`` variable::
+
+ # imagefolder="/mnt/vms/colo-test-secondary"
+ # primary_ip=127.0.0.1
+
+ # qemu-img create -f qcow2 $imagefolder/secondary-active.qcow2 10G
+
+ # qemu-img create -f qcow2 $imagefolder/secondary-hidden.qcow2 10G
+
+ # qemu-system-x86_64 -enable-kvm -cpu qemu64,kvmclock=on -m 512 -smp 1 -qmp stdio \
+ -device piix3-usb-uhci -device usb-tablet -name secondary \
+ -netdev tap,id=hn0,vhost=off,helper=/usr/lib/qemu/qemu-bridge-helper \
+ -device rtl8139,id=e0,netdev=hn0 \
+ -chardev socket,id=red0,host=$primary_ip,port=9003,reconnect-ms=1000 \
+ -chardev socket,id=red1,host=$primary_ip,port=9004,reconnect-ms=1000 \
+ -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
+ -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
+ -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
+ -drive if=none,id=parent0,file.filename=$imagefolder/primary.qcow2,driver=qcow2 \
+ -drive if=none,id=childs0,driver=replication,mode=secondary,file.driver=qcow2,\
+ top-id=colo-disk0,file.file.filename=$imagefolder/secondary-active.qcow2,\
+ file.backing.driver=qcow2,file.backing.file.filename=$imagefolder/secondary-hidden.qcow2,\
+ file.backing.backing=parent0 \
+ -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
+ children.0=childs0 \
+ -incoming tcp:0.0.0.0:9998
+
+
+**3.** On Secondary VM's QEMU monitor, issue command::
+
+ {"execute":"qmp_capabilities"}
+ {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "nbd-server-start", "arguments": {"addr": {"type": "inet", "data": {"host": "0.0.0.0", "port": "9999"} } } }
+ {"execute": "nbd-server-add", "arguments": {"device": "parent0", "writable": true } }
+
+Note:
+ a. The qmp command ``nbd-server-start`` and ``nbd-server-add`` must be run
+ before running the qmp command migrate on primary QEMU
+ b. Active disk, hidden disk and nbd target's length should be the
+ same.
+ c. It is better to put active disk and hidden disk in ramdisk. They
+ will be merged into the parent disk on failover.
+
+**4.** On Primary VM's QEMU monitor, issue command::
+
+ {"execute":"qmp_capabilities"}
+ {"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
+ {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
+
+Note:
+ a. There should be only one NBD Client for each primary disk.
+ b. The qmp command line must be run after running qmp command line in
+ secondary qemu.
+
+**5.** After the above steps, you will see, whenever you make changes to PVM, SVM will be synced.
+You can issue command ``{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }``
+to change the idle checkpoint period time
+
+Failover test
+^^^^^^^^^^^^^
+You can kill one of the VMs and Failover on the surviving VM:
+
+If you killed the Secondary, then follow "Primary Failover".
+After that, if you want to resume the replication, follow "Primary resume replication"
+
+If you killed the Primary, then follow "Secondary Failover".
+After that, if you want to resume the replication, follow "Secondary resume replication"
+
+Primary Failover
+~~~~~~~~~~~~~~~~
+The Secondary died, resume on the Primary::
+
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
+ {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
+ {"execute": "object-del", "arguments":{ "id": "comp0" } }
+ {"execute": "object-del", "arguments":{ "id": "iothread1" } }
+ {"execute": "object-del", "arguments":{ "id": "m0" } }
+ {"execute": "object-del", "arguments":{ "id": "redire0" } }
+ {"execute": "object-del", "arguments":{ "id": "redire1" } }
+ {"execute": "x-colo-lost-heartbeat" }
+
+Secondary Failover
+~~~~~~~~~~~~~~~~~~
+The Primary died, resume on the Secondary and prepare to become the new Primary::
+
+ {"execute": "nbd-server-stop"}
+ {"execute": "x-colo-lost-heartbeat"}
+
+ {"execute": "object-del", "arguments":{ "id": "f2" } }
+ {"execute": "object-del", "arguments":{ "id": "f1" } }
+ {"execute": "chardev-remove", "arguments":{ "id": "red1" } }
+ {"execute": "chardev-remove", "arguments":{ "id": "red0" } }
+
+ {"execute": "chardev-add", "arguments":{ "id": "mirror0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9003" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare1", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "0.0.0.0", "port": "9004" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare0-0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9001" } }, "server": false } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare_out", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": true } } } }
+ {"execute": "chardev-add", "arguments":{ "id": "compare_out0", "backend": {"type": "socket", "data": {"addr": { "type": "inet", "data": { "host": "127.0.0.1", "port": "9005" } }, "server": false } } } }
+
+Primary resume replication
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Resume replication after new Secondary is up.
+
+Start the new Secondary (Steps 2 and 3 above), then on the Primary::
+
+ {"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.2:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
+
+Wait until disk is synced, then::
+
+ {"execute": "stop"}
+ {"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
+
+ {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
+
+ {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.2:9998" } }
+
+Note:
+If this Primary previously was a Secondary, then we need to insert the
+filters before the filter-rewriter by using the
+""insert": "before", "position": "id=rew0"" Options. See below.
+
+Secondary resume replication
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Become Primary and resume replication after new Secondary is up. Note
+that now 127.0.0.1 is the Secondary and 127.0.0.2 is the Primary.
+
+Start the new Secondary (Steps 2 and 3 above, but with primary_ip=127.0.0.2),
+then on the old Secondary::
+
+ {"execute": "drive-mirror", "arguments":{ "device": "colo-disk0", "job-id": "resync", "target": "nbd://127.0.0.1:9999/parent0", "mode": "existing", "format": "raw", "sync": "full"} }
+
+Wait until disk is synced, then::
+
+ {"execute": "stop"}
+ {"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
+
+ {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire1", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "outdev": "compare0" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "iothread", "id": "iothread1" } }
+ {"execute": "object-add", "arguments":{ "qom-type": "colo-compare", "id": "comp0", "primary_in": "compare0-0", "secondary_in": "compare1", "outdev": "compare_out0", "iothread": "iothread1" } }
+
+ {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
+ {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
+
+TODO
+----
+1. Support shared storage.
+2. Develop the heartbeat part.
+3. Reduce checkpoint VM’s downtime while doing checkpoint.
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (5 preceding siblings ...)
2026-01-17 14:09 ` [PATCH v2 6/8] Convert colo main documentation to restructuredText Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:30 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup Lukas Straub
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
docs/system/qemu-colo.rst | 30 ++++++++++++------------------
1 file changed, 12 insertions(+), 18 deletions(-)
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
index 5b00c6c4c2679153f398ed5a85a5d9cc515630e6..2052e207e57afdd3ab3ab1a447d55f5e2e5b5b87 100644
--- a/docs/system/qemu-colo.rst
+++ b/docs/system/qemu-colo.rst
@@ -1,14 +1,6 @@
Qemu COLO Fault Tolerance
=========================
-| Copyright (c) 2016 Intel Corporation
-| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
-| Copyright (c) 2016 Fujitsu, Corp.
-| Copyright (c) 2026 Lukas Straub <lukasstraub2@web.de>
-
-This work is licensed under the terms of the GNU GPL, version 2 or later.
-See the COPYING file in the top-level directory.
-
This document gives an overview of COLO's design and how to use it.
Background
@@ -83,8 +75,8 @@ Overview::
| Storage | |External Network| | External Network | | Storage |
+---------------+ +----------------+ +------------------+ +--------------+
-Components introduction
-^^^^^^^^^^^^^^^^^^^^^^^
+Components
+^^^^^^^^^^
You can see there are several components in COLO's diagram of architecture.
Their functions are described below.
@@ -158,14 +150,14 @@ in test procedure.
Test procedure
--------------
-Note: Here we are running both instances on the same host for testing,
+Here we are running both instances on the same host for testing,
change the IP Addresses if you want to run it on two hosts. Initially
``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
Startup qemu
^^^^^^^^^^^^
**1. Primary**:
-Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
+Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
You don't need to change any IP's here, because ``0.0.0.0`` listens on any
interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
instance::
@@ -193,7 +185,7 @@ instance::
**2. Secondary**:
-Note: Active and hidden images need to be created only once and the
+Active and hidden images need to be created only once and the
size should be the same as ``primary.qcow2``. Again, you don't need to change
any IP's here, except for the ``$primary_ip`` variable::
@@ -354,8 +346,10 @@ Wait until disk is synced, then::
{"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
{"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
-TODO
-----
-1. Support shared storage.
-2. Develop the heartbeat part.
-3. Reduce checkpoint VM’s downtime while doing checkpoint.
+| Copyright (c) 2016 Intel Corporation
+| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+| Copyright (c) 2016 Fujitsu, Corp.
+| Copyright (c) 2026 Lukas Straub <lukasstraub2@web.de>
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
\ No newline at end of file
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
` (6 preceding siblings ...)
2026-01-17 14:09 ` [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes Lukas Straub
@ 2026-01-17 14:09 ` Lukas Straub
2026-01-20 17:32 ` Peter Xu
7 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-17 14:09 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Lukas Straub
On the primary side we don't actually need the replication
block driver, since it only passes trough all IO.
So simplify the setup and also use 'blockdev-add' instead of
'human-monitor-command'.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
---
docs/system/qemu-colo.rst | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
index 2052e207e57afdd3ab3ab1a447d55f5e2e5b5b87..7e361998d871b2c9a0e8065a15c004a9d841958b 100644
--- a/docs/system/qemu-colo.rst
+++ b/docs/system/qemu-colo.rst
@@ -233,8 +233,8 @@ Note:
**4.** On Primary VM's QEMU monitor, issue command::
{"execute":"qmp_capabilities"}
- {"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
- {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.2", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
+ {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "nbd0" } }
{"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
{"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
@@ -262,7 +262,7 @@ Primary Failover
The Secondary died, resume on the Primary::
{"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
- {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
+ {"execute": "blockdev-del", "arguments": {"node-name": "nbd0"} }
{"execute": "object-del", "arguments":{ "id": "comp0" } }
{"execute": "object-del", "arguments":{ "id": "iothread1" } }
{"execute": "object-del", "arguments":{ "id": "m0" } }
@@ -302,8 +302,8 @@ Wait until disk is synced, then::
{"execute": "stop"}
{"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
- {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
- {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.2", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "nbd0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
@@ -334,8 +334,8 @@ Wait until disk is synced, then::
{"execute": "stop"}
{"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
- {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
- {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
+ {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.1", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
+ {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "nbd0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
{"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
--
2.39.5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v2 4/8] multifd: Add COLO support
2026-01-17 14:09 ` [PATCH v2 4/8] multifd: Add COLO support Lukas Straub
@ 2026-01-20 17:13 ` Peter Xu
2026-01-20 18:05 ` Daniel P. Berrangé
2026-01-21 19:00 ` Lukas Straub
0 siblings, 2 replies; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:13 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Juan Quintela
On Sat, Jan 17, 2026 at 03:09:11PM +0100, Lukas Straub wrote:
> Like in the normal ram_load() path, put the received pages into the
> colo cache and mark the pages in the bitmap so that they will be
> flushed to the guest later.
>
> Multifd with COLO is useful to reduce the VM pause time during checkpointing
> for latency sensitive workloads. In such workloads the worst-case latency
> is especially important.
>
> Also, multifd migration is the preferred way to do migration nowadays and this
> allows to use multifd compression with COLO.
>
> Benchmark:
> Cluster nodes
> - Intel Xenon E5-2630 v3
> - 48Gb RAM
> - 10G Ethernet
> Guest
> - Windows Server 2016
> - 6Gb RAM
> - 4 cores
> Workload
> - Upload a file to the guest with SMB to simulate moderate
> memory dirtying
> - Measure the memory transfer time portion of each checkpoint
> - 600ms COLO checkpoint interval
>
> Results
> Plain
> idle mean: 4.50ms 99per: 10.33ms
> load mean: 24.30ms 99per: 78.05ms
> Multifd-4
> idle mean: 6.48ms 99per: 10.41ms
> load mean: 14.12ms 99per: 31.27ms
Thanks for the numbers. They're persuasive at least from 1st look.
Said that, one major question is, multifd should only help with throughput
when cpu is a bottleneck sending, in your case it's 10Gbps NIC. Normally
any decent cpu should be able to push closer to 10Gbps even without
multifd.
Per my previous experiences, multifd can only show a difference when the
hosts have at least 25GBps+ bandwidth available.
Maybe you turned on compression already? If so, worth stating the
compressor methods chosen / parameters.
>
> Evaluation
> While multifd has slightly higher latency when the guest idles, it is
> 10ms faster under load and more importantly it's worst case latency is
> less than 1/2 of plain under load as can be seen in the 99. Percentile.
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> MAINTAINERS | 1 +
> migration/meson.build | 2 +-
> migration/multifd-colo.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
> migration/multifd-colo.h | 26 +++++++++++++++++++++++++
> migration/multifd.c | 12 ++++++++++++
> migration/multifd.h | 1 +
> 6 files changed, 90 insertions(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 563804345fec68ee72793dbb7c1b7e5be4c32083..dbb217255c2cf35dc0ce971c2021b130fac5469b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3837,6 +3837,7 @@ COLO Framework
> M: Lukas Straub <lukasstraub2@web.de>
> S: Maintained
> F: migration/colo*
> +F: migration/multifd-colo.*
> F: include/migration/colo.h
> F: include/migration/failover.h
> F: docs/COLO-FT.txt
> diff --git a/migration/meson.build b/migration/meson.build
> index 16909d54c5110fc5d8187fd3a68c4a5b08b59ea7..1e59fe4f1f0bbfffed90df38e8f39fa87bceb9b9 100644
> --- a/migration/meson.build
> +++ b/migration/meson.build
> @@ -40,7 +40,7 @@ system_ss.add(files(
> ), gnutls, zlib)
>
> if get_option('replication').allowed()
> - system_ss.add(files('colo-failover.c', 'colo.c'))
> + system_ss.add(files('colo-failover.c', 'colo.c', 'multifd-colo.c'))
> else
> system_ss.add(files('colo-stubs.c'))
> endif
> diff --git a/migration/multifd-colo.c b/migration/multifd-colo.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..d8d98e79b12ed52c41f341052a682d7786e221b5
> --- /dev/null
> +++ b/migration/multifd-colo.c
> @@ -0,0 +1,49 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * multifd colo implementation
> + *
> + * Copyright (c) Lukas Straub <lukasstraub2@web.de>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "exec/target_page.h"
> +#include "qemu/error-report.h"
> +#include "qapi/error.h"
> +#include "ram.h"
> +#include "multifd.h"
> +#include "options.h"
> +#include "io/channel-socket.h"
> +#include "migration/colo.h"
> +#include "multifd-colo.h"
> +#include "system/ramblock.h"
> +
> +void multifd_colo_prepare_recv(MultiFDRecvParams *p)
> +{
> + assert(p->block->colo_cache);
> +
> + /*
> + * While we're still in precopy state (not yet in colo state), we copy
> + * received pages to both guest and cache. No need to set dirty bits,
> + * since guest and cache memory are in sync.
> + */
> + if (migration_incoming_in_colo_state()) {
> + colo_record_bitmap(p->block, p->normal, p->normal_num);
> + }
> + p->host = p->block->colo_cache;
I should have mentioned it while reviewing the previous version, anyway..
IMHO it would be better to have one place setting p->host instead of
overwritting it.
So instead of hooking before ->recv(), we should do it in
multifd_ram_unfill_packet(), moving the p->host update to the end of
function and hook it there with COLO (so that you can still record the
bitmaps, only after normal[]).
Another thing, which might be more important: you seem to have ignored
zero[], but I think you need it. zero[] keeps all pages that are zeros
(which may not used to be zeros). So IMHO you'll need to record them as
dirty too in COLO's dest bitmap, otherwise you may hit hard to debug RAM
corruptions.
> +}
> +
> +void multifd_colo_process_recv(MultiFDRecvParams *p)
> +{
> + if (!migration_incoming_in_colo_state()) {
> + for (int i = 0; i < p->normal_num; i++) {
> + void *guest = p->block->host + p->normal[i];
> + void *cache = p->host + p->normal[i];
> + memcpy(guest, cache, multifd_ram_page_size());
> + }
> + }
> + p->host = p->block->host;
Is resetting the pointer required? If not, we can skip it.
> +}
> diff --git a/migration/multifd-colo.h b/migration/multifd-colo.h
> new file mode 100644
> index 0000000000000000000000000000000000000000..82eaf3f48c47de2f090f9de52f9d57a337d4754a
> --- /dev/null
> +++ b/migration/multifd-colo.h
> @@ -0,0 +1,26 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * multifd colo header
> + *
> + * Copyright (c) Lukas Straub <lukasstraub2@web.de>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_MIGRATION_MULTIFD_COLO_H
> +#define QEMU_MIGRATION_MULTIFD_COLO_H
> +
> +#ifdef CONFIG_REPLICATION
> +
> +void multifd_colo_prepare_recv(MultiFDRecvParams *p);
> +void multifd_colo_process_recv(MultiFDRecvParams *p);
> +
> +#else
> +
> +static inline void multifd_colo_prepare_recv(MultiFDRecvParams *p) {}
> +static inline void multifd_colo_process_recv(MultiFDRecvParams *p) {}
> +
> +#endif
> +#endif
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 8e71171fb7a17726ba7eb0705e293c41e8aa32ec..6c85acec3bac134e85cfcee0d32057134f5af8d1 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -29,6 +29,7 @@
> #include "qemu-file.h"
> #include "trace.h"
> #include "multifd.h"
> +#include "multifd-colo.h"
> #include "threadinfo.h"
> #include "options.h"
> #include "qemu/yank.h"
> @@ -1269,7 +1270,18 @@ static int multifd_ram_state_recv(MultiFDRecvParams *p, Error **errp)
> {
> int ret;
>
> + if (migrate_colo()) {
> + multifd_colo_prepare_recv(p);
> + }
> +
> ret = multifd_recv_state->ops->recv(p, errp);
> + if (ret != 0) {
> + return ret;
> + }
> +
> + if (migrate_colo()) {
> + multifd_colo_process_recv(p);
> + }
>
> return ret;
> }
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 9b6d81e7ede024f05d4cd235de95e73840d0bbc4..7036f438fade1baed2442bfdcf8b5d6397c4a448 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -280,6 +280,7 @@ typedef struct {
> /* ramblock */
> RAMBlock *block;
> /* ramblock host address */
> + /* or points to the corresponding address in the colo cache */
Nit: we can merge it with /* ... */, and some wording change:
/*
* Normally, it points to ramblock's host address. When COLO
* enabled, it points to the mirror cache for the ramblock.
*/
> uint8_t *host;
> /* buffers to recv */
> struct iovec *iov;
>
> --
> 2.39.5
>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv()
2026-01-17 14:09 ` [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv() Lukas Straub
@ 2026-01-20 17:14 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:14 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:10PM +0100, Lukas Straub wrote:
> This is in preparation for the next patch.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/8] migration-test: Add COLO migration unit test
2026-01-17 14:09 ` [PATCH v2 5/8] migration-test: Add COLO migration unit test Lukas Straub
@ 2026-01-20 17:23 ` Peter Xu
2026-01-21 19:37 ` Lukas Straub
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:23 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:12PM +0100, Lukas Straub wrote:
> Add a COLO migration test for COLO migration and failover.
>
> COLO does not support q35 machine at this time.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> MAINTAINERS | 1 +
> tests/qtest/meson.build | 7 ++-
> tests/qtest/migration-test.c | 1 +
> tests/qtest/migration/colo-tests.c | 113 +++++++++++++++++++++++++++++++++++++
> tests/qtest/migration/framework.c | 87 +++++++++++++++++++++++++++-
> tests/qtest/migration/framework.h | 10 ++++
> 6 files changed, 217 insertions(+), 2 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index dbb217255c2cf35dc0ce971c2021b130fac5469b..92ca20c9d4186a08519d15bfe8cbd583ab061a8b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3840,6 +3840,7 @@ F: migration/colo*
> F: migration/multifd-colo.*
> F: include/migration/colo.h
> F: include/migration/failover.h
> +F: tests/qtest/migration/colo-tests.c
> F: docs/COLO-FT.txt
>
> COLO Proxy
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index 0f053fb56de5806d3c213e3a26c0b19998ae151a..d0129af4431bb08a94a918a1e40a8f657059d764 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -367,6 +367,11 @@ if gnutls.found()
> endif
> endif
>
> +migration_colo_files = []
> +if get_option('replication').allowed()
> + migration_colo_files = [files('migration/colo-tests.c')]
> +endif
> +
> qtests = {
> 'aspeed_hace-test': files('aspeed-hace-utils.c', 'aspeed_hace-test.c'),
> 'aspeed_smc-test': files('aspeed-smc-utils.c', 'aspeed_smc-test.c'),
> @@ -378,7 +383,7 @@ qtests = {
> 'migration/migration-util.c') + dbus_vmstate1,
> 'erst-test': files('erst-test.c'),
> 'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
> - 'migration-test': test_migration_files + migration_tls_files,
> + 'migration-test': test_migration_files + migration_tls_files + migration_colo_files,
> 'pxe-test': files('boot-sector.c'),
> 'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
> 'pnv-xive2-nvpg_bar.c'),
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 08936871741535c926eeac40a7d7c3f461c72fd0..e582f05c7dc2673dbd05a936df8feb6c964b5bbc 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -55,6 +55,7 @@ int main(int argc, char **argv)
> migration_test_add_precopy(env);
> migration_test_add_cpr(env);
> migration_test_add_misc(env);
> + migration_test_add_colo(env);
>
> ret = g_test_run();
>
> diff --git a/tests/qtest/migration/colo-tests.c b/tests/qtest/migration/colo-tests.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..5004f581e4d9e4e6f54eee6d70a9307b7fd123be
> --- /dev/null
> +++ b/tests/qtest/migration/colo-tests.c
> @@ -0,0 +1,113 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * QTest testcases for COLO migration
> + *
> + * Copyright (c) 2025 Lukas Straub <lukasstraub2@web.de>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "libqtest.h"
> +#include "migration/framework.h"
> +#include "migration/migration-qmp.h"
> +#include "migration/migration-util.h"
> +#include "qemu/module.h"
> +
> +static void test_colo_plain_common(MigrateCommon *args,
> + bool failover_during_checkpoint,
> + bool primary_failover)
> +{
> + args->listen_uri = "tcp:127.0.0.1:0";
> + test_colo_common(args, failover_during_checkpoint, primary_failover);
> +}
> +
> +static void *hook_start_multifd(QTestState *from, QTestState *to)
> +{
> + return migrate_hook_start_precopy_tcp_multifd_common(from, to, "none");
> +}
> +
> +static void test_colo_multifd_common(MigrateCommon *args,
> + bool failover_during_checkpoint,
> + bool primary_failover)
> +{
> + args->listen_uri = "defer";
> + args->start_hook = hook_start_multifd;
> + args->start.caps[MIGRATION_CAPABILITY_MULTIFD] = true;
> + test_colo_common(args, failover_during_checkpoint, primary_failover);
> +}
> +
> +static void test_colo_plain_primary_failover(char *name, MigrateCommon *args)
> +{
> + test_colo_plain_common(args, false, true);
> +}
> +
> +static void test_colo_plain_secondary_failover(char *name, MigrateCommon *args)
> +{
> + test_colo_plain_common(args, false, false);
> +}
> +
> +static void test_colo_multifd_primary_failover(char *name, MigrateCommon *args)
> +{
> + test_colo_multifd_common(args, false, true);
> +}
> +
> +static void test_colo_multifd_secondary_failover(char *name,
> + MigrateCommon *args)
> +{
> + test_colo_multifd_common(args, false, false);
> +}
> +
> +static void test_colo_plain_primary_failover_checkpoint(char *name,
> + MigrateCommon *args)
> +{
> + test_colo_plain_common(args, true, true);
> +}
> +
> +static void test_colo_plain_secondary_failover_checkpoint(char *name,
> + MigrateCommon *args)
> +{
> + test_colo_plain_common(args, true, false);
> +}
> +
> +static void test_colo_multifd_primary_failover_checkpoint(char *name,
> + MigrateCommon *args)
> +{
> + test_colo_multifd_common(args, true, true);
> +}
> +
> +static void test_colo_multifd_secondary_failover_checkpoint(char *name,
> + MigrateCommon *args)
> +{
> + test_colo_multifd_common(args, true, false);
> +}
> +
> +void migration_test_add_colo(MigrationTestEnv *env)
> +{
> + if (!env->full_set) {
> + return;
> + }
> +
> + migration_test_add("/migration/colo/plain/primary_failover",
> + test_colo_plain_primary_failover);
> + migration_test_add("/migration/colo/plain/secondary_failover",
> + test_colo_plain_secondary_failover);
> +
> + migration_test_add("/migration/colo/multifd/primary_failover",
> + test_colo_multifd_primary_failover);
> + migration_test_add("/migration/colo/multifd/secondary_failover",
> + test_colo_multifd_secondary_failover);
> +
> + migration_test_add("/migration/colo/plain/primary_failover_checkpoint",
> + test_colo_plain_primary_failover_checkpoint);
> + migration_test_add("/migration/colo/plain/secondary_failover_checkpoint",
> + test_colo_plain_secondary_failover_checkpoint);
> +
> + migration_test_add("/migration/colo/multifd/primary_failover_checkpoint",
> + test_colo_multifd_primary_failover_checkpoint);
> + migration_test_add("/migration/colo/multifd/secondary_failover_checkpoint",
> + test_colo_multifd_secondary_failover_checkpoint);
> +}
> diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
> index 57d3b9b7c5a269d31659971e308367bd916d28f6..fe34e7cc7a1a4eeb8d5219f54733bbd8446b0e4e 100644
> --- a/tests/qtest/migration/framework.c
> +++ b/tests/qtest/migration/framework.c
> @@ -315,7 +315,7 @@ int migrate_args(char **from, char **to, const char *uri, MigrateStart *args)
> if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
> memory_size = "150M";
>
> - if (g_str_equal(arch, "i386")) {
> + if (g_str_equal(arch, "i386") || args->force_pc_machine) {
The naming is better, thanks. Said that, force_pc_machine is unwanted
either.. if we can drop it. I asked this in v1:
https://lore.kernel.org/qemu-devel/aWltRH6Nra-Tji7w@x1.local/
Can we explore that possibility?
> machine_alias = "pc";
> } else {
> machine_alias = "q35";
> @@ -1066,6 +1066,91 @@ void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
> return NULL;
> }
>
> +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> + bool primary_failover)
> +{
> + QTestState *from, *to;
> + void *data_hook = NULL;
> +
> + /*
> + * For the COLO test, both VMs will run in parallel. Thus both VMs want to
> + * open the image read/write at the same time. Using read-only=on is not
> + * possible here, because ide-hd does not support read-only backing image.
> + *
> + * So use -snapshot, where each qemu instance creates its own writable
> + * snapshot internally while leaving the real image read-only.
> + */
> + args->start.opts_source = "-snapshot";
> + args->start.opts_target = "-snapshot";
> +
> + /*
> + * COLO migration code logs many errors when the migration socket
> + * is shut down, these are expected so we hide them here.
> + */
> + args->start.hide_stderr = true;
> +
> + /*
> + * COLO currently does not work with Q35 machine
> + */
> + args->start.force_pc_machine = true;
> +
> + args->start.oob = true;
Just curious: is OOB required in COLO for some reason? I understand yank
you used below uses OOB, so the question is behind that, on what can be
blocked in main thread, and special in COLO.
> + args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
> +
> + if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
> + return -1;
> + }
> +
> + migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
> +
> + if (args->start_hook) {
> + data_hook = args->start_hook(from, to);
> + }
> +
> + migrate_ensure_converge(from);
> + wait_for_serial("src_serial");
> +
> + migrate_qmp(from, to, args->connect_uri, NULL, "{}");
> +
> + wait_for_migration_status(from, "colo", NULL);
> + wait_for_resume(to, &dst_state);
We can move this whole function into colo-tests.c. Here you may want to
use get_dst() instead.
> +
> + wait_for_serial("src_serial");
> + wait_for_serial("dest_serial");
> +
> + /* wait for 3 checkpoints */
> + for (int i = 0; i < 3; i++) {
> + qtest_qmp_eventwait(to, "RESUME");
> + wait_for_serial("src_serial");
> + wait_for_serial("dest_serial");
> + }
> +
> + if (failover_during_checkpoint) {
> + qtest_qmp_eventwait(to, "STOP");
> + }
> + if (primary_failover) {
> + qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> + "'arguments': {'instances':"
> + "[{'type': 'migration'}]}}");
> + qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
> + wait_for_serial("src_serial");
> + } else {
> + qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> + "'arguments': {'instances':"
> + "[{'type': 'migration'}]}}");
> + qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
> + wait_for_serial("dest_serial");
> + }
> +
> + if (args->end_hook) {
> + args->end_hook(from, to, data_hook);
> + }
> +
> + migrate_end(from, to, !primary_failover);
> +
> + return 0;
> +}
> +
> QTestMigrationState *get_src(void)
> {
> return &src_state;
> diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
> index 2ef0f57962605c9e3bc7b7de48e52351e5389138..75088c5fb098a0f95acb1e23585d3b6e8307451e 100644
> --- a/tests/qtest/migration/framework.h
> +++ b/tests/qtest/migration/framework.h
> @@ -139,6 +139,9 @@ typedef struct {
> /* Do not connect to target monitor and qtest sockets in qtest_init */
> bool defer_target_connect;
>
> + /* Use pc machine for x86_64 */
> + bool force_pc_machine;
> +
> /*
> * Migration capabilities to be set in both source and
> * destination. For unilateral capabilities, use
> @@ -248,6 +251,8 @@ void test_postcopy_common(MigrateCommon *args);
> void test_postcopy_recovery_common(MigrateCommon *args);
> int test_precopy_common(MigrateCommon *args);
> void test_file_common(MigrateCommon *args, bool stop_src);
> +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> + bool colo_primary_failover);
> void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
> QTestState *to,
> const char *method);
> @@ -267,5 +272,10 @@ void migration_test_add_file(MigrationTestEnv *env);
> void migration_test_add_precopy(MigrationTestEnv *env);
> void migration_test_add_cpr(MigrationTestEnv *env);
> void migration_test_add_misc(MigrationTestEnv *env);
> +#ifdef CONFIG_REPLICATION
> +void migration_test_add_colo(MigrationTestEnv *env);
> +#else
> +static inline void migration_test_add_colo(MigrationTestEnv *env) {};
> +#endif
>
> #endif /* TEST_FRAMEWORK_H */
>
> --
> 2.39.5
>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 6/8] Convert colo main documentation to restructuredText
2026-01-17 14:09 ` [PATCH v2 6/8] Convert colo main documentation to restructuredText Lukas Straub
@ 2026-01-20 17:26 ` Peter Xu
2026-01-21 19:44 ` Lukas Straub
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:26 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:13PM +0100, Lukas Straub wrote:
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> MAINTAINERS | 2 +-
> docs/COLO-FT.txt | 334 ------------------------------------------
> docs/system/index.rst | 1 +
> docs/system/qemu-colo.rst | 361 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 363 insertions(+), 335 deletions(-)
Thank you.
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 92ca20c9d4186a08519d15bfe8cbd583ab061a8b..4c30dc50d15c74b317443e43920e01b4560b03a5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3841,7 +3841,7 @@ F: migration/multifd-colo.*
> F: include/migration/colo.h
> F: include/migration/failover.h
> F: tests/qtest/migration/colo-tests.c
> -F: docs/COLO-FT.txt
> +F: docs/devel/qemu-colo.rst
Should we still put it under docs/devel/migration?
COLO framework is under migration/. COLO tests will be under
tests/qtest/migration/. I still think we should keep doc under migration/
too, IOW when someone touches that we want to get copied too.
I also raised some other requests while discussing with you on the COLO
details. If you want, feel free to attach one more patch to add those
contents into the doc (after this conversion patch), on either doubled mem
consumption of SVM, or RAM mgmt details.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes
2026-01-17 14:09 ` [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes Lukas Straub
@ 2026-01-20 17:30 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:30 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:14PM +0100, Lukas Straub wrote:
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> docs/system/qemu-colo.rst | 30 ++++++++++++------------------
> 1 file changed, 12 insertions(+), 18 deletions(-)
>
> diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
> index 5b00c6c4c2679153f398ed5a85a5d9cc515630e6..2052e207e57afdd3ab3ab1a447d55f5e2e5b5b87 100644
> --- a/docs/system/qemu-colo.rst
> +++ b/docs/system/qemu-colo.rst
> @@ -1,14 +1,6 @@
> Qemu COLO Fault Tolerance
> =========================
>
> -| Copyright (c) 2016 Intel Corporation
> -| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> -| Copyright (c) 2016 Fujitsu, Corp.
> -| Copyright (c) 2026 Lukas Straub <lukasstraub2@web.de>
Hmm, I don't see this copyright line in the current code base. I think you
added it in previous conversion patch.
When converting, we should keep that patch change nothing in the content
but only convert things.
If you want to propose new things to the doc, it needs to be separately
done and reviewed.
We'd better not hide real changes within a conversion patch.
Here, I'm not sure you should add your copyright line. IIUC, that can be
added only if you made prominent contribution to this solution in the code
base. Becoming a maintainer is definitely a bless, however not justified
for an additional copyright update. I can also be wrong, but please
justify.
> -
> -This work is licensed under the terms of the GNU GPL, version 2 or later.
> -See the COPYING file in the top-level directory.
> -
> This document gives an overview of COLO's design and how to use it.
>
> Background
> @@ -83,8 +75,8 @@ Overview::
> | Storage | |External Network| | External Network | | Storage |
> +---------------+ +----------------+ +------------------+ +--------------+
>
> -Components introduction
> -^^^^^^^^^^^^^^^^^^^^^^^
> +Components
> +^^^^^^^^^^
> You can see there are several components in COLO's diagram of architecture.
> Their functions are described below.
>
> @@ -158,14 +150,14 @@ in test procedure.
>
> Test procedure
> --------------
> -Note: Here we are running both instances on the same host for testing,
> +Here we are running both instances on the same host for testing,
> change the IP Addresses if you want to run it on two hosts. Initially
> ``127.0.0.1`` is the Primary Host and ``127.0.0.2`` is the Secondary Host.
>
> Startup qemu
> ^^^^^^^^^^^^
> **1. Primary**:
> -Note: Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
> +Initially, ``$imagefolder/primary.qcow2`` needs to be copied to all hosts.
> You don't need to change any IP's here, because ``0.0.0.0`` listens on any
> interface. The chardev's with ``127.0.0.1`` IP's loopback to the local qemu
> instance::
> @@ -193,7 +185,7 @@ instance::
>
>
> **2. Secondary**:
> -Note: Active and hidden images need to be created only once and the
> +Active and hidden images need to be created only once and the
> size should be the same as ``primary.qcow2``. Again, you don't need to change
> any IP's here, except for the ``$primary_ip`` variable::
>
> @@ -354,8 +346,10 @@ Wait until disk is synced, then::
> {"execute": "migrate-set-capabilities", "arguments":{ "capabilities": [ {"capability": "x-colo", "state": true } ] } }
> {"execute": "migrate", "arguments":{ "uri": "tcp:127.0.0.1:9998" } }
>
> -TODO
> -----
> -1. Support shared storage.
> -2. Develop the heartbeat part.
> -3. Reduce checkpoint VM’s downtime while doing checkpoint.
> +| Copyright (c) 2016 Intel Corporation
> +| Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> +| Copyright (c) 2016 Fujitsu, Corp.
> +| Copyright (c) 2026 Lukas Straub <lukasstraub2@web.de>
> +
> +This work is licensed under the terms of the GNU GPL, version 2 or later.
> +See the COPYING file in the top-level directory.
> \ No newline at end of file
>
> --
> 2.39.5
>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup
2026-01-17 14:09 ` [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup Lukas Straub
@ 2026-01-20 17:32 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:32 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:15PM +0100, Lukas Straub wrote:
> On the primary side we don't actually need the replication
> block driver, since it only passes trough all IO.
> So simplify the setup and also use 'blockdev-add' instead of
> 'human-monitor-command'.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> Tested-by: Lukas Straub <lukasstraub2@web.de>
We can drop this line; Tested-by is normally not used on one's own patch.
Proposer should always test one's own patch..
I'll leave it to Chen and others to review this patch. Please consider
copy Zhijian and Dave when you repost; you'll get higher chance to get it
reviewed.
Thanks,
> ---
> docs/system/qemu-colo.rst | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/docs/system/qemu-colo.rst b/docs/system/qemu-colo.rst
> index 2052e207e57afdd3ab3ab1a447d55f5e2e5b5b87..7e361998d871b2c9a0e8065a15c004a9d841958b 100644
> --- a/docs/system/qemu-colo.rst
> +++ b/docs/system/qemu-colo.rst
> @@ -233,8 +233,8 @@ Note:
> **4.** On Primary VM's QEMU monitor, issue command::
>
> {"execute":"qmp_capabilities"}
> - {"execute": "human-monitor-command", "arguments": {"command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
> - {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "replication0" } }
> + {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.2", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
> + {"execute": "x-blockdev-change", "arguments":{"parent": "colo-disk0", "node": "nbd0" } }
> {"execute": "migrate-set-capabilities", "arguments": {"capabilities": [ {"capability": "x-colo", "state": true } ] } }
> {"execute": "migrate", "arguments": {"uri": "tcp:127.0.0.2:9998" } }
>
> @@ -262,7 +262,7 @@ Primary Failover
> The Secondary died, resume on the Primary::
>
> {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "child": "children.1"} }
> - {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_del replication0" } }
> + {"execute": "blockdev-del", "arguments": {"node-name": "nbd0"} }
> {"execute": "object-del", "arguments":{ "id": "comp0" } }
> {"execute": "object-del", "arguments":{ "id": "iothread1" } }
> {"execute": "object-del", "arguments":{ "id": "m0" } }
> @@ -302,8 +302,8 @@ Wait until disk is synced, then::
> {"execute": "stop"}
> {"execute": "block-job-cancel", "arguments":{ "device": "resync"} }
>
> - {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.2,file.port=9999,file.export=parent0,node-name=replication0"}}
> - {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
> + {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.2", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
> + {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "nbd0" } }
>
> {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
> {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
> @@ -334,8 +334,8 @@ Wait until disk is synced, then::
> {"execute": "stop"}
> {"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
>
> - {"execute": "human-monitor-command", "arguments":{ "command-line": "drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=127.0.0.1,file.port=9999,file.export=parent0,node-name=replication0"}}
> - {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "replication0" } }
> + {"execute": "blockdev-add", "arguments": {"driver": "nbd", "node-name": "nbd0", "server": {"type": "inet", "host": "127.0.0.1", "port": "9999"}, "export": "parent0", "detect-zeroes": "on"} }
> + {"execute": "x-blockdev-change", "arguments":{ "parent": "colo-disk0", "node": "nbd0" } }
>
> {"execute": "object-add", "arguments":{ "qom-type": "filter-mirror", "id": "m0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "tx", "outdev": "mirror0" } }
> {"execute": "object-add", "arguments":{ "qom-type": "filter-redirector", "id": "redire0", "insert": "before", "position": "id=rew0", "netdev": "hn0", "queue": "rx", "indev": "compare_out" } }
>
> --
> 2.39.5
>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework
2026-01-17 14:09 ` [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
@ 2026-01-20 17:32 ` Peter Xu
2026-01-22 9:54 ` Zhang Chen
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:32 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:08PM +0100, Lukas Straub wrote:
> I am ready to maintain it.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
2026-01-17 14:09 ` [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
@ 2026-01-20 17:32 ` Peter Xu
2026-01-22 9:54 ` Zhang Chen
0 siblings, 1 reply; 26+ messages in thread
From: Peter Xu @ 2026-01-20 17:32 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sat, Jan 17, 2026 at 03:09:09PM +0100, Lukas Straub wrote:
> His last email to the mailing list is from December 2021:
> https://lore.kernel.org/qemu-devel/20211214075424.6920-1-zhanghailiang@xfusion.com/
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 4/8] multifd: Add COLO support
2026-01-20 17:13 ` Peter Xu
@ 2026-01-20 18:05 ` Daniel P. Berrangé
2026-01-20 19:18 ` Peter Xu
2026-01-21 19:00 ` Lukas Straub
1 sibling, 1 reply; 26+ messages in thread
From: Daniel P. Berrangé @ 2026-01-20 18:05 UTC (permalink / raw)
To: Peter Xu
Cc: Lukas Straub, qemu-devel, Fabiano Rosas, Laurent Vivier,
Paolo Bonzini, Zhang Chen, Hailiang Zhang, Markus Armbruster,
Juan Quintela
On Tue, Jan 20, 2026 at 12:13:58PM -0500, Peter Xu wrote:
> On Sat, Jan 17, 2026 at 03:09:11PM +0100, Lukas Straub wrote:
> > Like in the normal ram_load() path, put the received pages into the
> > colo cache and mark the pages in the bitmap so that they will be
> > flushed to the guest later.
> >
> > Multifd with COLO is useful to reduce the VM pause time during checkpointing
> > for latency sensitive workloads. In such workloads the worst-case latency
> > is especially important.
> >
> > Also, multifd migration is the preferred way to do migration nowadays and this
> > allows to use multifd compression with COLO.
> >
> > Benchmark:
> > Cluster nodes
> > - Intel Xenon E5-2630 v3
> > - 48Gb RAM
> > - 10G Ethernet
> > Guest
> > - Windows Server 2016
> > - 6Gb RAM
> > - 4 cores
> > Workload
> > - Upload a file to the guest with SMB to simulate moderate
> > memory dirtying
> > - Measure the memory transfer time portion of each checkpoint
> > - 600ms COLO checkpoint interval
> >
> > Results
> > Plain
> > idle mean: 4.50ms 99per: 10.33ms
> > load mean: 24.30ms 99per: 78.05ms
> > Multifd-4
> > idle mean: 6.48ms 99per: 10.41ms
> > load mean: 14.12ms 99per: 31.27ms
>
> Thanks for the numbers. They're persuasive at least from 1st look.
>
> Said that, one major question is, multifd should only help with throughput
> when cpu is a bottleneck sending, in your case it's 10Gbps NIC. Normally
> any decent cpu should be able to push closer to 10Gbps even without
> multifd.
That assumes the CPUs used by migration are otherwise idle though. If the
host is busy running guest workloads, only small timeslices may be available
for use by migration threads. Using multifd would better utilize what's
available if multiple host CPUs have partial availability.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 4/8] multifd: Add COLO support
2026-01-20 18:05 ` Daniel P. Berrangé
@ 2026-01-20 19:18 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2026-01-20 19:18 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: Lukas Straub, qemu-devel, Fabiano Rosas, Laurent Vivier,
Paolo Bonzini, Zhang Chen, Hailiang Zhang, Markus Armbruster,
Juan Quintela
On Tue, Jan 20, 2026 at 06:05:15PM +0000, Daniel P. Berrangé wrote:
> On Tue, Jan 20, 2026 at 12:13:58PM -0500, Peter Xu wrote:
> > On Sat, Jan 17, 2026 at 03:09:11PM +0100, Lukas Straub wrote:
> > > Like in the normal ram_load() path, put the received pages into the
> > > colo cache and mark the pages in the bitmap so that they will be
> > > flushed to the guest later.
> > >
> > > Multifd with COLO is useful to reduce the VM pause time during checkpointing
> > > for latency sensitive workloads. In such workloads the worst-case latency
> > > is especially important.
> > >
> > > Also, multifd migration is the preferred way to do migration nowadays and this
> > > allows to use multifd compression with COLO.
> > >
> > > Benchmark:
> > > Cluster nodes
> > > - Intel Xenon E5-2630 v3
> > > - 48Gb RAM
> > > - 10G Ethernet
> > > Guest
> > > - Windows Server 2016
> > > - 6Gb RAM
> > > - 4 cores
> > > Workload
> > > - Upload a file to the guest with SMB to simulate moderate
> > > memory dirtying
> > > - Measure the memory transfer time portion of each checkpoint
> > > - 600ms COLO checkpoint interval
> > >
> > > Results
> > > Plain
> > > idle mean: 4.50ms 99per: 10.33ms
> > > load mean: 24.30ms 99per: 78.05ms
> > > Multifd-4
> > > idle mean: 6.48ms 99per: 10.41ms
> > > load mean: 14.12ms 99per: 31.27ms
> >
> > Thanks for the numbers. They're persuasive at least from 1st look.
> >
> > Said that, one major question is, multifd should only help with throughput
> > when cpu is a bottleneck sending, in your case it's 10Gbps NIC. Normally
> > any decent cpu should be able to push closer to 10Gbps even without
> > multifd.
>
> That assumes the CPUs used by migration are otherwise idle though. If the
> host is busy running guest workloads, only small timeslices may be available
> for use by migration threads. Using multifd would better utilize what's
> available if multiple host CPUs have partial availability.
Hmm, I'm not sure this is the case for when the test was run above. I
rarely see a host's CPUs been completely occupied. Say, on 16 cores system
it means ~1600% CPU utilization.
I think it's because normally when a host will be hosting VMs, we should
normally have some of CPU resources reserved for host housekeeping.
Otherwise I'm not sure how to guarantee general availability of the
host.. and IIUC it may also affect the guest.
Here, IMHO as long as there's >100% CPU resource on this host (e.g. out of
1600% on a 16 cores system), enabling multifd or not shouldn't matter much
when the NIC is 10Gbps.
Old but decent processor should be able to push 10~15Gbps, new processor
should be able to push to ~25Gbps or more, with 100% CPU resource.
It's because the scheduler will schedule whatever thread (either the
migration thread alone, or multifd threads) onto whatever core that will
still be free (or some cores that have free cycles).
When all CPUs are occupied, IMHO multifd shouldn't help much
either.. maybe >1 threads make it easier to get scheduled (hence more time
slices from scheduler), but I believe that's not the major use case for
multifd.. it should really be when there're plenty of CPU resources.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 4/8] multifd: Add COLO support
2026-01-20 17:13 ` Peter Xu
2026-01-20 18:05 ` Daniel P. Berrangé
@ 2026-01-21 19:00 ` Lukas Straub
1 sibling, 0 replies; 26+ messages in thread
From: Lukas Straub @ 2026-01-21 19:00 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster, Juan Quintela
[-- Attachment #1: Type: text/plain, Size: 9684 bytes --]
On Tue, 20 Jan 2026 12:13:58 -0500
Peter Xu <peterx@redhat.com> wrote:
> On Sat, Jan 17, 2026 at 03:09:11PM +0100, Lukas Straub wrote:
> > Like in the normal ram_load() path, put the received pages into the
> > colo cache and mark the pages in the bitmap so that they will be
> > flushed to the guest later.
> >
> > Multifd with COLO is useful to reduce the VM pause time during checkpointing
> > for latency sensitive workloads. In such workloads the worst-case latency
> > is especially important.
> >
> > Also, multifd migration is the preferred way to do migration nowadays and this
> > allows to use multifd compression with COLO.
> >
> > Benchmark:
> > Cluster nodes
> > - Intel Xenon E5-2630 v3
> > - 48Gb RAM
> > - 10G Ethernet
> > Guest
> > - Windows Server 2016
> > - 6Gb RAM
> > - 4 cores
> > Workload
> > - Upload a file to the guest with SMB to simulate moderate
> > memory dirtying
> > - Measure the memory transfer time portion of each checkpoint
> > - 600ms COLO checkpoint interval
> >
> > Results
> > Plain
> > idle mean: 4.50ms 99per: 10.33ms
> > load mean: 24.30ms 99per: 78.05ms
> > Multifd-4
> > idle mean: 6.48ms 99per: 10.41ms
> > load mean: 14.12ms 99per: 31.27ms
>
> Thanks for the numbers. They're persuasive at least from 1st look.
>
> Said that, one major question is, multifd should only help with throughput
> when cpu is a bottleneck sending, in your case it's 10Gbps NIC. Normally
> any decent cpu should be able to push closer to 10Gbps even without
> multifd.
>
> Per my previous experiences, multifd can only show a difference when the
> hosts have at least 25GBps+ bandwidth available.
>
> Maybe you turned on compression already? If so, worth stating the
> compressor methods chosen / parameters.
No, it's just the old Haswell CPU I guess. My clients also use it on
embedded platforms with embedded/mobile CPUs, so not necessarily the
fastest CPUs.
Btw, I forgot to mention, it's already worth it for the precopy phase
as it helps with converging. And also for my other patch that sends
dirty ram in the time between the checkpoints.
>
> >
> > Evaluation
> > While multifd has slightly higher latency when the guest idles, it is
> > 10ms faster under load and more importantly it's worst case latency is
> > less than 1/2 of plain under load as can be seen in the 99. Percentile.
> >
> > Signed-off-by: Juan Quintela <quintela@redhat.com>
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > ---
> > MAINTAINERS | 1 +
> > migration/meson.build | 2 +-
> > migration/multifd-colo.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++
> > migration/multifd-colo.h | 26 +++++++++++++++++++++++++
> > migration/multifd.c | 12 ++++++++++++
> > migration/multifd.h | 1 +
> > 6 files changed, 90 insertions(+), 1 deletion(-)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 563804345fec68ee72793dbb7c1b7e5be4c32083..dbb217255c2cf35dc0ce971c2021b130fac5469b 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3837,6 +3837,7 @@ COLO Framework
> > M: Lukas Straub <lukasstraub2@web.de>
> > S: Maintained
> > F: migration/colo*
> > +F: migration/multifd-colo.*
> > F: include/migration/colo.h
> > F: include/migration/failover.h
> > F: docs/COLO-FT.txt
> > diff --git a/migration/meson.build b/migration/meson.build
> > index 16909d54c5110fc5d8187fd3a68c4a5b08b59ea7..1e59fe4f1f0bbfffed90df38e8f39fa87bceb9b9 100644
> > --- a/migration/meson.build
> > +++ b/migration/meson.build
> > @@ -40,7 +40,7 @@ system_ss.add(files(
> > ), gnutls, zlib)
> >
> > if get_option('replication').allowed()
> > - system_ss.add(files('colo-failover.c', 'colo.c'))
> > + system_ss.add(files('colo-failover.c', 'colo.c', 'multifd-colo.c'))
> > else
> > system_ss.add(files('colo-stubs.c'))
> > endif
> > diff --git a/migration/multifd-colo.c b/migration/multifd-colo.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..d8d98e79b12ed52c41f341052a682d7786e221b5
> > --- /dev/null
> > +++ b/migration/multifd-colo.c
> > @@ -0,0 +1,49 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0-or-later
> > + *
> > + * multifd colo implementation
> > + *
> > + * Copyright (c) Lukas Straub <lukasstraub2@web.de>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "exec/target_page.h"
> > +#include "qemu/error-report.h"
> > +#include "qapi/error.h"
> > +#include "ram.h"
> > +#include "multifd.h"
> > +#include "options.h"
> > +#include "io/channel-socket.h"
> > +#include "migration/colo.h"
> > +#include "multifd-colo.h"
> > +#include "system/ramblock.h"
> > +
> > +void multifd_colo_prepare_recv(MultiFDRecvParams *p)
> > +{
> > + assert(p->block->colo_cache);
> > +
> > + /*
> > + * While we're still in precopy state (not yet in colo state), we copy
> > + * received pages to both guest and cache. No need to set dirty bits,
> > + * since guest and cache memory are in sync.
> > + */
> > + if (migration_incoming_in_colo_state()) {
> > + colo_record_bitmap(p->block, p->normal, p->normal_num);
> > + }
> > + p->host = p->block->colo_cache;
>
> I should have mentioned it while reviewing the previous version, anyway..
>
> IMHO it would be better to have one place setting p->host instead of
> overwritting it.
>
> So instead of hooking before ->recv(), we should do it in
> multifd_ram_unfill_packet(), moving the p->host update to the end of
> function and hook it there with COLO (so that you can still record the
> bitmaps, only after normal[]).
Okay, will fix that in the next version.
>
> Another thing, which might be more important: you seem to have ignored
> zero[], but I think you need it. zero[] keeps all pages that are zeros
> (which may not used to be zeros). So IMHO you'll need to record them as
> dirty too in COLO's dest bitmap, otherwise you may hit hard to debug RAM
> corruptions.
Yes, that is a good catch. I missed that during forward porting.
>
> > +}
> > +
> > +void multifd_colo_process_recv(MultiFDRecvParams *p)
> > +{
> > + if (!migration_incoming_in_colo_state()) {
> > + for (int i = 0; i < p->normal_num; i++) {
> > + void *guest = p->block->host + p->normal[i];
> > + void *cache = p->host + p->normal[i];
> > + memcpy(guest, cache, multifd_ram_page_size());
> > + }
> > + }
> > + p->host = p->block->host;
>
> Is resetting the pointer required? If not, we can skip it.
Not that I can see, i will remove this in the next version.
>
> > +}
> > diff --git a/migration/multifd-colo.h b/migration/multifd-colo.h
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..82eaf3f48c47de2f090f9de52f9d57a337d4754a
> > --- /dev/null
> > +++ b/migration/multifd-colo.h
> > @@ -0,0 +1,26 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0-or-later
> > + *
> > + * multifd colo header
> > + *
> > + * Copyright (c) Lukas Straub <lukasstraub2@web.de>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef QEMU_MIGRATION_MULTIFD_COLO_H
> > +#define QEMU_MIGRATION_MULTIFD_COLO_H
> > +
> > +#ifdef CONFIG_REPLICATION
> > +
> > +void multifd_colo_prepare_recv(MultiFDRecvParams *p);
> > +void multifd_colo_process_recv(MultiFDRecvParams *p);
> > +
> > +#else
> > +
> > +static inline void multifd_colo_prepare_recv(MultiFDRecvParams *p) {}
> > +static inline void multifd_colo_process_recv(MultiFDRecvParams *p) {}
> > +
> > +#endif
> > +#endif
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 8e71171fb7a17726ba7eb0705e293c41e8aa32ec..6c85acec3bac134e85cfcee0d32057134f5af8d1 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -29,6 +29,7 @@
> > #include "qemu-file.h"
> > #include "trace.h"
> > #include "multifd.h"
> > +#include "multifd-colo.h"
> > #include "threadinfo.h"
> > #include "options.h"
> > #include "qemu/yank.h"
> > @@ -1269,7 +1270,18 @@ static int multifd_ram_state_recv(MultiFDRecvParams *p, Error **errp)
> > {
> > int ret;
> >
> > + if (migrate_colo()) {
> > + multifd_colo_prepare_recv(p);
> > + }
> > +
> > ret = multifd_recv_state->ops->recv(p, errp);
> > + if (ret != 0) {
> > + return ret;
> > + }
> > +
> > + if (migrate_colo()) {
> > + multifd_colo_process_recv(p);
> > + }
> >
> > return ret;
> > }
> > diff --git a/migration/multifd.h b/migration/multifd.h
> > index 9b6d81e7ede024f05d4cd235de95e73840d0bbc4..7036f438fade1baed2442bfdcf8b5d6397c4a448 100644
> > --- a/migration/multifd.h
> > +++ b/migration/multifd.h
> > @@ -280,6 +280,7 @@ typedef struct {
> > /* ramblock */
> > RAMBlock *block;
> > /* ramblock host address */
> > + /* or points to the corresponding address in the colo cache */
>
> Nit: we can merge it with /* ... */, and some wording change:
>
> /*
> * Normally, it points to ramblock's host address. When COLO
> * enabled, it points to the mirror cache for the ramblock.
> */
Okay, will fix this.
>
> > uint8_t *host;
> > /* buffers to recv */
> > struct iovec *iov;
> >
> > --
> > 2.39.5
> >
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/8] migration-test: Add COLO migration unit test
2026-01-20 17:23 ` Peter Xu
@ 2026-01-21 19:37 ` Lukas Straub
2026-01-25 17:18 ` Lukas Straub
0 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-21 19:37 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
[-- Attachment #1: Type: text/plain, Size: 14521 bytes --]
On Tue, 20 Jan 2026 12:23:08 -0500
Peter Xu <peterx@redhat.com> wrote:
> On Sat, Jan 17, 2026 at 03:09:12PM +0100, Lukas Straub wrote:
> > Add a COLO migration test for COLO migration and failover.
> >
> > COLO does not support q35 machine at this time.
> >
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > ---
> > MAINTAINERS | 1 +
> > tests/qtest/meson.build | 7 ++-
> > tests/qtest/migration-test.c | 1 +
> > tests/qtest/migration/colo-tests.c | 113 +++++++++++++++++++++++++++++++++++++
> > tests/qtest/migration/framework.c | 87 +++++++++++++++++++++++++++-
> > tests/qtest/migration/framework.h | 10 ++++
> > 6 files changed, 217 insertions(+), 2 deletions(-)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index dbb217255c2cf35dc0ce971c2021b130fac5469b..92ca20c9d4186a08519d15bfe8cbd583ab061a8b 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3840,6 +3840,7 @@ F: migration/colo*
> > F: migration/multifd-colo.*
> > F: include/migration/colo.h
> > F: include/migration/failover.h
> > +F: tests/qtest/migration/colo-tests.c
> > F: docs/COLO-FT.txt
> >
> > COLO Proxy
> > diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> > index 0f053fb56de5806d3c213e3a26c0b19998ae151a..d0129af4431bb08a94a918a1e40a8f657059d764 100644
> > --- a/tests/qtest/meson.build
> > +++ b/tests/qtest/meson.build
> > @@ -367,6 +367,11 @@ if gnutls.found()
> > endif
> > endif
> >
> > +migration_colo_files = []
> > +if get_option('replication').allowed()
> > + migration_colo_files = [files('migration/colo-tests.c')]
> > +endif
> > +
> > qtests = {
> > 'aspeed_hace-test': files('aspeed-hace-utils.c', 'aspeed_hace-test.c'),
> > 'aspeed_smc-test': files('aspeed-smc-utils.c', 'aspeed_smc-test.c'),
> > @@ -378,7 +383,7 @@ qtests = {
> > 'migration/migration-util.c') + dbus_vmstate1,
> > 'erst-test': files('erst-test.c'),
> > 'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
> > - 'migration-test': test_migration_files + migration_tls_files,
> > + 'migration-test': test_migration_files + migration_tls_files + migration_colo_files,
> > 'pxe-test': files('boot-sector.c'),
> > 'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
> > 'pnv-xive2-nvpg_bar.c'),
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index 08936871741535c926eeac40a7d7c3f461c72fd0..e582f05c7dc2673dbd05a936df8feb6c964b5bbc 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -55,6 +55,7 @@ int main(int argc, char **argv)
> > migration_test_add_precopy(env);
> > migration_test_add_cpr(env);
> > migration_test_add_misc(env);
> > + migration_test_add_colo(env);
> >
> > ret = g_test_run();
> >
> > diff --git a/tests/qtest/migration/colo-tests.c b/tests/qtest/migration/colo-tests.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..5004f581e4d9e4e6f54eee6d70a9307b7fd123be
> > --- /dev/null
> > +++ b/tests/qtest/migration/colo-tests.c
> > @@ -0,0 +1,113 @@
> > +/*
> > + * SPDX-License-Identifier: GPL-2.0-or-later
> > + *
> > + * QTest testcases for COLO migration
> > + *
> > + * Copyright (c) 2025 Lukas Straub <lukasstraub2@web.de>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "libqtest.h"
> > +#include "migration/framework.h"
> > +#include "migration/migration-qmp.h"
> > +#include "migration/migration-util.h"
> > +#include "qemu/module.h"
> > +
> > +static void test_colo_plain_common(MigrateCommon *args,
> > + bool failover_during_checkpoint,
> > + bool primary_failover)
> > +{
> > + args->listen_uri = "tcp:127.0.0.1:0";
> > + test_colo_common(args, failover_during_checkpoint, primary_failover);
> > +}
> > +
> > +static void *hook_start_multifd(QTestState *from, QTestState *to)
> > +{
> > + return migrate_hook_start_precopy_tcp_multifd_common(from, to, "none");
> > +}
> > +
> > +static void test_colo_multifd_common(MigrateCommon *args,
> > + bool failover_during_checkpoint,
> > + bool primary_failover)
> > +{
> > + args->listen_uri = "defer";
> > + args->start_hook = hook_start_multifd;
> > + args->start.caps[MIGRATION_CAPABILITY_MULTIFD] = true;
> > + test_colo_common(args, failover_during_checkpoint, primary_failover);
> > +}
> > +
> > +static void test_colo_plain_primary_failover(char *name, MigrateCommon *args)
> > +{
> > + test_colo_plain_common(args, false, true);
> > +}
> > +
> > +static void test_colo_plain_secondary_failover(char *name, MigrateCommon *args)
> > +{
> > + test_colo_plain_common(args, false, false);
> > +}
> > +
> > +static void test_colo_multifd_primary_failover(char *name, MigrateCommon *args)
> > +{
> > + test_colo_multifd_common(args, false, true);
> > +}
> > +
> > +static void test_colo_multifd_secondary_failover(char *name,
> > + MigrateCommon *args)
> > +{
> > + test_colo_multifd_common(args, false, false);
> > +}
> > +
> > +static void test_colo_plain_primary_failover_checkpoint(char *name,
> > + MigrateCommon *args)
> > +{
> > + test_colo_plain_common(args, true, true);
> > +}
> > +
> > +static void test_colo_plain_secondary_failover_checkpoint(char *name,
> > + MigrateCommon *args)
> > +{
> > + test_colo_plain_common(args, true, false);
> > +}
> > +
> > +static void test_colo_multifd_primary_failover_checkpoint(char *name,
> > + MigrateCommon *args)
> > +{
> > + test_colo_multifd_common(args, true, true);
> > +}
> > +
> > +static void test_colo_multifd_secondary_failover_checkpoint(char *name,
> > + MigrateCommon *args)
> > +{
> > + test_colo_multifd_common(args, true, false);
> > +}
> > +
> > +void migration_test_add_colo(MigrationTestEnv *env)
> > +{
> > + if (!env->full_set) {
> > + return;
> > + }
> > +
> > + migration_test_add("/migration/colo/plain/primary_failover",
> > + test_colo_plain_primary_failover);
> > + migration_test_add("/migration/colo/plain/secondary_failover",
> > + test_colo_plain_secondary_failover);
> > +
> > + migration_test_add("/migration/colo/multifd/primary_failover",
> > + test_colo_multifd_primary_failover);
> > + migration_test_add("/migration/colo/multifd/secondary_failover",
> > + test_colo_multifd_secondary_failover);
> > +
> > + migration_test_add("/migration/colo/plain/primary_failover_checkpoint",
> > + test_colo_plain_primary_failover_checkpoint);
> > + migration_test_add("/migration/colo/plain/secondary_failover_checkpoint",
> > + test_colo_plain_secondary_failover_checkpoint);
> > +
> > + migration_test_add("/migration/colo/multifd/primary_failover_checkpoint",
> > + test_colo_multifd_primary_failover_checkpoint);
> > + migration_test_add("/migration/colo/multifd/secondary_failover_checkpoint",
> > + test_colo_multifd_secondary_failover_checkpoint);
> > +}
> > diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
> > index 57d3b9b7c5a269d31659971e308367bd916d28f6..fe34e7cc7a1a4eeb8d5219f54733bbd8446b0e4e 100644
> > --- a/tests/qtest/migration/framework.c
> > +++ b/tests/qtest/migration/framework.c
> > @@ -315,7 +315,7 @@ int migrate_args(char **from, char **to, const char *uri, MigrateStart *args)
> > if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
> > memory_size = "150M";
> >
> > - if (g_str_equal(arch, "i386")) {
> > + if (g_str_equal(arch, "i386") || args->force_pc_machine) {
>
> The naming is better, thanks. Said that, force_pc_machine is unwanted
> either.. if we can drop it. I asked this in v1:
>
> https://lore.kernel.org/qemu-devel/aWltRH6Nra-Tji7w@x1.local/
>
> Can we explore that possibility?
Never mind. I found the issue and will remove this in the next version.
>
> > machine_alias = "pc";
> > } else {
> > machine_alias = "q35";
> > @@ -1066,6 +1066,91 @@ void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
> > return NULL;
> > }
> >
> > +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> > + bool primary_failover)
> > +{
> > + QTestState *from, *to;
> > + void *data_hook = NULL;
> > +
> > + /*
> > + * For the COLO test, both VMs will run in parallel. Thus both VMs want to
> > + * open the image read/write at the same time. Using read-only=on is not
> > + * possible here, because ide-hd does not support read-only backing image.
> > + *
> > + * So use -snapshot, where each qemu instance creates its own writable
> > + * snapshot internally while leaving the real image read-only.
> > + */
> > + args->start.opts_source = "-snapshot";
> > + args->start.opts_target = "-snapshot";
> > +
> > + /*
> > + * COLO migration code logs many errors when the migration socket
> > + * is shut down, these are expected so we hide them here.
> > + */
> > + args->start.hide_stderr = true;
> > +
> > + /*
> > + * COLO currently does not work with Q35 machine
> > + */
> > + args->start.force_pc_machine = true;
> > +
> > + args->start.oob = true;
>
> Just curious: is OOB required in COLO for some reason? I understand yank
> you used below uses OOB, so the question is behind that, on what can be
> blocked in main thread, and special in COLO.
>
> > + args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
> > +
> > + if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
> > + return -1;
> > + }
> > +
> > + migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
> > +
> > + if (args->start_hook) {
> > + data_hook = args->start_hook(from, to);
> > + }
> > +
> > + migrate_ensure_converge(from);
> > + wait_for_serial("src_serial");
> > +
> > + migrate_qmp(from, to, args->connect_uri, NULL, "{}");
> > +
> > + wait_for_migration_status(from, "colo", NULL);
> > + wait_for_resume(to, &dst_state);
>
> We can move this whole function into colo-tests.c. Here you may want to
> use get_dst() instead.
Okey, will do that.
>
> > +
> > + wait_for_serial("src_serial");
> > + wait_for_serial("dest_serial");
> > +
> > + /* wait for 3 checkpoints */
> > + for (int i = 0; i < 3; i++) {
> > + qtest_qmp_eventwait(to, "RESUME");
> > + wait_for_serial("src_serial");
> > + wait_for_serial("dest_serial");
> > + }
> > +
> > + if (failover_during_checkpoint) {
> > + qtest_qmp_eventwait(to, "STOP");
> > + }
> > + if (primary_failover) {
> > + qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> > + "'arguments': {'instances':"
> > + "[{'type': 'migration'}]}}");
> > + qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
> > + wait_for_serial("src_serial");
> > + } else {
> > + qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> > + "'arguments': {'instances':"
> > + "[{'type': 'migration'}]}}");
> > + qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
> > + wait_for_serial("dest_serial");
> > + }
> > +
> > + if (args->end_hook) {
> > + args->end_hook(from, to, data_hook);
> > + }
> > +
> > + migrate_end(from, to, !primary_failover);
> > +
> > + return 0;
> > +}
> > +
> > QTestMigrationState *get_src(void)
> > {
> > return &src_state;
> > diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
> > index 2ef0f57962605c9e3bc7b7de48e52351e5389138..75088c5fb098a0f95acb1e23585d3b6e8307451e 100644
> > --- a/tests/qtest/migration/framework.h
> > +++ b/tests/qtest/migration/framework.h
> > @@ -139,6 +139,9 @@ typedef struct {
> > /* Do not connect to target monitor and qtest sockets in qtest_init */
> > bool defer_target_connect;
> >
> > + /* Use pc machine for x86_64 */
> > + bool force_pc_machine;
> > +
> > /*
> > * Migration capabilities to be set in both source and
> > * destination. For unilateral capabilities, use
> > @@ -248,6 +251,8 @@ void test_postcopy_common(MigrateCommon *args);
> > void test_postcopy_recovery_common(MigrateCommon *args);
> > int test_precopy_common(MigrateCommon *args);
> > void test_file_common(MigrateCommon *args, bool stop_src);
> > +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> > + bool colo_primary_failover);
> > void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
> > QTestState *to,
> > const char *method);
> > @@ -267,5 +272,10 @@ void migration_test_add_file(MigrationTestEnv *env);
> > void migration_test_add_precopy(MigrationTestEnv *env);
> > void migration_test_add_cpr(MigrationTestEnv *env);
> > void migration_test_add_misc(MigrationTestEnv *env);
> > +#ifdef CONFIG_REPLICATION
> > +void migration_test_add_colo(MigrationTestEnv *env);
> > +#else
> > +static inline void migration_test_add_colo(MigrationTestEnv *env) {};
> > +#endif
> >
> > #endif /* TEST_FRAMEWORK_H */
> >
> > --
> > 2.39.5
> >
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 6/8] Convert colo main documentation to restructuredText
2026-01-20 17:26 ` Peter Xu
@ 2026-01-21 19:44 ` Lukas Straub
0 siblings, 0 replies; 26+ messages in thread
From: Lukas Straub @ 2026-01-21 19:44 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
[-- Attachment #1: Type: text/plain, Size: 1846 bytes --]
On Tue, 20 Jan 2026 12:26:17 -0500
Peter Xu <peterx@redhat.com> wrote:
> On Sat, Jan 17, 2026 at 03:09:13PM +0100, Lukas Straub wrote:
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > ---
> > MAINTAINERS | 2 +-
> > docs/COLO-FT.txt | 334 ------------------------------------------
> > docs/system/index.rst | 1 +
> > docs/system/qemu-colo.rst | 361 ++++++++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 363 insertions(+), 335 deletions(-)
>
> Thank you.
>
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 92ca20c9d4186a08519d15bfe8cbd583ab061a8b..4c30dc50d15c74b317443e43920e01b4560b03a5 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3841,7 +3841,7 @@ F: migration/multifd-colo.*
> > F: include/migration/colo.h
> > F: include/migration/failover.h
> > F: tests/qtest/migration/colo-tests.c
> > -F: docs/COLO-FT.txt
> > +F: docs/devel/qemu-colo.rst
>
> Should we still put it under docs/devel/migration?
>
> COLO framework is under migration/. COLO tests will be under
> tests/qtest/migration/. I still think we should keep doc under migration/
> too, IOW when someone touches that we want to get copied too.
Whoops, this should be docs/system/qemu-colo.rst so user documentation.
I think this document is more for users, so they can test it just with
standalone qemu.
Also I want to replace the wiki page with this so they don't go
out of sync.
>
> I also raised some other requests while discussing with you on the COLO
> details. If you want, feel free to attach one more patch to add those
> contents into the doc (after this conversion patch), on either doubled mem
> consumption of SVM, or RAM mgmt details.
Will do.
I think I will make a extra doc for devel/migration.
>
> Thanks,
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework
2026-01-20 17:32 ` Peter Xu
@ 2026-01-22 9:54 ` Zhang Chen
0 siblings, 0 replies; 26+ messages in thread
From: Zhang Chen @ 2026-01-22 9:54 UTC (permalink / raw)
To: Peter Xu
Cc: Lukas Straub, qemu-devel, Fabiano Rosas, Laurent Vivier,
Paolo Bonzini, Hailiang Zhang, Markus Armbruster
On Wed, Jan 21, 2026 at 1:32 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Sat, Jan 17, 2026 at 03:09:08PM +0100, Lukas Straub wrote:
> > I am ready to maintain it.
> >
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
> --
> Peter Xu
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
2026-01-20 17:32 ` Peter Xu
@ 2026-01-22 9:54 ` Zhang Chen
0 siblings, 0 replies; 26+ messages in thread
From: Zhang Chen @ 2026-01-22 9:54 UTC (permalink / raw)
To: Peter Xu
Cc: Lukas Straub, qemu-devel, Fabiano Rosas, Laurent Vivier,
Paolo Bonzini, Hailiang Zhang, Markus Armbruster
On Wed, Jan 21, 2026 at 1:33 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Sat, Jan 17, 2026 at 03:09:09PM +0100, Lukas Straub wrote:
> > His last email to the mailing list is from December 2021:
> > https://lore.kernel.org/qemu-devel/20211214075424.6920-1-zhanghailiang@xfusion.com/
> >
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
> --
> Peter Xu
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/8] migration-test: Add COLO migration unit test
2026-01-21 19:37 ` Lukas Straub
@ 2026-01-25 17:18 ` Lukas Straub
2026-01-26 15:28 ` Peter Xu
0 siblings, 1 reply; 26+ messages in thread
From: Lukas Straub @ 2026-01-25 17:18 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
[-- Attachment #1: Type: text/plain, Size: 4780 bytes --]
On Wed, 21 Jan 2026 20:37:51 +0100
Lukas Straub <lukasstraub2@web.de> wrote:
> On Tue, 20 Jan 2026 12:23:08 -0500
> Peter Xu <peterx@redhat.com> wrote:
>
> > On Sat, Jan 17, 2026 at 03:09:12PM +0100, Lukas Straub wrote:
> > > Add a COLO migration test for COLO migration and failover.
> > >
> > > COLO does not support q35 machine at this time.
> > >
> > > [...]
> > >
> > > +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> > > + bool primary_failover)
> > > +{
> > > + QTestState *from, *to;
> > > + void *data_hook = NULL;
> > > +
> > > + /*
> > > + * For the COLO test, both VMs will run in parallel. Thus both VMs want to
> > > + * open the image read/write at the same time. Using read-only=on is not
> > > + * possible here, because ide-hd does not support read-only backing image.
> > > + *
> > > + * So use -snapshot, where each qemu instance creates its own writable
> > > + * snapshot internally while leaving the real image read-only.
> > > + */
> > > + args->start.opts_source = "-snapshot";
> > > + args->start.opts_target = "-snapshot";
> > > +
> > > + /*
> > > + * COLO migration code logs many errors when the migration socket
> > > + * is shut down, these are expected so we hide them here.
> > > + */
> > > + args->start.hide_stderr = true;
> > > +
> > > + /*
> > > + * COLO currently does not work with Q35 machine
> > > + */
> > > + args->start.force_pc_machine = true;
> > > +
> > > + args->start.oob = true;
> >
> > Just curious: is OOB required in COLO for some reason? I understand yank
> > you used below uses OOB, so the question is behind that, on what can be
> > blocked in main thread, and special in COLO.
There is a lot that can hang:
The netfilters all run on the main loop and use blocking write.
fiter-mirror on the primary side mirrors packets to the secondary and
can hang.
filter-redirect on the secondary side redirects packets to primary's
colo-compare and can hang.
The nbd client on the primary side that is connected to the nbd server
on the secondary side can hang. Especially during vm_stop() which fluses
all inflight block io with BQL held.
Regards,
Lukas Straub
> >
> > > + args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
> > > +
> > > + if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
> > > + return -1;
> > > + }
> > > +
> > > + migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
> > > +
> > > + if (args->start_hook) {
> > > + data_hook = args->start_hook(from, to);
> > > + }
> > > +
> > > + migrate_ensure_converge(from);
> > > + wait_for_serial("src_serial");
> > > +
> > > + migrate_qmp(from, to, args->connect_uri, NULL, "{}");
> > > +
> > > + wait_for_migration_status(from, "colo", NULL);
> > > + wait_for_resume(to, &dst_state);
> >
> > We can move this whole function into colo-tests.c. Here you may want to
> > use get_dst() instead.
>
> Okey, will do that.
>
> >
> > > +
> > > + wait_for_serial("src_serial");
> > > + wait_for_serial("dest_serial");
> > > +
> > > + /* wait for 3 checkpoints */
> > > + for (int i = 0; i < 3; i++) {
> > > + qtest_qmp_eventwait(to, "RESUME");
> > > + wait_for_serial("src_serial");
> > > + wait_for_serial("dest_serial");
> > > + }
> > > +
> > > + if (failover_during_checkpoint) {
> > > + qtest_qmp_eventwait(to, "STOP");
> > > + }
> > > + if (primary_failover) {
> > > + qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> > > + "'arguments': {'instances':"
> > > + "[{'type': 'migration'}]}}");
> > > + qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
> > > + wait_for_serial("src_serial");
> > > + } else {
> > > + qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> > > + "'arguments': {'instances':"
> > > + "[{'type': 'migration'}]}}");
> > > + qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
> > > + wait_for_serial("dest_serial");
> > > + }
> > > +
> > > + if (args->end_hook) {
> > > + args->end_hook(from, to, data_hook);
> > > + }
> > > +
> > > + migrate_end(from, to, !primary_failover);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > QTestMigrationState *get_src(void)
> > > {
> > > return &src_state;
> > > [...]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 5/8] migration-test: Add COLO migration unit test
2026-01-25 17:18 ` Lukas Straub
@ 2026-01-26 15:28 ` Peter Xu
0 siblings, 0 replies; 26+ messages in thread
From: Peter Xu @ 2026-01-26 15:28 UTC (permalink / raw)
To: Lukas Straub
Cc: qemu-devel, Fabiano Rosas, Laurent Vivier, Paolo Bonzini,
Zhang Chen, Hailiang Zhang, Markus Armbruster
On Sun, Jan 25, 2026 at 06:18:36PM +0100, Lukas Straub wrote:
> On Wed, 21 Jan 2026 20:37:51 +0100
> Lukas Straub <lukasstraub2@web.de> wrote:
>
> > On Tue, 20 Jan 2026 12:23:08 -0500
> > Peter Xu <peterx@redhat.com> wrote:
> >
> > > On Sat, Jan 17, 2026 at 03:09:12PM +0100, Lukas Straub wrote:
> > > > Add a COLO migration test for COLO migration and failover.
> > > >
> > > > COLO does not support q35 machine at this time.
> > > >
> > > > [...]
> > > >
> > > > +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> > > > + bool primary_failover)
> > > > +{
> > > > + QTestState *from, *to;
> > > > + void *data_hook = NULL;
> > > > +
> > > > + /*
> > > > + * For the COLO test, both VMs will run in parallel. Thus both VMs want to
> > > > + * open the image read/write at the same time. Using read-only=on is not
> > > > + * possible here, because ide-hd does not support read-only backing image.
> > > > + *
> > > > + * So use -snapshot, where each qemu instance creates its own writable
> > > > + * snapshot internally while leaving the real image read-only.
> > > > + */
> > > > + args->start.opts_source = "-snapshot";
> > > > + args->start.opts_target = "-snapshot";
> > > > +
> > > > + /*
> > > > + * COLO migration code logs many errors when the migration socket
> > > > + * is shut down, these are expected so we hide them here.
> > > > + */
> > > > + args->start.hide_stderr = true;
> > > > +
> > > > + /*
> > > > + * COLO currently does not work with Q35 machine
> > > > + */
> > > > + args->start.force_pc_machine = true;
> > > > +
> > > > + args->start.oob = true;
> > >
> > > Just curious: is OOB required in COLO for some reason? I understand yank
> > > you used below uses OOB, so the question is behind that, on what can be
> > > blocked in main thread, and special in COLO.
>
> There is a lot that can hang:
> The netfilters all run on the main loop and use blocking write.
> fiter-mirror on the primary side mirrors packets to the secondary and
> can hang.
> filter-redirect on the secondary side redirects packets to primary's
> colo-compare and can hang.
> The nbd client on the primary side that is connected to the nbd server
> on the secondary side can hang. Especially during vm_stop() which fluses
> all inflight block io with BQL held.
None of them are used in this unit test, right?
I agree if OOB is needed in production we should also enable it in the unit
tests. Said that, would you please add a comment into the test case
explaining this? E.g. what can fail in reality, and why we still test OOB
(because we want to get as close to production COLO use case as possible).
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2026-01-26 15:28 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-01-17 14:09 ` [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
2026-01-20 17:32 ` Peter Xu
2026-01-22 9:54 ` Zhang Chen
2026-01-17 14:09 ` [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
2026-01-20 17:32 ` Peter Xu
2026-01-22 9:54 ` Zhang Chen
2026-01-17 14:09 ` [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv() Lukas Straub
2026-01-20 17:14 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 4/8] multifd: Add COLO support Lukas Straub
2026-01-20 17:13 ` Peter Xu
2026-01-20 18:05 ` Daniel P. Berrangé
2026-01-20 19:18 ` Peter Xu
2026-01-21 19:00 ` Lukas Straub
2026-01-17 14:09 ` [PATCH v2 5/8] migration-test: Add COLO migration unit test Lukas Straub
2026-01-20 17:23 ` Peter Xu
2026-01-21 19:37 ` Lukas Straub
2026-01-25 17:18 ` Lukas Straub
2026-01-26 15:28 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 6/8] Convert colo main documentation to restructuredText Lukas Straub
2026-01-20 17:26 ` Peter Xu
2026-01-21 19:44 ` Lukas Straub
2026-01-17 14:09 ` [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes Lukas Straub
2026-01-20 17:30 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup Lukas Straub
2026-01-20 17:32 ` Peter Xu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.