From: Lukas Straub <lukasstraub2@web.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
Laurent Vivier <lvivier@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Zhang Chen <zhangckid@gmail.com>,
Hailiang Zhang <zhanghailiang@xfusion.com>,
Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH v2 5/8] migration-test: Add COLO migration unit test
Date: Sun, 25 Jan 2026 18:18:36 +0100 [thread overview]
Message-ID: <20260125181836.3dd2d026@penguin> (raw)
In-Reply-To: <20260121203751.6bc9027d@penguin>
[-- Attachment #1: Type: text/plain, Size: 4780 bytes --]
On Wed, 21 Jan 2026 20:37:51 +0100
Lukas Straub <lukasstraub2@web.de> wrote:
> On Tue, 20 Jan 2026 12:23:08 -0500
> Peter Xu <peterx@redhat.com> wrote:
>
> > On Sat, Jan 17, 2026 at 03:09:12PM +0100, Lukas Straub wrote:
> > > Add a COLO migration test for COLO migration and failover.
> > >
> > > COLO does not support q35 machine at this time.
> > >
> > > [...]
> > >
> > > +int test_colo_common(MigrateCommon *args, bool failover_during_checkpoint,
> > > + bool primary_failover)
> > > +{
> > > + QTestState *from, *to;
> > > + void *data_hook = NULL;
> > > +
> > > + /*
> > > + * For the COLO test, both VMs will run in parallel. Thus both VMs want to
> > > + * open the image read/write at the same time. Using read-only=on is not
> > > + * possible here, because ide-hd does not support read-only backing image.
> > > + *
> > > + * So use -snapshot, where each qemu instance creates its own writable
> > > + * snapshot internally while leaving the real image read-only.
> > > + */
> > > + args->start.opts_source = "-snapshot";
> > > + args->start.opts_target = "-snapshot";
> > > +
> > > + /*
> > > + * COLO migration code logs many errors when the migration socket
> > > + * is shut down, these are expected so we hide them here.
> > > + */
> > > + args->start.hide_stderr = true;
> > > +
> > > + /*
> > > + * COLO currently does not work with Q35 machine
> > > + */
> > > + args->start.force_pc_machine = true;
> > > +
> > > + args->start.oob = true;
> >
> > Just curious: is OOB required in COLO for some reason? I understand yank
> > you used below uses OOB, so the question is behind that, on what can be
> > blocked in main thread, and special in COLO.
There is a lot that can hang:
The netfilters all run on the main loop and use blocking write.
fiter-mirror on the primary side mirrors packets to the secondary and
can hang.
filter-redirect on the secondary side redirects packets to primary's
colo-compare and can hang.
The nbd client on the primary side that is connected to the nbd server
on the secondary side can hang. Especially during vm_stop() which fluses
all inflight block io with BQL held.
Regards,
Lukas Straub
> >
> > > + args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
> > > +
> > > + if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
> > > + return -1;
> > > + }
> > > +
> > > + migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
> > > +
> > > + if (args->start_hook) {
> > > + data_hook = args->start_hook(from, to);
> > > + }
> > > +
> > > + migrate_ensure_converge(from);
> > > + wait_for_serial("src_serial");
> > > +
> > > + migrate_qmp(from, to, args->connect_uri, NULL, "{}");
> > > +
> > > + wait_for_migration_status(from, "colo", NULL);
> > > + wait_for_resume(to, &dst_state);
> >
> > We can move this whole function into colo-tests.c. Here you may want to
> > use get_dst() instead.
>
> Okey, will do that.
>
> >
> > > +
> > > + wait_for_serial("src_serial");
> > > + wait_for_serial("dest_serial");
> > > +
> > > + /* wait for 3 checkpoints */
> > > + for (int i = 0; i < 3; i++) {
> > > + qtest_qmp_eventwait(to, "RESUME");
> > > + wait_for_serial("src_serial");
> > > + wait_for_serial("dest_serial");
> > > + }
> > > +
> > > + if (failover_during_checkpoint) {
> > > + qtest_qmp_eventwait(to, "STOP");
> > > + }
> > > + if (primary_failover) {
> > > + qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> > > + "'arguments': {'instances':"
> > > + "[{'type': 'migration'}]}}");
> > > + qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
> > > + wait_for_serial("src_serial");
> > > + } else {
> > > + qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> > > + "'arguments': {'instances':"
> > > + "[{'type': 'migration'}]}}");
> > > + qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
> > > + wait_for_serial("dest_serial");
> > > + }
> > > +
> > > + if (args->end_hook) {
> > > + args->end_hook(from, to, data_hook);
> > > + }
> > > +
> > > + migrate_end(from, to, !primary_failover);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > QTestMigrationState *get_src(void)
> > > {
> > > return &src_state;
> > > [...]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2026-01-25 17:19 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-17 14:09 [PATCH v2 0/8] migration: Add COLO multifd support and COLO migration unit test Lukas Straub
2026-01-17 14:09 ` [PATCH v2 1/8] MAINTAINERS: Add myself as maintainer for COLO migration framework Lukas Straub
2026-01-20 17:32 ` Peter Xu
2026-01-22 9:54 ` Zhang Chen
2026-01-17 14:09 ` [PATCH v2 2/8] MAINTAINERS: Remove Hailiang Zhang from " Lukas Straub
2026-01-20 17:32 ` Peter Xu
2026-01-22 9:54 ` Zhang Chen
2026-01-17 14:09 ` [PATCH v2 3/8] Move ram state receive into multifd_ram_state_recv() Lukas Straub
2026-01-20 17:14 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 4/8] multifd: Add COLO support Lukas Straub
2026-01-20 17:13 ` Peter Xu
2026-01-20 18:05 ` Daniel P. Berrangé
2026-01-20 19:18 ` Peter Xu
2026-01-21 19:00 ` Lukas Straub
2026-01-17 14:09 ` [PATCH v2 5/8] migration-test: Add COLO migration unit test Lukas Straub
2026-01-20 17:23 ` Peter Xu
2026-01-21 19:37 ` Lukas Straub
2026-01-25 17:18 ` Lukas Straub [this message]
2026-01-26 15:28 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 6/8] Convert colo main documentation to restructuredText Lukas Straub
2026-01-20 17:26 ` Peter Xu
2026-01-21 19:44 ` Lukas Straub
2026-01-17 14:09 ` [PATCH v2 7/8] qemu-colo.rst: Miscellaneous changes Lukas Straub
2026-01-20 17:30 ` Peter Xu
2026-01-17 14:09 ` [PATCH v2 8/8] qemu-colo.rst: Simplify the block replication setup Lukas Straub
2026-01-20 17:32 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260125181836.3dd2d026@penguin \
--to=lukasstraub2@web.de \
--cc=armbru@redhat.com \
--cc=farosas@suse.de \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhangckid@gmail.com \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.