From: Fabiano Rosas <farosas@suse.de>
To: Lukas Straub <lukasstraub2@web.de>, qemu-devel@nongnu.org
Cc: Peter Xu <peterx@redhat.com>, Zhang Chen <zhangckid@gmail.com>,
Hailiang Zhang <zhanghailiang@xfusion.com>,
Li Zhijian <lizhijian@fujitsu.com>,
"Dr. David Alan Gilbert" <dave@treblig.org>,
stefanha@redhat.com
Subject: COLO concurrency issues
Date: Fri, 13 Feb 2026 09:13:49 -0300 [thread overview]
Message-ID: <87ms1cn8n6.fsf@suse.de> (raw)
Hi, I've been following the qemu-colo.rst steps to test COLO and
encountered a couple of issues. Unfortunately, I don't have cycles to
investigate further. Happens with QEMU master (also tested some versions
back until the COLO fix 0b5bf4ea76).
1) Deadlock at fdmon_io_uring_wait:
(steps from qemu-colo.rst)
- Secondary Failover
- Secondary resume replication
- Start the new Secondary
- Sync
- Wait until disk is synced, then:
{"execute": "stop"}
{"execute": "block-job-cancel", "arguments":{ "device": "resync" } }
The above results in the old secondary hanging indefinitely at:
do {
ret = io_uring_submit_and_wait(&ctx->fdmon_io_uring, wait_nr);
} while (ret == -EINTR);
(gdb) bt
#0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1 0x00007f5519e0204e in ??? () at //usr/lib64/liburing.so.2
#2 0x00007f5519e01b00 in ??? () at //usr/lib64/liburing.so.2
#3 0x0000563c2dc06cc9 in fdmon_io_uring_wait (ctx=0x563c30411b00, ready_list=0x7ffd0bad8f58, timeout=575708467831) at ../util/fdmon-io_uring.c:416
#4 0x0000563c2dc00976 in aio_poll (ctx=0x563c30411b00, blocking=true) at ../util/aio-posix.c:699
#5 0x0000563c2daa01c6 in bdrv_drain_all_begin () at ../block/io.c:529
#6 0x0000563c2daa03d8 in bdrv_drain_all () at ../block/io.c:574
#7 0x0000563c2d764aae in do_vm_stop (state=RUN_STATE_PAUSED, send_stop=true) at ../system/cpus.c:312
#8 0x0000563c2d765964 in vm_stop (state=RUN_STATE_PAUSED) at ../system/cpus.c:754
#9 0x0000563c2d7f3378 in qmp_stop (errp=0x7ffd0bad9080) at ../monitor/qmp-cmds.c:62
#10 0x0000563c2dba7a72 in qmp_marshal_stop (args=0x563c306ac070, ret=0x7f5518dffda8, errp=0x7f5518dffda0) at qapi/qapi-commands-misc.c:197
#11 0x0000563c2dbf1316 in do_qmp_dispatch_bh (opaque=0x7f5518dffe40) at ../qapi/qmp-dispatch.c:128
#12 0x0000563c2dc1de48 in aio_bh_call (bh=0x563c3040fef0) at ../util/async.c:173
#13 0x0000563c2dc1df64 in aio_bh_poll (ctx=0x563c3040c070) at ../util/async.c:220
#14 0x0000563c2dbffff0 in aio_dispatch (ctx=0x563c3040c070) at ../util/aio-posix.c:389
#15 0x0000563c2dc1e3cd in aio_ctx_dispatch (source=0x563c3040c070, callback=0x0, user_data=0x0) at ../util/async.c:365
#16 0x00007f551b114f4c in g_main_dispatch (context=0x563c304120f0) at ../glib/gmain.c:3476
#17 g_main_context_dispatch_unlocked (context=context@entry=0x563c304120f0) at ../glib/gmain.c:4284
#18 0x00007f551b1170c9 in g_main_context_dispatch (context=0x563c304120f0) at ../glib/gmain.c:4272
#19 0x0000563c2dc1fa0b in glib_pollfds_poll () at ../util/main-loop.c:290
#20 0x0000563c2dc1fa85 in os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:313
#21 0x0000563c2dc1fb8a in main_loop_wait (nonblocking=0) at ../util/main-loop.c:592
#22 0x0000563c2d78eb60 in qemu_main_loop () at ../system/runstate.c:903
#23 0x0000563c2db412fc in qemu_default_main (opaque=0x0) at ../system/main.c:50
#24 0x0000563c2db413ab in main (argc=40, argv=0x7ffd0bad94d8) at ../system/main.c:93
---
2) Race at colo_process_checkpoint
The following pattern seems to be inherently racy, whether the switch
statement sees the state as COMPLETED or not varies:
colo_process_checkpoint()
{
...
out:
...
/*
* There are only two reasons we can get here, some error happened
* or the user triggered failover.
*/
--> switch (failover_get_state()) {
case FAILOVER_STATUS_COMPLETED:
qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
COLO_EXIT_REASON_REQUEST);
break;
default:
qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
COLO_EXIT_REASON_ERROR);
}
/* Hope this not to be too long to wait here */
--> qemu_event_wait(&s->colo_exit_event);
...
}
This results in what seems like a spurious:
{"timestamp": {"seconds": 1770984655, "microseconds": 216464}, "event":
"COLO_EXIT", "data": {"mode": "primary", "reason": "error"}}
I'm not sure if the intention is to just ignore it, but it seems moving
the qemu_event_wait before checking the state would eliminate the race.
next reply other threads:[~2026-02-13 12:14 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-13 12:13 Fabiano Rosas [this message]
2026-02-14 16:11 ` COLO concurrency issues Lukas Straub
2026-02-19 14:36 ` Stefan Hajnoczi
2026-02-20 2:04 ` Dr. David Alan Gilbert
2026-03-05 21:42 ` Fabiano Rosas
2026-03-05 21:54 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ms1cn8n6.fsf@suse.de \
--to=farosas@suse.de \
--cc=dave@treblig.org \
--cc=lizhijian@fujitsu.com \
--cc=lukasstraub2@web.de \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=zhangckid@gmail.com \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.