From: Lukas Straub <lukasstraub2@web.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Li Zhijian" <lizhijian@fujitsu.com>,
"Hailiang Zhang" <zhanghailiang@xfusion.com>,
"Kevin Wolf" <kwolf@redhat.com>,
"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>,
"Zhang Chen" <zhangckid@gmail.com>,
"Dr . David Alan Gilbert" <dave@treblig.org>,
"Prasad Pandit" <ppandit@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Yury Kotov" <yury-kotov@yandex-team.ru>,
"Juraj Marcin" <jmarcin@redhat.com>
Subject: Re: [PATCH 00/13] migration: Threadify loadvm process
Date: Sat, 17 Jan 2026 15:00:37 +0100 [thread overview]
Message-ID: <20260117150037.605c9744@penguin> (raw)
In-Reply-To: <20251022192612.2737648-1-peterx@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4433 bytes --]
On Wed, 22 Oct 2025 15:25:59 -0400
Peter Xu <peterx@redhat.com> wrote:
> This is v1, however not 10.2 material. The earliest I see fit would still
> be 11.0+ even if everything goes extremely smooth.
>
> Removal of RFC is only about that I'm more confident this should be able to
> land without breaking something too easily, as I smoked it slightly more
> cross-archs this time. AFAIU the best (and possibly only..) way to prove
> it solid is to merge it.. likely in the early phase of a dev cycle.
>
> The plan is we'll try to get to more device setups too soon, before it
> could land.
>
> Background
> ==========
>
> Nowadays, live migration heavily depends on threads. For example, most of
> the major features that will be used nowadays in live migration (multifd,
> postcopy, mapped-ram, vfio, etc.) all work with threads internally.
>
> But still, from time to time, we'll see some coroutines floating around the
> migration context. The major one is precopy's loadvm, which is internally
> a coroutine. It is still a critical path that any live migration depends on.
>
> A mixture of using both coroutines and threads is prone to issues. Some
> examples can refer to commit e65cec5e5d ("migration/ram: Yield periodically
> to the main loop") or commit 7afbdada7e ("migration/postcopy: ensure
> preempt channel is ready before loading states").
>
> It was a coroutine since this work (thanks to Fabiano, the archeologist,
> digging the link):
>
> https://lists.gnu.org/archive/html/qemu-devel/2012-08/msg01136.html
>
> [...]
>
> Tests
> =====
>
> Default CI passes.
>
> RDMA unit tests pass as usual. I also tried out cancellation / failure
> tests over RDMA channels, making sure nothing is stuck.
>
> I also roughly measured how long it takes to run the whole 80+ migration
> qtest suite, and see no measurable difference before / after this series.
>
> I didn't test COLO, I wanted to but the doc example didn't work.
>
> Risks
> =====
>
> This series has the risk of breaking things. I would be surprised if it
> didn't..
>
> The current way of taking BQL during FULL section load may cause issues, it
> means when the IOs are unstable we could be waiting for IO (in the new
> migration incoming thread) with BQL held. This is low possibility, though,
> only happens when the network halts during flushing the device states.
> However still possible. One solution is to further breakdown the BQL
> critical sections to smaller sections, as mentioned in TODO.
>
> Anything more than welcomed: suggestions, questions, objections, tests..
>
> TODO
> ====
>
> - Finer grained BQL breakdown
>
> Peter Xu (13):
> io: Add qio_channel_wait_cond() helper
> migration: Properly wait on G_IO_IN when peeking messages
> migration/rdma: Fix wrong context in qio_channel_rdma_shutdown()
> migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread
> migration/rdma: Change io_create_watch() to return immediately
> migration: Introduce WITH_BQL_HELD() / WITH_BQL_RELEASED()
> migration: Pass in bql_held information from qemu_loadvm_state()
> migration: Thread-ify precopy vmstate load process
> migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel
> migration/postcopy: Remove workaround on wait preempt channel
> migration/ram: Remove workaround on ram yield during load
> migration: Allow blocking mode for incoming live migration
> migration/vfio: Drop BQL dependency for loadvm SWITCHOVER_START
>
> include/io/channel.h | 15 +++
> include/migration/colo.h | 6 +-
> migration/migration.h | 109 +++++++++++++++++--
> migration/savevm.h | 4 +-
> hw/vfio/migration-multifd.c | 3 -
> io/channel.c | 21 ++--
> migration/channel.c | 7 +-
> migration/colo-stubs.c | 2 +-
> migration/colo.c | 26 ++---
> migration/migration.c | 81 ++++++++------
> migration/qemu-file.c | 6 +-
> migration/ram.c | 13 +--
> migration/rdma.c | 204 ++++++++----------------------------
> migration/savevm.c | 98 +++++++++--------
> migration/trace-events | 4 +-
> 15 files changed, 291 insertions(+), 308 deletions(-)
>
Works well in my COLO testing. Fro the whole series:
Tested-by: Lukas Straub <lukasstraub2@web.de>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2026-01-17 14:01 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-22 19:25 [PATCH 00/13] migration: Threadify loadvm process Peter Xu
2025-10-22 19:26 ` [PATCH 01/13] io: Add qio_channel_wait_cond() helper Peter Xu
2025-10-23 13:02 ` Vladimir Sementsov-Ogievskiy
2025-10-24 12:00 ` Daniel P. Berrangé
2025-10-22 19:26 ` [PATCH 02/13] migration: Properly wait on G_IO_IN when peeking messages Peter Xu
2025-10-23 13:07 ` Vladimir Sementsov-Ogievskiy
2025-10-24 12:02 ` Daniel P. Berrangé
2025-10-28 18:16 ` Peter Xu
2025-10-22 19:26 ` [PATCH 03/13] migration/rdma: Fix wrong context in qio_channel_rdma_shutdown() Peter Xu
2025-10-22 19:26 ` [PATCH 04/13] migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread Peter Xu
2025-10-23 13:41 ` Vladimir Sementsov-Ogievskiy
2025-11-03 7:26 ` Zhijian Li (Fujitsu)
2025-10-22 19:26 ` [PATCH 05/13] migration/rdma: Change io_create_watch() to return immediately Peter Xu
2025-11-03 7:32 ` Zhijian Li (Fujitsu)
2025-10-22 19:26 ` [PATCH 06/13] migration: Introduce WITH_BQL_HELD() / WITH_BQL_RELEASED() Peter Xu
2025-10-28 13:27 ` Vladimir Sementsov-Ogievskiy
2025-10-22 19:26 ` [PATCH 07/13] migration: Pass in bql_held information from qemu_loadvm_state() Peter Xu
2025-10-28 14:22 ` Vladimir Sementsov-Ogievskiy
2025-12-10 22:01 ` Peter Xu
2025-10-22 19:26 ` [PATCH 08/13] migration: Thread-ify precopy vmstate load process Peter Xu
2025-11-04 2:40 ` Zhijian Li (Fujitsu)
2025-12-10 22:06 ` Peter Xu
2026-01-08 20:27 ` Fabiano Rosas
2026-01-12 15:50 ` Peter Xu
2026-01-12 19:04 ` Fabiano Rosas
2026-01-12 21:07 ` Peter Xu
2026-01-13 13:04 ` Fabiano Rosas
2026-01-13 16:49 ` Peter Xu
2026-01-16 21:48 ` Fabiano Rosas
2026-01-20 16:40 ` Peter Xu
2026-01-20 18:54 ` Fabiano Rosas
2026-01-20 20:12 ` Peter Xu
2026-01-17 13:57 ` Lukas Straub
2025-10-22 19:26 ` [PATCH 09/13] migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel Peter Xu
2025-10-22 19:26 ` [PATCH 10/13] migration/postcopy: Remove workaround on wait preempt channel Peter Xu
2025-10-22 19:26 ` [PATCH 11/13] migration/ram: Remove workaround on ram yield during load Peter Xu
2025-10-22 19:26 ` [PATCH 12/13] migration: Allow blocking mode for incoming live migration Peter Xu
2025-10-22 19:26 ` [PATCH 13/13] migration/vfio: Drop BQL dependency for loadvm SWITCHOVER_START Peter Xu
2025-10-22 19:29 ` [PATCH 00/13] migration: Threadify loadvm process Peter Xu
2026-01-17 14:00 ` Lukas Straub [this message]
2026-01-20 16:43 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260117150037.605c9744@penguin \
--to=lukasstraub2@web.de \
--cc=berrange@redhat.com \
--cc=dave@treblig.org \
--cc=farosas@suse.de \
--cc=jmarcin@redhat.com \
--cc=kwolf@redhat.com \
--cc=lizhijian@fujitsu.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=ppandit@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@yandex-team.ru \
--cc=yury-kotov@yandex-team.ru \
--cc=zhangckid@gmail.com \
--cc=zhanghailiang@xfusion.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.