qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>, qemu-devel@nongnu.org
Cc: "Dr . David Alan Gilbert" <dave@treblig.org>,
	peterx@redhat.com, "Kevin Wolf" <kwolf@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Hailiang Zhang" <zhanghailiang@xfusion.com>,
	"Yury Kotov" <yury-kotov@yandex-team.ru>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"Prasad Pandit" <ppandit@redhat.com>,
	"Zhang Chen" <zhangckid@gmail.com>,
	"Li Zhijian" <lizhijian@fujitsu.com>,
	"Juraj Marcin" <jmarcin@redhat.com>
Subject: Re: [PATCH RFC 0/9] migration: Threadify loadvm process
Date: Tue, 16 Sep 2025 18:32:59 -0300	[thread overview]
Message-ID: <87zfau13sk.fsf@suse.de> (raw)
In-Reply-To: <20250827205949.364606-1-peterx@redhat.com>

Peter Xu <peterx@redhat.com> writes:

> [this is an early RFC, not for merge, but to collect initial feedbacks]
>
> Background
> ==========
>
> Nowadays, live migration heavily depends on threads. For example, most of
> the major features that will be used nowadays in live migration (multifd,
> postcopy, mapped-ram, vfio, etc.) all work with threads internally.
>
> But still, from time to time, we'll see some coroutines floating around the
> migration context.  The major one is precopy's loadvm, which is internally
> a coroutine.  It is still a critical path that any live migration depends on.
>

I always wanted to be an archaeologist:

https://lists.gnu.org/archive/html/qemu-devel//2012-08/msg01136.html

I was expecting to find some complicated chain of events leading to the
choice of using a coroutine, but no.

> A mixture of using both coroutines and threads is prone to issues.  Some
> examples can refer to commit e65cec5e5d ("migration/ram: Yield periodically
> to the main loop") or commit 7afbdada7e ("migration/postcopy: ensure
> preempt channel is ready before loading states").
>
> Overview
> ========
>
> This series tries to move migration further into the thread-based model, by
> allowing the loadvm process to happen in a thread rather than in the main
> thread with a coroutine.
>
> Luckily, since the qio channel code is always ready for both cases, IO
> paths should all be fine.
>
> Note that loadvm for postcopy already happens in a ram load thread which is
> separate.  However, RAM is just the simple case here, even it has its own
> challenges (on atomically update of the pgtables), its complexity lies in
> the kernel.
>
> For precopy, loadvm has quite a few operations that will need BQL.  The
> question is we can't take BQL for the whole process of loadvm, because
> that'll block the main thread from executions (e.g. QMP hangs).  Here, the
> finer granule we can push BQL the better.  This series so far chose
> somewhere in the middle, by taking BQL on majorly these two places:
>
>   - CPU synchronizations
>   - Device START/FULL sections
>
> After this series applied, most of the rest loadvm path will run without
> BQL anymore.  There is a more detailed discussion / todo in the commit
> message of patch "migration: Thread-ify precopy vmstate load process"
> explaning how to further split the BQL critical sections.
>
> I was trying to split the patches into smaller ones if possible, but it's
> still quite challenging so there's one major patch that does the work.
>
> After the series applied, the only leftover pieces in migration/ that would
> use a coroutine is snapshot save/load/delete jobs.
>

Which are then fine because the work itself runs on the main loop,
right? So the bottom-half scheduling could be left as a coroutine.

> Tests
> =====
>
> Default CI passes.
>
> RDMA unit tests pass as usual. I also tried out cancellation / failure
> tests over RDMA channels, making sure nothing is stuck.
>
> I also roughly measured how long it takes to run the whole 80+ migration
> qtest suite, and see no measurable difference before / after this series.
>
> Risks
> =====
>
> This series has the risk of breaking things.  I would be surprised if it
> didn't..
>
> I confess I didn't test anything on COLO but only from code observations
> and analysis.  COLO maintainers: could you add some unit tests to QEMU's
> qtests?
>
> The current way of taking BQL during FULL section load may cause issues, it
> means when the IOs are unstable we could be waiting for IO (in the new
> migration incoming thread) with BQL held.  This is low possibility, though,
> only happens when the network halts during flushing the device states.
> However still possible.  One solution is to further breakdown the BQL
> critical sections to smaller sections, as mentioned in TODO.
>
> Anything more than welcomed: suggestions, questions, objections, tests..
>
> Todo
> ====
>
> - Test COLO?
> - Finer grained BQL breakdown
> - More..
>
> Thanks,
>
> Peter Xu (9):
>   migration/vfio: Remove BQL implication in
>     vfio_multifd_switchover_start()
>   migration/rdma: Fix wrong context in qio_channel_rdma_shutdown()
>   migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread
>   migration/rdma: Change io_create_watch() to return immediately
>   migration: Thread-ify precopy vmstate load process
>   migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel
>   migration/postcopy: Remove workaround on wait preempt channel
>   migration/ram: Remove workaround on ram yield during load
>   migration/rdma: Remove rdma_cm_poll_handler
>
>  include/migration/colo.h    |   6 +-
>  migration/migration.h       |  52 +++++++--
>  migration/savevm.h          |   5 +-
>  hw/vfio/migration-multifd.c |   9 +-
>  migration/channel.c         |   7 +-
>  migration/colo-stubs.c      |   2 +-
>  migration/colo.c            |  23 +---
>  migration/migration.c       |  62 ++++++++---
>  migration/ram.c             |  13 +--
>  migration/rdma.c            | 206 ++++++++----------------------------
>  migration/savevm.c          |  85 +++++++--------
>  migration/trace-events      |   4 +-
>  12 files changed, 196 insertions(+), 278 deletions(-)


  parent reply	other threads:[~2025-09-16 21:34 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27 20:59 [PATCH RFC 0/9] migration: Threadify loadvm process Peter Xu
2025-08-27 20:59 ` [PATCH RFC 1/9] migration/vfio: Remove BQL implication in vfio_multifd_switchover_start() Peter Xu
2025-08-28 18:05   ` Maciej S. Szmigiero
2025-10-21 20:36     ` Peter Xu
2025-09-16 21:34   ` Fabiano Rosas
2025-08-27 20:59 ` [PATCH RFC 2/9] migration/rdma: Fix wrong context in qio_channel_rdma_shutdown() Peter Xu
2025-09-16 21:41   ` Fabiano Rosas
2025-09-26  1:01   ` Zhijian Li (Fujitsu)
2025-08-27 20:59 ` [PATCH RFC 3/9] migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread Peter Xu
2025-09-16 21:50   ` Fabiano Rosas
2025-09-26  1:02   ` Zhijian Li (Fujitsu)
2025-08-27 20:59 ` [PATCH RFC 4/9] migration/rdma: Change io_create_watch() to return immediately Peter Xu
2025-09-16 22:35   ` Fabiano Rosas
2025-10-08 20:34     ` Peter Xu
2025-09-26  2:39   ` Zhijian Li (Fujitsu)
2025-10-08 20:42     ` Peter Xu
2025-08-27 20:59 ` [PATCH RFC 5/9] migration: Thread-ify precopy vmstate load process Peter Xu
2025-08-27 23:51   ` Dr. David Alan Gilbert
2025-08-29 16:37     ` Peter Xu
2025-09-04  1:38       ` Dr. David Alan Gilbert
2025-10-08 21:02         ` Peter Xu
2025-08-29  8:29   ` Vladimir Sementsov-Ogievskiy
2025-08-29 17:17     ` Peter Xu
2025-09-01  9:35       ` Vladimir Sementsov-Ogievskiy
2025-10-21 18:49         ` Peter Xu
2025-09-17 18:23   ` Fabiano Rosas
2025-10-09 21:41     ` Peter Xu
2025-09-26  3:41   ` Zhijian Li (Fujitsu)
2025-10-08 21:10     ` Peter Xu
2025-08-27 20:59 ` [PATCH RFC 6/9] migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel Peter Xu
2025-09-16 22:39   ` Fabiano Rosas
2025-10-08 21:18     ` Peter Xu
2025-09-26  2:44   ` Zhijian Li (Fujitsu)
2025-08-27 20:59 ` [PATCH RFC 7/9] migration/postcopy: Remove workaround on wait preempt channel Peter Xu
2025-09-17 18:30   ` Fabiano Rosas
2025-08-27 20:59 ` [PATCH RFC 8/9] migration/ram: Remove workaround on ram yield during load Peter Xu
2025-09-17 18:31   ` Fabiano Rosas
2025-08-27 20:59 ` [PATCH RFC 9/9] migration/rdma: Remove rdma_cm_poll_handler Peter Xu
2025-09-17 18:38   ` Fabiano Rosas
2025-10-08 21:22     ` Peter Xu
2025-09-26  3:38   ` Zhijian Li (Fujitsu)
2025-08-29  8:29 ` [PATCH RFC 0/9] migration: Threadify loadvm process Vladimir Sementsov-Ogievskiy
2025-08-29 17:18   ` Peter Xu
2025-09-04  8:27 ` Zhang Chen
2025-10-08 21:26   ` Peter Xu
2025-10-20 21:41     ` Peter Xu
2025-10-20 22:08       ` Lukas Straub
2025-10-21  2:31         ` Zhang Chen
2025-10-21 13:58           ` Peter Xu
2025-09-16 21:32 ` Fabiano Rosas [this message]
2025-10-09 16:58   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zfau13sk.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=berrange@redhat.com \
    --cc=dave@treblig.org \
    --cc=jmarcin@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    --cc=yury-kotov@yandex-team.ru \
    --cc=zhangckid@gmail.com \
    --cc=zhanghailiang@xfusion.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).