All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>, qemu-devel@nongnu.org
Cc: "Dr . David Alan Gilbert" <dave@treblig.org>,
	peterx@redhat.com, "Kevin Wolf" <kwolf@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Hailiang Zhang" <zhanghailiang@xfusion.com>,
	"Yury Kotov" <yury-kotov@yandex-team.ru>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"Prasad Pandit" <ppandit@redhat.com>,
	"Zhang Chen" <zhangckid@gmail.com>,
	"Li Zhijian" <lizhijian@fujitsu.com>,
	"Juraj Marcin" <jmarcin@redhat.com>
Subject: Re: [PATCH RFC 0/9] migration: Threadify loadvm process
Date: Tue, 16 Sep 2025 18:32:59 -0300	[thread overview]
Message-ID: <87zfau13sk.fsf@suse.de> (raw)
In-Reply-To: <20250827205949.364606-1-peterx@redhat.com>

Peter Xu <peterx@redhat.com> writes:

> [this is an early RFC, not for merge, but to collect initial feedbacks]
>
> Background
> ==========
>
> Nowadays, live migration heavily depends on threads. For example, most of
> the major features that will be used nowadays in live migration (multifd,
> postcopy, mapped-ram, vfio, etc.) all work with threads internally.
>
> But still, from time to time, we'll see some coroutines floating around the
> migration context.  The major one is precopy's loadvm, which is internally
> a coroutine.  It is still a critical path that any live migration depends on.
>

I always wanted to be an archaeologist:

https://lists.gnu.org/archive/html/qemu-devel//2012-08/msg01136.html

I was expecting to find some complicated chain of events leading to the
choice of using a coroutine, but no.

> A mixture of using both coroutines and threads is prone to issues.  Some
> examples can refer to commit e65cec5e5d ("migration/ram: Yield periodically
> to the main loop") or commit 7afbdada7e ("migration/postcopy: ensure
> preempt channel is ready before loading states").
>
> Overview
> ========
>
> This series tries to move migration further into the thread-based model, by
> allowing the loadvm process to happen in a thread rather than in the main
> thread with a coroutine.
>
> Luckily, since the qio channel code is always ready for both cases, IO
> paths should all be fine.
>
> Note that loadvm for postcopy already happens in a ram load thread which is
> separate.  However, RAM is just the simple case here, even it has its own
> challenges (on atomically update of the pgtables), its complexity lies in
> the kernel.
>
> For precopy, loadvm has quite a few operations that will need BQL.  The
> question is we can't take BQL for the whole process of loadvm, because
> that'll block the main thread from executions (e.g. QMP hangs).  Here, the
> finer granule we can push BQL the better.  This series so far chose
> somewhere in the middle, by taking BQL on majorly these two places:
>
>   - CPU synchronizations
>   - Device START/FULL sections
>
> After this series applied, most of the rest loadvm path will run without
> BQL anymore.  There is a more detailed discussion / todo in the commit
> message of patch "migration: Thread-ify precopy vmstate load process"
> explaning how to further split the BQL critical sections.
>
> I was trying to split the patches into smaller ones if possible, but it's
> still quite challenging so there's one major patch that does the work.
>
> After the series applied, the only leftover pieces in migration/ that would
> use a coroutine is snapshot save/load/delete jobs.
>

Which are then fine because the work itself runs on the main loop,
right? So the bottom-half scheduling could be left as a coroutine.

> Tests
> =====
>
> Default CI passes.
>
> RDMA unit tests pass as usual. I also tried out cancellation / failure
> tests over RDMA channels, making sure nothing is stuck.
>
> I also roughly measured how long it takes to run the whole 80+ migration
> qtest suite, and see no measurable difference before / after this series.
>
> Risks
> =====
>
> This series has the risk of breaking things.  I would be surprised if it
> didn't..
>
> I confess I didn't test anything on COLO but only from code observations
> and analysis.  COLO maintainers: could you add some unit tests to QEMU's
> qtests?
>
> The current way of taking BQL during FULL section load may cause issues, it
> means when the IOs are unstable we could be waiting for IO (in the new
> migration incoming thread) with BQL held.  This is low possibility, though,
> only happens when the network halts during flushing the device states.
> However still possible.  One solution is to further breakdown the BQL
> critical sections to smaller sections, as mentioned in TODO.
>
> Anything more than welcomed: suggestions, questions, objections, tests..
>
> Todo
> ====
>
> - Test COLO?
> - Finer grained BQL breakdown
> - More..
>
> Thanks,
>
> Peter Xu (9):
>   migration/vfio: Remove BQL implication in
>     vfio_multifd_switchover_start()
>   migration/rdma: Fix wrong context in qio_channel_rdma_shutdown()
>   migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread
>   migration/rdma: Change io_create_watch() to return immediately
>   migration: Thread-ify precopy vmstate load process
>   migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel
>   migration/postcopy: Remove workaround on wait preempt channel
>   migration/ram: Remove workaround on ram yield during load
>   migration/rdma: Remove rdma_cm_poll_handler
>
>  include/migration/colo.h    |   6 +-
>  migration/migration.h       |  52 +++++++--
>  migration/savevm.h          |   5 +-
>  hw/vfio/migration-multifd.c |   9 +-
>  migration/channel.c         |   7 +-
>  migration/colo-stubs.c      |   2 +-
>  migration/colo.c            |  23 +---
>  migration/migration.c       |  62 ++++++++---
>  migration/ram.c             |  13 +--
>  migration/rdma.c            | 206 ++++++++----------------------------
>  migration/savevm.c          |  85 +++++++--------
>  migration/trace-events      |   4 +-
>  12 files changed, 196 insertions(+), 278 deletions(-)


  parent reply	other threads:[~2025-09-16 21:34 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27 20:59 [PATCH RFC 0/9] migration: Threadify loadvm process Peter Xu
2025-08-27 20:59 ` [PATCH RFC 1/9] migration/vfio: Remove BQL implication in vfio_multifd_switchover_start() Peter Xu
2025-08-28 18:05   ` Maciej S. Szmigiero
2025-10-21 20:36     ` Peter Xu
2025-09-16 21:34   ` Fabiano Rosas
2025-08-27 20:59 ` [PATCH RFC 2/9] migration/rdma: Fix wrong context in qio_channel_rdma_shutdown() Peter Xu
2025-09-16 21:41   ` Fabiano Rosas
2025-09-26  1:01   ` Zhijian Li (Fujitsu)
2025-08-27 20:59 ` [PATCH RFC 3/9] migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread Peter Xu
2025-09-16 21:50   ` Fabiano Rosas
2025-09-26  1:02   ` Zhijian Li (Fujitsu)
2025-08-27 20:59 ` [PATCH RFC 4/9] migration/rdma: Change io_create_watch() to return immediately Peter Xu
2025-09-16 22:35   ` Fabiano Rosas
2025-10-08 20:34     ` Peter Xu
2025-09-26  2:39   ` Zhijian Li (Fujitsu)
2025-10-08 20:42     ` Peter Xu
2025-08-27 20:59 ` [PATCH RFC 5/9] migration: Thread-ify precopy vmstate load process Peter Xu
2025-08-27 23:51   ` Dr. David Alan Gilbert
2025-08-29 16:37     ` Peter Xu
2025-09-04  1:38       ` Dr. David Alan Gilbert
2025-10-08 21:02         ` Peter Xu
2025-08-29  8:29   ` Vladimir Sementsov-Ogievskiy
2025-08-29 17:17     ` Peter Xu
2025-09-01  9:35       ` Vladimir Sementsov-Ogievskiy
2025-10-21 18:49         ` Peter Xu
2025-09-17 18:23   ` Fabiano Rosas
2025-10-09 21:41     ` Peter Xu
2025-09-26  3:41   ` Zhijian Li (Fujitsu)
2025-10-08 21:10     ` Peter Xu
2025-08-27 20:59 ` [PATCH RFC 6/9] migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel Peter Xu
2025-09-16 22:39   ` Fabiano Rosas
2025-10-08 21:18     ` Peter Xu
2025-09-26  2:44   ` Zhijian Li (Fujitsu)
2025-08-27 20:59 ` [PATCH RFC 7/9] migration/postcopy: Remove workaround on wait preempt channel Peter Xu
2025-09-17 18:30   ` Fabiano Rosas
2025-08-27 20:59 ` [PATCH RFC 8/9] migration/ram: Remove workaround on ram yield during load Peter Xu
2025-09-17 18:31   ` Fabiano Rosas
2025-08-27 20:59 ` [PATCH RFC 9/9] migration/rdma: Remove rdma_cm_poll_handler Peter Xu
2025-09-17 18:38   ` Fabiano Rosas
2025-10-08 21:22     ` Peter Xu
2025-09-26  3:38   ` Zhijian Li (Fujitsu)
2025-08-29  8:29 ` [PATCH RFC 0/9] migration: Threadify loadvm process Vladimir Sementsov-Ogievskiy
2025-08-29 17:18   ` Peter Xu
2025-09-04  8:27 ` Zhang Chen
2025-10-08 21:26   ` Peter Xu
2025-10-20 21:41     ` Peter Xu
2025-10-20 22:08       ` Lukas Straub
2025-10-21  2:31         ` Zhang Chen
2025-10-21 13:58           ` Peter Xu
2025-09-16 21:32 ` Fabiano Rosas [this message]
2025-10-09 16:58   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zfau13sk.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=berrange@redhat.com \
    --cc=dave@treblig.org \
    --cc=jmarcin@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    --cc=yury-kotov@yandex-team.ru \
    --cc=zhangckid@gmail.com \
    --cc=zhanghailiang@xfusion.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.