All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Straub <lukasstraub2@web.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Li Zhijian" <lizhijian@fujitsu.com>,
	"Hailiang Zhang" <zhanghailiang@xfusion.com>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Fabiano Rosas" <farosas@suse.de>,
	"Zhang Chen" <zhangckid@gmail.com>,
	"Dr . David Alan Gilbert" <dave@treblig.org>,
	"Prasad Pandit" <ppandit@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Yury Kotov" <yury-kotov@yandex-team.ru>,
	"Juraj Marcin" <jmarcin@redhat.com>
Subject: Re: [PATCH 08/13] migration: Thread-ify precopy vmstate load process
Date: Sat, 17 Jan 2026 14:57:51 +0100	[thread overview]
Message-ID: <20260117145751.2985158f@penguin> (raw)
In-Reply-To: <20251022192612.2737648-9-peterx@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5704 bytes --]

On Wed, 22 Oct 2025 15:26:07 -0400
Peter Xu <peterx@redhat.com> wrote:

> Migration module was there for 10+ years.  Initially, it was in most cases
> based on coroutines.  As more features were added into the framework, like
> postcopy, multifd, etc.. it became a mixture of threads and coroutines.
> 
> I'm guessing coroutines just can't fix all issues that migration want to
> resolve.
> 
> After all these years, migration is now heavily based on a threaded model.
> 
> Now there's still a major part of migration framework that is still not
> thread-based, which is precopy load.  We do load in a separate thread in
> postcopy since the 1st day postcopy was introduced, however that requires a
> separate state transition from precopy loading all devices first, which
> still happens in the main thread of a coroutine.
> 
> This patch tries to move the migration incoming side to be run inside a
> separate thread (mig/dst/main) just like the src (mig/src/main).  The
> entrance to be migration_incoming_thread().
> 
> Quite a few things are needed to make it fly..  One note here is we need to
> change all these things in one patch to not break anything.  The other way
> to do this is add code to make all paths (that this patch touched) be ready
> for either coroutine or thread.  That may cause confusions in another way.
> So reviewers, please take my sincere apology on the hardness of reviewing
> this patch: it covers a few modules at the same time, and with some risky
> changes.
> 
> BQL Analysis
> ============
> 
> Firstly, when moving it over to the thread, it means the thread cannot take
> BQL during the whole process of loading anymore, because otherwise it can
> block main thread from using the BQL for all kinds of other concurrent
> tasks (for example, processing QMP / HMP commands).
> 
> Here the first question to ask is: what needs BQL during precopy load, and
> what doesn't?
> 
> Most of the load process shouldn't need BQL, especially when it's about
> RAM.  After all, RAM is still the major chunk of data to move for a live
> migration process.  VFIO started to change that, though, but still, VFIO is
> per-device so that shouldn't need BQL either in most cases.
> 
> Generic device loads will need BQL, likely not when receiving VMSDs, but
> when applying them.  One example is any post_load() could potentially
> inject memory regions causing memory transactions to happen.  That'll need
> to update the global address spaces, hence requires BQL.  The other one is
> CPU sync operations, even if the sync alone may not need BQL (which is
> still to be further justified), run_on_cpu() will need it.
> 
> For that, qemu_loadvm_state() and qemu_loadvm_state_main() functions need
> to now take a "bql_held" parameter saying whether bql is held.  We could
> use things like BQL_LOCK_GUARD(), but this patch goes with explicit
> lockings rather than relying on bql_locked TLS variable.  In case of
> migration, we always know whether BQL is held in different context as long
> as we can still pass that information downwards.
> 
> COLO
> ====
> 
> COLO assumed the dest VM load happens in a coroutine.  After this patch,
> it's not anymore.  Change that by invoking colo_incoming_co() directly from
> the migration_incoming_thread().
> 
> The name (colo_incoming_co()) isn't proper anymore.  Change it to
> colo_incoming_wait(), removing the coroutine annotation alongside.
> 
> Remove all the bql_lock() implications in COLO, e.g., colo_incoming_co()
> used to release the lock for a short period while join().  Now it's not
> needed.  Instead, taking BQL but only when needed (colo_release_ram_cache).
> 
> At the meantime, there's colo_incoming_co variable that used to store the
> COLO incoming coroutine, only to be kicked off when a secondary failover
> happens.
> 
> To recap, what should happen for such failover should be (taking example of
> a QMP command x-colo-lost-heartbeat triggering on dest QEMU):
> 
>   - The QMP command will kick off both the coroutine and the COLO
>     thread (colo_process_incoming_thread()), with something like:
> 
>     /* Notify COLO incoming thread that failover work is finished */
>     qemu_event_set(&mis->colo_incoming_event);
> 
>     qemu_coroutine_enter(mis->colo_incoming_co);
> 
>   - The coroutine, which yielded itself before, now resumes after enter(),
>     then it'll wait for the join():
> 
>     mis->colo_incoming_co = qemu_coroutine_self();
>     qemu_coroutine_yield();
>     mis->colo_incoming_co = NULL;
> 
>     /* Wait checkpoint incoming thread exit before free resource */
>     qemu_thread_join(&th);
> 
> Here, when switching to a thread model, it should be fine removing
> colo_incoming_co variable completely, because if so, the incoming thread
> will (instead of yielding the coroutine) wait at qemu_thread_join() until
> the colo thread completes execution (after receiving colo_incoming_event).
> 
> [...]
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/migration/colo.h |  6 ++--
>  migration/migration.h    | 14 +++-----
>  migration/colo-stubs.c   |  2 +-
>  migration/colo.c         | 24 ++++---------
>  migration/migration.c    | 77 +++++++++++++++++++++++++---------------
>  migration/rdma.c         | 34 +-----------------
>  migration/savevm.c       |  8 ++---
>  migration/trace-events   |  4 +--
>  8 files changed, 69 insertions(+), 100 deletions(-)
> 

Looks good to me, you got the COLO parts exactly right. Thank you for
putting so much effort in this.

Reviewed-by: Lukas Straub <lukasstraub2@web.de>

Best Regards,
Lukas Straub


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2026-01-17 13:59 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22 19:25 [PATCH 00/13] migration: Threadify loadvm process Peter Xu
2025-10-22 19:26 ` [PATCH 01/13] io: Add qio_channel_wait_cond() helper Peter Xu
2025-10-23 13:02   ` Vladimir Sementsov-Ogievskiy
2025-10-24 12:00   ` Daniel P. Berrangé
2025-10-22 19:26 ` [PATCH 02/13] migration: Properly wait on G_IO_IN when peeking messages Peter Xu
2025-10-23 13:07   ` Vladimir Sementsov-Ogievskiy
2025-10-24 12:02   ` Daniel P. Berrangé
2025-10-28 18:16     ` Peter Xu
2025-10-22 19:26 ` [PATCH 03/13] migration/rdma: Fix wrong context in qio_channel_rdma_shutdown() Peter Xu
2025-10-22 19:26 ` [PATCH 04/13] migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread Peter Xu
2025-10-23 13:41   ` Vladimir Sementsov-Ogievskiy
2025-11-03  7:26   ` Zhijian Li (Fujitsu)
2025-10-22 19:26 ` [PATCH 05/13] migration/rdma: Change io_create_watch() to return immediately Peter Xu
2025-11-03  7:32   ` Zhijian Li (Fujitsu)
2025-10-22 19:26 ` [PATCH 06/13] migration: Introduce WITH_BQL_HELD() / WITH_BQL_RELEASED() Peter Xu
2025-10-28 13:27   ` Vladimir Sementsov-Ogievskiy
2025-10-22 19:26 ` [PATCH 07/13] migration: Pass in bql_held information from qemu_loadvm_state() Peter Xu
2025-10-28 14:22   ` Vladimir Sementsov-Ogievskiy
2025-12-10 22:01     ` Peter Xu
2025-10-22 19:26 ` [PATCH 08/13] migration: Thread-ify precopy vmstate load process Peter Xu
2025-11-04  2:40   ` Zhijian Li (Fujitsu)
2025-12-10 22:06     ` Peter Xu
2026-01-08 20:27   ` Fabiano Rosas
2026-01-12 15:50     ` Peter Xu
2026-01-12 19:04       ` Fabiano Rosas
2026-01-12 21:07         ` Peter Xu
2026-01-13 13:04           ` Fabiano Rosas
2026-01-13 16:49             ` Peter Xu
2026-01-16 21:48               ` Fabiano Rosas
2026-01-20 16:40                 ` Peter Xu
2026-01-20 18:54                   ` Fabiano Rosas
2026-01-20 20:12                     ` Peter Xu
2026-01-17 13:57   ` Lukas Straub [this message]
2025-10-22 19:26 ` [PATCH 09/13] migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel Peter Xu
2025-10-22 19:26 ` [PATCH 10/13] migration/postcopy: Remove workaround on wait preempt channel Peter Xu
2025-10-22 19:26 ` [PATCH 11/13] migration/ram: Remove workaround on ram yield during load Peter Xu
2025-10-22 19:26 ` [PATCH 12/13] migration: Allow blocking mode for incoming live migration Peter Xu
2025-10-22 19:26 ` [PATCH 13/13] migration/vfio: Drop BQL dependency for loadvm SWITCHOVER_START Peter Xu
2025-10-22 19:29 ` [PATCH 00/13] migration: Threadify loadvm process Peter Xu
2026-01-17 14:00 ` Lukas Straub
2026-01-20 16:43   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260117145751.2985158f@penguin \
    --to=lukasstraub2@web.de \
    --cc=berrange@redhat.com \
    --cc=dave@treblig.org \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    --cc=yury-kotov@yandex-team.ru \
    --cc=zhangckid@gmail.com \
    --cc=zhanghailiang@xfusion.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.