All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, "Li Zhijian" <lizhijian@fujitsu.com>,
	"Hailiang Zhang" <zhanghailiang@xfusion.com>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Zhang Chen" <zhangckid@gmail.com>,
	"Dr . David Alan Gilbert" <dave@treblig.org>,
	"Prasad Pandit" <ppandit@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Yury Kotov" <yury-kotov@yandex-team.ru>,
	"Juraj Marcin" <jmarcin@redhat.com>
Subject: Re: [PATCH 08/13] migration: Thread-ify precopy vmstate load process
Date: Tue, 13 Jan 2026 11:49:47 -0500	[thread overview]
Message-ID: <aWZ3q3NBBCl5KTYr@x1.local> (raw)
In-Reply-To: <87fr89psu5.fsf@suse.de>

On Tue, Jan 13, 2026 at 10:04:02AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Jan 12, 2026 at 04:04:12PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Thu, Jan 08, 2026 at 05:27:37PM -0300, Fabiano Rosas wrote:
> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >> 
> >> >> > Migration module was there for 10+ years.  Initially, it was in most cases
> >> >> > based on coroutines.  As more features were added into the framework, like
> >> >> > postcopy, multifd, etc.. it became a mixture of threads and coroutines.
> >> >> >
> >> >> > I'm guessing coroutines just can't fix all issues that migration want to
> >> >> > resolve.
> >> >> >
> >> >> > After all these years, migration is now heavily based on a threaded model.
> >> >> >
> >> >> > Now there's still a major part of migration framework that is still not
> >> >> > thread-based, which is precopy load.  We do load in a separate thread in
> >> >> > postcopy since the 1st day postcopy was introduced, however that requires a
> >> >> > separate state transition from precopy loading all devices first, which
> >> >> > still happens in the main thread of a coroutine.
> >> >> >
> >> >> > This patch tries to move the migration incoming side to be run inside a
> >> >> > separate thread (mig/dst/main) just like the src (mig/src/main).  The
> >> >> > entrance to be migration_incoming_thread().
> >> >> >
> >> >> > Quite a few things are needed to make it fly..  One note here is we need to
> >> >> > change all these things in one patch to not break anything.  The other way
> >> >> > to do this is add code to make all paths (that this patch touched) be ready
> >> >> > for either coroutine or thread.  That may cause confusions in another way.
> >> >> > So reviewers, please take my sincere apology on the hardness of reviewing
> >> >> > this patch: it covers a few modules at the same time, and with some risky
> >> >> > changes.
> >> >> >
> >> >> > BQL Analysis
> >> >> > ============
> >> >> >
> >> >> > Firstly, when moving it over to the thread, it means the thread cannot take
> >> >> > BQL during the whole process of loading anymore, because otherwise it can
> >> >> > block main thread from using the BQL for all kinds of other concurrent
> >> >> > tasks (for example, processing QMP / HMP commands).
> >> >> >
> >> >> > Here the first question to ask is: what needs BQL during precopy load, and
> >> >> > what doesn't?
> >> >> >
> >> >> 
> >> >> I just noticed that the BQL held at process_incoming_migration_co is
> >> >> also responsible for stopping qmp_migrate_set_capabilities from being
> >> >> dispatched.
> >> >
> >> > I don't know if it is by design, or even if it will be guaranteed to work..
> >> >
> >> 
> >> Regardless, we shouldn't rely on the BQL for this. The BQL should be
> >> left as last resort for things that interact across subsystems. If
> >> someone is issuing a migration command during a migration, the migration
> >> code is exquisitely positioned to handle that itself.
> >
> > Yes I agree.
> >
> >> 
> >> > Consider the migration incoming rocoutine runs into qemu_get_byte(), and
> >> > then proactively yield the migration coroutine (qemu_coroutine_yield())
> >> > when the incoming port is blocked on read..
> >> >
> >> > AFAIU, a proper fix for that (note, this will currently break tests) is:
> >> >
> >> > bool migration_is_running(void)
> >> >  {
> >> > -    MigrationState *s = current_migration;
> >> > +    MigrationStatus state;
> >> >  
> >> > -    if (!s) {
> >> > -        return false;
> >> > +    if (runstate_check(RUN_STATE_INMIGRATE)) {
> >> > +        MigrationIncomingState *mis = migration_incoming_get_current();
> >> > +
> >> > +        if (!mis) {
> >> > +            return false;
> >> > +        }
> >> > +
> >> > +        state = mis->state;
> >> > +    } else {
> >> > +        MigrationState *s = migrate_get_current();
> >> > +
> >> > +        if (!s) {
> >> > +            return false;
> >> > +        }
> >> > +
> >> > +        state = s->state;
> >> >      }
> >> >  
> >> > -    switch (s->state) {
> >> > +    switch (state) {
> >> >      case MIGRATION_STATUS_ACTIVE:
> >> >      case MIGRATION_STATUS_POSTCOPY_DEVICE:
> >> >      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
> >> >
> >> 
> >> LGTM
> >> 
> >> >> 
> >> >> Any point during incoming migration when BQL is unlocked we have a
> >> >> window where a capability could be changed. Same for parameters, for
> >> >> that matter.
> >> >> 
> >> >> To make matters worse, the -incoming cmdline will trigger
> >> >> qmp_migrate_incoming->...->migration_transport_compatible early on, but
> >> >> until the channels finally connect and process_incoming_migration_co
> >> >> starts it's possible to just change a capability in an incompatible way
> >> >> and the transport will never be validated again.
> >> >
> >> > Right.  Above should fix it, but I believe it also means after "-incoming
> >> > tcp:xxx" (or anything not "defer") we should forbid changing migration caps
> >> > or params on destination.
> >> >
> >> 
> >> Parameters are never forbidden, right? And we cannot forbid them with
> >> is_running because some parameters are allowed to be changed while
> >> running.
> >
> > Right, my above statement was definitely inaccurate.
> >
> > After merging caps and params we only have params.  We should only allow
> > some params to be changed anytime.  Most of the params (including caps)
> > should not allow changing during migration live on either src/dst.
> >
> >> 
> >> I feel we should have a more fine grained way of saying "this option
> >> cannot be set at this moment", instead of just using the state as a
> >> proxy. States can change, while the fact that from a certain point on,
> >> certain options should not be touched anymore doesn't change.
> >
> > IIUC that's what migration_is_running() about?
> >
> 
> At a high level, yes, but I think that's actually a downside of
> migration_is_running. It's not explicit.
> 
> E.g.: qmp_migrate -> migration_transport_compatible is the first time
> capabilities are checked in the code. Guess what migration_is_running
> returns at that point? (it's false)
> 
> If there's ever a change in BQL behavior, there'll be a bug and we'll
> see another patch extending the scope of migration_is_running, either
> adding existing states to it, or as Prasad proposed, adding a new state
> that is set a little later/earlier.
> 
> What we technically want is to stop accepting new capabilities as soon
> as we're about to use them for the first time. Not as soon as (state ==
> ACTIVE || SETUP || ...) which is a more high level definition.

Isn't migration_incoming_state_setup() almost immediately invoked (after
migration_transport_compatible() check passed), which will start to make
the retval of migration_is_running() reflecting the fact?

IMHO as long as QMP handlers are run with any lock (currently, BQL), then
such layout should keep working?  That will make sure QMP "migrate" and
"migrate_set_parameters" be serialized, that's, AFAICT, what we only need.

> 
> > It's a matter of if such finer granularity is necessary for us. Let's
> > assume the simple scheme is:
> >
> >   (a) Some params are allowed to be modified anytime, examples,
> >       downtime-limit, max-bandwidth, max-postcopy-bandwidth, etc.
> >
> 
> What about during migration_cleanup() or migrate-cancel? Won't there
> ever be something in the path of setting these params that would require
> a "healthy" migration?

If we'll introduce FAILING, then plus CANCELLING I think we should cover
all race against migration_cleanup()?  As long as we'll have both FAILING
and CANCELLING to return true for migration_is_running().

> 
> >   (b) All the rest params and all capabilities are not allowed to be modified
> >       when migration_is_running() returns true (my fixed version above)
> >
> > The whitelisted (a) should really be the smallest set of params, and
> > justified one by one or it should fall into (b).
> >
> > My hope is the simple scheme should work for us already.  After merging
> > caps, it's only about some params that can be set anytime, rest params can
> > only be set when migration is not running.
> >
> 
> Sure, I'd just like to avoid adding another set of "if
> (s->parameters->has_foobar)" if possible.
> 
> >> 
> >> Maybe a little infra like bdrv_op_is_blocked, i.e, a list of blocked
> >> operations. It could be set in qmp_migrate and checked in
> >> qmp_set_parameters/caps.
> >
> > Any example you can think of, that above simple scheme won't work for us?
> >
> 
> It works. I think it's fragile due to:
> 
> - the reliance on BQL to avoid concurrent invocations;
> 
> - the possible addition of new states for unrelated reasons
>   risks regressions and it's just extra work to make sure everything is
>   still blocked at the time it needs to be blocked.

Yes, we'll need to make sure when new states introduced it needs to report
correctly in migration_is_running().  I think it's not an issue because we
need to do it right with/without relying on the function to ban updates on
caps/params.. It always must be done right as migration_is_running() is
also used in other use cases.

> 
> - any bugs need to be fixed by moving states around, which is dubious
>   since the migration flow itself (which the states should represent)
>   has not changed in a while.
> 
>   We're still seeing patches such as dc487044d5 ("migration: Make
>   migration_has_failed() work even for CANCELLING") and the recent patch
>   from Prasad. So there is something changing elsewhere and exposing
>   these bugs and it's not immediately clear what that something is (or
>   else we would have avoided the bug).
> 
> To be clear, I'm not asking we solve these issues at this moment, it's
> fine if we go forward with the simple scheme. I'm just trying to clarify
> what I think the problems are.
> 
> >> 
> >> > As discussed above, that'll at least break our qtests.  But frankly
> >> > speaking I think that's the right thing to do..  I hope libvirt always
> >> > works with "defer" and never update any caps/params after QMP
> >> > migrate_incoming.
> >> >
> >> > So I wonder if I should continue with above patch, and then fix our qtests.
> >> > Your work from the other "merge caps+params" might also work here,
> >> > actually, if we make sure everything will be set alone with the QMP
> >> > migrate_incoming single command.
> >> >
> >> 
> >> For incoming, yes. And this is maybe a point in favor of adding the
> >> 'config'.
> >> 
> >> For outgoing, there's still the point I mentioned above about how to
> >> restrict _some_ options to be allowed at runtime and others not.
> >> 
> >> > Let me know your initial thoughts, then I'll see what I can do..
> >> >
> >> 
> >> We should fix the bug, I think your patch is good for that.
> >> 
> >> Although this kind of overlaps with some things we've been discussing
> >> with Prasad. I'd be super happy if the code magically stopped using
> >> QAPI's MigrationStatus for internal tracking of migration state and
> >> blocking of commands and so on.
> >> 
> >> Whatever comes first =)
> >
> > I didn't follow as closely on the discussion there.  I don't know if
> > changing MigrationStatus is a good idea..  we should have some really good
> > reason to make libvirt and all mgmt need a change.. but maybe I misread the
> > discussion.  I can wait until I read something solid if Prasad is going to
> > propose something.
> >
> 
> We're moving towards decoupling MigrationStatus from these ordinary code
> checks. MigrationStatus should be used for informing management about a
> relevant change in the migration stage of progression.
> 
> As it stands, we're either "exposing implementation details" or
> "inadvertently changing the API". Both are undesirable.
> 
> Of course, we might introduce the "internal state is off-sync with
> externally visible states" issue, but hopefully we'll catch that during
> review. =)

I need to confess I'm still a bit lost on what it is about, and what is the
problem we're solving here.. But I can always wait and read the patches
when they come up.. :)

> 
> >> 
> >> ---
> >> Side note, did we ever discuss something like this?
> >> 
> >> struct MigrationState {
> >>    <state>
> >>    union {
> >>      <outgoing>
> >>      <incoming>
> >>    }
> >> }
> >> 
> >> there's so much stuff in these structs...
> >
> > Yeah..  When merging state, we'll also need to be careful on overlapped
> > fields, e.g. expecting a COMPLETED state (from a completed incoming
> > migration) when starting a new outgoing migration.
> >
> > We can likely leave this for later.
> 
> No worries, I'm just collecting your opinions on it.
> 

-- 
Peter Xu



  reply	other threads:[~2026-01-13 16:50 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22 19:25 [PATCH 00/13] migration: Threadify loadvm process Peter Xu
2025-10-22 19:26 ` [PATCH 01/13] io: Add qio_channel_wait_cond() helper Peter Xu
2025-10-23 13:02   ` Vladimir Sementsov-Ogievskiy
2025-10-24 12:00   ` Daniel P. Berrangé
2025-10-22 19:26 ` [PATCH 02/13] migration: Properly wait on G_IO_IN when peeking messages Peter Xu
2025-10-23 13:07   ` Vladimir Sementsov-Ogievskiy
2025-10-24 12:02   ` Daniel P. Berrangé
2025-10-28 18:16     ` Peter Xu
2025-10-22 19:26 ` [PATCH 03/13] migration/rdma: Fix wrong context in qio_channel_rdma_shutdown() Peter Xu
2025-10-22 19:26 ` [PATCH 04/13] migration/rdma: Allow qemu_rdma_wait_comp_channel work with thread Peter Xu
2025-10-23 13:41   ` Vladimir Sementsov-Ogievskiy
2025-11-03  7:26   ` Zhijian Li (Fujitsu)
2025-10-22 19:26 ` [PATCH 05/13] migration/rdma: Change io_create_watch() to return immediately Peter Xu
2025-11-03  7:32   ` Zhijian Li (Fujitsu)
2025-10-22 19:26 ` [PATCH 06/13] migration: Introduce WITH_BQL_HELD() / WITH_BQL_RELEASED() Peter Xu
2025-10-28 13:27   ` Vladimir Sementsov-Ogievskiy
2025-10-22 19:26 ` [PATCH 07/13] migration: Pass in bql_held information from qemu_loadvm_state() Peter Xu
2025-10-28 14:22   ` Vladimir Sementsov-Ogievskiy
2025-12-10 22:01     ` Peter Xu
2025-10-22 19:26 ` [PATCH 08/13] migration: Thread-ify precopy vmstate load process Peter Xu
2025-11-04  2:40   ` Zhijian Li (Fujitsu)
2025-12-10 22:06     ` Peter Xu
2026-01-08 20:27   ` Fabiano Rosas
2026-01-12 15:50     ` Peter Xu
2026-01-12 19:04       ` Fabiano Rosas
2026-01-12 21:07         ` Peter Xu
2026-01-13 13:04           ` Fabiano Rosas
2026-01-13 16:49             ` Peter Xu [this message]
2026-01-16 21:48               ` Fabiano Rosas
2026-01-20 16:40                 ` Peter Xu
2026-01-20 18:54                   ` Fabiano Rosas
2026-01-20 20:12                     ` Peter Xu
2026-01-17 13:57   ` Lukas Straub
2025-10-22 19:26 ` [PATCH 09/13] migration/rdma: Remove coroutine path in qemu_rdma_wait_comp_channel Peter Xu
2025-10-22 19:26 ` [PATCH 10/13] migration/postcopy: Remove workaround on wait preempt channel Peter Xu
2025-10-22 19:26 ` [PATCH 11/13] migration/ram: Remove workaround on ram yield during load Peter Xu
2025-10-22 19:26 ` [PATCH 12/13] migration: Allow blocking mode for incoming live migration Peter Xu
2025-10-22 19:26 ` [PATCH 13/13] migration/vfio: Drop BQL dependency for loadvm SWITCHOVER_START Peter Xu
2025-10-22 19:29 ` [PATCH 00/13] migration: Threadify loadvm process Peter Xu
2026-01-17 14:00 ` Lukas Straub
2026-01-20 16:43   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWZ3q3NBBCl5KTYr@x1.local \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dave@treblig.org \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=pbonzini@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    --cc=yury-kotov@yandex-team.ru \
    --cc=zhangckid@gmail.com \
    --cc=zhanghailiang@xfusion.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.