qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Juraj Marcin <jmarcin@redhat.com>
Cc: qemu-devel@nongnu.org, Jiri Denemark <jdenemar@redhat.com>,
	"Dr. David Alan Gilbert" <dave@treblig.org>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH 4/4] migration: Introduce POSTCOPY_DEVICE state
Date: Fri, 19 Sep 2025 13:50:48 -0400	[thread overview]
Message-ID: <aM2X-N9gXvFxxdvI@x1.local> (raw)
In-Reply-To: <aM2LoGDh5WsVnEi8@x1.local>

On Fri, Sep 19, 2025 at 12:58:08PM -0400, Peter Xu wrote:
> > @@ -2564,6 +2569,11 @@ static void *source_return_path_thread(void *opaque)
> >              tmp32 = ldl_be_p(buf);
> >              trace_source_return_path_thread_pong(tmp32);
> >              qemu_sem_post(&ms->rp_state.rp_pong_acks);
> > +            if (tmp32 == QEMU_VM_PING_PACKAGED_LOADED) {
> > +                trace_source_return_path_thread_dst_started();
> > +                migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_DEVICE,
> > +                                  MIGRATION_STATUS_POSTCOPY_ACTIVE);
> 
> Could this race with the migration thread modifying the state concurrently?
> 
> To avoid it, we could have a bool, set it here once, and in the iterations
> do something like:
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 10c216d25d..55230e10ee 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3449,6 +3449,16 @@ static MigIterateState migration_iteration_run(MigrationState *s)
>      trace_migrate_pending_estimate(pending_size, must_precopy, can_postcopy);
>  
>      if (in_postcopy) {
> +        if (s->postcopy_package_loaded) {
> +            assert(s->state == MIGRATION_STATUS_POSTCOPY_DEVICE);
> +            migrate_set_state(s->state, MIGRATION_STATUS_POSTCOPY_DEVICE,
> +                              MIGRATION_STATUS_POSTCOPY_ACTIVE);
> +            /*
> +             * Since postcopy cannot be re-initiated, this flag will only
> +             * be set at most once for QEMU's whole lifecyce.
> +             */
> +            s->postcopy_package_loaded = false;
> +        }
>          /*
>           * Iterate in postcopy until all pending data flushed.  Note that
>           * postcopy completion doesn't rely on can_switchover, because when

[...]

> > @@ -2871,7 +2882,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
> >  
> >      /* Now, switchover looks all fine, switching to postcopy-active */
> >      migrate_set_state(&ms->state, MIGRATION_STATUS_DEVICE,
> > -                      MIGRATION_STATUS_POSTCOPY_ACTIVE);
> > +                      MIGRATION_STATUS_POSTCOPY_DEVICE);
> >  
> >      bql_unlock();
> >  
> > @@ -3035,7 +3046,8 @@ static void migration_completion(MigrationState *s)
> >  
> >      if (s->state == MIGRATION_STATUS_ACTIVE) {
> >          ret = migration_completion_precopy(s);
> > -    } else if (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> > +    } else if (s->state == MIGRATION_STATUS_POSTCOPY_DEVICE ||
> > +               s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> 
> Exactly.  We need to be prepared that src sending too fast so when device
> loading on dest we finished.

One thing more to mention here.. which may void some previous comments I
left.  Let's discuss.

I think it may also make sense to only allow a COMPLETE after
POSTCOPY_ACTIVE.

That is, if src sends too fast to have finished sending everything,
reaching COMPLETE during POSTCOPY_DEVICE, that is, while before it receives
the new PONG you defined, then.. I _think_ it is better to wait for that.

If it finally arrives, then it's perfect, we switch to POSTCOPY_ACTIVE,
then continue the completion.

If the channel is broken before its arrival, logically we should handle
this case as a FAILURE and restart the VM on src.

It's only relevant in a corner case, but does that sound better?

-- 
Peter Xu



  reply	other threads:[~2025-09-19 17:52 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-15 11:59 [PATCH 0/4] migration: Introduce POSTCOPY_DEVICE state Juraj Marcin
2025-09-15 11:59 ` [PATCH 1/4] migration: Do not try to start VM if disk activation fails Juraj Marcin
2025-09-19 16:12   ` Fabiano Rosas
2025-09-15 11:59 ` [PATCH 2/4] migration: Accept MigrationStatus in migration_has_failed() Juraj Marcin
2025-09-19 14:57   ` Peter Xu
2025-09-22 11:26     ` Juraj Marcin
2025-09-15 11:59 ` [PATCH 3/4] migration: Refactor incoming cleanup into migration_incoming_finish() Juraj Marcin
2025-09-19 15:53   ` Peter Xu
2025-09-19 16:46   ` Fabiano Rosas
2025-09-22 12:58     ` Juraj Marcin
2025-09-22 15:51       ` Peter Xu
2025-09-22 17:40         ` Fabiano Rosas
2025-09-22 17:48           ` Peter Xu
2025-09-23 14:58         ` Juraj Marcin
2025-09-23 16:17           ` Peter Xu
2025-09-15 11:59 ` [PATCH 4/4] migration: Introduce POSTCOPY_DEVICE state Juraj Marcin
2025-09-19 16:58   ` Peter Xu
2025-09-19 17:50     ` Peter Xu [this message]
2025-09-22 13:34       ` Juraj Marcin
2025-09-22 16:16         ` Peter Xu
2025-09-23 14:23           ` Juraj Marcin
2025-09-25 11:54   ` Jiří Denemark
2025-09-25 18:22     ` Peter Xu
2025-09-30  7:53       ` Jiří Denemark
2025-09-30 20:04         ` Peter Xu
2025-10-01  8:43           ` Jiří Denemark
2025-10-01 11:05             ` Dr. David Alan Gilbert
2025-10-01 14:26               ` Jiří Denemark
2025-10-01 15:53                 ` Dr. David Alan Gilbert
2025-10-01 15:10               ` Daniel P. Berrangé
2025-10-02 12:17                 ` Jiří Denemark
2025-10-02 13:12                   ` Dr. David Alan Gilbert
2025-10-01 10:09           ` Juraj Marcin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aM2X-N9gXvFxxdvI@x1.local \
    --to=peterx@redhat.com \
    --cc=dave@treblig.org \
    --cc=farosas@suse.de \
    --cc=jdenemar@redhat.com \
    --cc=jmarcin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).