All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dave@treblig.org>
To: Juraj Marcin <jmarcin@redhat.com>
Cc: Peter Xu <peterx@redhat.com>, Jiri Denemark <jdenemar@redhat.com>,
	qemu-devel@nongnu.org, Stefan Weil <sw@weilnetz.de>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [RFC PATCH 0/4] migration: Introduce postcopy-setup capability and state
Date: Wed, 3 Sep 2025 12:00:11 +0000	[thread overview]
Message-ID: <aLgtyy_UAfsmOLET@gallifrey> (raw)
In-Reply-To: <w6qkokuof6ge2gwajhcwul5boaqf57w6m4yzsbyljpgpnigc64@pw2unqceumjn>

* Juraj Marcin (jmarcin@redhat.com) wrote:
> Hi Dave,
> 
> On 2025-09-01 17:57, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > On Thu, Aug 14, 2025 at 05:42:23PM +0200, Juraj Marcin wrote:
> > > > Fair point, I'll then continue with the PING/PONG solution, the first
> > > > implementation I have seems to be working to resolve Issue 1.
> > > > 
> > > > For rarer split brain, we'll rely on block device locks/mgmt to resolve
> > > > and change the failure handling, so it registers errors from disk
> > > > activation.
> > > > 
> > > > As tested, there should be no problems with the destination
> > > > transitioning to POSTCOPY_PAUSED, since the VM was not started yet.
> > > > 
> > > > However, to prevent the source side from transitioning to
> > > > POSTCOPY_PAUSED, I think adding a new state is still the best option.
> > > > 
> > > > I tried keeping the migration states as they are now and just rely on an
> > > > attribute of MigrationState if 3rd PONG was received, however, this
> > > > collides with (at least) migrate_pause tests, that are waiting for
> > > > POSTCOPY_ACTIVE, and then pause the migration triggering the source to
> > > > resume. We could maybe work around it by waiting for the 3rd pong
> > > > instead, but I am not sure if it is possible from tests, or by not
> > > > resuming if migrate_pause command is executed?
> > > > 
> > > > I also tried extending the span of the DEVICE state, but some functions
> > > > behave differently depending on if they are in postcopy or not, using
> > > > the migration_in_postcopy() function, but adding the DEVICE there isn't
> > > > working either. And treating the DEVICE state sometimes as postcopy and
> > > > sometimes as not seems just too messy, if it would even be possible.
> > > 
> > > Yeah, it might indeed be a bit messy.
> > > 
> > > Is it possible to find a middle ground?  E.g. add postcopy-setup status,
> > > but without any new knob to enable it?  Just to describe the period of time
> > > where dest QEMU haven't started running but started loading device states.
> > > 
> > > The hope is libvirt (which, AFAIU, always enables the "events" capability)
> > > can ignore the new postcopy-setup status transition, then maybe we can also
> > > introduce the postcopy-setup and make it always appear.
> > 
> > When the destination is started with '-S' (autostart=false), which is what
> > I think libvirt does, doesn't management only start the destination
> > after a certain useful event?
> > In other words, is there an event we already emit to say that the destination
> > has finished loading the postcopy devices, or could we just add that
> > event, so that management could just wait for that before issuing
> > the continue?
> 
> I am not aware of any such event on the destination side. When postcopy
> (and its switchower) starts, the destination transitions from ACTIVE
> directly to POSTCOPY_ACTIVE in the listen thread while devices are
> loaded concurrently by the main thread.
> 
> There is DEVICE state on the source side, but that is used only on the
> source side when device state is being collected. When device state is
> being loaded on the destination, the source side is also already in
> POSTCOPY_ACTIVE state.

So I wonder what libvirt uses to trigger it starting the destination in
the postcopy case?  It's got to be after the device state has loaded.

Dave

> Best regards,
> 
> Juraj Marcin
> 
> > 
> > Dave
> > 
> > > Thanks,
> > > 
> > > -- 
> > > Peter Xu
> > > 
> > > 
> > -- 
> >  -----Open up your eyes, open up your mind, open up your code -------   
> > / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
> > \        dave @ treblig.org |                               | In Hex /
> >  \ _________________________|_____ http://www.treblig.org   |_______/
> > 
> 
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/


  reply	other threads:[~2025-09-03 12:01 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07 11:49 [RFC PATCH 0/4] migration: Introduce postcopy-setup capability and state Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 1/4] qemu-thread: Introduce qemu_thread_detach() Juraj Marcin
2025-08-19 10:37   ` Daniel P. Berrangé
2025-08-07 11:49 ` [RFC PATCH 2/4] migration: Fix state transition in postcopy_start() error handling Juraj Marcin
2025-08-07 20:54   ` Peter Xu
2025-08-08  9:44     ` Juraj Marcin
2025-08-08 16:00       ` Peter Xu
2025-08-08 19:08     ` Fabiano Rosas
2025-08-11 13:00       ` Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 3/4] migration: Make listen thread joinable Juraj Marcin
2025-08-07 20:57   ` Peter Xu
2025-08-08 11:08     ` Juraj Marcin
2025-08-08 17:05       ` Peter Xu
2025-08-11 13:02         ` Juraj Marcin
2025-08-07 11:49 ` [RFC PATCH 4/4] migration: Introduce postcopy-setup capability and state Juraj Marcin
2025-08-11 14:54 ` [RFC PATCH 0/4] " Peter Xu
2025-08-12 13:34   ` Juraj Marcin
2025-08-13 17:42     ` Peter Xu
2025-08-14 15:42       ` Juraj Marcin
2025-08-14 19:24         ` Peter Xu
2025-08-15  6:35           ` Juraj Marcin
2025-09-01 17:57           ` Dr. David Alan Gilbert
2025-09-02  8:30             ` Juraj Marcin
2025-09-03 12:00               ` Dr. David Alan Gilbert [this message]
2025-09-03 13:07                 ` Peter Xu
2025-09-04 16:11                 ` Juraj Marcin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aLgtyy_UAfsmOLET@gallifrey \
    --to=dave@treblig.org \
    --cc=farosas@suse.de \
    --cc=jdenemar@redhat.com \
    --cc=jmarcin@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.