All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: Time to introduce a migration protocol negotiation (Re: [PATCH v2 00/25] migration: Postcopy Preemption)
Date: Tue, 15 Mar 2022 11:15:41 +0000	[thread overview]
Message-ID: <YjB1XXzIsJWtSR4E@redhat.com> (raw)
In-Reply-To: <YjAul3GIWmB3+v0P@xz-m1.local>

On Tue, Mar 15, 2022 at 02:13:43PM +0800, Peter Xu wrote:
> On Mon, Mar 14, 2022 at 06:49:25PM +0000, Daniel P. Berrangé wrote:
> > Taking a step back here and looking at the bigger picture of
> > migration protocol configuration....
> > 
> > Almost every time we add a new feature to migration, we end up
> > having to define at least one new migration parameter, then wire
> > it up in libvirt, and then the mgmt app too, often needing to
> > ensure it is turn on for both client and server at the same time.
> > 
> > 
> > For some features, requiring an explicit opt-in could make sense,
> > because we don't know for sure that the feature is always a benefit.
> > These are things that can be thought of as workload sensitive
> > tunables.
> > 
> > 
> > For other features though, it feels like we would be better off if
> > we could turn it on by default with no config. These are things
> > that can be thought of as migration infrastructre / transport
> > architectural designs.
> 
> Thanks for raising this discussion.  That's something I wanted to raise too
> but I just haven't, at least formally.
> 
> Actually I think I raised this question once or twice, but I just didn't
> insist trying. :)
> 
> > 
> > 
> > eg it would be nice to be able to use multifd by default for
> > migration. We would still want a tunable to control the number
> > of channels, but we ought to be able to just start with a default
> > number of channels automatically, so the tunable is only needed
> > for special cases.
> 
> I still remember you mentioned the upper layer softwares can have
> assumption on using only 1 pair of socket for migration, I think that makes
> postcopy-preempt by default impossible.
> 
> Why multifd is different here?

It isn't different. We went through the pain to extending libvirt
to know how to open many channels for multifd. We'll have todo
the same with this postcopy-pre-empt. To this day though, management
apps above libvirt largely don't enable multifd, which is a real
shame. This is the key reason I think we need to handle this at
the QEMU level automatically.

> > This post-copy is another case.  We should start off knowing
> > we can switch to post-copy at any time.
> 
> This one is kind of special and it'll be harder, IMHO.
> 
> AFAIU, postcopy users will always initiate the migration with at least a
> full round of precopy, with the hope that all the static guest pages will
> be migrated.

I think I didn't explain myself properly here. Today there are
two parts to postcopy usage in libvirt

  - Pass the "VIR_MIGRATE_POSTCOPY" when starting the migration.
    The migration still runs in pre-copy mode. This merely ensures
    we configure a bi-directional socket, so the app has the option
    to swtich to postcopy later

  - Invoke virDomainMigrateStartPostCopy  to flip from pre-copy
    to post-copy phase. This requires you previously passed
    VIR_MIGRATE_POSTCOPY to enable its use.

The first point using 'VIR_MIGRATE_POSTCOPY' should not exist.
That should be automaticaly negotiated and handled by QEMU.

Libvirt and mgmt apps should only need to care about whether
or not they call virDomainMigrateStartPostCopy to flip to
post-copy mode.

> > We should further be able to add pre-emption if we find it available.
> 
> Yeah here I have the same question per multifd above.  I just have no idea
> whether QEMU has such knowledge on making this decision.  E.g., how could
> QEMU know whether upper app is not tunneling the migration stream?  How
> could QEMU know whether the upper app could handle multiple tcp sockets
> well?

It can't do this today - that's why we need the new migration protocol
feature negotiation I describe below.

> > So rather than following our historical practice, anjd adding
> > yet another migration parameter for a specific feature, I'd
> > really encourage us to put a stop to it and future proof
> > ourselves.
> > 
> > 
> > Introduce one *final-no-more-never-again-after-this* migration
> > capability called "protocol-negotiation".
> 
> Let's see how Juan/Dave/others think.. anyway, that's something I always
> wanted.
> 
> IMHO an even simpler term can be as simple as:
> 
>   -global migration.handshake=on

This is just inventing a new migration capability framework. We
can just use existing QMP for this.

> > When that capability is set, first declare that henceforth the
> > migration transport is REQUIRED to support **multiple**,
> > **bi-directional** channels.
> 
> This new capability will simply need to depend on the return-path
> capability we already have.  E.g. exec-typed migration won't be able to
> enable return-path, so not applicable to this one too.

'exec' can be made to work if desired. Currently we only create
a unidirectuional pipe and wire it up to stdin for outgoing
migration. Nothing stops us declaring 'exec' uses a socketpair
wired to stdin + stdout, and supprot invoking 'exec' multiple
times to get many sockets

> > Now define a protocol handshake. A 5 minute thought experiment
> > starts off with something simple:
> > 
> >    dst -> src:  Greeting Message:
> >                   Magic: "QEMU-MIGRATE"  12 bytes
> >                   Num Versions: 1 byte
> >                   Version list: 1 byte * num versions
> >                   Num features: 4 bytes
> >                   Feature list: string * num features
> > 
> >    src -> dst:  Greeting Reply:
> >                   Magic: "QEMU-MIGRATE" 12 bytes
> >                   Select version: 1 byte
> >                   Num select features: 4 bytes
> >                   Selected features: string * num features   
> > 
> >    .... possibly more src <-> dst messages depending on
> >         features negotiated....
> > 
> >    src -> dst:  start migration
> >  
> >     ...traditional migration stream runs now for the remainder
> >        of this connection ...
> > 
> > 
> > 
> > I suggest "dst" starts first, so that connecting to a dst lets you
> > easily debug whether QEMU is speaking v2 or just waiting for the
> > client to send something as traditionally the case.
> 
> No strong opinion on which QEMU should start the conversation, just to
> mention that we may not be able to use this to identify whether it's an old
> or new QEMU, afaiu, because of network delays?
> 
> We can never tell whether the dest QEMU didn't talk is because it's an old
> binary or it's new binary but with high latency network.

Sure, you wouldn't want to functionally rely on it. It is mostly
just a debugging aid so you can port scan and show it is new
QEMU migration protocol not the old one.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2022-03-15 11:23 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01  8:39 [PATCH v2 00/25] migration: Postcopy Preemption Peter Xu
2022-03-01  8:39 ` [PATCH v2 01/25] migration: Dump sub-cmd name in loadvm_process_command tp Peter Xu
2022-03-01  8:39 ` [PATCH v2 02/25] migration: Finer grained tracepoints for POSTCOPY_LISTEN Peter Xu
2022-03-01  8:39 ` [PATCH v2 03/25] migration: Tracepoint change in postcopy-run bottom half Peter Xu
2022-03-01  8:39 ` [PATCH v2 04/25] migration: Introduce postcopy channels on dest node Peter Xu
2022-03-01  8:39 ` [PATCH v2 05/25] migration: Dump ramblock and offset too when non-same-page detected Peter Xu
2022-03-01  8:39 ` [PATCH v2 06/25] migration: Add postcopy_thread_create() Peter Xu
2022-03-01  8:39 ` [PATCH v2 07/25] migration: Move static var in ram_block_from_stream() into global Peter Xu
2022-03-01  8:39 ` [PATCH v2 08/25] migration: Add pss.postcopy_requested status Peter Xu
2022-03-01  8:39 ` [PATCH v2 09/25] migration: Move migrate_allow_multifd and helpers into migration.c Peter Xu
2022-03-01  8:39 ` [PATCH v2 10/25] migration: Enlarge postcopy recovery to capture !-EIO too Peter Xu
2022-03-01  8:39 ` [PATCH v2 11/25] migration: postcopy_pause_fault_thread() never fails Peter Xu
2022-03-01  8:39 ` [PATCH v2 12/25] migration: Export ram_load_postcopy() Peter Xu
2022-03-01  8:39 ` [PATCH v2 13/25] migration: Move channel setup out of postcopy_try_recover() Peter Xu
2022-03-01  8:39 ` [PATCH v2 14/25] migration: Add migration_incoming_transport_cleanup() Peter Xu
2022-03-01  8:39 ` [PATCH v2 15/25] migration: Allow migrate-recover to run multiple times Peter Xu
2022-03-01  8:39 ` [PATCH v2 16/25] migration: Add postcopy-preempt capability Peter Xu
2022-03-01  8:39 ` [PATCH v2 17/25] migration: Postcopy preemption preparation on channel creation Peter Xu
2022-03-01  8:39 ` [PATCH v2 18/25] migration: Postcopy preemption enablement Peter Xu
2022-03-01  8:39 ` [PATCH v2 19/25] migration: Postcopy recover with preempt enabled Peter Xu
2022-03-01  8:39 ` [PATCH v2 20/25] migration: Create the postcopy preempt channel asynchronously Peter Xu
2022-03-01  8:39 ` [PATCH v2 21/25] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
2022-03-01  8:39 ` [PATCH v2 22/25] migration: Add helpers to detect TLS capability Peter Xu
2022-03-01  8:39 ` [PATCH v2 23/25] migration: Fail postcopy preempt with TLS for now Peter Xu
2022-03-01  8:39 ` [PATCH v2 24/25] tests: Add postcopy preempt test Peter Xu
2022-03-01  8:39 ` [PATCH v2 25/25] tests: Pass in MigrateStart** into test_migrate_start() Peter Xu
2022-03-02 12:11   ` Dr. David Alan Gilbert
2022-03-01  9:25 ` [PATCH v2 00/25] migration: Postcopy Preemption Daniel P. Berrangé
2022-03-01 10:17   ` Peter Xu
2022-03-01 10:27     ` Daniel P. Berrangé
2022-03-01 10:55       ` Peter Xu
2022-03-01 16:51         ` Dr. David Alan Gilbert
2022-03-02  1:46           ` Peter Xu
2022-03-14 18:49           ` Time to introduce a migration protocol negotiation (Re: [PATCH v2 00/25] migration: Postcopy Preemption) Daniel P. Berrangé
2022-03-15  6:13             ` Peter Xu
2022-03-15 11:15               ` Daniel P. Berrangé [this message]
2022-03-16  3:30                 ` Peter Xu
2022-03-16  9:59                   ` Daniel P. Berrangé
2022-03-16 10:40                     ` Peter Xu
2022-03-16 11:00                       ` Daniel P. Berrangé
2022-03-18  7:08                         ` Peter Xu
2022-03-15 10:43             ` Dr. David Alan Gilbert
2022-03-15 11:05               ` Daniel P. Berrangé
2022-03-01 18:05         ` [PATCH v2 00/25] migration: Postcopy Preemption Daniel P. Berrangé
2022-03-02  1:48           ` Peter Xu
2022-03-02 12:14 ` Dr. David Alan Gilbert
2022-03-02 12:34   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YjB1XXzIsJWtSR4E@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.