Time to introduce a migration protocol negotiation (Re: [PATCH v2 00/25] migration: Postcopy Preemption)

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
	qemu-devel@nongnu.org, Peter Xu <peterx@redhat.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Time to introduce a migration protocol negotiation (Re: [PATCH v2 00/25] migration: Postcopy Preemption)
Date: Mon, 14 Mar 2022 18:49:25 +0000	[thread overview]
Message-ID: <Yi+ONfiZlQD2LoHX@redhat.com> (raw)
In-Reply-To: <Yh5O/eq4If4MYpTq@work-vm>

On Tue, Mar 01, 2022 at 04:51:09PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Tue, Mar 01, 2022 at 10:27:10AM +0000, Daniel P. Berrangé wrote:
> > > > I also didn't know whether there's other limitations of it.  For example,
> > > > will a new socket pair be a problem for any VM environment (either a
> > > > limitation from the management app, container, and so on)?  I think it's
> > > > the same to multifd in that aspect, but I never explored.
> > > 
> > > If it needs extra sockets that is something apps will need to be aware
> > > of unfortunately and explicitly opt-in to :-( Migration is often
> > > tunnelled/proxied over other channels, so whatever does that needs to
> > > be aware of possibility of seeing extra sockets.
> > 
> > Ah, then probably it can never be the default.  But for sure it could be
> > nice that higher level can opt-in and make it a default at some point as
> > long as it knows the network topology is safe to do so.
> > 
> > > 
> > > > > > TODO List
> > > > > > =========
> > > > > > 
> > > > > > TLS support
> > > > > > -----------
> > > > > > 
> > > > > > I only noticed its missing very recently.  Since soft freeze is coming, and
> > > > > > obviously I'm still growing this series, so I tend to have the existing
> > > > > > material discussed. Let's see if it can still catch the train for QEMU 7.0
> > > > > > release (soft freeze on 2022-03-08)..
> > > > > 
> > > > > I don't like the idea of shipping something that is only half finished.
> > > > > It means that when apps probe for the feature, they'll see preempt
> > > > > capability present, but have no idea whether they're using a QEMU that
> > > > > is broken when combined with TLS or not. We shouldn't merge something
> > > > > just to meet the soft freeze deadline if we know key features are broken.
> > > > 
> > > > IMHO merging and declaring support are two problems.
> > > > 
> > > > To me, it's always fine to merge the code that implemented the fundation of a
> > > > feature.  The feature can be worked upon in the future.
> > > > 
> > > > Requiring a feature to be "complete" sometimes can cause burden to not only
> > > > the author of the series but also reviewers.  It's IMHO not necessary to
> > > > bind these two ideas.
> > > > 
> > > > It's sometimes also hard to define "complete": take the TLS as example, no
> > > > one probably even noticed that it won't work with TLS and I just noticed it
> > > > merely these two days..  We obviously can't merge partial patchset, but if
> > > > the patchset is well isolated, then it's not a blocker for merging, imho.
> > > > 
> > > > Per my understanding, what you worried is when we declare it supported but
> > > > later we never know when TLS will be ready for it.  One solution is I can
> > > > rename the capability as x-, then after the TLS side ready I drop the x-
> > > > prefix.  Then Libvirt or any mgmt software doesn't need to support this
> > > > until we drop the x-, so there's no risk of compatibility.
> > > > 
> > > > Would that sound okay to you?
> > > 
> > > If it has an x- prefix then we can basically ignore it from a mgmt app
> > > POV until it is actually finished.
> > > 
> > > > I can always step back and work on TLS first before it's merged, but again
> > > > I don't think it's required.
> > > 
> > > Apps increasingly consider use of TLS to be a mandatory feature for
> > > migration, so until that works, this preempt has to be considered
> > > unsupported & unfinished IMHO. So either TLS should be ready when
> > > it merges, or it should be clearly marked unsupported at the QAPI
> > > level.
> > 
> > Yes, I fully agree with it, and for huge vm migrations I think TLS is in
> > many cases mandatory.
> > 
> > I do plan to work on it right afterwards if this series land, but as the
> > series grows I just noticed maybe we should start landing some codes that's
> > already solid.  Landing the code as another benefit that I want to make
> > sure the code merged at least won't affect the existing features.
> > 
> > So what I'm curious is why TLS is getting quite some attentions in the past
> > few years but I didn't even see any selftests included in migration-test on
> > tls.  That's something I wanted to look into, maybe even before adding the
> > preempt+tls support. But maybe I just missed something, as I didn't use tls
> > a lot in the past.
> 
> Hmm, I think it's worth getting TLS working before putting the full
> series in, because it might impact the way you wire the channels up -
> it's going to take some care; but lets see which parts we can/should
> take.

Taking a step back here and looking at the bigger picture of
migration protocol configuration....

Almost every time we add a new feature to migration, we end up
having to define at least one new migration parameter, then wire
it up in libvirt, and then the mgmt app too, often needing to
ensure it is turn on for both client and server at the same time.

For some features, requiring an explicit opt-in could make sense,
because we don't know for sure that the feature is always a benefit.
These are things that can be thought of as workload sensitive
tunables.

For other features though, it feels like we would be better off if
we could turn it on by default with no config. These are things
that can be thought of as migration infrastructre / transport
architectural designs.

eg it would be nice to be able to use multifd by default for
migration. We would still want a tunable to control the number
of channels, but we ought to be able to just start with a default
number of channels automatically, so the tunable is only needed
for special cases.

This post-copy is another case.  We should start off knowing
we can switch to post-copy at any time. We should further be
able to add pre-emption if we find it available. IOW, we should
not have required anything more than 'switch to post-copy' to
be exposed to mgmtm apps.

Or enabling zero copy on either send or receive side.

Or enabling kernel-TLS offload

Or ..insert other interesting protocol feature...

All this stems from our current migration protocol that started
as a single unidirectional channel, which goes straight into
the migration data stream, with no protocol handshake  and
thus no feature negotiation either.

We've offloaded feature negotiation to libvirt and in turn to
the mgmt app and this is awful, for thue layers above, but
also awful for QEMU. Because multifd requires mgmt app opt-in,
we can wait 10 years and there will still be countless apps
using single-fd mode because they've not been updated to
opt-in.  If we negotiated features at QEMU level we could
have everything using multifd in a few years, and have dropped
single-fd mode a few years later.

So rather than following our historical practice, anjd adding
yet another migration parameter for a specific feature, I'd
really encourage us to put a stop to it and future proof
ourselves.

Introduce one *final-no-more-never-again-after-this* migration
capability called "protocol-negotiation".

When that capability is set, first declare that henceforth the
migration transport is REQUIRED to support **multiple**,
**bi-directional** channels. We might only use 1 TCP channel
in some cases, but it declares our intent that we expect to be
able to use as many channels as we see fit henceforth.

Now define a protocol handshake. A 5 minute thought experiment
starts off with something simple:

   dst -> src:  Greeting Message:
                  Magic: "QEMU-MIGRATE"  12 bytes
                  Num Versions: 1 byte
                  Version list: 1 byte * num versions
                  Num features: 4 bytes
                  Feature list: string * num features

   src -> dst:  Greeting Reply:
                  Magic: "QEMU-MIGRATE" 12 bytes
                  Select version: 1 byte
                  Num select features: 4 bytes
                  Selected features: string * num features   

   .... possibly more src <-> dst messages depending on
        features negotiated....

   src -> dst:  start migration

    ...traditional migration stream runs now for the remainder
       of this connection ...

I suggest "dst" starts first, so that connecting to a dst lets you
easily debug whether QEMU is speaking v2 or just waiting for the
client to send something as traditionally the case.

This shouldn't need very much code, and it gives us flexibility
to do all sorts of interesting things going forward with less
overhead for everyone involved.

We can layer in a real authentication system like SASL after
the greeting without any libvirt / mgmt app support

We can enable zero-copy at will. We can enable kernel-TLS at
will. We can add new TCP connections for clever feature XYZ.

We get a back channel every time, so dst can pass info back
to the src to optimize behaviour.

We can experiment with features and throw them away again
later without involving the mgmt app, since we negotiate
their use.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2022-03-14 18:52 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01  8:39 [PATCH v2 00/25] migration: Postcopy Preemption Peter Xu
2022-03-01  8:39 ` [PATCH v2 01/25] migration: Dump sub-cmd name in loadvm_process_command tp Peter Xu
2022-03-01  8:39 ` [PATCH v2 02/25] migration: Finer grained tracepoints for POSTCOPY_LISTEN Peter Xu
2022-03-01  8:39 ` [PATCH v2 03/25] migration: Tracepoint change in postcopy-run bottom half Peter Xu
2022-03-01  8:39 ` [PATCH v2 04/25] migration: Introduce postcopy channels on dest node Peter Xu
2022-03-01  8:39 ` [PATCH v2 05/25] migration: Dump ramblock and offset too when non-same-page detected Peter Xu
2022-03-01  8:39 ` [PATCH v2 06/25] migration: Add postcopy_thread_create() Peter Xu
2022-03-01  8:39 ` [PATCH v2 07/25] migration: Move static var in ram_block_from_stream() into global Peter Xu
2022-03-01  8:39 ` [PATCH v2 08/25] migration: Add pss.postcopy_requested status Peter Xu
2022-03-01  8:39 ` [PATCH v2 09/25] migration: Move migrate_allow_multifd and helpers into migration.c Peter Xu
2022-03-01  8:39 ` [PATCH v2 10/25] migration: Enlarge postcopy recovery to capture !-EIO too Peter Xu
2022-03-01  8:39 ` [PATCH v2 11/25] migration: postcopy_pause_fault_thread() never fails Peter Xu
2022-03-01  8:39 ` [PATCH v2 12/25] migration: Export ram_load_postcopy() Peter Xu
2022-03-01  8:39 ` [PATCH v2 13/25] migration: Move channel setup out of postcopy_try_recover() Peter Xu
2022-03-01  8:39 ` [PATCH v2 14/25] migration: Add migration_incoming_transport_cleanup() Peter Xu
2022-03-01  8:39 ` [PATCH v2 15/25] migration: Allow migrate-recover to run multiple times Peter Xu
2022-03-01  8:39 ` [PATCH v2 16/25] migration: Add postcopy-preempt capability Peter Xu
2022-03-01  8:39 ` [PATCH v2 17/25] migration: Postcopy preemption preparation on channel creation Peter Xu
2022-03-01  8:39 ` [PATCH v2 18/25] migration: Postcopy preemption enablement Peter Xu
2022-03-01  8:39 ` [PATCH v2 19/25] migration: Postcopy recover with preempt enabled Peter Xu
2022-03-01  8:39 ` [PATCH v2 20/25] migration: Create the postcopy preempt channel asynchronously Peter Xu
2022-03-01  8:39 ` [PATCH v2 21/25] migration: Parameter x-postcopy-preempt-break-huge Peter Xu
2022-03-01  8:39 ` [PATCH v2 22/25] migration: Add helpers to detect TLS capability Peter Xu
2022-03-01  8:39 ` [PATCH v2 23/25] migration: Fail postcopy preempt with TLS for now Peter Xu
2022-03-01  8:39 ` [PATCH v2 24/25] tests: Add postcopy preempt test Peter Xu
2022-03-01  8:39 ` [PATCH v2 25/25] tests: Pass in MigrateStart** into test_migrate_start() Peter Xu
2022-03-02 12:11   ` Dr. David Alan Gilbert
2022-03-01  9:25 ` [PATCH v2 00/25] migration: Postcopy Preemption Daniel P. Berrangé
2022-03-01 10:17   ` Peter Xu
2022-03-01 10:27     ` Daniel P. Berrangé
2022-03-01 10:55       ` Peter Xu
2022-03-01 16:51         ` Dr. David Alan Gilbert
2022-03-02  1:46           ` Peter Xu
2022-03-14 18:49           ` Daniel P. Berrangé [this message]
2022-03-15  6:13             ` Time to introduce a migration protocol negotiation (Re: [PATCH v2 00/25] migration: Postcopy Preemption) Peter Xu
2022-03-15 11:15               ` Daniel P. Berrangé
2022-03-16  3:30                 ` Peter Xu
2022-03-16  9:59                   ` Daniel P. Berrangé
2022-03-16 10:40                     ` Peter Xu
2022-03-16 11:00                       ` Daniel P. Berrangé
2022-03-18  7:08                         ` Peter Xu
2022-03-15 10:43             ` Dr. David Alan Gilbert
2022-03-15 11:05               ` Daniel P. Berrangé
2022-03-01 18:05         ` [PATCH v2 00/25] migration: Postcopy Preemption Daniel P. Berrangé
2022-03-02  1:48           ` Peter Xu
2022-03-02 12:14 ` Dr. David Alan Gilbert
2022-03-02 12:34   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yi+ONfiZlQD2LoHX@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).