From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Manish Mishra <manish.mishra@nutanix.com>,
qemu-devel <qemu-devel@nongnu.org>,
Juan Quintela <quintela@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: MultiFD and default channel out of order mapping on receive side.
Date: Tue, 18 Oct 2022 09:18:28 +0100 [thread overview]
Message-ID: <Y05hVC7AXdc0Ak4z@redhat.com> (raw)
In-Reply-To: <Y03F97gmi7N4cyMM@x1n>
On Mon, Oct 17, 2022 at 05:15:35PM -0400, Peter Xu wrote:
> On Mon, Oct 17, 2022 at 12:38:30PM +0100, Daniel P. Berrangé wrote:
> > On Mon, Oct 17, 2022 at 01:06:00PM +0530, manish.mishra wrote:
> > > Hi Daniel,
> > >
> > > I was thinking for some solutions for this so wanted to discuss that before going ahead. Also added Juan and Peter in loop.
> > >
> > > 1. Earlier i was thinking, on destination side as of now for default
> > > and multi-FD channel first data to be sent is MAGIC_NUMBER and VERSION
> > > so may be we can decide mapping based on that. But then that does not
> > > work for newly added post copy preempt channel as it does not send
> > > any MAGIC number. Also even for multiFD just MAGIC number does not
> > > tell which multifd channel number is it, even though as per my thinking
> > > it does not matter. So MAGIC number should be good for indentifying
> > > default vs multiFD channel?
> >
> > Yep, you don't need to know more than the MAGIC value.
> >
> > In migration_io_process_incoming, we need to use MSG_PEEK to look at
> > the first 4 bytes pendingon the wire. If those bytes are 'QEVM' that's
> > the primary channel, if those bytes are big endian 0x11223344, that's
> > a multifd channel. Using MSG_PEEK aviods need to modify thue later
> > code that actually reads this data.
> >
> > The challenge is how long to wait with the MSG_PEEK. If we do it
> > in a blocking mode, its fine for main channel and multifd, but
> > IIUC for the post-copy pre-empt channel we'd be waiting for
> > something that will never arrive.
> >
> > Having suggested MSG_PEEK though, this may well not work if the
> > channel has TLS present. In fact it almost definitely won't work.
> >
> > To cope with TLS migration_io_process_incoming would need to
> > actually read the data off the wire, and later methods be
> > taught to skip reading the magic.
> >
> > > 2. For post-copy preempt may be we can initiate this channel only
> > > after we have received a request from remote e.g. remote page fault.
> > > This to me looks safest considering post-copy recorvery case too.
> > > I can not think of any depedency on post copy preempt channel which
> > > requires it to be initialised very early. May be Peter can confirm
> > > this.
> >
> > I guess that could work
>
> Currently all preempt code still assumes when postcopy activated it's in
> preempt mode. IIUC such a change will bring an extra phase of postcopy
> with no-preempt before preempt enabled. We may need to teach qemu to
> understand that if it's needed.
>
> Meanwhile the initial page requests will not be able to benefit from the
> new preempt channel too.
>
> >
> > > 3. Another thing we can do is to have 2-way handshake on every
> > > channel creation with some additional metadata, this to me looks
> > > like cleanest approach and durable, i understand that can break
> > > migration to/from old qemu, but then that can come as migration
> > > capability?
> >
> > The benefit of (1) is that the fix can be deployed for all existing
> > QEMU releases by backporting it. (3) will meanwhile need mgmt app
> > updates to make it work, which is much more work to deploy.
> >
> > We really shoulud have had a more formal handshake, and I've described
> > ways to achieve this in the past, but it is quite alot of work.
>
> I don't know whether (1) is a valid option if there are use cases that it
> cannot cover (on either tls or preempt). The handshake is definitely the
> clean approach.
>
> What's the outcome of such wrongly ordered connections? Will migration
> fail immediately and safely?
>
> For multifd, I think it should fail immediately after the connection
> established.
>
> For preempt, I'd also expect the same thing because the only wrong order to
> happen right now is having the preempt channel to be the migration channel,
> then it should also fail immediately on the first qemu_get_byte().
>
> Hopefully that's still not too bad - I mean, if we can fail constantly and
> safely (never fail during postcopy), we can always retry and as long as
> connections created successfully we can start the migration safely. But
> please correct me if it's not the case.
It should typically fail as the magic bytes are different, which will not
pass validation. The exception being the postcopy pre-empt channel which
may well cause migration to stall as nothing will be sent initially by
the src.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2022-10-18 8:25 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-12 19:53 MultiFD and default channel out of order mapping on receive side manish.mishra
2022-10-13 8:15 ` Daniel P. Berrangé
2022-10-13 8:56 ` manish.mishra
2022-10-17 7:36 ` manish.mishra
2022-10-17 11:38 ` Daniel P. Berrangé
2022-10-17 21:15 ` Peter Xu
2022-10-18 8:18 ` Daniel P. Berrangé [this message]
2022-10-18 14:51 ` Peter Xu
2022-10-18 21:00 ` Peter Xu
2022-10-20 14:44 ` manish.mishra
2022-10-20 16:32 ` Peter Xu
2022-10-20 22:07 ` Daniel P. Berrangé
2022-10-21 8:13 ` manish.mishra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y05hVC7AXdc0Ak4z@redhat.com \
--to=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=manish.mishra@nutanix.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).