Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, amit.shah@redhat.com
Subject: Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
Date: Tue, 26 Apr 2016 13:38:04 +0100	[thread overview]
Message-ID: <20160426123803.GB2228@work-vm> (raw)
In-Reply-To: <87wpnl8wab.fsf@emacs.mitica>

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> Hi
> >> 
> >> This patch series is "an" initial implementation of multiple fd migration.
> >> This is to get something out for others to comment, it is not finished at all.
> >
> > I've had a quick skim:
> >   a) I think mst is right about the risk of getting stale pages out of order.
> 
> I have been thinking about this.  We just need to send a "we have
> finish" this round packet.  And reception has to wait for all threads to
> finish before continue.  It is easier and not expensive.  We never
> resend the same page during the same round. 

Yes.

> >   b) Since you don't change the URI at all, it's a bit restricted; for example,
> >      it means I can't run separate sessions over different NICs unless I've
> >      done something clever at the routing/or bonded them.
> >      One thing I liked the sound of multi-fd for is NUMA; get a BIG box
> >      and give each numa node a separate NIC and run a separate thread on each
> >      node.
> 
> If we want this _how_ we want to configure it.  This was part of the
> reason to post the patch.  It works only for tcp, I don't even try the
> others, just to see what people want.

I was thinking this would work even for TCP; you'd just need a way to pass
different URIs (with address/port) for each connection.

> >   c) Hmm we do still have a single thread doing all the bitmap syncing and scanning,
> >      we'll have to watch out if that is the bottleneck at all.
> 
> Yeap.  My idea here was to still maintain the bitmap scanning on the
> main thread, but send work to the "worker threads" in batches, not in
> single pages.  But I haven't really profiled how long we spend there.

Yeh, it would be interesting to see what this profile looked like; if we
suddenly found that main thread had spare cycles perhaps we could do some
more interesting types of scanning.

> >   d) All the zero testing is still done in the main thread which we know is
> >      expensive.
> 
> Not trivial if we don't want to send control information over the
> "other" channels.  One solution would be split the main memory in
> different "main" threads.  No performance profiles.

Yes, and it's tricky because the order is:
   1) Send control information
   2) Farm it out to individual thread

  It's too late for '2' to say 'it's zero'.

> >   e) Do we need to do something for security with having multiple ports? How
> >      do we check that nothing snuck in on one of our extra ports, have we got
> >      sanity checks to make sure it's actually the right stream.
> 
> 
> We only have a single port.  We opened it several times.  It shouldn't
> require changes in either libvirt/firewall.  (Famous last words)

True I guess.

> 
> >   f) You're handing out pages to the sending threads on the basis of which one
> >      is free (in the same way as the multi threaded compression); but I think
> >      it needs some sanity adding to only hand out whole host pages - it feels
> >      like receiving all the chunks of one host page down separate FDs would
> >      be horrible.
> 
> Trivial optimization would be to send _whole_ huge pages in one go.  I
> wanted comments about what people wanted here.  My idea was really to
> add multipage or several pages in one go.  Would reduce synchronization
> a lot.   I do to the 1st that becomes free because ...... I don't know
> how long a specific transmission is going to take.  TCP for you :-(

Sending huge pages would be very nice; the tricky thing is you don't want to send
a huge page unless it's all marked dirty.

> >   g) I think you might be able to combine the compression into the same threads;
> >      so that if multi-fd + multi-threaded-compresison is set you don't end
> >      up with 2 sets of threads and it might be the simplest way to make them
> >      work together.
> 
> Yeap, I thought that.  But I didn't want to merge them in a first
> stage.  It makes much more sense to _not_ send the compressed data
> through the main channel.  But that would be v2 (or 3, or 4 ...)

Right.

> >   h) You've used the last free RAM_SAVE_FLAG!  And the person who takes the last
> >      slice^Wbit has to get some more.
> >      Since arm, ppc, and 68k have variants that have TARGET_PAGE_BITS 10  that
> >      means we're full; I suggest what you do is use that flag to mean that we
> >      send another 64bit word; and in that word you use the bottom 7 bits for
> >      the fd index and bit 7 is set to indicate it's fd.  The other bits are sent
> >      as zero and available for the next use.
> >      Either that or start combining with some other flags.
> >      (I may have a use for some more bits in mind!)
> 
> Ok.  I can looke at that.
> 
> >   i) Is this safe for xbzrle - what happens to the cache (or is it all
> >      still the main thread?)
> 
> Nope.  Only way to use xbzrle is:
> 
> if (zero(page) {
>    ...
> } else if (xbzrle(page)) {
> 
> } else {
>     multifd(page)
> }
> 
> Otherwise we would have to make xbzrle multithread, or split memory
> between fd's.  Problem to split memory between fd's is that we need to
> know where the hot spots are.

OK, that makes sense.  So does that mean that some pages can get xbzrle sent?

> >   j) For postcopy I could do with a separate fd for the requested pages
> >      (but again that comes back to needing an easy solution to the ordering)
> 
> The ordering was easy, as said.  You can just use that command with each
> postcopy requested page.  Or something similar, no?
> 
> I think that just forgetting about that pages, and each time that we
> receive a requested page, we first wait for the main thread to finish
> its pages should be enough, no?

Actually, I realised it's simpler;  once we're in postcopy mode we never
send the same page again; so we never have any ordering problems as long
as we perform a sync across the fd's at postcopy entry.

Dave

> 
> > Dave
> 
> Thanks very much, JUan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

     prev parent reply	other threads:[~2016-04-26 12:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
2016-04-22 11:27   ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 04/13] migration: Add multifd capability Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
2016-04-22 11:37   ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
2016-04-22 12:09   ` Dr. David Alan Gilbert
2016-04-20 15:46 ` [Qemu-devel] [RFC 00/13] Multiple fd migration support Michael S. Tsirkin
2016-04-22 12:26 ` Dr. David Alan Gilbert
2016-04-25 16:53   ` Juan Quintela
2016-04-26 12:38     ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160426123803.GB2228@work-vm \
    --to=dgilbert@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).