Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, amit.shah@redhat.com
Subject: Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
Date: Tue, 26 Apr 2016 13:38:04 +0100	[thread overview]
Message-ID: <20160426123803.GB2228@work-vm> (raw)
In-Reply-To: <87wpnl8wab.fsf@emacs.mitica>

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> Hi
> >> 
> >> This patch series is "an" initial implementation of multiple fd migration.
> >> This is to get something out for others to comment, it is not finished at all.
> >
> > I've had a quick skim:
> >   a) I think mst is right about the risk of getting stale pages out of order.
> 
> I have been thinking about this.  We just need to send a "we have
> finish" this round packet.  And reception has to wait for all threads to
> finish before continue.  It is easier and not expensive.  We never
> resend the same page during the same round. 

Yes.

> >   b) Since you don't change the URI at all, it's a bit restricted; for example,
> >      it means I can't run separate sessions over different NICs unless I've
> >      done something clever at the routing/or bonded them.
> >      One thing I liked the sound of multi-fd for is NUMA; get a BIG box
> >      and give each numa node a separate NIC and run a separate thread on each
> >      node.
> 
> If we want this _how_ we want to configure it.  This was part of the
> reason to post the patch.  It works only for tcp, I don't even try the
> others, just to see what people want.

I was thinking this would work even for TCP; you'd just need a way to pass
different URIs (with address/port) for each connection.

> >   c) Hmm we do still have a single thread doing all the bitmap syncing and scanning,
> >      we'll have to watch out if that is the bottleneck at all.
> 
> Yeap.  My idea here was to still maintain the bitmap scanning on the
> main thread, but send work to the "worker threads" in batches, not in
> single pages.  But I haven't really profiled how long we spend there.

Yeh, it would be interesting to see what this profile looked like; if we
suddenly found that main thread had spare cycles perhaps we could do some
more interesting types of scanning.

> >   d) All the zero testing is still done in the main thread which we know is
> >      expensive.
> 
> Not trivial if we don't want to send control information over the
> "other" channels.  One solution would be split the main memory in
> different "main" threads.  No performance profiles.

Yes, and it's tricky because the order is:
   1) Send control information
   2) Farm it out to individual thread

  It's too late for '2' to say 'it's zero'.

> >   e) Do we need to do something for security with having multiple ports? How
> >      do we check that nothing snuck in on one of our extra ports, have we got
> >      sanity checks to make sure it's actually the right stream.
> 
> 
> We only have a single port.  We opened it several times.  It shouldn't
> require changes in either libvirt/firewall.  (Famous last words)

True I guess.

> 
> >   f) You're handing out pages to the sending threads on the basis of which one
> >      is free (in the same way as the multi threaded compression); but I think
> >      it needs some sanity adding to only hand out whole host pages - it feels
> >      like receiving all the chunks of one host page down separate FDs would
> >      be horrible.
> 
> Trivial optimization would be to send _whole_ huge pages in one go.  I
> wanted comments about what people wanted here.  My idea was really to
> add multipage or several pages in one go.  Would reduce synchronization
> a lot.   I do to the 1st that becomes free because ...... I don't know
> how long a specific transmission is going to take.  TCP for you :-(

Sending huge pages would be very nice; the tricky thing is you don't want to send
a huge page unless it's all marked dirty.

> >   g) I think you might be able to combine the compression into the same threads;
> >      so that if multi-fd + multi-threaded-compresison is set you don't end
> >      up with 2 sets of threads and it might be the simplest way to make them
> >      work together.
> 
> Yeap, I thought that.  But I didn't want to merge them in a first
> stage.  It makes much more sense to _not_ send the compressed data
> through the main channel.  But that would be v2 (or 3, or 4 ...)

Right.

> >   h) You've used the last free RAM_SAVE_FLAG!  And the person who takes the last
> >      slice^Wbit has to get some more.
> >      Since arm, ppc, and 68k have variants that have TARGET_PAGE_BITS 10  that
> >      means we're full; I suggest what you do is use that flag to mean that we
> >      send another 64bit word; and in that word you use the bottom 7 bits for
> >      the fd index and bit 7 is set to indicate it's fd.  The other bits are sent
> >      as zero and available for the next use.
> >      Either that or start combining with some other flags.
> >      (I may have a use for some more bits in mind!)
> 
> Ok.  I can looke at that.
> 
> >   i) Is this safe for xbzrle - what happens to the cache (or is it all
> >      still the main thread?)
> 
> Nope.  Only way to use xbzrle is:
> 
> if (zero(page) {
>    ...
> } else if (xbzrle(page)) {
> 
> } else {
>     multifd(page)
> }
> 
> Otherwise we would have to make xbzrle multithread, or split memory
> between fd's.  Problem to split memory between fd's is that we need to
> know where the hot spots are.

OK, that makes sense.  So does that mean that some pages can get xbzrle sent?

> >   j) For postcopy I could do with a separate fd for the requested pages
> >      (but again that comes back to needing an easy solution to the ordering)
> 
> The ordering was easy, as said.  You can just use that command with each
> postcopy requested page.  Or something similar, no?
> 
> I think that just forgetting about that pages, and each time that we
> receive a requested page, we first wait for the main thread to finish
> its pages should be enough, no?

Actually, I realised it's simpler;  once we're in postcopy mode we never
send the same page again; so we never have any ordering problems as long
as we perform a sync across the fd's at postcopy entry.

Dave

> 
> > Dave
> 
> Thanks very much, JUan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

     prev parent reply	other threads:[~2016-04-26 12:38 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
2016-04-22 11:27   ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 04/13] migration: Add multifd capability Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
2016-04-22 11:37   ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
2016-04-22 12:09   ` Dr. David Alan Gilbert
2016-04-20 15:46 ` [Qemu-devel] [RFC 00/13] Multiple fd migration support Michael S. Tsirkin
2016-04-22 12:26 ` Dr. David Alan Gilbert
2016-04-25 16:53   ` Juan Quintela
2016-04-26 12:38     ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160426123803.GB2228@work-vm \
    --to=dgilbert@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.