From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org, amit.shah@redhat.com
Subject: Re: [Qemu-devel] [RFC 00/13] Multiple fd migration support
Date: Tue, 26 Apr 2016 13:38:04 +0100 [thread overview]
Message-ID: <20160426123803.GB2228@work-vm> (raw)
In-Reply-To: <87wpnl8wab.fsf@emacs.mitica>
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Juan Quintela (quintela@redhat.com) wrote:
> >> Hi
> >>
> >> This patch series is "an" initial implementation of multiple fd migration.
> >> This is to get something out for others to comment, it is not finished at all.
> >
> > I've had a quick skim:
> > a) I think mst is right about the risk of getting stale pages out of order.
>
> I have been thinking about this. We just need to send a "we have
> finish" this round packet. And reception has to wait for all threads to
> finish before continue. It is easier and not expensive. We never
> resend the same page during the same round.
Yes.
> > b) Since you don't change the URI at all, it's a bit restricted; for example,
> > it means I can't run separate sessions over different NICs unless I've
> > done something clever at the routing/or bonded them.
> > One thing I liked the sound of multi-fd for is NUMA; get a BIG box
> > and give each numa node a separate NIC and run a separate thread on each
> > node.
>
> If we want this _how_ we want to configure it. This was part of the
> reason to post the patch. It works only for tcp, I don't even try the
> others, just to see what people want.
I was thinking this would work even for TCP; you'd just need a way to pass
different URIs (with address/port) for each connection.
> > c) Hmm we do still have a single thread doing all the bitmap syncing and scanning,
> > we'll have to watch out if that is the bottleneck at all.
>
> Yeap. My idea here was to still maintain the bitmap scanning on the
> main thread, but send work to the "worker threads" in batches, not in
> single pages. But I haven't really profiled how long we spend there.
Yeh, it would be interesting to see what this profile looked like; if we
suddenly found that main thread had spare cycles perhaps we could do some
more interesting types of scanning.
> > d) All the zero testing is still done in the main thread which we know is
> > expensive.
>
> Not trivial if we don't want to send control information over the
> "other" channels. One solution would be split the main memory in
> different "main" threads. No performance profiles.
Yes, and it's tricky because the order is:
1) Send control information
2) Farm it out to individual thread
It's too late for '2' to say 'it's zero'.
> > e) Do we need to do something for security with having multiple ports? How
> > do we check that nothing snuck in on one of our extra ports, have we got
> > sanity checks to make sure it's actually the right stream.
>
>
> We only have a single port. We opened it several times. It shouldn't
> require changes in either libvirt/firewall. (Famous last words)
True I guess.
>
> > f) You're handing out pages to the sending threads on the basis of which one
> > is free (in the same way as the multi threaded compression); but I think
> > it needs some sanity adding to only hand out whole host pages - it feels
> > like receiving all the chunks of one host page down separate FDs would
> > be horrible.
>
> Trivial optimization would be to send _whole_ huge pages in one go. I
> wanted comments about what people wanted here. My idea was really to
> add multipage or several pages in one go. Would reduce synchronization
> a lot. I do to the 1st that becomes free because ...... I don't know
> how long a specific transmission is going to take. TCP for you :-(
Sending huge pages would be very nice; the tricky thing is you don't want to send
a huge page unless it's all marked dirty.
> > g) I think you might be able to combine the compression into the same threads;
> > so that if multi-fd + multi-threaded-compresison is set you don't end
> > up with 2 sets of threads and it might be the simplest way to make them
> > work together.
>
> Yeap, I thought that. But I didn't want to merge them in a first
> stage. It makes much more sense to _not_ send the compressed data
> through the main channel. But that would be v2 (or 3, or 4 ...)
Right.
> > h) You've used the last free RAM_SAVE_FLAG! And the person who takes the last
> > slice^Wbit has to get some more.
> > Since arm, ppc, and 68k have variants that have TARGET_PAGE_BITS 10 that
> > means we're full; I suggest what you do is use that flag to mean that we
> > send another 64bit word; and in that word you use the bottom 7 bits for
> > the fd index and bit 7 is set to indicate it's fd. The other bits are sent
> > as zero and available for the next use.
> > Either that or start combining with some other flags.
> > (I may have a use for some more bits in mind!)
>
> Ok. I can looke at that.
>
> > i) Is this safe for xbzrle - what happens to the cache (or is it all
> > still the main thread?)
>
> Nope. Only way to use xbzrle is:
>
> if (zero(page) {
> ...
> } else if (xbzrle(page)) {
>
> } else {
> multifd(page)
> }
>
> Otherwise we would have to make xbzrle multithread, or split memory
> between fd's. Problem to split memory between fd's is that we need to
> know where the hot spots are.
OK, that makes sense. So does that mean that some pages can get xbzrle sent?
> > j) For postcopy I could do with a separate fd for the requested pages
> > (but again that comes back to needing an easy solution to the ordering)
>
> The ordering was easy, as said. You can just use that command with each
> postcopy requested page. Or something similar, no?
>
> I think that just forgetting about that pages, and each time that we
> receive a requested page, we first wait for the main thread to finish
> its pages should be enough, no?
Actually, I realised it's simpler; once we're in postcopy mode we never
send the same page again; so we never have any ordering problems as long
as we perform a sync across the fd's at postcopy entry.
Dave
>
> > Dave
>
> Thanks very much, JUan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
prev parent reply other threads:[~2016-04-26 12:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-20 14:44 [Qemu-devel] [RFC 00/13] Multiple fd migration support Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 01/13] migration: create Migration Incoming State at init time Juan Quintela
2016-04-22 11:27 ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 02/13] migration: Pass TCP args in an struct Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 03/13] migration: [HACK] Don't create decompression threads if not enabled Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 04/13] migration: Add multifd capability Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 05/13] migration: Create x-multifd-threads parameter Juan Quintela
2016-04-22 11:37 ` Dr. David Alan Gilbert
2016-04-20 14:44 ` [Qemu-devel] [PATCH 06/13] migration: create multifd migration threads Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 07/13] migration: Start of multiple fd work Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 08/13] migration: create ram_multifd_page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 09/13] migration: Create thread infrastructure for multifd send side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 10/13] migration: Send the fd number which we are going to use for this page Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 11/13] migration: Create thread infrastructure for multifd recv side Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 12/13] migration: Test new fd infrastructure Juan Quintela
2016-04-20 14:44 ` [Qemu-devel] [PATCH 13/13] migration: [HACK]Transfer pages over new channels Juan Quintela
2016-04-22 12:09 ` Dr. David Alan Gilbert
2016-04-20 15:46 ` [Qemu-devel] [RFC 00/13] Multiple fd migration support Michael S. Tsirkin
2016-04-22 12:26 ` Dr. David Alan Gilbert
2016-04-25 16:53 ` Juan Quintela
2016-04-26 12:38 ` Dr. David Alan Gilbert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160426123803.GB2228@work-vm \
--to=dgilbert@redhat.com \
--cc=amit.shah@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).