From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cnlel-0004FA-Np for qemu-devel@nongnu.org; Tue, 14 Mar 2017 08:34:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cnlei-0003a9-13 for qemu-devel@nongnu.org; Tue, 14 Mar 2017 08:34:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43198) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cnleh-0003YZ-Ow for qemu-devel@nongnu.org; Tue, 14 Mar 2017 08:34:23 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E909480F7D for ; Tue, 14 Mar 2017 12:34:23 +0000 (UTC) Date: Tue, 14 Mar 2017 12:34:20 +0000 From: "Daniel P. Berrange" Message-ID: <20170314123420.GN2652@redhat.com> Reply-To: "Daniel P. Berrange" References: <20170313124434.1043-1-quintela@redhat.com> <20170314102142.GC2445@work-vm> <20170314114704.GJ2652@redhat.com> <20170314122222.GH2445@work-vm> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20170314122222.GH2445@work-vm> Subject: Re: [Qemu-devel] [PATCH 00/16] Multifd v4 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Juan Quintela , qemu-devel@nongnu.org On Tue, Mar 14, 2017 at 12:22:23PM +0000, Dr. David Alan Gilbert wrote: > * Daniel P. Berrange (berrange@redhat.com) wrote: > > On Tue, Mar 14, 2017 at 10:21:43AM +0000, Dr. David Alan Gilbert wrote: > > > * Juan Quintela (quintela@redhat.com) wrote: > > > > Hi > > > > > > > > This is the 4th version of multifd. Changes: > > > > - XBZRLE don't need to be checked for > > > > - Documentation and defaults are consistent > > > > - split socketArgs > > > > - use iovec instead of creating something similar. > > > > - We use now the exported size of target page (another HACK removal) > > > > - created qio_chanel_{wirtev,readv}_all functions. the _full() name > > > > was already taken. > > > > What they do is the same that the without _all() function, but if it > > > > returns due to blocking it redo the call. > > > > - it is checkpatch.pl clean now. > > > > > > > > Please comment, Juan. > > > > > > High level things, > > > a) I think you probably need to do some bandwidth measurements to show > > > that multifd is managing to have some benefit - it would be good > > > for the cover letter. > > > > Presumably this would be a building block to solving the latency problems > > with post-copy, by reserving one channel for use transferring out of band > > pages required by target host page faults. > > Right, it's on my list to look at; there's some interesting questions about > the way in which the main fd carrying the headers interacts, and also what > happens to pages immediately after the requested page; for example, lets > say we're currently streaming at address 'S' and a postcopy request (P) comes in; > so what we currently have on one FD is: > > S,S+1....S+n,P,P+1,P+2,P+n > > Note that when a request comes in we flip location so we start sending background > pages from P+1 on the assumption that they'll be wanted soon. > > with 3 FDs this would go initially as: > S S+3 P+1 P+4 > S+1 S+4 P+2 .. > S+2 P P+3 .. > > now if we had a spare FD for postcopy we'd do: > S S+3 P+1 P+4 > S+1 S+4 P+2 .. > S+2 S+5 P+3 .. > - P - - > > So 'P' got there quickly - but P+1 is stuck behind the S's; is that what we want? > An interesting alternative would be to switch which fd we keep free: > S S+3 - - - > S+1 S+4 P+2 P+4 > S+2 S+5 P+3 P+5 > - P P+1 P+6 > > So depending on your buffering P+1 might also now be pretty fast; but that's > starting to get into heuristics about guessing how much you should put on > your previously low-queue'd fd. Ah, I see, so you're essentially trying todo read-ahead when post-copy faults. It becomes even more fun when you have multiple page faults coming in, (quite likely with multi-vCPU guests), as you have P, Q, R, S come in, all of which want servicing quickly. So if you queue up too many P+n pages for read-ahead, you'd delay Q, R & S S S+3 - - - S+1 S+4 P+2 P+4 Q R ... S+2 S+5 P+3 P+5 Q+1 R+1 ... - P P+1 P+6 Q+2 ... ... this tends to argue for overcommitting threads vs cpus. eg even if QEMU is confined to only use 2 host CPUs, it would be worth having 4 migration threads. They would contend for CPU time for AES encryption, but you would reduce chance of getting stuck behind large send-buffers. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|