From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44766) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UQK6G-0001mj-Ly for qemu-devel@nongnu.org; Thu, 11 Apr 2013 12:11:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UQK6B-00046y-Sr for qemu-devel@nongnu.org; Thu, 11 Apr 2013 12:11:48 -0400 Received: from e8.ny.us.ibm.com ([32.97.182.138]:46096) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UQK6B-00046k-NK for qemu-devel@nongnu.org; Thu, 11 Apr 2013 12:11:43 -0400 Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 11 Apr 2013 12:11:43 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 2DD8C38C81FF for ; Thu, 11 Apr 2013 12:09:51 -0400 (EDT) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3BG9ohh271720 for ; Thu, 11 Apr 2013 12:09:50 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3BG9nsU029344 for ; Thu, 11 Apr 2013 13:09:49 -0300 Message-ID: <5166E048.4090008@linux.vnet.ibm.com> Date: Thu, 11 Apr 2013 12:09:44 -0400 From: "Michael R. Hines" MIME-Version: 1.0 References: <20130410133448.GA18128@redhat.com> <51658554.2000909@linux.vnet.ibm.com> <20130410174107.GB32247@redhat.com> <5165C60E.20006@linux.vnet.ibm.com> <20130411071927.GA17063@redhat.com> <5166B6B1.2030003@linux.vnet.ibm.com> <20130411134820.GA24942@redhat.com> <5166C19A.1040402@linux.vnet.ibm.com> <20130411143718.GC24942@redhat.com> <5166D460.2070106@linux.vnet.ibm.com> <20130411154424.GB22779@redhat.com> In-Reply-To: <20130411154424.GB22779@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, pbonzini@redhat.com On 04/11/2013 11:44 AM, Michael S. Tsirkin wrote: > On Thu, Apr 11, 2013 at 11:18:56AM -0400, Michael R. Hines wrote: >> First of all, > I know it's a hard habit to break but could you > please stop stop top-posting? Acknowledged. > >> this whole argument should not even exist for the >> following reason: >> >> Page registrations are supposed to be *rare* - once a page is registered, it >> is registered for life. There is nothing in the design that says a page must >> be "unregistered" and I do not believe anybody is proposing that. > Hmm proposing what? Of course you need to unregister pages > eventually otherwise your pinned memory on the destination > will just grow indefinitely. People are often doing > registration caches to help reduce the overhead, > but never unregistering seems too aggressive. > > You mean the chunk-based thing just delays the agony > until all guest memory is pinned for RDMA anyway? > Wait, is it registered for life on the source too? > > Well this kind of explains why qemu was dying on OOM, > doesn't it? Yes, that's correct. The agony is just delayed. The right thing to do in a future patch would be to pin as much as possible in advance before the bulk phase round even begins (using the pagemap). In the meantime, chunk registartion performance is still very good so long as total migration time is not the metric you are optimizing for. >> Second, this means that my previous analysis showing that >> performance was reduced >> was also incorrect because most of the RDMA transfers were against >> pages during >> the bulk phase round, which incorrectly makes dynamic page >> registration look bad. >> I should have done more testing *after* the bulk phase round, >> and I apologize for not doing that. >> >> Indeed when I do such a test (with the 'stress' command) the cost of >> page registration disappears >> because most of the registrations have already completed a long time ago. >> >> Thanks, Paolo for reminding us about the bulk-phase behavior to being with. >> >> Third, this means that optimizing this protocol would not be helpful >> and that we should >> follow the "keep it simple" approach because during steady-state >> phase of the migration >> most of the pages should have already been registered. >> >> - Michael > If you mean that registering all memory is a requirement, > then I am not sure I agree: you wrote one slow protocol, this > does not mean that there can't be a fast one. > > But if you mean to say that the current chunk based code > is useless, then I'd have to agree. Answer above.