From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:35681) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URPSz-0003nB-Q9 for qemu-devel@nongnu.org; Sun, 14 Apr 2013 12:07:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1URPSx-0008Uu-7q for qemu-devel@nongnu.org; Sun, 14 Apr 2013 12:07:45 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:41856) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URPSx-0008Ug-1i for qemu-devel@nongnu.org; Sun, 14 Apr 2013 12:07:43 -0400 Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 14 Apr 2013 10:07:42 -0600 Received: from d01relay05.pok.ibm.com (d01relay05.pok.ibm.com [9.56.227.237]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 33B8838C8047 for ; Sun, 14 Apr 2013 12:07:39 -0400 (EDT) Received: from d01av05.pok.ibm.com (d01av05.pok.ibm.com [9.56.224.195]) by d01relay05.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3EG7d0c333324 for ; Sun, 14 Apr 2013 12:07:39 -0400 Received: from d01av05.pok.ibm.com (loopback [127.0.0.1]) by d01av05.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3EG7d3x011945 for ; Sun, 14 Apr 2013 12:07:39 -0400 Message-ID: <516AD44A.4040902@linux.vnet.ibm.com> Date: Sun, 14 Apr 2013 12:07:38 -0400 From: "Michael R. Hines" MIME-Version: 1.0 References: <20130411145632.GA2280@redhat.com> <5166F7AE.8070209@linux.vnet.ibm.com> <20130411191533.GA25515@redhat.com> <51671DFF.80904@linux.vnet.ibm.com> <20130412104802.GA23467@redhat.com> <5167E797.2050103@redhat.com> <20130412112553.GB23467@redhat.com> <51681DAA.3000503@redhat.com> <20130414115911.GA4923@redhat.com> <516ABCCC.207@linux.vnet.ibm.com> <20130414160327.GB7165@redhat.com> In-Reply-To: <20130414160327.GB7165@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, Paolo Bonzini On 04/14/2013 12:03 PM, Michael S. Tsirkin wrote: > On Sun, Apr 14, 2013 at 10:27:24AM -0400, Michael R. Hines wrote: >> On 04/14/2013 07:59 AM, Michael S. Tsirkin wrote: >>> On Fri, Apr 12, 2013 at 04:43:54PM +0200, Paolo Bonzini wrote: >>>> Il 12/04/2013 13:25, Michael S. Tsirkin ha scritto: >>>>> On Fri, Apr 12, 2013 at 12:53:11PM +0200, Paolo Bonzini wrote: >>>>>> Il 12/04/2013 12:48, Michael S. Tsirkin ha scritto: >>>>>>> 1. You have two protocols already and this does not make sense in >>>>>>> version 1 of the patch. >>>>>> It makes sense if we consider it experimental (add x- in front of >>>>>> transport and capability) and would like people to play with it. >>>>>> >>>>>> Paolo >>>>> But it's not testable yet. I see problems just reading the >>>>> documentation. Author thinks "ulimit -l 10000000000" on both source and >>>>> destination is just fine. This can easily crash host or cause OOM >>>>> killer to kill QEMU. So why is there any need for extra testers? Fix >>>>> the major bugs first. >>>>> >>>>> There's a similar issue with device assignment - we can't fix it there, >>>>> and despite being available for years, this was one of two reasons that >>>>> has kept this feature out of hands of lots of users (and assuming guest >>>>> has lots of zero pages won't work: balloon is not widely used either >>>>> since it depends on a well-behaved guest to work correctly). >>>> I agree assuming guest has lots of zero pages won't work, but I think >>>> you are overstating the importance of overcommit. Let's mark the damn >>>> thing as experimental, and stop making perfect the enemy of good. >>>> >>>> Paolo >>> It looks like we have to decide, before merging, whether migration with >>> rdma that breaks overcommit is worth it or not. Since the author made >>> it very clear he does not intend to make it work with overcommit, ever. >>> >> That depends entirely as what you define as overcommit. > You don't get to define your own terms. Look it up in wikipedia or > something. > >> The pages do get unregistered at the end of the migration =) >> >> - Michael > The limitations are pretty clear, and you really should document them: > > 1. run qemu as root, or under ulimit -l on both source and > destination > > 2. expect that as much as that amount of memory is pinned > and unvailable to host kernel and applications for > arbitrarily long time. > Make sure you have much more RAM in host or QEMU will get killed. > > To me, especially 1 is an unacceptable security tradeoff. > It is entirely fixable but we both have other priorities, > so it'll stay broken. > Agreed, the documentation should be clear. So, if you define that scenario as broken, then yes, it's broken. - Michael