From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:58751) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URXwR-00035Z-Ot for qemu-devel@nongnu.org; Sun, 14 Apr 2013 21:10:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1URXwO-0006IK-RK for qemu-devel@nongnu.org; Sun, 14 Apr 2013 21:10:43 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:48318) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URXwO-0006IG-N3 for qemu-devel@nongnu.org; Sun, 14 Apr 2013 21:10:40 -0400 Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 14 Apr 2013 21:10:40 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id C28C4C90050 for ; Sun, 14 Apr 2013 21:10:37 -0400 (EDT) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3F1AbKT301818 for ; Sun, 14 Apr 2013 21:10:37 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3F1AboK005912 for ; Sun, 14 Apr 2013 22:10:37 -0300 Message-ID: <516B538C.5060008@linux.vnet.ibm.com> Date: Sun, 14 Apr 2013 21:10:36 -0400 From: "Michael R. Hines" MIME-Version: 1.0 References: <20130411145632.GA2280@redhat.com> <5166F7AE.8070209@linux.vnet.ibm.com> <20130411191533.GA25515@redhat.com> <51671DFF.80904@linux.vnet.ibm.com> <20130412104802.GA23467@redhat.com> <5168105C.5040605@linux.vnet.ibm.com> <20130414082827.GA1548@redhat.com> <516ABDB8.1090100@linux.vnet.ibm.com> <20130414185116.GE7165@redhat.com> <516B06E0.9040804@linux.vnet.ibm.com> <20130414211647.GG7165@redhat.com> In-Reply-To: <20130414211647.GG7165@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, Paolo Bonzini On 04/14/2013 05:16 PM, Michael S. Tsirkin wrote: > On Sun, Apr 14, 2013 at 03:43:28PM -0400, Michael R. Hines wrote: >> On 04/14/2013 02:51 PM, Michael S. Tsirkin wrote: >>> On Sun, Apr 14, 2013 at 10:31:20AM -0400, Michael R. Hines wrote: >>>> On 04/14/2013 04:28 AM, Michael S. Tsirkin wrote: >>>>> On Fri, Apr 12, 2013 at 09:47:08AM -0400, Michael R. Hines wrote: >>>>>> Second, as I've explained, I strongly, strongly disagree with unregistering >>>>>> memory for all of the aforementioned reasons - workloads do not >>>>>> operate in such a manner that they can tolerate memory to be >>>>>> pulled out from underneath them at such fine-grained time scales >>>>>> in the *middle* of a relocation and I will not commit to writing a solution >>>>>> for a problem that doesn't exist. >>>>> Exactly same thing happens with swap, doesn't it? >>>>> You are saying workloads simply can not tolerate swap. >>>>> >>>>>> If you can prove (through some kind of anaylsis) that workloads >>>>>> would benefit from this kind of fine-grained memory overcommit >>>>>> by having cgroups swap out memory to disk underneath them >>>>>> without their permission, I would happily reconsider my position. >>>>>> >>>>>> - Michael >>>>> This has nothing to do with cgroups directly, it's just a way to >>>>> demonstrate you have a bug. >>>>> >>>> If your datacenter or your cloud or your product does not want to >>>> tolerate page registration, then don't use RDMA! >>>> >>>> The bottom line is: RDMA is useless without page registration. Without >>>> it, the performance of it will be crippled. If you define that as a bug, >>>> then so be it. >>>> >>>> - Michael >>> No one cares if you do page registration or not. ulimit -l 10g is the >>> problem. You should limit the amount of locked memory. >>> Lots of good research went into making RDMA go fast with limited locked >>> memory, with some success. Search for "registration cache" for example. >>> >> Patches using such a cache would be welcome. >> >> - Michael >> > And when someone writes them one day, we'll have to carry the old code > around for interoperability as well. Not pretty. To avoid that, you > need to explicitly say in the documenation that it's experimental and > unsupported. > That's what protocols are for. As I've already said, I've incorporated this into the design of the protocol already. The protocol already has a field called "repeat" which allows a user to request multiple chunk registrations at the same time. If you insist, I can add a capability / command to the protocol called "unregister chunk", but I'm not volunteering to implement that command as I don't have any data showing it to be of any value. That would insulate the protocol against any such future "registration cache" design. - Michael