From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:60260) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URjOf-0003JL-Dn for qemu-devel@nongnu.org; Mon, 15 Apr 2013 09:24:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1URjOW-0003FG-9u for qemu-devel@nongnu.org; Mon, 15 Apr 2013 09:24:37 -0400 Received: from e7.ny.us.ibm.com ([32.97.182.137]:33671) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URjOW-0003FC-5J for qemu-devel@nongnu.org; Mon, 15 Apr 2013 09:24:28 -0400 Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 15 Apr 2013 09:24:27 -0400 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 6F9A3C90025 for ; Mon, 15 Apr 2013 09:24:24 -0400 (EDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3FDOMBl273432 for ; Mon, 15 Apr 2013 09:24:23 -0400 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3FDRAY9030916 for ; Mon, 15 Apr 2013 07:27:10 -0600 Message-ID: <516BFF83.40505@linux.vnet.ibm.com> Date: Mon, 15 Apr 2013 09:24:19 -0400 From: "Michael R. Hines" MIME-Version: 1.0 References: <20130411145632.GA2280@redhat.com> <5166F7AE.8070209@linux.vnet.ibm.com> <20130411191533.GA25515@redhat.com> <51671DFF.80904@linux.vnet.ibm.com> <20130412104802.GA23467@redhat.com> <5168105C.5040605@linux.vnet.ibm.com> <20130414082827.GA1548@redhat.com> <516ABDB8.1090100@linux.vnet.ibm.com> <20130414185116.GE7165@redhat.com> <516B06E0.9040804@linux.vnet.ibm.com> <20130414211647.GG7165@redhat.com> <516B538C.5060008@linux.vnet.ibm.com> <516BBB99.5040302@redhat.com> In-Reply-To: <516BBB99.5040302@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: aliguori@us.ibm.com, "Michael S. Tsirkin" , qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com On 04/15/2013 04:34 AM, Paolo Bonzini wrote: > Il 15/04/2013 03:10, Michael R. Hines ha scritto: >>> And when someone writes them one day, we'll have to carry the old code >>> around for interoperability as well. Not pretty. To avoid that, you >>> need to explicitly say in the documenation that it's experimental and >>> unsupported. >>> >> That's what protocols are for. >> >> As I've already said, I've incorporated this into the design of the >> protocol >> already. >> >> The protocol already has a field called "repeat" which allows a user to >> request multiple chunk registrations at the same time. >> >> If you insist, I can add a capability / command to the protocol called >> "unregister chunk", >> but I'm not volunteering to implement that command as I don't have any data >> showing it to be of any value. > Implementing it on the destination side would be of value because it > would make the implementation interoperable. > > A very basic implementation would be "during the bulk phase, unregister > the previous chunk every time you register a chunk". It would work > great when migrating an idle guest, for example. It would probably be > faster than TCP (which is now at 4.2 Gbps). > > On one hand this should not block merging the patches; on the other > hand, "agreeing to disagree" without having done any test is not very > fruitful. You can disagree on the priorities (and I agree with you on > this), but what mst is proposing is absolutely reasonable. > > Paolo Ok, I think I understand the disconnect here: So, let's continue to use the above example that you described and let me ask another question. Let's say the above mentioned idle VM is chosen, for whatever reason, *not* to use TCP migration, and instead use RDMA. (I recommend against choosing RDMA in the current docs, but let's stick to this example for the sake of argument). Now, in this example, let's say the migration starts up and the hypervisor has run out of physical memory and starts swapping during the migration. (also for the sake of argument). The next thing that would immediately happen is the next IB verbs function call: "ib_reg_mr()". This function call would probably fail because there's nothing else left to pin and the function call would return an error. So my question is: Is it not sufficient to send a message back to the primary-VM side of the connection which says: "Your migration cannot proceed anymore, please resume the VM and try again somewhere else". In this case, both the system administrator and the virtual machine are safe, nothing has been killed, nothing has crashed, and the management software can proceed to make a new management decision. Is there something wrong with this sequence of events? - Michael