From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40173) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uroyi-00037u-7v for qemu-devel@nongnu.org; Wed, 26 Jun 2013 08:37:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uroye-0004zR-Lu for qemu-devel@nongnu.org; Wed, 26 Jun 2013 08:37:40 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:35009) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uroye-0004zK-FX for qemu-devel@nongnu.org; Wed, 26 Jun 2013 08:37:36 -0400 Received: from /spool/local by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 26 Jun 2013 06:37:32 -0600 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 6E00C19D8042 for ; Wed, 26 Jun 2013 06:37:19 -0600 (MDT) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r5QCbSlC134254 for ; Wed, 26 Jun 2013 06:37:28 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r5QCbSG7017004 for ; Wed, 26 Jun 2013 06:37:28 -0600 Message-ID: <51CAE086.506@linux.vnet.ibm.com> Date: Wed, 26 Jun 2013 08:37:26 -0400 From: "Michael R. Hines" MIME-Version: 1.0 References: <1372125485-11795-1-git-send-email-mrhines@linux.vnet.ibm.com> <1372125485-11795-15-git-send-email-mrhines@linux.vnet.ibm.com> <8761x21pvx.fsf@elfo.elfo> <51C96D39.2020603@redhat.com> <51C99EC6.2050008@linux.vnet.ibm.com> <51C9A0D8.6050800@redhat.com> <51C9AF15.8000404@linux.vnet.ibm.com> <51C9AF56.9030800@redhat.com> <51CA03FF.1000806@linux.vnet.ibm.com> <51CA0640.1040507@redhat.com> <51CA3646.8040409@linux.vnet.ibm.com> <51CA8C15.6040501@redhat.com> In-Reply-To: <51CA8C15.6040501@redhat.com> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: aliguori@us.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, chegu_vinod@hp.com, knoel@redhat.com On 06/26/2013 02:37 AM, Paolo Bonzini wrote: > Il 26/06/2013 02:31, Michael R. Hines ha scritto: >> On 06/25/2013 05:06 PM, Paolo Bonzini wrote: >>> Il 25/06/2013 22:56, Michael R. Hines ha scritto: >>>> I was wrong - this does require a protocol extension. >>>> >>>> This is because the RDMA transfers are asynchronous, and thus >>>> we cannot know in advance that it is safe to unregister the memory >>>> associated with each individual transfer before the transfer actually >>>> completes. >>>> >>>> While the destination currently uses the protocol to participate in >>>> *registering* the page, the destination does not participate in the >>>> RDMA transfers themselves, only the source does, and thus would >>>> require a new exchange of messages to block and instruct the >>>> destination to unpin the memory. >>> Yes, that's what I recalled too (really what mst told me :)). Does it >>> need to be blocking though? As long as the pinning is blocking, and >>> messages are processed in order, the source can proceed immediately >>> after sending an unpin message. This assumes of course that the chunk >>> is not being transmitted, and I am not sure how easy the source can >>> determine that. >> No, they're not processed in order. In fact, not only does the device >> write out of order, but also the PCI bus writes out of order. >> This was such a problem in fact, that I fixed several bugs as a result >> a few weeks ago (v7 of the patch with an in-depth description). >> >> The destination simply cannot assume whatsoever what the ordering >> of the writes are - that's really the whole point of using RDMA in the >> first place so that the software can get out of the way of the transfer >> process to lower the latency of each transfer. > The memory is processed out of order, but what about the messages? > Those must be in order. > > Note that I wrote above "This assumes of course that the chunk is not > being transmitted". Can the source know when an asynchronous transfer > finished, and delay the unpinning until that time? > > Paolo Yes, the source does know. There's no problem unpinning on the source. But both sides must do the unpinning, not just the source. Did I misunderstand you? Are you suggesting *only* unpinning on the source? - Michael