From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34724) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UrjLx-00084G-Lz for qemu-devel@nongnu.org; Wed, 26 Jun 2013 02:37:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UrjLw-0008JW-JC for qemu-devel@nongnu.org; Wed, 26 Jun 2013 02:37:17 -0400 Received: from mail-bk0-x232.google.com ([2a00:1450:4008:c01::232]:36388) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UrjLw-0008JQ-CM for qemu-devel@nongnu.org; Wed, 26 Jun 2013 02:37:16 -0400 Received: by mail-bk0-f50.google.com with SMTP id ik8so4765309bkc.37 for ; Tue, 25 Jun 2013 23:37:15 -0700 (PDT) Sender: Paolo Bonzini Message-ID: <51CA8C15.6040501@redhat.com> Date: Wed, 26 Jun 2013 08:37:09 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1372125485-11795-1-git-send-email-mrhines@linux.vnet.ibm.com> <1372125485-11795-15-git-send-email-mrhines@linux.vnet.ibm.com> <8761x21pvx.fsf@elfo.elfo> <51C96D39.2020603@redhat.com> <51C99EC6.2050008@linux.vnet.ibm.com> <51C9A0D8.6050800@redhat.com> <51C9AF15.8000404@linux.vnet.ibm.com> <51C9AF56.9030800@redhat.com> <51CA03FF.1000806@linux.vnet.ibm.com> <51CA0640.1040507@redhat.com> <51CA3646.8040409@linux.vnet.ibm.com> In-Reply-To: <51CA3646.8040409@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael R. Hines" Cc: aliguori@us.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, chegu_vinod@hp.com, knoel@redhat.com Il 26/06/2013 02:31, Michael R. Hines ha scritto: > On 06/25/2013 05:06 PM, Paolo Bonzini wrote: >> Il 25/06/2013 22:56, Michael R. Hines ha scritto: >>> I was wrong - this does require a protocol extension. >>> >>> This is because the RDMA transfers are asynchronous, and thus >>> we cannot know in advance that it is safe to unregister the memory >>> associated with each individual transfer before the transfer actually >>> completes. >>> >>> While the destination currently uses the protocol to participate in >>> *registering* the page, the destination does not participate in the >>> RDMA transfers themselves, only the source does, and thus would >>> require a new exchange of messages to block and instruct the >>> destination to unpin the memory. >> Yes, that's what I recalled too (really what mst told me :)). Does it >> need to be blocking though? As long as the pinning is blocking, and >> messages are processed in order, the source can proceed immediately >> after sending an unpin message. This assumes of course that the chunk >> is not being transmitted, and I am not sure how easy the source can >> determine that. > > No, they're not processed in order. In fact, not only does the device > write out of order, but also the PCI bus writes out of order. > This was such a problem in fact, that I fixed several bugs as a result > a few weeks ago (v7 of the patch with an in-depth description). > > The destination simply cannot assume whatsoever what the ordering > of the writes are - that's really the whole point of using RDMA in the > first place so that the software can get out of the way of the transfer > process to lower the latency of each transfer. The memory is processed out of order, but what about the messages? Those must be in order. Note that I wrote above "This assumes of course that the chunk is not being transmitted". Can the source know when an asynchronous transfer finished, and delay the unpinning until that time? Paolo > > The only option is to send a blocking message to the other side to > request the unpinning (in addition to unpinning on the source first upon > completion of the original transfer). > > As you can expect, this would be very expensive and we must ensure > that we have *very* good a-priori information that this memory will > not need to be re-registered anytime in the near future. > > - Michael > > >