From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59416) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z37qA-0007Fo-UJ for qemu-devel@nongnu.org; Thu, 11 Jun 2015 15:08:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z37q5-0008UT-S4 for qemu-devel@nongnu.org; Thu, 11 Jun 2015 15:08:38 -0400 Received: from e17.ny.us.ibm.com ([129.33.205.207]:52978) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z37q5-0008UG-OC for qemu-devel@nongnu.org; Thu, 11 Jun 2015 15:08:33 -0400 Received: from /spool/local by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 11 Jun 2015 15:08:33 -0400 Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 19E3838C8041 for ; Thu, 11 Jun 2015 15:08:29 -0400 (EDT) Received: from d01av05.pok.ibm.com (d01av05.pok.ibm.com [9.56.224.195]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t5BJ8SkQ63897756 for ; Thu, 11 Jun 2015 19:08:28 GMT Received: from d01av05.pok.ibm.com (localhost [127.0.0.1]) by d01av05.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t5BJ8Svw025487 for ; Thu, 11 Jun 2015 15:08:28 -0400 Message-ID: <5579DC91.70002@linux.vnet.ibm.com> Date: Thu, 11 Jun 2015 14:08:01 -0500 From: "Michael R. Hines" MIME-Version: 1.0 References: <1434043048-4444-1-git-send-email-dgilbert@redhat.com> <1434043048-4444-7-git-send-email-dgilbert@redhat.com> <5579CF9C.50707@linux.vnet.ibm.com> <20150611185825.GO2123@work-vm> In-Reply-To: <20150611185825.GO2123@work-vm> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 06/12] Translate offsets to destination address space List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: amit.shah@redhat.com, quintela@redhat.com, arei.gonglei@huawei.com, qemu-devel@nongnu.org, mrhines@us.ibm.com On 06/11/2015 01:58 PM, Dr. David Alan Gilbert wrote: > * Michael R. Hines (mrhines@linux.vnet.ibm.com) wrote: >> On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: >>> From: "Dr. David Alan Gilbert" >>> >>> The 'offset' field in RDMACompress and 'current_addr' field >>> in RDMARegister are commented as being offsets within a particular >>> RAMBlock, however they appear to actually be offsets within the >>> ram_addr_t space. >>> >>> The code currently assumes that the offsets on the source/destination >>> match, this change removes the need for the assumption for these >>> structures by translating the addresses into the ram_addr_t space of >>> the destination host. >>> >>> Note: An alternative would be to change the fields to actually >>> take the data they're commented for; this would potentially be >>> simpler but would break stream compatibility for those cases >>> that currently work. >>> >>> Signed-off-by: Dr. David Alan Gilbert >>> --- >>> migration/rdma.c | 31 ++++++++++++++++++++++++------- >>> 1 file changed, 24 insertions(+), 7 deletions(-) >>> >>> diff --git a/migration/rdma.c b/migration/rdma.c >>> index 9532461..cb66721 100644 >>> --- a/migration/rdma.c >>> +++ b/migration/rdma.c >>> @@ -411,7 +411,7 @@ static void network_to_control(RDMAControlHeader *control) >>> */ >>> typedef struct QEMU_PACKED { >>> union QEMU_PACKED { >>> - uint64_t current_addr; /* offset into the ramblock of the chunk */ >>> + uint64_t current_addr; /* offset into the ram_addr_t space */ >>> uint64_t chunk; /* chunk to lookup if unregistering */ >>> } key; >>> uint32_t current_index; /* which ramblock the chunk belongs to */ >>> @@ -419,8 +419,19 @@ typedef struct QEMU_PACKED { >>> uint64_t chunks; /* how many sequential chunks to register */ >>> } RDMARegister; >>> >>> -static void register_to_network(RDMARegister *reg) >>> +static void register_to_network(RDMAContext *rdma, RDMARegister *reg) >>> { >>> + RDMALocalBlock *local_block; >>> + local_block = &rdma->local_ram_blocks.block[reg->current_index]; >>> + >>> + if (local_block->is_ram_block) { >>> + /* >>> + * current_addr as passed in is an address in the local ram_addr_t >>> + * space, we need to translate this for the destination >>> + */ >>> + reg->key.current_addr -= local_block->offset; >>> + reg->key.current_addr += rdma->dest_blocks[reg->current_index].offset; >>> + } >>> reg->key.current_addr = htonll(reg->key.current_addr); >>> reg->current_index = htonl(reg->current_index); >>> reg->chunks = htonll(reg->chunks); >>> @@ -436,13 +447,19 @@ static void network_to_register(RDMARegister *reg) >>> typedef struct QEMU_PACKED { >>> uint32_t value; /* if zero, we will madvise() */ >>> uint32_t block_idx; /* which ram block index */ >>> - uint64_t offset; /* where in the remote ramblock this chunk */ >>> + uint64_t offset; /* Address in remote ram_addr_t space */ >>> uint64_t length; /* length of the chunk */ >>> } RDMACompress; >>> >>> -static void compress_to_network(RDMACompress *comp) >>> +static void compress_to_network(RDMAContext *rdma, RDMACompress *comp) >>> { >>> comp->value = htonl(comp->value); >>> + /* >>> + * comp->offset as passed in is an address in the local ram_addr_t >>> + * space, we need to translate this for the destination >>> + */ >>> + comp->offset -= rdma->local_ram_blocks.block[comp->block_idx].offset; >>> + comp->offset += rdma->dest_blocks[comp->block_idx].offset; >>> comp->block_idx = htonl(comp->block_idx); >>> comp->offset = htonll(comp->offset); >>> comp->length = htonll(comp->length); >> So, why add the destination block's offset on the source side >> just for it to be re-adjusted again when it gets to the destination side? >> >> Can you just stop at this: >> >> + reg->key.current_addr -= local_block->offset; >> >> Without this: >> >> + reg->key.current_addr += >> rdma->dest_blocks[reg->current_index].offset; >> >> ... on the source, followed by this on the destionation: >> >> + comp->offset -= rdma->local_ram_blocks.block[comp->block_idx].offset; >> >> Without this: >> >> + comp->offset += rdma->dest_blocks[comp->block_idx].offset; >> >> Did I follow correctly? > Aren't both of those conversions happening on the source? > Anyway, I think what you're saying is that we change the value sent over > the network to be an offset within the block instead of an offset in > the whole ram_addr_t space (i.e. that's what happens if you don't > add back on the dest_blocks[].offset). Yes, right. Can you skip adding/subtracting the local block offset on each side? - Michael