From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50806) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z3USD-00027m-6t for qemu-devel@nongnu.org; Fri, 12 Jun 2015 15:17:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z3US8-0007KM-6m for qemu-devel@nongnu.org; Fri, 12 Jun 2015 15:17:25 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:35629) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z3US8-0007Jw-0B for qemu-devel@nongnu.org; Fri, 12 Jun 2015 15:17:20 -0400 Received: from /spool/local by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 12 Jun 2015 13:17:18 -0600 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 577451FF0049 for ; Fri, 12 Jun 2015 13:08:25 -0600 (MDT) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t5CJEvso27918344 for ; Fri, 12 Jun 2015 12:14:57 -0700 Received: from d03av03.boulder.ibm.com (localhost [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t5CJHFLS011081 for ; Fri, 12 Jun 2015 13:17:15 -0600 Message-ID: <557B303A.4000006@linux.vnet.ibm.com> Date: Fri, 12 Jun 2015 14:17:14 -0500 From: "Michael R. Hines" MIME-Version: 1.0 References: <1434043048-4444-1-git-send-email-dgilbert@redhat.com> <1434043048-4444-7-git-send-email-dgilbert@redhat.com> <5579CF9C.50707@linux.vnet.ibm.com> <20150611185825.GO2123@work-vm> <5579DC91.70002@linux.vnet.ibm.com> <20150612185034.GF2141@work-vm> In-Reply-To: <20150612185034.GF2141@work-vm> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 06/12] Translate offsets to destination address space List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: amit.shah@redhat.com, quintela@redhat.com, arei.gonglei@huawei.com, qemu-devel@nongnu.org, mrhines@us.ibm.com On 06/12/2015 01:50 PM, Dr. David Alan Gilbert wrote: > * Michael R. Hines (mrhines@linux.vnet.ibm.com) wrote: >> On 06/11/2015 01:58 PM, Dr. David Alan Gilbert wrote: >>> * Michael R. Hines (mrhines@linux.vnet.ibm.com) wrote: >>>> On 06/11/2015 12:17 PM, Dr. David Alan Gilbert (git) wrote: >>>>> From: "Dr. David Alan Gilbert" >>>>> >>>>> The 'offset' field in RDMACompress and 'current_addr' field >>>>> in RDMARegister are commented as being offsets within a particular >>>>> RAMBlock, however they appear to actually be offsets within the >>>>> ram_addr_t space. >>>>> >>>>> The code currently assumes that the offsets on the source/destination >>>>> match, this change removes the need for the assumption for these >>>>> structures by translating the addresses into the ram_addr_t space of >>>>> the destination host. >>>>> >>>>> Note: An alternative would be to change the fields to actually >>>>> take the data they're commented for; this would potentially be >>>>> simpler but would break stream compatibility for those cases >>>>> that currently work. >>>>> >>>>> Signed-off-by: Dr. David Alan Gilbert >>>>> --- >>>>> migration/rdma.c | 31 ++++++++++++++++++++++++------- >>>>> 1 file changed, 24 insertions(+), 7 deletions(-) >>>>> >>>>> diff --git a/migration/rdma.c b/migration/rdma.c >>>>> index 9532461..cb66721 100644 >>>>> --- a/migration/rdma.c >>>>> +++ b/migration/rdma.c >>>>> @@ -411,7 +411,7 @@ static void network_to_control(RDMAControlHeader *control) >>>>> */ >>>>> typedef struct QEMU_PACKED { >>>>> union QEMU_PACKED { >>>>> - uint64_t current_addr; /* offset into the ramblock of the chunk */ >>>>> + uint64_t current_addr; /* offset into the ram_addr_t space */ >>>>> uint64_t chunk; /* chunk to lookup if unregistering */ >>>>> } key; >>>>> uint32_t current_index; /* which ramblock the chunk belongs to */ >>>>> @@ -419,8 +419,19 @@ typedef struct QEMU_PACKED { >>>>> uint64_t chunks; /* how many sequential chunks to register */ >>>>> } RDMARegister; >>>>> >>>>> -static void register_to_network(RDMARegister *reg) >>>>> +static void register_to_network(RDMAContext *rdma, RDMARegister *reg) >>>>> { >>>>> + RDMALocalBlock *local_block; >>>>> + local_block = &rdma->local_ram_blocks.block[reg->current_index]; >>>>> + >>>>> + if (local_block->is_ram_block) { >>>>> + /* >>>>> + * current_addr as passed in is an address in the local ram_addr_t >>>>> + * space, we need to translate this for the destination >>>>> + */ >>>>> + reg->key.current_addr -= local_block->offset; >>>>> + reg->key.current_addr += rdma->dest_blocks[reg->current_index].offset; >>>>> + } >>>>> reg->key.current_addr = htonll(reg->key.current_addr); >>>>> reg->current_index = htonl(reg->current_index); >>>>> reg->chunks = htonll(reg->chunks); >>>>> @@ -436,13 +447,19 @@ static void network_to_register(RDMARegister *reg) >>>>> typedef struct QEMU_PACKED { >>>>> uint32_t value; /* if zero, we will madvise() */ >>>>> uint32_t block_idx; /* which ram block index */ >>>>> - uint64_t offset; /* where in the remote ramblock this chunk */ >>>>> + uint64_t offset; /* Address in remote ram_addr_t space */ >>>>> uint64_t length; /* length of the chunk */ >>>>> } RDMACompress; >>>>> >>>>> -static void compress_to_network(RDMACompress *comp) >>>>> +static void compress_to_network(RDMAContext *rdma, RDMACompress *comp) >>>>> { >>>>> comp->value = htonl(comp->value); >>>>> + /* >>>>> + * comp->offset as passed in is an address in the local ram_addr_t >>>>> + * space, we need to translate this for the destination >>>>> + */ >>>>> + comp->offset -= rdma->local_ram_blocks.block[comp->block_idx].offset; >>>>> + comp->offset += rdma->dest_blocks[comp->block_idx].offset; >>>>> comp->block_idx = htonl(comp->block_idx); >>>>> comp->offset = htonll(comp->offset); >>>>> comp->length = htonll(comp->length); >>>> So, why add the destination block's offset on the source side >>>> just for it to be re-adjusted again when it gets to the destination side? >>>> >>>> Can you just stop at this: >>>> >>>> + reg->key.current_addr -= local_block->offset; >>>> >>>> Without this: >>>> >>>> + reg->key.current_addr += >>>> rdma->dest_blocks[reg->current_index].offset; >>>> >>>> ... on the source, followed by this on the destionation: >>>> >>>> + comp->offset -= rdma->local_ram_blocks.block[comp->block_idx].offset; >>>> >>>> Without this: >>>> >>>> + comp->offset += rdma->dest_blocks[comp->block_idx].offset; >>>> >>>> Did I follow correctly? >>> Aren't both of those conversions happening on the source? >>> Anyway, I think what you're saying is that we change the value sent over >>> the network to be an offset within the block instead of an offset in >>> the whole ram_addr_t space (i.e. that's what happens if you don't >>> add back on the dest_blocks[].offset). >> Yes, right. Can you skip adding/subtracting the local block offset on each >> side? > I don't understand how I can do that without changing the wire format so > that it would be subtly incompatible, and I'd like to get 2.1ish migrating to 2.3ish. > If I didn't add the local_block->offset on the source, the value on the wire > would now be the offset within the RAMBlock rather than the offset in ram_addr_t. > > Except for compatibility I'd agree it would be simpler. > > Dave > > >> - Michael >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > Acknowledged.