qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: aliguori@us.ibm.com, mst@redhat.com,
	michael.r.hines.mrhines@linux.vnet.ibm.com,
	qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com,
	mrhines@us.ibm.com, gokul@us.ibm.com
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v3: 07/10] Send the actual pages over RDMA.
Date: Mon, 11 Mar 2013 12:31:36 -0400	[thread overview]
Message-ID: <513E06E8.1080706@linux.vnet.ibm.com> (raw)
In-Reply-To: <513DE341.80209@redhat.com>

Acknowledged all...

On 03/11/2013 09:59 AM, Paolo Bonzini wrote:
> Il 11/03/2013 05:33, Michael.R.Hines.mrhines@linux.vnet.ibm.com ha scritto:
>> From: "Michael R. Hines" <mrhines@us.ibm.com>
>>
>> For performance reasons, dup_page() and xbzrle() is skipped because
>> they are too expensive for zero-copy RDMA.
>>
>> Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
>> ---
>>   arch_init.c |   57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch_init.c b/arch_init.c
>> index 8daeafa..437cb47 100644
>> --- a/arch_init.c
>> +++ b/arch_init.c
>> @@ -45,6 +45,7 @@
>>   #include "exec/address-spaces.h"
>>   #include "hw/pcspk.h"
>>   #include "migration/page_cache.h"
>> +#include "migration/rdma.h"
>>   #include "qemu/config-file.h"
>>   #include "qmp-commands.h"
>>   #include "trace.h"
>> @@ -245,6 +246,18 @@ uint64_t norm_mig_pages_transferred(void)
>>       return acct_info.norm_pages;
>>   }
>>   
>> +/*
>> + * RDMA does not use the buffered_file,
>> + * but we still need a way to do accounting...
>> + */
>> +uint64_t delta_norm_mig_bytes_transferred(void)
>> +{
>> +    static uint64_t last_norm_pages = 0;
>> +    uint64_t delta_bytes = (acct_info.norm_pages - last_norm_pages) * TARGET_PAGE_SIZE;
>> +    last_norm_pages = acct_info.norm_pages;
>> +    return delta_bytes;
>> +}
>> +
>>   uint64_t xbzrle_mig_bytes_transferred(void)
>>   {
>>       return acct_info.xbzrle_bytes;
>> @@ -282,6 +295,45 @@ static size_t save_block_hdr(QEMUFile *f, RAMBlock *block, ram_addr_t offset,
>>       return size;
>>   }
>>   
>> +static size_t save_rdma_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset,
>> +                             int cont)
>> +{
>> +    int ret;
>> +    size_t bytes_sent = 0;
>> +    ram_addr_t current_addr;
>> +    RDMAData * rdma = &migrate_get_current()->rdma;
>> +
>> +    acct_info.norm_pages++;
>> +
>> +    /*
>> +     * use RDMA to send page
>> +     */
> Not quite true, the page is added to the current chunk.  Please make the
> comments a quick-and-dirty reference of the protocol, or leave them out
> altogether.
>
>> +    current_addr = block->offset + offset;
>> +    if ((ret = qemu_rdma_write(rdma, current_addr, TARGET_PAGE_SIZE)) < 0) {
>> +        fprintf(stderr, "rdma migration: write error! %d\n", ret);
>> +        qemu_file_set_error(f, ret);
>> +        return ret;
>> +    }
>> +
>> +    /*
>> +     * do some polling
>> +     */
> Again, that's quite self-evident.  Poll for what though? :)
>
>> +    while (1) {
>> +        int ret = qemu_rdma_poll(rdma);
>> +        if (ret == RDMA_WRID_NONE) {
>> +            break;
>> +        }
>> +        if (ret < 0) {
>> +            fprintf(stderr, "rdma migration: polling error! %d\n", ret);
>> +            qemu_file_set_error(f, ret);
>> +            return ret;
>> +        }
>> +    }
>> +
>> +    bytes_sent += TARGET_PAGE_SIZE;
>> +    return bytes_sent;
>> +}
> As written in the other message, I think this should be an additional
> QEMUFile operation, hopefully the same that Orit is introducing in her
> patches.
>
>>   #define ENCODING_FLAG_XBZRLE 0x1
>>   
>>   static int save_xbzrle_page(QEMUFile *f, uint8_t *current_data,
>> @@ -462,7 +514,10 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
>>   
>>               /* In doubt sent page as normal */
>>               bytes_sent = -1;
>> -            if (is_dup_page(p)) {
>> +            if (migrate_use_rdma()) {
>> +                /* searching for zeros is still too expensive for RDMA */
>> +                bytes_sent = save_rdma_page(f, block, offset, cont);
> Again as written in the other message, this is not really an RDMA thing,
> it's mostly the effect of a fast link.  Of course to some extent it
> depends on the CPU and RAM speed, but we can fake that it isn't.
>
>> +            } else if (is_dup_page(p)) {
>>                   acct_info.dup_pages++;
>>                   bytes_sent = save_block_hdr(f, block, offset, cont,
>>                                               RAM_SAVE_FLAG_COMPRESS);
>>
> Thanks,
>
> Paolo
>

  reply	other threads:[~2013-03-11 16:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1362976414-21396-1-git-send-email-mrhines@us.ibm.com>
     [not found] ` <1362976414-21396-4-git-send-email-mrhines@us.ibm.com>
2013-03-11 11:51   ` [Qemu-devel] [RFC PATCH RDMA support v3: 03/10] documentation of RDMA protocol in docs/rdma.txt Michael S. Tsirkin
2013-03-11 16:24     ` Michael R. Hines
2013-03-11 17:05       ` Michael S. Tsirkin
2013-03-11 17:17         ` Michael R. Hines
2013-03-11 17:19           ` Michael S. Tsirkin
2013-03-11 17:35             ` Michael R. Hines
     [not found] ` <1362976414-21396-3-git-send-email-mrhines@us.ibm.com>
2013-03-11 13:35   ` [Qemu-devel] [RFC PATCH RDMA support v3: 02/10] Link in new migration-rdma.c and rmda.c files Paolo Bonzini
2013-03-11 16:25     ` Michael R. Hines
     [not found] ` <1362976414-21396-9-git-send-email-mrhines@us.ibm.com>
2013-03-11 13:40   ` [Qemu-devel] [RFC PATCH RDMA support v3: 08/10] Introduce QEMUFileRDMA Paolo Bonzini
2013-03-11 16:26     ` Michael R. Hines
2013-03-11 16:26     ` Michael R. Hines
     [not found] ` <1362976414-21396-6-git-send-email-mrhines@us.ibm.com>
2013-03-11 13:41   ` [Qemu-devel] [RFC PATCH RDMA support v3: 05/10] RDMA connection establishment (migration-rdma.c) Paolo Bonzini
2013-03-11 16:28     ` Michael R. Hines
2013-03-11 20:20     ` Michael R. Hines
     [not found] ` <1362976414-21396-7-git-send-email-mrhines@us.ibm.com>
2013-03-11 13:49   ` [Qemu-devel] [RFC PATCH RDMA support v3: 06/10] Introduce 'max_iterations' and Call out to migration-rdma.c when requested Paolo Bonzini
2013-03-11 16:30     ` Michael R. Hines
     [not found] ` <1362976414-21396-8-git-send-email-mrhines@us.ibm.com>
2013-03-11 13:59   ` [Qemu-devel] [RFC PATCH RDMA support v3: 07/10] Send the actual pages over RDMA Paolo Bonzini
2013-03-11 16:31     ` Michael R. Hines [this message]
     [not found] ` <1362976414-21396-11-git-send-email-mrhines@us.ibm.com>
2013-03-11 14:00   ` [Qemu-devel] [RFC PATCH RDMA support v3: 10/10] Parse RDMA host/port out of the QMP string Paolo Bonzini
2013-03-11 16:32     ` Michael R. Hines
     [not found] ` <1362976414-21396-10-git-send-email-mrhines@us.ibm.com>
2013-03-11 14:07   ` [Qemu-devel] [RFC PATCH RDMA support v3: 09/10] Move RAMBlock to cpu-common.h Paolo Bonzini
2013-03-11 16:34     ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513E06E8.1080706@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=abali@us.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=gokul@us.ibm.com \
    --cc=michael.r.hines.mrhines@linux.vnet.ibm.com \
    --cc=mrhines@us.ibm.com \
    --cc=mst@redhat.com \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).