qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Eric Blake <eblake@redhat.com>
Cc: aliguori@us.ibm.com, mst@redhat.com, qemu-devel@nongnu.org,
	owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com,
	gokul@us.ibm.com, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v1: 12/13] updated protocol documentation
Date: Wed, 10 Apr 2013 22:47:10 -0400	[thread overview]
Message-ID: <5166242E.1030904@linux.vnet.ibm.com> (raw)
In-Reply-To: <51662342.8090802@redhat.com>

Great comments, thanks.

On 04/10/2013 10:43 PM, Eric Blake wrote:
> On 04/10/2013 04:28 PM, mrhines@linux.vnet.ibm.com wrote:
>> From: "Michael R. Hines" <mrhines@us.ibm.com>
>>
>> Full documentation on the rdma protocol: docs/rdma.txt
>>
>> Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
>> ---
>>   docs/rdma.txt |  331 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 331 insertions(+)
>>   create mode 100644 docs/rdma.txt
>>
>> diff --git a/docs/rdma.txt b/docs/rdma.txt
>> new file mode 100644
>> index 0000000..ae68d2f
>> --- /dev/null
>> +++ b/docs/rdma.txt
>> @@ -0,0 +1,331 @@
>> +Changes since v6:
>> +
>> +(Thanks, Paolo - things look much cleaner now.)
>> +
>> +- Try to get patch-ordering correct =)
>> +- Much cleaner use of QEMUFileOps
>> +- Much fewer header files changes
>> +- Convert zero check capability to QMP command instead
>> +- Updated documentation
> The above text probably shouldn't be in the file.
>
>> +
>> +Wiki: http://wiki.qemu.org/Features/RDMALiveMigration
>> +Github: git@github.com:hinesmr/qemu.git
>> +Contact: Michael R. Hines, mrhines@us.ibm.com
> Missing a copyright statement, but that's just following the example of
> other docs, so I guess it's okay?
>
>> +
>> +RDMA Live Migration Specification, Version # 1
>> +
>> +Contents:
>> +=================================
>> +* Running
>> +* RDMA Protocol Description
>> +* Versioning and Capabilities
>> +* QEMUFileRDMA Interface
>> +* Migration of pc.ram
>> +* Error handling
>> +* TODO
>> +* Performance
>> +
> No high-level overview of what the acronym RDMA even stands for?
>
>> +RUNNING:
>> +===============================
>> +
>> +First, decide if you want dynamic page registration on the server-side.
>> +This always happens on the primary-VM side, but is optional on the server.
>> +Doing this allows you to support overcommit (such as cgroups or ballooning)
>> +with a smaller footprint on the server-side without having to register the
>> +entire VM memory footprint.
>> +NOTE: This significantly slows down RDMA throughput (about 30% slower).
>> +
>> +$ virsh qemu-monitor-command --hmp \
>> +    --cmd "migrate_set_capability chunk_register_destination off" # enabled by default
> 'virsh qemu-monitor-command' is documented as unsupported by libvirt
> (it's intended solely as a development/debugging aid); but I guess until
> libvirt learns to expose RDMA support by default, this is okay for a
> first cut of documentation.  Furthermore, you are missing a domain argument.
>
> Do you really want to be requiring the user to do everything through
> libvirt?  This is qemu documentation, so you should document how things
> work without needing libvirt in the picture.
>
>> +
>> +Next, if you decided *not* to use chunked registration on the server,
>> +it is recommended to also disable zero page detection. While this is not
>> +strictly necessary, zero page detection also significantly slows down
>> +throughput on higher-performance links (by about 50%), like 40 gbps infiniband cards:
>> +
>> +$ virsh qemu-monitor-command --hmp \
>> +    --cmd "migrate_check_for_zero off" # enabled by default
> Missing a domain argument.
>
>> +
>> +Finally, set the migration speed to match your hardware's capabilities:
>> +
>> +$ virsh qemu-monitor-command --hmp \
>> +    --cmd "migrate_set_speed 40g" # or whatever is the MAX of your RDMA device
> This modifies qemu state behind libvirt's back, and won't necessarily do
> what you want if libvirt tries to change things back to the speed it
> thought it was managing.  Instead, use 'virsh migrate-setspeed $dom 40'.
>
>> +
>> +Finally, perform the actual migration:
>> +
>> +$ virsh migrate domain rdma:xx.xx.xx.xx:port
> That's not quite valid syntax for 'virsh migrate'.  Again, do you really
> want to be documenting libvirt's interface, or qemu's interface?
>
>> +
>> +RDMA Protocol Description:
>> +=================================
> Aesthetics: match the length of === to the line above it.
>
> <snip> I'm not reviewing technical content, just face value...
>
>> +
>> +These two functions are very short and simply used the protocol
>> +describe above to deliver bytes without changing the upper-level
>> +users of QEMUFile that depend on a bytstream abstraction.
> s/bytstream/bytestream/
>
> ...
>> +
>> +After pinning, an RDMA Write is generated and tramsmitted
>> +for the entire chunk.
> s/tramsmitted/transmitted/
>
>> +5. Also, some form of balloon-device usage tracking would also
>> +   help aleviate some of these issues.
> s/aleviate/alleviate/
>
>> +
>> +PERFORMANCE
>> +===================
>> +
>> +Using a 40gbps infinband link performing a worst-case stress test:
> s/infinband/infiniband/
>
>> +
>> +RDMA Throughput With $ stress --vm-bytes 1024M --vm 1 --vm-keep
>> +Approximately 30 gpbs (little better than the paper)
> which paper? Call that out in your high-level summary
>
> ...
>> +
>> +An *exhaustive* paper (2010) shows additional performance details
>> +linked on the QEMU wiki:
> Missing the actual reference?  And it would help to mention it at the
> beginning of the file.
>

  reply	other threads:[~2013-04-11  2:47 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-10 22:28 [Qemu-devel] [RFC PATCH RDMA support v7: 00/13] rdma cleanup and reordering mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 01/13] introduce qemu_ram_foreach_block() mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 02/13] Core RMDA logic mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 03/13] RDMA is enabled by default per the usual ./configure testing mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 04/13] update QEMUFileOps with new hooks mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 05/13] accessor function prototypes for new QEMUFileOps hooks mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 06/13] implementation of " mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 07/13] introduce capability for dynamic chunk registration mrhines
2013-04-11  2:24   ` Eric Blake
2013-04-11  2:39     ` Michael R. Hines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 08/13] default chunk registration to true mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 09/13] parse QMP string for new 'rdma' protocol mrhines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 10/13] introduce new command migrate_check_for_zero mrhines
2013-04-11  2:26   ` Eric Blake
2013-04-11  2:39     ` Michael R. Hines
2013-04-11  7:52       ` Orit Wasserman
2013-04-11 12:30         ` Eric Blake
2013-04-11 12:36           ` Orit Wasserman
2013-04-11 17:53             ` Michael R. Hines
2013-04-11  3:11     ` Michael R. Hines
2013-04-11  7:38   ` Michael S. Tsirkin
2013-04-11  9:18     ` Paolo Bonzini
2013-04-11 11:13       ` Michael S. Tsirkin
2013-04-11 13:19         ` Michael R. Hines
2013-04-11 13:51           ` Michael S. Tsirkin
2013-04-11 14:06             ` Michael R. Hines
2013-04-11 14:17               ` Paolo Bonzini
2013-04-11 14:35                 ` Michael R. Hines
2013-04-11 14:45                   ` Paolo Bonzini
2013-04-11 15:37                     ` Michael R. Hines
2013-04-11 13:24       ` Michael R. Hines
2013-04-11 14:15         ` Paolo Bonzini
2013-04-11 14:45           ` Michael S. Tsirkin
2013-04-11 14:57           ` Michael R. Hines
2013-04-11 15:01             ` Michael S. Tsirkin
2013-04-11 15:08             ` Paolo Bonzini
2013-04-11 15:35               ` Michael R. Hines
2013-04-11 15:45                 ` Paolo Bonzini
2013-04-11 16:02                   ` Michael R. Hines
2013-04-11 16:12                     ` Paolo Bonzini
2013-04-11 16:07                   ` Eric Blake
2013-04-11 16:29                     ` Michael R. Hines
2013-04-11 16:36                       ` Eric Blake
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 11/13] send pc.ram over RDMA mrhines
2013-04-11  6:26   ` Paolo Bonzini
2013-04-11 12:41     ` Michael R. Hines
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 12/13] updated protocol documentation mrhines
2013-04-11  2:43   ` Eric Blake
2013-04-11  2:47     ` Michael R. Hines [this message]
2013-04-11  6:29   ` Paolo Bonzini
2013-04-10 22:28 ` [Qemu-devel] [RFC PATCH RDMA support v1: 13/13] print out migration throughput while debugging mrhines
2013-04-10 22:32 ` [Qemu-devel] [RFC PATCH RDMA support v7: 00/13] rdma cleanup and reordering Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5166242E.1030904@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=abali@us.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=eblake@redhat.com \
    --cc=gokul@us.ibm.com \
    --cc=mrhines@us.ibm.com \
    --cc=mst@redhat.com \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).