From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, owasserm@redhat.com,
abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation
Date: Thu, 11 Apr 2013 16:33:03 -0400 [thread overview]
Message-ID: <51671DFF.80904@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130411191533.GA25515@redhat.com>
On 04/11/2013 03:15 PM, Michael S. Tsirkin wrote:
> On Thu, Apr 11, 2013 at 01:49:34PM -0400, Michael R. Hines wrote:
>> On 04/11/2013 10:56 AM, Michael S. Tsirkin wrote:
>>> On Thu, Apr 11, 2013 at 04:50:21PM +0200, Paolo Bonzini wrote:
>>>> Il 11/04/2013 16:37, Michael S. Tsirkin ha scritto:
>>>>> pg1 -> pin -> req -> res -> rdma -> done
>>>>> pg2 -> pin -> req -> res -> rdma -> done
>>>>> pg3 -> pin -> req -> res -> rdma -> done
>>>>> pg4 -> pin -> req -> res -> rdma -> done
>>>>> pg4 -> pin -> req -> res -> rdma -> done
>>>>>
>>>>> It's like a assembly line see? So while software does the registration
>>>>> roundtrip dance, hardware is processing rdma requests for previous
>>>>> chunks.
>>>> Does this only affects the implementation, or also the wire protocol?
>>> It affects the wire protocol.
>> I *do* believe chunked registration was a *very* useful request by
>> the community, and I want to thank you for convincing me to implement it.
>>
>> But, with all due respect, pipelining is a "solution looking for a problem".
> The problem is bad performance, isn't it?
> If it wasn't we'd use chunk based all the time.
>
>> Improving the protocol does not help the behavior of any well-known
>> workloads,
>> because it is based on the idea the the memory footprint of a VM would
>> *rapidly* shrink and contract up and down during the steady-state iteration
>> rounds while the migration is taking place.
> What gave you that idea? Not at all. It is based on the idea
> of doing control actions in parallel with data transfers,
> so that control latency does not degrade performance.
Again, this parallelization is trying to solve a problem that doesn't
exist.
As I've described before, I re-executed the worst-case memory stress hog
tests with RDMA *after* the bulk-phase round completes and determined
that RDMA throughput remains unaffected because most of the memory
was already registered in advance.
>> This simply does not happen - workloads don't behave that way - they either
>> grow really big or they grow really small and they settle that way
>> for a reasonable
>> amount of time before the load on the application changes at a
>> future point in time.
>>
>> - Michael
> What is the bottleneck for chunk-based? Can you tell me that? Find out,
> and you will maybe see pipelining will help.
>
> Basically to me, when you describe the protocol in detail the problems
> become apparent.
>
> I think you worry too much about what the guest does, what APIs are
> exposed from the migration core and the specifics of the workload. Build
> a sane protocol for data transfers and layer the workload on top.
>
What is the point in enhancing a protocol to solve a problem will never
be manifested?
We're trying to overlap two *completely different use cases* that are
completely unrelated:
1. Static overcommit
2. Dynamic, fine-grained overcommit (at small time scales... seconds or
minutes)
#1 Happens all the time. Cram a bunch of virtual machines with fixed
workloads
and fixed writable working sets into the same place, and you're good to go.
#2 never happens. Ever. It just doesn't happen, and the enhancements you've
described are trying to protect against #2, when we should really be
focused on #1.
It is not standard practice for a workload to expect high overcommit
performance
in the *middle* of a relocation and nobody in the industry that I have
met over the
years has expressed any such desire to do so.
Workloads just don't behave that way.
Dynamic registration does an excellent job at overcommitment for #1
because most
of the registrations are done at the very beginning and can be further
optimized to
cause little or no performance loss by simply issuing the registrations
before the
migration ever begins.
Performance for #2 even with dynamic registration is excellent and I am not
experiencing any problems associated with it.
So, we're discussing a non-issue.
- Michael
Overcommit has two
next prev parent reply other threads:[~2013-04-11 20:33 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-09 3:04 [Qemu-devel] [RFC PATCH RDMA support v5: 00/12] new formal protocol design mrhines
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 01/12] ./configure with and without --enable-rdma mrhines
2013-04-09 17:05 ` Paolo Bonzini
2013-04-09 18:07 ` Michael R. Hines
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 02/12] check for CONFIG_RDMA mrhines
2013-04-09 16:46 ` Paolo Bonzini
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation mrhines
2013-04-10 5:27 ` Michael S. Tsirkin
2013-04-10 13:04 ` Michael R. Hines
2013-04-10 13:34 ` Michael S. Tsirkin
2013-04-10 15:29 ` Michael R. Hines
2013-04-10 17:41 ` Michael S. Tsirkin
2013-04-10 20:05 ` Michael R. Hines
2013-04-11 7:19 ` Michael S. Tsirkin
2013-04-11 13:12 ` Michael R. Hines
2013-04-11 13:48 ` Michael S. Tsirkin
2013-04-11 13:58 ` Michael R. Hines
2013-04-11 14:37 ` Michael S. Tsirkin
2013-04-11 14:50 ` Paolo Bonzini
2013-04-11 14:56 ` Michael S. Tsirkin
2013-04-11 17:49 ` Michael R. Hines
2013-04-11 19:15 ` Michael S. Tsirkin
2013-04-11 20:33 ` Michael R. Hines [this message]
2013-04-12 10:48 ` Michael S. Tsirkin
2013-04-12 10:53 ` Paolo Bonzini
2013-04-12 11:25 ` Michael S. Tsirkin
2013-04-12 14:43 ` Paolo Bonzini
2013-04-14 11:59 ` Michael S. Tsirkin
2013-04-14 14:09 ` Paolo Bonzini
2013-04-14 14:40 ` Michael R. Hines
2013-04-14 14:27 ` Michael R. Hines
2013-04-14 16:03 ` Michael S. Tsirkin
2013-04-14 16:07 ` Michael R. Hines
2013-04-14 16:40 ` Michael R. Hines
2013-04-14 18:30 ` Michael S. Tsirkin
2013-04-14 19:06 ` Michael R. Hines
2013-04-14 21:10 ` Michael S. Tsirkin
2013-04-15 1:06 ` Michael R. Hines
2013-04-15 6:00 ` Michael S. Tsirkin
2013-04-15 13:07 ` Michael R. Hines
2013-04-15 22:20 ` Michael S. Tsirkin
2013-04-15 8:28 ` Paolo Bonzini
2013-04-15 13:08 ` Michael R. Hines
2013-04-15 8:26 ` Paolo Bonzini
2013-04-12 13:47 ` Michael R. Hines
2013-04-14 8:28 ` Michael S. Tsirkin
2013-04-14 14:31 ` Michael R. Hines
2013-04-14 18:51 ` Michael S. Tsirkin
2013-04-14 19:43 ` Michael R. Hines
2013-04-14 21:16 ` Michael S. Tsirkin
2013-04-15 1:10 ` Michael R. Hines
2013-04-15 6:10 ` Michael S. Tsirkin
2013-04-15 8:34 ` Paolo Bonzini
2013-04-15 13:24 ` Michael R. Hines
2013-04-15 13:30 ` Paolo Bonzini
2013-04-15 19:55 ` Michael R. Hines
2013-04-11 15:01 ` Michael R. Hines
2013-04-11 15:18 ` Michael R. Hines
2013-04-11 15:33 ` Paolo Bonzini
2013-04-11 15:46 ` Michael S. Tsirkin
2013-04-11 15:47 ` Paolo Bonzini
2013-04-11 15:58 ` Michael S. Tsirkin
2013-04-11 16:06 ` Michael R. Hines
2013-04-12 5:10 ` Michael R. Hines
2013-04-12 5:26 ` Paolo Bonzini
2013-04-12 5:54 ` Michael R. Hines
2013-04-11 15:44 ` Michael S. Tsirkin
2013-04-11 16:09 ` Michael R. Hines
2013-04-11 17:04 ` Michael S. Tsirkin
2013-04-11 17:27 ` Michael R. Hines
2013-04-11 16:13 ` Michael R. Hines
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 04/12] introduce qemu_ram_foreach_block() mrhines
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 05/12] core RDMA migration logic w/ new protocol mrhines
2013-04-09 16:57 ` Paolo Bonzini
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 06/12] connection-establishment for RDMA mrhines
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 07/12] additional savevm.c accessors " mrhines
2013-04-09 17:03 ` Paolo Bonzini
2013-04-09 17:31 ` Peter Maydell
2013-04-09 18:04 ` Michael R. Hines
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 08/12] new capabilities added and check for QMP string 'rdma' mrhines
2013-04-09 17:01 ` Paolo Bonzini
2013-04-10 1:11 ` Michael R. Hines
2013-04-10 8:07 ` Paolo Bonzini
2013-04-10 10:35 ` Michael S. Tsirkin
2013-04-10 12:24 ` Michael R. Hines
2013-04-09 17:02 ` Paolo Bonzini
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 09/12] transmit pc.ram using RDMA mrhines
2013-04-09 16:50 ` Paolo Bonzini
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 10/12] new header file prototypes for savevm.c mrhines
2013-04-09 16:43 ` Paolo Bonzini
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 11/12] update schema to define new capabilities mrhines
2013-04-09 16:43 ` Paolo Bonzini
2013-04-09 3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 12/12] don't set nonblock on invalid file descriptor mrhines
2013-04-09 16:45 ` Paolo Bonzini
2013-04-09 4:24 ` [Qemu-devel] [RFC PATCH RDMA support v5: 00/12] new formal protocol design Michael R. Hines
2013-04-09 12:44 ` Michael S. Tsirkin
2013-04-09 14:23 ` Michael R. Hines
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51671DFF.80904@linux.vnet.ibm.com \
--to=mrhines@linux.vnet.ibm.com \
--cc=abali@us.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=gokul@us.ibm.com \
--cc=mrhines@us.ibm.com \
--cc=mst@redhat.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).