Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, owasserm@redhat.com,
	abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation
Date: Sun, 14 Apr 2013 21:06:36 -0400	[thread overview]
Message-ID: <516B529C.5090503@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130414211055.GF7165@redhat.com>

On 04/14/2013 05:10 PM, Michael S. Tsirkin wrote:
> On Sun, Apr 14, 2013 at 03:06:11PM -0400, Michael R. Hines wrote:
>> On 04/14/2013 02:30 PM, Michael S. Tsirkin wrote:
>>> On Sun, Apr 14, 2013 at 12:40:10PM -0400, Michael R. Hines wrote:
>>>> On 04/14/2013 12:03 PM, Michael S. Tsirkin wrote:
>>>>> On Sun, Apr 14, 2013 at 10:27:24AM -0400, Michael R. Hines wrote:
>>>>>> On 04/14/2013 07:59 AM, Michael S. Tsirkin wrote:
>>>>>>> On Fri, Apr 12, 2013 at 04:43:54PM +0200, Paolo Bonzini wrote:
>>>>>>>> Il 12/04/2013 13:25, Michael S. Tsirkin ha scritto:
>>>>>>>>> On Fri, Apr 12, 2013 at 12:53:11PM +0200, Paolo Bonzini wrote:
>>>>>>>>>> Il 12/04/2013 12:48, Michael S. Tsirkin ha scritto:
>>>>>>>>>>> 1.  You have two protocols already and this does not make sense in
>>>>>>>>>>> version 1 of the patch.
>>>>>>>>>> It makes sense if we consider it experimental (add x- in front of
>>>>>>>>>> transport and capability) and would like people to play with it.
>>>>>>>>>>
>>>>>>>>>> Paolo
>>>>>>>>> But it's not testable yet.  I see problems just reading the
>>>>>>>>> documentation.  Author thinks "ulimit -l 10000000000" on both source and
>>>>>>>>> destination is just fine.  This can easily crash host or cause OOM
>>>>>>>>> killer to kill QEMU.  So why is there any need for extra testers?  Fix
>>>>>>>>> the major bugs first.
>>>>>>>>>
>>>>>>>>> There's a similar issue with device assignment - we can't fix it there,
>>>>>>>>> and despite being available for years, this was one of two reasons that
>>>>>>>>> has kept this feature out of hands of lots of users (and assuming guest
>>>>>>>>> has lots of zero pages won't work: balloon is not widely used either
>>>>>>>>> since it depends on a well-behaved guest to work correctly).
>>>>>>>> I agree assuming guest has lots of zero pages won't work, but I think
>>>>>>>> you are overstating the importance of overcommit.  Let's mark the damn
>>>>>>>> thing as experimental, and stop making perfect the enemy of good.
>>>>>>>>
>>>>>>>> Paolo
>>>>>>> It looks like we have to decide, before merging, whether migration with
>>>>>>> rdma that breaks overcommit is worth it or not.  Since the author made
>>>>>>> it very clear he does not intend to make it work with overcommit, ever.
>>>>>>>
>>>>>> That depends entirely as what you define as overcommit.
>>>>> You don't get to define your own terms.  Look it up in wikipedia or
>>>>> something.
>>>>>
>>>>>> The pages do get unregistered at the end of the migration =)
>>>>>>
>>>>>> - Michael
>>>>> The limitations are pretty clear, and you really should document them:
>>>>>
>>>>> 1. run qemu as root, or under ulimit -l <total guest memory> on both source and
>>>>>    destination
>>>>>
>>>>> 2. expect that as much as that amount of memory is pinned
>>>>>    and unvailable to host kernel and applications for
>>>>>    arbitrarily long time.
>>>>>    Make sure you have much more RAM in host or QEMU will get killed.
>>>>>
>>>>> To me, especially 1 is an unacceptable security tradeoff.
>>>>> It is entirely fixable but we both have other priorities,
>>>>> so it'll stay broken.
>>>>>
>>>> I've modified the beginning of docs/rdma.txt to say the following:
>>> It really should say this, in a very prominent place:
>>>
>>> BUGS:
>> Not a bug. We'll have to agree to disagree. Please drop this.
> It's not a feature, it makes management harder and
> will bite some users who are not careful enough
> to read documentation and know what to expect.

Something that does not exist cannot be a bug. That's called a 
non-existent optimization.

>>> 1. You must run qemu as root, or under
>>>     ulimit -l <total guest memory> on both source and destination
>> Good, will update the documentation now.
>>> 2. Expect that as much as that amount of memory to be locked
>>>     and unvailable to host kernel and applications for
>>>     arbitrarily long time.
>>>     Make sure you have much more RAM in host otherwise QEMU,
>>>     or some other arbitrary application on same host, will get killed.
>> This is implied already. The docs say "If you don't want pinning,
>> then use TCP".
>> That's enough warning.
> No it's not. Pinning is jargon, and does not mean locking
> up gigabytes.  Why are you using jargon?
> Explain the limitation in plain English so people know
> when to expect things to work.

Already done.

>>> 3. Migration with RDMA support is experimental and unsupported.
>>>     In particular, please do not expect it to work across qemu versions,
>>>     and do not expect the management interface to be stable.
>> The only correct statement here is that it's experimental.
>>
>> I will update the docs to reflect that.
>>
>>>> $ cat docs/rdma.txt
>>>>
>>>> ... snip ..
>>>>
>>>> BEFORE RUNNING:
>>>> ===============
>>>>
>>>> Use of RDMA requires pinning and registering memory with the
>>>> hardware. If this is not acceptable for your application or
>>>> product, then the use of RDMA is strongly discouraged and you
>>>> should revert back to standard TCP-based migration.
>>> No one knows of should know what "pinning and registering" means.
>> I will define it in the docs, then.
> Keep it simple. Just tell people what they need to know.
> It's silly to expect users to understand internals of
> the product before they even try it for the first time.

Agreed.

>>> For which applications and products is it appropriate?
>> That's up to the vendor or user to decide, not us.
> With zero information so far, no one will be
> able to decide.

There is plenty of information. Including this email thread.


>>> Also, you are talking about current QEMU
>>> code using RDMA for migration but say "RDMA" generally.
>> Sure, I will fix the docs.
>>
>>>> Next, decide if you want dynamic page registration on the server-side.
>>>> For example, if you have an 8GB RAM virtual machine, but only 1GB
>>>> is in active use, then disabling this feature will cause all 8GB to
>>>> be pinned and resident in memory. This feature mostly affects the
>>>> bulk-phase round of the migration and can be disabled for extremely
>>>> high-performance RDMA hardware using the following command:
>>>> QEMU Monitor Command:
>>>> $ migrate_set_capability chunk_register_destination off # enabled by default
>>>>
>>>> Performing this action will cause all 8GB to be pinned, so if that's
>>>> not what you want, then please ignore this step altogether.
>>> This does not make it clear what is the benefit of disabling this
>>> capability. I think it's best to avoid options, just use chunk
>>> based always.
>>> If it's here "so people can play with it" then please rename
>>> it to something like "x-unsupported-chunk_register_destination"
>>> so people know this is unsupported and not to be used for production.
>> Again, please drop the request for removing chunking.
>>
>> Paolo already told me to use "x-rdma" - so that's enough for now.
>>
>> - Michael
> You are adding a new command that's also experimental, so you must tag
> it explicitly too.

The entire migration is experimental - which by extension makes the 
capability experimental.

next prev parent reply	other threads:[~2013-04-15  1:06 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-09  3:04 [Qemu-devel] [RFC PATCH RDMA support v5: 00/12] new formal protocol design mrhines
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 01/12] ./configure with and without --enable-rdma mrhines
2013-04-09 17:05   ` Paolo Bonzini
2013-04-09 18:07     ` Michael R. Hines
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 02/12] check for CONFIG_RDMA mrhines
2013-04-09 16:46   ` Paolo Bonzini
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 03/12] comprehensive protocol documentation mrhines
2013-04-10  5:27   ` Michael S. Tsirkin
2013-04-10 13:04     ` Michael R. Hines
2013-04-10 13:34       ` Michael S. Tsirkin
2013-04-10 15:29         ` Michael R. Hines
2013-04-10 17:41           ` Michael S. Tsirkin
2013-04-10 20:05             ` Michael R. Hines
2013-04-11  7:19               ` Michael S. Tsirkin
2013-04-11 13:12                 ` Michael R. Hines
2013-04-11 13:48                   ` Michael S. Tsirkin
2013-04-11 13:58                     ` Michael R. Hines
2013-04-11 14:37                       ` Michael S. Tsirkin
2013-04-11 14:50                         ` Paolo Bonzini
2013-04-11 14:56                           ` Michael S. Tsirkin
2013-04-11 17:49                             ` Michael R. Hines
2013-04-11 19:15                               ` Michael S. Tsirkin
2013-04-11 20:33                                 ` Michael R. Hines
2013-04-12 10:48                                   ` Michael S. Tsirkin
2013-04-12 10:53                                     ` Paolo Bonzini
2013-04-12 11:25                                       ` Michael S. Tsirkin
2013-04-12 14:43                                         ` Paolo Bonzini
2013-04-14 11:59                                           ` Michael S. Tsirkin
2013-04-14 14:09                                             ` Paolo Bonzini
2013-04-14 14:40                                               ` Michael R. Hines
2013-04-14 14:27                                             ` Michael R. Hines
2013-04-14 16:03                                               ` Michael S. Tsirkin
2013-04-14 16:07                                                 ` Michael R. Hines
2013-04-14 16:40                                                 ` Michael R. Hines
2013-04-14 18:30                                                   ` Michael S. Tsirkin
2013-04-14 19:06                                                     ` Michael R. Hines
2013-04-14 21:10                                                       ` Michael S. Tsirkin
2013-04-15  1:06                                                         ` Michael R. Hines [this message]
2013-04-15  6:00                                                           ` Michael S. Tsirkin
2013-04-15 13:07                                                             ` Michael R. Hines
2013-04-15 22:20                                                               ` Michael S. Tsirkin
2013-04-15  8:28                                                           ` Paolo Bonzini
2013-04-15 13:08                                                             ` Michael R. Hines
2013-04-15  8:26                                                       ` Paolo Bonzini
2013-04-12 13:47                                     ` Michael R. Hines
2013-04-14  8:28                                       ` Michael S. Tsirkin
2013-04-14 14:31                                         ` Michael R. Hines
2013-04-14 18:51                                           ` Michael S. Tsirkin
2013-04-14 19:43                                             ` Michael R. Hines
2013-04-14 21:16                                               ` Michael S. Tsirkin
2013-04-15  1:10                                                 ` Michael R. Hines
2013-04-15  6:10                                                   ` Michael S. Tsirkin
2013-04-15  8:34                                                   ` Paolo Bonzini
2013-04-15 13:24                                                     ` Michael R. Hines
2013-04-15 13:30                                                       ` Paolo Bonzini
2013-04-15 19:55                                                         ` Michael R. Hines
2013-04-11 15:01                           ` Michael R. Hines
2013-04-11 15:18                         ` Michael R. Hines
2013-04-11 15:33                           ` Paolo Bonzini
2013-04-11 15:46                             ` Michael S. Tsirkin
2013-04-11 15:47                               ` Paolo Bonzini
2013-04-11 15:58                                 ` Michael S. Tsirkin
2013-04-11 16:06                                   ` Michael R. Hines
2013-04-12  5:10                             ` Michael R. Hines
2013-04-12  5:26                               ` Paolo Bonzini
2013-04-12  5:54                                 ` Michael R. Hines
2013-04-11 15:44                           ` Michael S. Tsirkin
2013-04-11 16:09                             ` Michael R. Hines
2013-04-11 17:04                               ` Michael S. Tsirkin
2013-04-11 17:27                                 ` Michael R. Hines
2013-04-11 16:13                             ` Michael R. Hines
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 04/12] introduce qemu_ram_foreach_block() mrhines
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 05/12] core RDMA migration logic w/ new protocol mrhines
2013-04-09 16:57   ` Paolo Bonzini
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 06/12] connection-establishment for RDMA mrhines
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 07/12] additional savevm.c accessors " mrhines
2013-04-09 17:03   ` Paolo Bonzini
2013-04-09 17:31   ` Peter Maydell
2013-04-09 18:04     ` Michael R. Hines
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 08/12] new capabilities added and check for QMP string 'rdma' mrhines
2013-04-09 17:01   ` Paolo Bonzini
2013-04-10  1:11     ` Michael R. Hines
2013-04-10  8:07       ` Paolo Bonzini
2013-04-10 10:35         ` Michael S. Tsirkin
2013-04-10 12:24         ` Michael R. Hines
2013-04-09 17:02   ` Paolo Bonzini
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 09/12] transmit pc.ram using RDMA mrhines
2013-04-09 16:50   ` Paolo Bonzini
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 10/12] new header file prototypes for savevm.c mrhines
2013-04-09 16:43   ` Paolo Bonzini
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 11/12] update schema to define new capabilities mrhines
2013-04-09 16:43   ` Paolo Bonzini
2013-04-09  3:04 ` [Qemu-devel] [RFC PATCH RDMA support v5: 12/12] don't set nonblock on invalid file descriptor mrhines
2013-04-09 16:45   ` Paolo Bonzini
2013-04-09  4:24 ` [Qemu-devel] [RFC PATCH RDMA support v5: 00/12] new formal protocol design Michael R. Hines
2013-04-09 12:44 ` Michael S. Tsirkin
2013-04-09 14:23   ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516B529C.5090503@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=abali@us.ibm.com \
    --cc=aliguori@us.ibm.com \
    --cc=gokul@us.ibm.com \
    --cc=mrhines@us.ibm.com \
    --cc=mst@redhat.com \
    --cc=owasserm@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).