From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:33789)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chegu_vinod@hp.com>) id 1UYPOt-0006YV-5e
	for qemu-devel@nongnu.org; Fri, 03 May 2013 19:28:28 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <chegu_vinod@hp.com>) id 1UYPOr-0005by-ME
	for qemu-devel@nongnu.org; Fri, 03 May 2013 19:28:27 -0400
Received: from g4t0017.houston.hp.com ([15.201.24.20]:14950)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chegu_vinod@hp.com>) id 1UYPOr-0005bQ-Cv
	for qemu-devel@nongnu.org; Fri, 03 May 2013 19:28:25 -0400
Message-ID: <51844811.4030001@hp.com>
Date: Fri, 03 May 2013 16:28:17 -0700
From: Chegu Vinod <chegu_vinod@hp.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="------------010008030706000106020600"
Subject: Re: [Qemu-devel] [PATCH v6 00/11] rdma: migration support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
Cc: Karen Noel <knoel@redhat.com>, Juan Jose Quintela Carreira <quintela@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, qemu-devel qemu-devel <qemu-devel@nongnu.org>, Orit Wasserman <owasserm@redhat.com>, Anthony Liguori <anthony@codemonkey.ws>, Paolo Bonzini <pbonzini@redhat.com>

This is a multi-part message in MIME format.
--------------010008030706000106020600
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit


Hi Michael,

I picked up the qemu bits from your github branch and gave it a try.   
(BTW the setup I was given temporary access to has a pair of MLX's  IB 
QDR cards connected back to back via QSFP cables)

Observed a couple of things and wanted to share..perhaps you may be 
aware of them already or perhaps these are unrelated to your specific 
changes ? (Note: Still haven't finished the review of your changes ).

a) x-rdma-pin-all off case

Seem to only work sometimes but fails at other times. Here is an example...

(qemu) rdma: Accepting rdma connection...
rdma: Memory pin all: disabled
rdma: verbs context after listen: 0x555556757d50
rdma: dest_connect Source GID: fe80::2:c903:9:53a5, Dest GID: 
fe80::2:c903:9:5855
rdma: Accepted migration
qemu-system-x86_64: VQ 1 size 0x100 Guest index 0x4d2 inconsistent with 
Host ind
ex 0x4ec: delta 0xffe6
qemu: warning: error while loading state for instance 0x0 of device 
'virtio-net'
load of migration failed


b) x-rdma-pin-all on case :

The guest is not resuming on the target host. i.e. the source host's 
qemu states that migration is complete but the guest is not responsive 
anymore... (doesn't seem to have crashed but its stuck somewhere).    
Have you seen this behavior before ? Any tips on how I could extract 
additional info ?

Besides the list of noted restrictions/issues around having to pin all 
of guest memory....if the pinning is done as part of starting of the 
migration it ends up taking noticeably long time for larger guests. 
Wonder whether that should be counted as part of the total migration 
time ?.

Also the act of pinning all the memory seems to "freeze" the guest. e.g. 
: For larger enterprise sized guests (say 128GB and higher) the guest is 
"frozen" is anywhere from nearly a minute (~50seconds) to multiple 
minutes as the guest size increases...which imo kind of defeats the 
purpose of live guest migration.

Would like to hear if you have already thought about any other 
alternatives to address this issue ? for e.g. would it be better to pin 
all of the guest's memory as part of starting the guest itself ? Yes 
there are restrictions when we do pinning...but it can help with 
performance.
---
BTW, a different (yet sort of related) topic... recently a patch went 
into upstream that provided an option to qemu to mlock all of guest 
memory :

https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03947.html .

but when attempting to do the mlock for larger guests a lot of time is 
spent bringing each page into cache and clearing/zeron'g it etc.etc.

https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg04161.html


----

Note: The basic tcp based live guest migration in the same qemu version 
still works fine on the same hosts over a pair of non-RDMA cards 10Gb 
NICs connected back-to-back.

Thanks
Vinod


--------------010008030706000106020600
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <br>
    Hi Michael,<br>
    <br>
    I picked up the qemu bits from your github branch and gave it a
    try.&nbsp;&nbsp; (BTW the setup I was given temporary access to has a pair of
    MLX's&nbsp; IB QDR cards connected back to back via QSFP cables) <br>
    <br>
    Observed a couple of things and wanted to share..perhaps you may be
    aware of them already or perhaps these are unrelated to your
    specific changes ? (Note: Still haven't finished the review of your
    changes ). <br>
    <br>
    a) x-rdma-pin-all <font color="#3333ff">off </font>case <br>
    <br>
    Seem to only work sometimes but fails at other times. Here is an
    example...<br>
    <br>
    (qemu) rdma: Accepting rdma connection...<br>
    rdma: Memory pin all: disabled<br>
    rdma: verbs context after listen: 0x555556757d50<br>
    rdma: dest_connect Source GID: fe80::2:c903:9:53a5, Dest GID:
    fe80::2:c903:9:5855<br>
    rdma: Accepted migration<br>
    qemu-system-x86_64: VQ 1 size 0x100 Guest index 0x4d2 inconsistent
    with Host ind<br>
    ex 0x4ec: delta 0xffe6<br>
    qemu: warning: error while loading state for instance 0x0 of device
    'virtio-net'<br>
    load of migration failed<br>
    <br>
    <br>
    b) x-rdma-pin-all&nbsp;&nbsp; <font color="#3333ff">on&nbsp;</font> case :&nbsp; <br>
    <br>
    The guest is not resuming on the target host. i.e. the source host's
    qemu states that migration is complete but the guest is not
    responsive anymore... (doesn't seem to have crashed but its stuck
    somewhere).&nbsp;&nbsp;&nbsp; Have you seen this behavior before ? Any tips on how
    I could extract additional info ? <br>
    <br>
    Besides the list of noted restrictions/issues around having to pin
    all of guest memory....if the pinning is done as part of starting of
    the migration it ends up taking noticeably long time for larger
    guests. Wonder whether that should be counted as part of the total
    migration time ?.&nbsp; <br>
    <br>
    Also the act of pinning all the memory seems to "freeze" the guest.
    e.g. : For larger enterprise sized guests (say 128GB and higher) the
    guest is "frozen" is anywhere from nearly a minute (~50seconds) to
    multiple minutes as the guest size increases...which imo kind of
    defeats the purpose of live guest migration.<br>
    <br>
    Would like to hear if you have already thought about any other
    alternatives to address this issue ? for e.g. would it be better to
    pin all of the guest's memory as part of starting the guest itself ?
    Yes there are restrictions when we do pinning...but it can help with
    performance. <br>
    ---<br>
    BTW, a different (yet sort of related) topic... recently a patch
    went into upstream that provided an option to qemu to mlock all of
    guest memory : <br>
    <br>
    <a
href="https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03947.html">https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03947.html
    </a>.&nbsp; <br>
    <br>
    but when attempting to do the mlock for larger guests a lot of time
    is spent bringing each page into cache and clearing/zeron'g it
    etc.etc. <br>
    <br>
    <a
href="https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg04161.html">https://lists.gnu.org/archive/html/qemu-devel/2013-04/msg04161.html</a>&nbsp;
    <br>
    <br>
    <br>
    ----<br>
    <br>
    Note: The basic tcp based live guest migration in the same qemu
    version still works fine on the same hosts over a pair of non-RDMA
    cards 10Gb NICs connected back-to-back. <br>
    <br>
    Thanks<br>
    Vinod<br>
    <br>
    <br>
    &nbsp;<br>
  </body>
</html>

--------------010008030706000106020600--