From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:38342)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chegu_vinod@hp.com>) id 1Venpi-0006od-Cy
	for qemu-devel@nongnu.org; Fri, 08 Nov 2013 10:18:57 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <chegu_vinod@hp.com>) id 1VenpZ-0001Cp-Nx
	for qemu-devel@nongnu.org; Fri, 08 Nov 2013 10:18:50 -0500
Received: from g4t0016.houston.hp.com ([15.201.24.19]:23729)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <chegu_vinod@hp.com>) id 1VenpZ-0001Bj-CX
	for qemu-devel@nongnu.org; Fri, 08 Nov 2013 10:18:41 -0500
Message-ID: <527D00C8.5080707@hp.com>
Date: Fri, 08 Nov 2013 07:18:32 -0800
From: Chegu Vinod <chegu_vinod@hp.com>
MIME-Version: 1.0
References: <1383743088-8139-1-git-send-email-quintela@redhat.com>
In-Reply-To: <1383743088-8139-1-git-send-email-quintela@redhat.com>
Content-Type: multipart/alternative;
	boundary="------------090100000401030609010304"
Subject: Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Juan Quintela <quintela@redhat.com>, qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>, Karen Noel <knoel@redhat.com>, Orit Wasserman <owasserm@redhat.com>, Eric Blake <eblake@redhat.com>

This is a multi-part message in MIME format.
--------------090100000401030609010304
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 11/6/2013 5:04 AM, Juan Quintela wrote:
> Hi
>
> [v2]
> In this version:
> - fixed all the comments from last versions (thanks Eric)
> - kvm migration bitmap is synchronized using bitmap operations
> - qemu bitmap -> migration bitmap is synchronized using bitmap operations
> If bitmaps are not properly aligned, we fall back to old code.
> Code survives virt-tests, so should be in quite good shape.
>
> ToDo list:
>
> - vga ram by default is not aligned in a page number multiple of 64,
>
>    it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>    a second or so? bitmap is only 2048 pages (16MB by default).
>    We need to change the ram_addr only
>
> - vga: still more, after we finish migration, vga code continues
>    synchronizing the kvm bitmap on source machine.  Notice that there
>    is no graphics client connected to the VGA.  Worth investigating?
>
> - I haven't yet meassure speed differences on big hosts.  Vinod?
>
> - Depending of performance, more optimizations to do.
>
> - debugging printf's still on the code, just to see if we are taking
>    (or not) the optimized paths.
>
> And that is all.  Please test & comment.
>
> Thanks, Juan.
>
> [v1]
> This series split the dirty bitmap (8 bits per page, only three used)
> into 3 individual bitmaps.  Once the conversion is done, operations
> are handled by bitmap operations, not bit by bit.
>
> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
>     everywhere.
>
> - We set/reset each flag individually
>    (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>
> - Rename several functions to clarify/make consistent things.
>
> - I know it dont't pass checkpatch for long lines, propper submission
>    should pass it. We have to have long lines, short variable names, or
>    ugly line splitting :p
>
> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
>    #include it don't work, as a workaround, I have copied its value, but
>    any better idea?  I can always create "exec/migration-flags.h", though.
>
> - The meat of the code is patch 19.  Rest of patches are quite easy
> (even that one is not too complex).
>
> Only optimizations done so far are
> set_dirty_range()/clear_dirty_range() that now operates with
> bitmap_set/clear.
>
> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
> see comment on patch.
>
> It passes virt-test migration tests, so it should be perfect.
>
> I post it to ask for comments.
>
> ToDo list:
>
> - create a lock for the bitmaps and fold migration bitmap into this
>    one.  This would avoid a copy and make things easier?
>
> - As this code uses/abuses bitmaps, we need to change the type of the
>    index from int to long.  With an int index, we can only access a
>    maximum of 8TB guest (yes, this is not urgent, we have a couple of
>    years to do it).
>
> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>
> - spliting the KVM bitmap synchronization into chunks, i.e. not
>    synchronize all memory, just enough to continue with migration.
>
> Any further ideas/needs?
>
> Thanks, Juan.
>
> PD.  Why it took so long?
>
>       Because I was trying to integrate the bitmap on the MemoryRegion
>       abstraction.  Would have make the code cleaner, but hit dead-end
>       after dead-end.  As practical terms, TCG don't know about
>       MemoryRegions, it has been ported to run on top of them, but
>       don't use them effective
>
>
> The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:
>
>    rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>
> are available in the git repository at:
>
>    git://github.com/juanquintela/qemu.git bitmap-v2.next
>
> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>
>    migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)
>
> ----------------------------------------------------------------
> Juan Quintela (39):
>        Move prototypes to memory.h
>        memory: cpu_physical_memory_set_dirty_flags() result is never used
>        memory: cpu_physical_memory_set_dirty_range() return void
>        exec: use accessor function to know if memory is dirty
>        memory: create function to set a single dirty bit
>        exec: create function to get a single dirty bit
>        memory: make cpu_physical_memory_is_dirty return bool
>        exec: simplify notdirty_mem_write()
>        memory: all users of cpu_physical_memory_get_dirty used only one flag
>        memory: set single dirty flags when possible
>        memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
>        memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
>        memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
>        memory: use bit 2 for migration
>        memory: make sure that client is always inside range
>        memory: only resize dirty bitmap when memory size increases
>        memory: cpu_physical_memory_clear_dirty_flag() result is never used
>        bitmap: Add bitmap_zero_extend operation
>        memory: split dirty bitmap into three
>        memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
>        memory: unfold cpu_physical_memory_set_dirty() in its only user
>        memory: unfold cpu_physical_memory_set_dirty_flag()
>        memory: make cpu_physical_memory_get_dirty() the main function
>        memory: cpu_physical_memory_get_dirty() is used as returning a bool
>        memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
>        memory: use find_next_bit() to find dirty bits
>        memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
>        memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
>        memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
>        memory: make cpu_physical_memory_reset_dirty() take a length parameter
>        memory: cpu_physical_memory_set_dirty_tracking() should return void
>        memory: split cpu_physical_memory_* functions to its own include
>        memory: unfold memory_region_test_and_clear()
>        kvm: use directly cpu_physical_memory_* api for tracking dirty pages
>        kvm: refactor start address calculation
>        memory: move bitmap synchronization to its own function
>        memory: syncronize kvm bitmap using bitmaps operations
>        ram: split function that synchronizes a range
>        migration: synchronize memory bitmap 64bits at a time
>
>   arch_init.c                    |  57 ++++++++++++----
>   cputlb.c                       |  10 +--
>   exec.c                         |  75 ++++++++++-----------
>   include/exec/cpu-all.h         |   4 +-
>   include/exec/cpu-common.h      |   4 --
>   include/exec/memory-internal.h |  84 ------------------------
>   include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
>   include/exec/memory.h          |  10 +--
>   include/qemu/bitmap.h          |   9 +++
>   kvm-all.c                      |  28 ++------
>   memory.c                       |  17 ++---
>   11 files changed, 260 insertions(+), 181 deletions(-)
>   create mode 100644 include/exec/memory-physical.h
> .
>


Tested-by: Chegu Vinod<chegu_vinod@hp.com>

-------

Hi Juan,


Here are some results from migrating couple of*big fat*  guests using TCP migration and RDMA migration and the last one was with a workload. As one would expect there were noticeable improvements.

Pl. see below.

FYI
Vinod


------

Migrate speed : 20G
Migrate downtime : 2s

I) Without Juan's bitmap optimization patches : (i.e. current upstream)

Freezes observed during the start and at times during the pre-copy phase.
Longer than expected downtime.

a) 20VCPU/256GB (TCP migration)

A freeze of ~1 second in the guest. (as measured by Juan's timer script)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 97048 milliseconds
downtime: 3740 milliseconds
setup: 6912 milliseconds
transferred ram: 5734321 kbytes
throughput: 4243.94 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65856255 pages
skipped: 0 pages
normal: 1286361 pages
normal bytes: 5145444 kbytes


b) 40VCPU/512GB (TCP migration)

A freeze of ~7 seconds in the guest. (as measured by Juan's timer script)

info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 238957 milliseconds
downtime: 5700 milliseconds
setup: 14062 milliseconds
transferred ram: 10461990 kbytes
throughput: 4223.74 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131953694 pages
skipped: 0 pages
normal: 2321019 pages
normal bytes: 9284076 kbytes


------

II) With Juan's v2 bitmap optimization patches :

The actual downtime is lesser/closer to expected value...and that's good !

No multi-second freezes inside the guest during the start of migration or
during the pre-copy phase !

a) 20VCPU/256GB (TCP migration)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 84626 milliseconds
downtime: 1893 milliseconds
setup: 296 milliseconds
transferred ram: 5791133 kbytes
throughput: 4841.76 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65841383 pages
skipped: 0 pages
normal: 1300569 pages
normal bytes: 5202276 kbytes


b) 40VCPU/512GB (TCP migration)


(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 239477 milliseconds
downtime: 1508 milliseconds
setup: 1171 milliseconds
transferred ram: 10584740 kbytes
throughput: 3570.72 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131934489 pages
skipped: 0 pages
normal: 2351688 pages
normal bytes: 9406752 kbytes


c) 40VCPU/512GB (RDMA migration)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 174542 milliseconds
downtime: 1697 milliseconds
setup: 1140 milliseconds
transferred ram: 11739842 kbytes
throughput: 9987.07 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131722040 pages
skipped: 0 pages
normal: 2902087 pages
normal bytes: 11608348 kbytes


d) 40VCPU/512GB (RDMA migration)
  (Guest running SpecJBB2005 24 warehouse threads (guest was ~60% loaded)).

info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 154236 milliseconds
downtime: 2314 milliseconds
setup: 575 milliseconds
transferred ram: 19198318 kbytes
throughput: 19404.17 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 130647356 pages
skipped: 0 pages
normal: 4766514 pages
normal bytes: 19066056 kbytes


--------------090100000401030609010304
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 11/6/2013 5:04 AM, Juan Quintela
      wrote:<br>
    </div>
    <blockquote
      cite="mid:1383743088-8139-1-git-send-email-quintela@redhat.com"
      type="cite">
      <pre wrap="">Hi

[v2]
In this version:
- fixed all the comments from last versions (thanks Eric)
- kvm migration bitmap is synchronized using bitmap operations
- qemu bitmap -&gt; migration bitmap is synchronized using bitmap operations
If bitmaps are not properly aligned, we fall back to old code.
Code survives virt-tests, so should be in quite good shape.

ToDo list:

- vga ram by default is not aligned in a page number multiple of 64,

  it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
  a second or so? bitmap is only 2048 pages (16MB by default).
  We need to change the ram_addr only

- vga: still more, after we finish migration, vga code continues
  synchronizing the kvm bitmap on source machine.  Notice that there
  is no graphics client connected to the VGA.  Worth investigating?

- I haven't yet meassure speed differences on big hosts.  Vinod?

- Depending of performance, more optimizations to do.

- debugging printf's still on the code, just to see if we are taking
  (or not) the optimized paths.

And that is all.  Please test &amp; comment.

Thanks, Juan.

[v1]
This series split the dirty bitmap (8 bits per page, only three used)
into 3 individual bitmaps.  Once the conversion is done, operations
are handled by bitmap operations, not bit by bit.

- *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
   everywhere.

- We set/reset each flag individually
  (set_dirty_flags(0xff&amp;~CODE_DIRTY_FLAG)) are gone.

- Rename several functions to clarify/make consistent things.

- I know it dont't pass checkpatch for long lines, propper submission
  should pass it. We have to have long lines, short variable names, or
  ugly line splitting :p

- DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
  #include it don't work, as a workaround, I have copied its value, but
  any better idea?  I can always create "exec/migration-flags.h", though.

- The meat of the code is patch 19.  Rest of patches are quite easy
(even that one is not too complex).

Only optimizations done so far are
set_dirty_range()/clear_dirty_range() that now operates with
bitmap_set/clear.

Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
see comment on patch.

It passes virt-test migration tests, so it should be perfect.

I post it to ask for comments.

ToDo list:

- create a lock for the bitmaps and fold migration bitmap into this
  one.  This would avoid a copy and make things easier?

- As this code uses/abuses bitmaps, we need to change the type of the
  index from int to long.  With an int index, we can only access a
  maximum of 8TB guest (yes, this is not urgent, we have a couple of
  years to do it).

- merging KVM &lt;-&gt; QEMU bitmap as a bitmap and not bit-by-bit.

- spliting the KVM bitmap synchronization into chunks, i.e. not
  synchronize all memory, just enough to continue with migration.

Any further ideas/needs?

Thanks, Juan.

PD.  Why it took so long?

     Because I was trying to integrate the bitmap on the MemoryRegion
     abstraction.  Would have make the code cleaner, but hit dead-end
     after dead-end.  As practical terms, TCG don't know about
     MemoryRegions, it has been ported to run on top of them, but
     don't use them effective


The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:

  rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)

are available in the git repository at:

  git://github.com/juanquintela/qemu.git bitmap-v2.next

for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:

  migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)

----------------------------------------------------------------
Juan Quintela (39):
      Move prototypes to memory.h
      memory: cpu_physical_memory_set_dirty_flags() result is never used
      memory: cpu_physical_memory_set_dirty_range() return void
      exec: use accessor function to know if memory is dirty
      memory: create function to set a single dirty bit
      exec: create function to get a single dirty bit
      memory: make cpu_physical_memory_is_dirty return bool
      exec: simplify notdirty_mem_write()
      memory: all users of cpu_physical_memory_get_dirty used only one flag
      memory: set single dirty flags when possible
      memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
      memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
      memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
      memory: use bit 2 for migration
      memory: make sure that client is always inside range
      memory: only resize dirty bitmap when memory size increases
      memory: cpu_physical_memory_clear_dirty_flag() result is never used
      bitmap: Add bitmap_zero_extend operation
      memory: split dirty bitmap into three
      memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
      memory: unfold cpu_physical_memory_set_dirty() in its only user
      memory: unfold cpu_physical_memory_set_dirty_flag()
      memory: make cpu_physical_memory_get_dirty() the main function
      memory: cpu_physical_memory_get_dirty() is used as returning a bool
      memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
      memory: use find_next_bit() to find dirty bits
      memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
      memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
      memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
      memory: make cpu_physical_memory_reset_dirty() take a length parameter
      memory: cpu_physical_memory_set_dirty_tracking() should return void
      memory: split cpu_physical_memory_* functions to its own include
      memory: unfold memory_region_test_and_clear()
      kvm: use directly cpu_physical_memory_* api for tracking dirty pages
      kvm: refactor start address calculation
      memory: move bitmap synchronization to its own function
      memory: syncronize kvm bitmap using bitmaps operations
      ram: split function that synchronizes a range
      migration: synchronize memory bitmap 64bits at a time

 arch_init.c                    |  57 ++++++++++++----
 cputlb.c                       |  10 +--
 exec.c                         |  75 ++++++++++-----------
 include/exec/cpu-all.h         |   4 +-
 include/exec/cpu-common.h      |   4 --
 include/exec/memory-internal.h |  84 ------------------------
 include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
 include/exec/memory.h          |  10 +--
 include/qemu/bitmap.h          |   9 +++
 kvm-all.c                      |  28 ++------
 memory.c                       |  17 ++---
 11 files changed, 260 insertions(+), 181 deletions(-)
 create mode 100644 include/exec/memory-physical.h
.

</pre>
    </blockquote>
    <br>
    <br>
    <pre wrap="">Tested-by: Chegu Vinod <a class="moz-txt-link-rfc2396E" href="mailto:chegu_vinod@hp.com">&lt;chegu_vinod@hp.com&gt;</a>

-------

Hi Juan,


Here are some results from migrating couple of <b class="moz-txt-star"><span class="moz-txt-tag">*</span>big fat<span class="moz-txt-tag">*</span></b> guests using TCP migration and RDMA migration and the last one was with a workload. As one would expect there were noticeable improvements. 

Pl. see below.

FYI
Vinod


------

Migrate speed : 20G
Migrate downtime : 2s

I) Without Juan's bitmap optimization patches : (i.e. current upstream)

Freezes observed during the start and at times during the pre-copy phase. 
Longer than expected downtime.

a) 20VCPU/256GB (TCP migration)

A freeze of ~1 second in the guest. (as measured by Juan's timer script)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: completed
total time: 97048 milliseconds
downtime: 3740 milliseconds
setup: 6912 milliseconds
transferred ram: 5734321 kbytes
throughput: 4243.94 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65856255 pages
skipped: 0 pages
normal: 1286361 pages
normal bytes: 5145444 kbytes


b) 40VCPU/512GB (TCP migration)

A freeze of ~7 seconds in the guest. (as measured by Juan's timer script)

info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: completed
total time: 238957 milliseconds
downtime: 5700 milliseconds
setup: 14062 milliseconds
transferred ram: 10461990 kbytes
throughput: 4223.74 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131953694 pages
skipped: 0 pages
normal: 2321019 pages
normal bytes: 9284076 kbytes


------

II) With Juan's v2 bitmap optimization patches :

The actual downtime is lesser/closer to expected value...and that's good !

No multi-second freezes inside the guest during the start of migration or 
during the pre-copy phase !

a) 20VCPU/256GB (TCP migration)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: completed
total time: 84626 milliseconds
downtime: 1893 milliseconds
setup: 296 milliseconds
transferred ram: 5791133 kbytes
throughput: 4841.76 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65841383 pages
skipped: 0 pages
normal: 1300569 pages
normal bytes: 5202276 kbytes


b) 40VCPU/512GB (TCP migration)


(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: completed
total time: 239477 milliseconds
downtime: 1508 milliseconds
setup: 1171 milliseconds
transferred ram: 10584740 kbytes
throughput: 3570.72 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131934489 pages
skipped: 0 pages
normal: 2351688 pages
normal bytes: 9406752 kbytes


c) 40VCPU/512GB (RDMA migration)

(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: completed
total time: 174542 milliseconds
downtime: 1697 milliseconds
setup: 1140 milliseconds
transferred ram: 11739842 kbytes
throughput: 9987.07 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131722040 pages
skipped: 0 pages
normal: 2902087 pages
normal bytes: 11608348 kbytes


d) 40VCPU/512GB (RDMA migration)
 (Guest running SpecJBB2005 24 warehouse threads (guest was ~60% loaded)).

info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off 
Migration status: completed
total time: 154236 milliseconds
downtime: 2314 milliseconds
setup: 575 milliseconds
transferred ram: 19198318 kbytes
throughput: 19404.17 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 130647356 pages
skipped: 0 pages
normal: 4766514 pages
normal bytes: 19066056 kbytes</pre>
  </body>
</html>

--------------090100000401030609010304--