From: Chegu Vinod <chegu_vinod@hp.com>
To: Juan Quintela <quintela@redhat.com>,
qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
Karen Noel <knoel@redhat.com>,
Orit Wasserman <owasserm@redhat.com>,
Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
Date: Fri, 08 Nov 2013 07:18:32 -0800 [thread overview]
Message-ID: <527D00C8.5080707@hp.com> (raw)
In-Reply-To: <1383743088-8139-1-git-send-email-quintela@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 10919 bytes --]
On 11/6/2013 5:04 AM, Juan Quintela wrote:
> Hi
>
> [v2]
> In this version:
> - fixed all the comments from last versions (thanks Eric)
> - kvm migration bitmap is synchronized using bitmap operations
> - qemu bitmap -> migration bitmap is synchronized using bitmap operations
> If bitmaps are not properly aligned, we fall back to old code.
> Code survives virt-tests, so should be in quite good shape.
>
> ToDo list:
>
> - vga ram by default is not aligned in a page number multiple of 64,
>
> it could be optimized. Kraxel? It syncs the kvm bitmap at least 1
> a second or so? bitmap is only 2048 pages (16MB by default).
> We need to change the ram_addr only
>
> - vga: still more, after we finish migration, vga code continues
> synchronizing the kvm bitmap on source machine. Notice that there
> is no graphics client connected to the VGA. Worth investigating?
>
> - I haven't yet meassure speed differences on big hosts. Vinod?
>
> - Depending of performance, more optimizations to do.
>
> - debugging printf's still on the code, just to see if we are taking
> (or not) the optimized paths.
>
> And that is all. Please test & comment.
>
> Thanks, Juan.
>
> [v1]
> This series split the dirty bitmap (8 bits per page, only three used)
> into 3 individual bitmaps. Once the conversion is done, operations
> are handled by bitmap operations, not bit by bit.
>
> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
> everywhere.
>
> - We set/reset each flag individually
> (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>
> - Rename several functions to clarify/make consistent things.
>
> - I know it dont't pass checkpatch for long lines, propper submission
> should pass it. We have to have long lines, short variable names, or
> ugly line splitting :p
>
> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
> #include it don't work, as a workaround, I have copied its value, but
> any better idea? I can always create "exec/migration-flags.h", though.
>
> - The meat of the code is patch 19. Rest of patches are quite easy
> (even that one is not too complex).
>
> Only optimizations done so far are
> set_dirty_range()/clear_dirty_range() that now operates with
> bitmap_set/clear.
>
> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
> see comment on patch.
>
> It passes virt-test migration tests, so it should be perfect.
>
> I post it to ask for comments.
>
> ToDo list:
>
> - create a lock for the bitmaps and fold migration bitmap into this
> one. This would avoid a copy and make things easier?
>
> - As this code uses/abuses bitmaps, we need to change the type of the
> index from int to long. With an int index, we can only access a
> maximum of 8TB guest (yes, this is not urgent, we have a couple of
> years to do it).
>
> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>
> - spliting the KVM bitmap synchronization into chunks, i.e. not
> synchronize all memory, just enough to continue with migration.
>
> Any further ideas/needs?
>
> Thanks, Juan.
>
> PD. Why it took so long?
>
> Because I was trying to integrate the bitmap on the MemoryRegion
> abstraction. Would have make the code cleaner, but hit dead-end
> after dead-end. As practical terms, TCG don't know about
> MemoryRegions, it has been ported to run on top of them, but
> don't use them effective
>
>
> The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:
>
> rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>
> are available in the git repository at:
>
> git://github.com/juanquintela/qemu.git bitmap-v2.next
>
> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>
> migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)
>
> ----------------------------------------------------------------
> Juan Quintela (39):
> Move prototypes to memory.h
> memory: cpu_physical_memory_set_dirty_flags() result is never used
> memory: cpu_physical_memory_set_dirty_range() return void
> exec: use accessor function to know if memory is dirty
> memory: create function to set a single dirty bit
> exec: create function to get a single dirty bit
> memory: make cpu_physical_memory_is_dirty return bool
> exec: simplify notdirty_mem_write()
> memory: all users of cpu_physical_memory_get_dirty used only one flag
> memory: set single dirty flags when possible
> memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
> memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
> memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
> memory: use bit 2 for migration
> memory: make sure that client is always inside range
> memory: only resize dirty bitmap when memory size increases
> memory: cpu_physical_memory_clear_dirty_flag() result is never used
> bitmap: Add bitmap_zero_extend operation
> memory: split dirty bitmap into three
> memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
> memory: unfold cpu_physical_memory_set_dirty() in its only user
> memory: unfold cpu_physical_memory_set_dirty_flag()
> memory: make cpu_physical_memory_get_dirty() the main function
> memory: cpu_physical_memory_get_dirty() is used as returning a bool
> memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
> memory: use find_next_bit() to find dirty bits
> memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
> memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
> memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
> memory: make cpu_physical_memory_reset_dirty() take a length parameter
> memory: cpu_physical_memory_set_dirty_tracking() should return void
> memory: split cpu_physical_memory_* functions to its own include
> memory: unfold memory_region_test_and_clear()
> kvm: use directly cpu_physical_memory_* api for tracking dirty pages
> kvm: refactor start address calculation
> memory: move bitmap synchronization to its own function
> memory: syncronize kvm bitmap using bitmaps operations
> ram: split function that synchronizes a range
> migration: synchronize memory bitmap 64bits at a time
>
> arch_init.c | 57 ++++++++++++----
> cputlb.c | 10 +--
> exec.c | 75 ++++++++++-----------
> include/exec/cpu-all.h | 4 +-
> include/exec/cpu-common.h | 4 --
> include/exec/memory-internal.h | 84 ------------------------
> include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
> include/exec/memory.h | 10 +--
> include/qemu/bitmap.h | 9 +++
> kvm-all.c | 28 ++------
> memory.c | 17 ++---
> 11 files changed, 260 insertions(+), 181 deletions(-)
> create mode 100644 include/exec/memory-physical.h
> .
>
Tested-by: Chegu Vinod<chegu_vinod@hp.com>
-------
Hi Juan,
Here are some results from migrating couple of*big fat* guests using TCP migration and RDMA migration and the last one was with a workload. As one would expect there were noticeable improvements.
Pl. see below.
FYI
Vinod
------
Migrate speed : 20G
Migrate downtime : 2s
I) Without Juan's bitmap optimization patches : (i.e. current upstream)
Freezes observed during the start and at times during the pre-copy phase.
Longer than expected downtime.
a) 20VCPU/256GB (TCP migration)
A freeze of ~1 second in the guest. (as measured by Juan's timer script)
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 97048 milliseconds
downtime: 3740 milliseconds
setup: 6912 milliseconds
transferred ram: 5734321 kbytes
throughput: 4243.94 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65856255 pages
skipped: 0 pages
normal: 1286361 pages
normal bytes: 5145444 kbytes
b) 40VCPU/512GB (TCP migration)
A freeze of ~7 seconds in the guest. (as measured by Juan's timer script)
info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 238957 milliseconds
downtime: 5700 milliseconds
setup: 14062 milliseconds
transferred ram: 10461990 kbytes
throughput: 4223.74 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131953694 pages
skipped: 0 pages
normal: 2321019 pages
normal bytes: 9284076 kbytes
------
II) With Juan's v2 bitmap optimization patches :
The actual downtime is lesser/closer to expected value...and that's good !
No multi-second freezes inside the guest during the start of migration or
during the pre-copy phase !
a) 20VCPU/256GB (TCP migration)
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 84626 milliseconds
downtime: 1893 milliseconds
setup: 296 milliseconds
transferred ram: 5791133 kbytes
throughput: 4841.76 mbps
remaining ram: 0 kbytes
total ram: 268444252 kbytes
duplicate: 65841383 pages
skipped: 0 pages
normal: 1300569 pages
normal bytes: 5202276 kbytes
b) 40VCPU/512GB (TCP migration)
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 239477 milliseconds
downtime: 1508 milliseconds
setup: 1171 milliseconds
transferred ram: 10584740 kbytes
throughput: 3570.72 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131934489 pages
skipped: 0 pages
normal: 2351688 pages
normal bytes: 9406752 kbytes
c) 40VCPU/512GB (RDMA migration)
(qemu) info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 174542 milliseconds
downtime: 1697 milliseconds
setup: 1140 milliseconds
transferred ram: 11739842 kbytes
throughput: 9987.07 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 131722040 pages
skipped: 0 pages
normal: 2902087 pages
normal bytes: 11608348 kbytes
d) 40VCPU/512GB (RDMA migration)
(Guest running SpecJBB2005 24 warehouse threads (guest was ~60% loaded)).
info migrate
capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off
Migration status: completed
total time: 154236 milliseconds
downtime: 2314 milliseconds
setup: 575 milliseconds
transferred ram: 19198318 kbytes
throughput: 19404.17 mbps
remaining ram: 0 kbytes
total ram: 536879712 kbytes
duplicate: 130647356 pages
skipped: 0 pages
normal: 4766514 pages
normal bytes: 19066056 kbytes
[-- Attachment #2: Type: text/html, Size: 11158 bytes --]
next prev parent reply other threads:[~2013-11-08 15:18 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
2013-11-07 9:36 ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
2013-11-07 9:42 ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
2013-11-07 9:51 ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
2013-11-07 10:00 ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
2013-11-07 11:37 ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
2013-11-06 23:11 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
2013-11-06 23:12 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
2013-11-06 23:15 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
2013-11-06 23:18 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
2013-11-06 23:24 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
2013-11-06 23:39 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
2013-11-06 23:44 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
2013-11-06 23:49 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
2013-11-06 23:55 ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
2013-11-06 15:54 ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
2013-11-06 15:56 ` Paolo Bonzini
2013-11-06 16:22 ` Juan Quintela
2013-11-06 16:23 ` Paolo Bonzini
2013-12-18 8:54 ` Alexander Graf
2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
2013-11-06 15:58 ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time Juan Quintela
2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
2013-11-06 15:38 ` Juan Quintela
2013-11-07 11:23 ` Gerd Hoffmann
2013-11-06 15:49 ` Paolo Bonzini
2013-11-25 6:39 ` Michael R. Hines
2013-11-25 9:45 ` Paolo Bonzini
2013-11-08 15:18 ` Chegu Vinod [this message]
2013-11-25 6:15 ` Michael R. Hines
2013-11-25 6:58 ` Michael R. Hines
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=527D00C8.5080707@hp.com \
--to=chegu_vinod@hp.com \
--cc=eblake@redhat.com \
--cc=knoel@redhat.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.