qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Juan Quintela <quintela@redhat.com>
Cc: chegu_vinod@hp.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
Date: Mon, 25 Nov 2013 14:15:44 +0800	[thread overview]
Message-ID: <5292EB10.4030705@linux.vnet.ibm.com> (raw)
In-Reply-To: <1383743088-8139-1-git-send-email-quintela@redhat.com>

On 11/06/2013 09:04 PM, Juan Quintela wrote:
> Hi
>
> [v2]
> In this version:
> - fixed all the comments from last versions (thanks Eric)
> - kvm migration bitmap is synchronized using bitmap operations
> - qemu bitmap -> migration bitmap is synchronized using bitmap operations
> If bitmaps are not properly aligned, we fall back to old code.
> Code survives virt-tests, so should be in quite good shape.
>
> ToDo list:
>
> - vga ram by default is not aligned in a page number multiple of 64,
>
>    it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>    a second or so? bitmap is only 2048 pages (16MB by default).
>    We need to change the ram_addr only
>
> - vga: still more, after we finish migration, vga code continues
>    synchronizing the kvm bitmap on source machine.  Notice that there
>    is no graphics client connected to the VGA.  Worth investigating?
>
> - I haven't yet meassure speed differences on big hosts.  Vinod?
>
> - Depending of performance, more optimizations to do.
>
> - debugging printf's still on the code, just to see if we are taking
>    (or not) the optimized paths.
>
> And that is all.  Please test & comment.
>
> Thanks, Juan.
>
> [v1]
> This series split the dirty bitmap (8 bits per page, only three used)
> into 3 individual bitmaps.  Once the conversion is done, operations
> are handled by bitmap operations, not bit by bit.
>
> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
>     everywhere.
>
> - We set/reset each flag individually
>    (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>
> - Rename several functions to clarify/make consistent things.
>
> - I know it dont't pass checkpatch for long lines, propper submission
>    should pass it. We have to have long lines, short variable names, or
>    ugly line splitting :p
>
> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
>    #include it don't work, as a workaround, I have copied its value, but
>    any better idea?  I can always create "exec/migration-flags.h", though.
>
> - The meat of the code is patch 19.  Rest of patches are quite easy
> (even that one is not too complex).
>
> Only optimizations done so far are
> set_dirty_range()/clear_dirty_range() that now operates with
> bitmap_set/clear.
>
> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
> see comment on patch.
>
> It passes virt-test migration tests, so it should be perfect.
>
> I post it to ask for comments.
>
> ToDo list:
>
> - create a lock for the bitmaps and fold migration bitmap into this
>    one.  This would avoid a copy and make things easier?
>
> - As this code uses/abuses bitmaps, we need to change the type of the
>    index from int to long.  With an int index, we can only access a
>    maximum of 8TB guest (yes, this is not urgent, we have a couple of
>    years to do it).
>
> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>
> - spliting the KVM bitmap synchronization into chunks, i.e. not
>    synchronize all memory, just enough to continue with migration.
>
> Any further ideas/needs?
>
> Thanks, Juan.
>
> PD.  Why it took so long?
>
>       Because I was trying to integrate the bitmap on the MemoryRegion
>       abstraction.  Would have make the code cleaner, but hit dead-end
>       after dead-end.  As practical terms, TCG don't know about
>       MemoryRegions, it has been ported to run on top of them, but
>       don't use them effective
>
>
> The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:
>
>    rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>
> are available in the git repository at:
>
>    git://github.com/juanquintela/qemu.git bitmap-v2.next
>
> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>
>    migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)
>
> ----------------------------------------------------------------
> Juan Quintela (39):
>        Move prototypes to memory.h
>        memory: cpu_physical_memory_set_dirty_flags() result is never used
>        memory: cpu_physical_memory_set_dirty_range() return void
>        exec: use accessor function to know if memory is dirty
>        memory: create function to set a single dirty bit
>        exec: create function to get a single dirty bit
>        memory: make cpu_physical_memory_is_dirty return bool
>        exec: simplify notdirty_mem_write()
>        memory: all users of cpu_physical_memory_get_dirty used only one flag
>        memory: set single dirty flags when possible
>        memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
>        memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
>        memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
>        memory: use bit 2 for migration
>        memory: make sure that client is always inside range
>        memory: only resize dirty bitmap when memory size increases
>        memory: cpu_physical_memory_clear_dirty_flag() result is never used
>        bitmap: Add bitmap_zero_extend operation
>        memory: split dirty bitmap into three
>        memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
>        memory: unfold cpu_physical_memory_set_dirty() in its only user
>        memory: unfold cpu_physical_memory_set_dirty_flag()
>        memory: make cpu_physical_memory_get_dirty() the main function
>        memory: cpu_physical_memory_get_dirty() is used as returning a bool
>        memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
>        memory: use find_next_bit() to find dirty bits
>        memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
>        memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
>        memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
>        memory: make cpu_physical_memory_reset_dirty() take a length parameter
>        memory: cpu_physical_memory_set_dirty_tracking() should return void
>        memory: split cpu_physical_memory_* functions to its own include
>        memory: unfold memory_region_test_and_clear()
>        kvm: use directly cpu_physical_memory_* api for tracking dirty pages
>        kvm: refactor start address calculation
>        memory: move bitmap synchronization to its own function
>        memory: syncronize kvm bitmap using bitmaps operations
>        ram: split function that synchronizes a range
>        migration: synchronize memory bitmap 64bits at a time
>
>   arch_init.c                    |  57 ++++++++++++----
>   cputlb.c                       |  10 +--
>   exec.c                         |  75 ++++++++++-----------
>   include/exec/cpu-all.h         |   4 +-
>   include/exec/cpu-common.h      |   4 --
>   include/exec/memory-internal.h |  84 ------------------------
>   include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
>   include/exec/memory.h          |  10 +--
>   include/qemu/bitmap.h          |   9 +++
>   kvm-all.c                      |  28 ++------
>   memory.c                       |  17 ++---
>   11 files changed, 260 insertions(+), 181 deletions(-)
>   create mode 100644 include/exec/memory-physical.h
>

Well done! The performance gains are very impressive.
I also just completed a measurement of this patch for the MC fault 
tolerance implementation that I posted last month:

Here is an example using a 12GB virtual machine (not as big as Vinod's, 
but not small either),
with most of the RAM being aggressively dirtied by a small C program.

With MC (micro checkpointing enabled, checkpointing 10 times per second),
here is the *per-checkpointing* overheads of the dirty bitmap:

1) BEFORE Juan's Patch:
a) Time to retrieve LOGDIRTY from KVM:  33 milliseconds
b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: 86 
milliseconds

2) AFTER Juan's Patch:
a) Time to retrieve LOGDIRTY from KVM:  17 milliseconds
b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: < 1 
millisecond

Enormous improvements - very nice.

- Michael

  parent reply	other threads:[~2013-11-25  6:16 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
2013-11-07  9:36   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
2013-11-07  9:42   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
2013-11-07  9:51   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
2013-11-07 10:00   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
2013-11-07 11:37   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
2013-11-06 23:11   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
2013-11-06 23:12   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
2013-11-06 23:15   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
2013-11-06 23:18   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
2013-11-06 23:24   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
2013-11-06 23:39   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
2013-11-06 23:44   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
2013-11-06 23:49   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
2013-11-06 23:55   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
2013-11-06 15:54   ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
2013-11-06 15:56   ` Paolo Bonzini
2013-11-06 16:22     ` Juan Quintela
2013-11-06 16:23       ` Paolo Bonzini
2013-12-18  8:54       ` Alexander Graf
2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
2013-11-06 15:58   ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time Juan Quintela
2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
2013-11-06 15:38   ` Juan Quintela
2013-11-07 11:23     ` Gerd Hoffmann
2013-11-06 15:49   ` Paolo Bonzini
2013-11-25  6:39     ` Michael R. Hines
2013-11-25  9:45       ` Paolo Bonzini
2013-11-08 15:18 ` Chegu Vinod
2013-11-25  6:15 ` Michael R. Hines [this message]
2013-11-25  6:58   ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5292EB10.4030705@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=chegu_vinod@hp.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).