All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Juan Quintela <quintela@redhat.com>
Cc: chegu_vinod@hp.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization
Date: Mon, 25 Nov 2013 14:15:44 +0800	[thread overview]
Message-ID: <5292EB10.4030705@linux.vnet.ibm.com> (raw)
In-Reply-To: <1383743088-8139-1-git-send-email-quintela@redhat.com>

On 11/06/2013 09:04 PM, Juan Quintela wrote:
> Hi
>
> [v2]
> In this version:
> - fixed all the comments from last versions (thanks Eric)
> - kvm migration bitmap is synchronized using bitmap operations
> - qemu bitmap -> migration bitmap is synchronized using bitmap operations
> If bitmaps are not properly aligned, we fall back to old code.
> Code survives virt-tests, so should be in quite good shape.
>
> ToDo list:
>
> - vga ram by default is not aligned in a page number multiple of 64,
>
>    it could be optimized.  Kraxel?  It syncs the kvm bitmap at least 1
>    a second or so? bitmap is only 2048 pages (16MB by default).
>    We need to change the ram_addr only
>
> - vga: still more, after we finish migration, vga code continues
>    synchronizing the kvm bitmap on source machine.  Notice that there
>    is no graphics client connected to the VGA.  Worth investigating?
>
> - I haven't yet meassure speed differences on big hosts.  Vinod?
>
> - Depending of performance, more optimizations to do.
>
> - debugging printf's still on the code, just to see if we are taking
>    (or not) the optimized paths.
>
> And that is all.  Please test & comment.
>
> Thanks, Juan.
>
> [v1]
> This series split the dirty bitmap (8 bits per page, only three used)
> into 3 individual bitmaps.  Once the conversion is done, operations
> are handled by bitmap operations, not bit by bit.
>
> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_*
>     everywhere.
>
> - We set/reset each flag individually
>    (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone.
>
> - Rename several functions to clarify/make consistent things.
>
> - I know it dont't pass checkpatch for long lines, propper submission
>    should pass it. We have to have long lines, short variable names, or
>    ugly line splitting :p
>
> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h?
>    #include it don't work, as a workaround, I have copied its value, but
>    any better idea?  I can always create "exec/migration-flags.h", though.
>
> - The meat of the code is patch 19.  Rest of patches are quite easy
> (even that one is not too complex).
>
> Only optimizations done so far are
> set_dirty_range()/clear_dirty_range() that now operates with
> bitmap_set/clear.
>
> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen,
> see comment on patch.
>
> It passes virt-test migration tests, so it should be perfect.
>
> I post it to ask for comments.
>
> ToDo list:
>
> - create a lock for the bitmaps and fold migration bitmap into this
>    one.  This would avoid a copy and make things easier?
>
> - As this code uses/abuses bitmaps, we need to change the type of the
>    index from int to long.  With an int index, we can only access a
>    maximum of 8TB guest (yes, this is not urgent, we have a couple of
>    years to do it).
>
> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit.
>
> - spliting the KVM bitmap synchronization into chunks, i.e. not
>    synchronize all memory, just enough to continue with migration.
>
> Any further ideas/needs?
>
> Thanks, Juan.
>
> PD.  Why it took so long?
>
>       Because I was trying to integrate the bitmap on the MemoryRegion
>       abstraction.  Would have make the code cleaner, but hit dead-end
>       after dead-end.  As practical terms, TCG don't know about
>       MemoryRegions, it has been ported to run on top of them, but
>       don't use them effective
>
>
> The following changes since commit c2d30667760e3d7b81290d801e567d4f758825ca:
>
>    rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800)
>
> are available in the git repository at:
>
>    git://github.com/juanquintela/qemu.git bitmap-v2.next
>
> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997:
>
>    migration: synchronize memory bitmap 64bits at a time (2013-11-06 13:54:56 +0100)
>
> ----------------------------------------------------------------
> Juan Quintela (39):
>        Move prototypes to memory.h
>        memory: cpu_physical_memory_set_dirty_flags() result is never used
>        memory: cpu_physical_memory_set_dirty_range() return void
>        exec: use accessor function to know if memory is dirty
>        memory: create function to set a single dirty bit
>        exec: create function to get a single dirty bit
>        memory: make cpu_physical_memory_is_dirty return bool
>        exec: simplify notdirty_mem_write()
>        memory: all users of cpu_physical_memory_get_dirty used only one flag
>        memory: set single dirty flags when possible
>        memory: cpu_physical_memory_set_dirty_range() allways dirty all flags
>        memory: cpu_physical_memory_mask_dirty_range() always clear a single flag
>        memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG
>        memory: use bit 2 for migration
>        memory: make sure that client is always inside range
>        memory: only resize dirty bitmap when memory size increases
>        memory: cpu_physical_memory_clear_dirty_flag() result is never used
>        bitmap: Add bitmap_zero_extend operation
>        memory: split dirty bitmap into three
>        memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user
>        memory: unfold cpu_physical_memory_set_dirty() in its only user
>        memory: unfold cpu_physical_memory_set_dirty_flag()
>        memory: make cpu_physical_memory_get_dirty() the main function
>        memory: cpu_physical_memory_get_dirty() is used as returning a bool
>        memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range
>        memory: use find_next_bit() to find dirty bits
>        memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations
>        memory: cpu_physical_memory_clear_dirty_range() now uses bitmap operations
>        memory: s/dirty/clean/ in cpu_physical_memory_is_dirty()
>        memory: make cpu_physical_memory_reset_dirty() take a length parameter
>        memory: cpu_physical_memory_set_dirty_tracking() should return void
>        memory: split cpu_physical_memory_* functions to its own include
>        memory: unfold memory_region_test_and_clear()
>        kvm: use directly cpu_physical_memory_* api for tracking dirty pages
>        kvm: refactor start address calculation
>        memory: move bitmap synchronization to its own function
>        memory: syncronize kvm bitmap using bitmaps operations
>        ram: split function that synchronizes a range
>        migration: synchronize memory bitmap 64bits at a time
>
>   arch_init.c                    |  57 ++++++++++++----
>   cputlb.c                       |  10 +--
>   exec.c                         |  75 ++++++++++-----------
>   include/exec/cpu-all.h         |   4 +-
>   include/exec/cpu-common.h      |   4 --
>   include/exec/memory-internal.h |  84 ------------------------
>   include/exec/memory-physical.h | 143 +++++++++++++++++++++++++++++++++++++++++
>   include/exec/memory.h          |  10 +--
>   include/qemu/bitmap.h          |   9 +++
>   kvm-all.c                      |  28 ++------
>   memory.c                       |  17 ++---
>   11 files changed, 260 insertions(+), 181 deletions(-)
>   create mode 100644 include/exec/memory-physical.h
>

Well done! The performance gains are very impressive.
I also just completed a measurement of this patch for the MC fault 
tolerance implementation that I posted last month:

Here is an example using a 12GB virtual machine (not as big as Vinod's, 
but not small either),
with most of the RAM being aggressively dirtied by a small C program.

With MC (micro checkpointing enabled, checkpointing 10 times per second),
here is the *per-checkpointing* overheads of the dirty bitmap:

1) BEFORE Juan's Patch:
a) Time to retrieve LOGDIRTY from KVM:  33 milliseconds
b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: 86 
milliseconds

2) AFTER Juan's Patch:
a) Time to retrieve LOGDIRTY from KVM:  17 milliseconds
b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: < 1 
millisecond

Enormous improvements - very nice.

- Michael

  parent reply	other threads:[~2013-11-25  6:16 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-06 13:04 [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 01/39] Move prototypes to memory.h Juan Quintela
2013-11-07  9:36   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 02/39] memory: cpu_physical_memory_set_dirty_flags() result is never used Juan Quintela
2013-11-07  9:42   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 03/39] memory: cpu_physical_memory_set_dirty_range() return void Juan Quintela
2013-11-07  9:51   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 04/39] exec: use accessor function to know if memory is dirty Juan Quintela
2013-11-07 10:00   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 05/39] memory: create function to set a single dirty bit Juan Quintela
2013-11-07 11:37   ` Orit Wasserman
2013-11-06 13:04 ` [Qemu-devel] [PATCH 06/39] exec: create function to get " Juan Quintela
2013-11-06 23:11   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 07/39] memory: make cpu_physical_memory_is_dirty return bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 08/39] exec: simplify notdirty_mem_write() Juan Quintela
2013-11-06 23:12   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 09/39] memory: all users of cpu_physical_memory_get_dirty used only one flag Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 10/39] memory: set single dirty flags when possible Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 11/39] memory: cpu_physical_memory_set_dirty_range() allways dirty all flags Juan Quintela
2013-11-06 23:15   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 12/39] memory: cpu_physical_memory_mask_dirty_range() always clear a single flag Juan Quintela
2013-11-06 23:18   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 13/39] memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG Juan Quintela
2013-11-06 23:24   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 14/39] memory: use bit 2 for migration Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 15/39] memory: make sure that client is always inside range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 16/39] memory: only resize dirty bitmap when memory size increases Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 17/39] memory: cpu_physical_memory_clear_dirty_flag() result is never used Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 18/39] bitmap: Add bitmap_zero_extend operation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 19/39] memory: split dirty bitmap into three Juan Quintela
2013-11-06 23:39   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 20/39] memory: unfold cpu_physical_memory_clear_dirty_flag() in its only user Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 21/39] memory: unfold cpu_physical_memory_set_dirty() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 22/39] memory: unfold cpu_physical_memory_set_dirty_flag() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 23/39] memory: make cpu_physical_memory_get_dirty() the main function Juan Quintela
2013-11-06 23:44   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 24/39] memory: cpu_physical_memory_get_dirty() is used as returning a bool Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 25/39] memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 26/39] memory: use find_next_bit() to find dirty bits Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 27/39] memory: cpu_physical_memory_set_dirty_range() now uses bitmap operations Juan Quintela
2013-11-06 23:49   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 28/39] memory: cpu_physical_memory_clear_dirty_range() " Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 29/39] memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 30/39] memory: make cpu_physical_memory_reset_dirty() take a length parameter Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 31/39] memory: cpu_physical_memory_set_dirty_tracking() should return void Juan Quintela
2013-11-06 23:55   ` Eric Blake
2013-11-06 13:04 ` [Qemu-devel] [PATCH 32/39] memory: split cpu_physical_memory_* functions to its own include Juan Quintela
2013-11-06 15:54   ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 33/39] memory: unfold memory_region_test_and_clear() Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 34/39] kvm: use directly cpu_physical_memory_* api for tracking dirty pages Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 35/39] kvm: refactor start address calculation Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 36/39] memory: move bitmap synchronization to its own function Juan Quintela
2013-11-06 15:56   ` Paolo Bonzini
2013-11-06 16:22     ` Juan Quintela
2013-11-06 16:23       ` Paolo Bonzini
2013-12-18  8:54       ` Alexander Graf
2013-11-06 13:04 ` [Qemu-devel] [PATCH 37/39] memory: syncronize kvm bitmap using bitmaps operations Juan Quintela
2013-11-06 15:58   ` Paolo Bonzini
2013-11-06 13:04 ` [Qemu-devel] [PATCH 38/39] ram: split function that synchronizes a range Juan Quintela
2013-11-06 13:04 ` [Qemu-devel] [PATCH 39/39] migration: synchronize memory bitmap 64bits at a time Juan Quintela
2013-11-06 14:37 ` [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization Gerd Hoffmann
2013-11-06 15:38   ` Juan Quintela
2013-11-07 11:23     ` Gerd Hoffmann
2013-11-06 15:49   ` Paolo Bonzini
2013-11-25  6:39     ` Michael R. Hines
2013-11-25  9:45       ` Paolo Bonzini
2013-11-08 15:18 ` Chegu Vinod
2013-11-25  6:15 ` Michael R. Hines [this message]
2013-11-25  6:58   ` Michael R. Hines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5292EB10.4030705@linux.vnet.ibm.com \
    --to=mrhines@linux.vnet.ibm.com \
    --cc=chegu_vinod@hp.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.