From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41367) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vkq88-0003St-Rt for qemu-devel@nongnu.org; Mon, 25 Nov 2013 01:58:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vkq81-0005tU-4t for qemu-devel@nongnu.org; Mon, 25 Nov 2013 01:58:48 -0500 Received: from e35.co.us.ibm.com ([32.97.110.153]:54219) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vkq80-0005rW-SX for qemu-devel@nongnu.org; Mon, 25 Nov 2013 01:58:41 -0500 Received: from /spool/local by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 24 Nov 2013 23:58:39 -0700 Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 617E31FF001F for ; Sun, 24 Nov 2013 23:58:17 -0700 (MST) Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by b03cxnp08025.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id rAP4ulKV66912466 for ; Mon, 25 Nov 2013 05:56:47 +0100 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id rAP6wamG023599 for ; Sun, 24 Nov 2013 23:58:36 -0700 Message-ID: <5292F51A.3030101@linux.vnet.ibm.com> Date: Mon, 25 Nov 2013 14:58:34 +0800 From: "Michael R. Hines" MIME-Version: 1.0 References: <1383743088-8139-1-git-send-email-quintela@redhat.com> <5292EB10.4030705@linux.vnet.ibm.com> In-Reply-To: <5292EB10.4030705@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 00/39] bitmap handling optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juan Quintela Cc: chegu_vinod@hp.com, qemu-devel@nongnu.org On 11/25/2013 02:15 PM, Michael R. Hines wrote: > On 11/06/2013 09:04 PM, Juan Quintela wrote: >> Hi >> >> [v2] >> In this version: >> - fixed all the comments from last versions (thanks Eric) >> - kvm migration bitmap is synchronized using bitmap operations >> - qemu bitmap -> migration bitmap is synchronized using bitmap >> operations >> If bitmaps are not properly aligned, we fall back to old code. >> Code survives virt-tests, so should be in quite good shape. >> >> ToDo list: >> >> - vga ram by default is not aligned in a page number multiple of 64, >> >> it could be optimized. Kraxel? It syncs the kvm bitmap at least 1 >> a second or so? bitmap is only 2048 pages (16MB by default). >> We need to change the ram_addr only >> >> - vga: still more, after we finish migration, vga code continues >> synchronizing the kvm bitmap on source machine. Notice that there >> is no graphics client connected to the VGA. Worth investigating? >> >> - I haven't yet meassure speed differences on big hosts. Vinod? >> >> - Depending of performance, more optimizations to do. >> >> - debugging printf's still on the code, just to see if we are taking >> (or not) the optimized paths. >> >> And that is all. Please test & comment. >> >> Thanks, Juan. >> >> [v1] >> This series split the dirty bitmap (8 bits per page, only three used) >> into 3 individual bitmaps. Once the conversion is done, operations >> are handled by bitmap operations, not bit by bit. >> >> - *_DIRTY_FLAG flags are gone, now we use memory.h DIRTY_MEMORY_* >> everywhere. >> >> - We set/reset each flag individually >> (set_dirty_flags(0xff&~CODE_DIRTY_FLAG)) are gone. >> >> - Rename several functions to clarify/make consistent things. >> >> - I know it dont't pass checkpatch for long lines, propper submission >> should pass it. We have to have long lines, short variable names, or >> ugly line splitting :p >> >> - DIRTY_MEMORY_NUM: how can one include exec/memory.h into cpu-all.h? >> #include it don't work, as a workaround, I have copied its value, but >> any better idea? I can always create "exec/migration-flags.h", >> though. >> >> - The meat of the code is patch 19. Rest of patches are quite easy >> (even that one is not too complex). >> >> Only optimizations done so far are >> set_dirty_range()/clear_dirty_range() that now operates with >> bitmap_set/clear. >> >> Note for Xen: cpu_physical_memory_set_dirty_range() was wrong for xen, >> see comment on patch. >> >> It passes virt-test migration tests, so it should be perfect. >> >> I post it to ask for comments. >> >> ToDo list: >> >> - create a lock for the bitmaps and fold migration bitmap into this >> one. This would avoid a copy and make things easier? >> >> - As this code uses/abuses bitmaps, we need to change the type of the >> index from int to long. With an int index, we can only access a >> maximum of 8TB guest (yes, this is not urgent, we have a couple of >> years to do it). >> >> - merging KVM <-> QEMU bitmap as a bitmap and not bit-by-bit. >> >> - spliting the KVM bitmap synchronization into chunks, i.e. not >> synchronize all memory, just enough to continue with migration. >> >> Any further ideas/needs? >> >> Thanks, Juan. >> >> PD. Why it took so long? >> >> Because I was trying to integrate the bitmap on the MemoryRegion >> abstraction. Would have make the code cleaner, but hit dead-end >> after dead-end. As practical terms, TCG don't know about >> MemoryRegions, it has been ported to run on top of them, but >> don't use them effective >> >> >> The following changes since commit >> c2d30667760e3d7b81290d801e567d4f758825ca: >> >> rtc: remove dead SQW IRQ code (2013-11-05 20:04:03 -0800) >> >> are available in the git repository at: >> >> git://github.com/juanquintela/qemu.git bitmap-v2.next >> >> for you to fetch changes up to d91eff97e6f36612eb22d57c2b6c2623f73d3997: >> >> migration: synchronize memory bitmap 64bits at a time (2013-11-06 >> 13:54:56 +0100) >> >> ---------------------------------------------------------------- >> Juan Quintela (39): >> Move prototypes to memory.h >> memory: cpu_physical_memory_set_dirty_flags() result is never >> used >> memory: cpu_physical_memory_set_dirty_range() return void >> exec: use accessor function to know if memory is dirty >> memory: create function to set a single dirty bit >> exec: create function to get a single dirty bit >> memory: make cpu_physical_memory_is_dirty return bool >> exec: simplify notdirty_mem_write() >> memory: all users of cpu_physical_memory_get_dirty used only >> one flag >> memory: set single dirty flags when possible >> memory: cpu_physical_memory_set_dirty_range() allways dirty >> all flags >> memory: cpu_physical_memory_mask_dirty_range() always clear a >> single flag >> memory: use DIRTY_MEMORY_* instead of *_DIRTY_FLAG >> memory: use bit 2 for migration >> memory: make sure that client is always inside range >> memory: only resize dirty bitmap when memory size increases >> memory: cpu_physical_memory_clear_dirty_flag() result is never >> used >> bitmap: Add bitmap_zero_extend operation >> memory: split dirty bitmap into three >> memory: unfold cpu_physical_memory_clear_dirty_flag() in its >> only user >> memory: unfold cpu_physical_memory_set_dirty() in its only user >> memory: unfold cpu_physical_memory_set_dirty_flag() >> memory: make cpu_physical_memory_get_dirty() the main function >> memory: cpu_physical_memory_get_dirty() is used as returning a >> bool >> memory: s/mask/clear/ cpu_physical_memory_mask_dirty_range >> memory: use find_next_bit() to find dirty bits >> memory: cpu_physical_memory_set_dirty_range() now uses bitmap >> operations >> memory: cpu_physical_memory_clear_dirty_range() now uses >> bitmap operations >> memory: s/dirty/clean/ in cpu_physical_memory_is_dirty() >> memory: make cpu_physical_memory_reset_dirty() take a length >> parameter >> memory: cpu_physical_memory_set_dirty_tracking() should return >> void >> memory: split cpu_physical_memory_* functions to its own include >> memory: unfold memory_region_test_and_clear() >> kvm: use directly cpu_physical_memory_* api for tracking dirty >> pages >> kvm: refactor start address calculation >> memory: move bitmap synchronization to its own function >> memory: syncronize kvm bitmap using bitmaps operations >> ram: split function that synchronizes a range >> migration: synchronize memory bitmap 64bits at a time >> >> arch_init.c | 57 ++++++++++++---- >> cputlb.c | 10 +-- >> exec.c | 75 ++++++++++----------- >> include/exec/cpu-all.h | 4 +- >> include/exec/cpu-common.h | 4 -- >> include/exec/memory-internal.h | 84 ------------------------ >> include/exec/memory-physical.h | 143 >> +++++++++++++++++++++++++++++++++++++++++ >> include/exec/memory.h | 10 +-- >> include/qemu/bitmap.h | 9 +++ >> kvm-all.c | 28 ++------ >> memory.c | 17 ++--- >> 11 files changed, 260 insertions(+), 181 deletions(-) >> create mode 100644 include/exec/memory-physical.h >> > > Well done! The performance gains are very impressive. > I also just completed a measurement of this patch for the MC fault > tolerance implementation that I posted last month: > > Here is an example using a 12GB virtual machine (not as big as > Vinod's, but not small either), > with most of the RAM being aggressively dirtied by a small C program. > > With MC (micro checkpointing enabled, checkpointing 10 times per second), > here is the *per-checkpointing* overheads of the dirty bitmap: > > 1) BEFORE Juan's Patch: > a) Time to retrieve LOGDIRTY from KVM: 33 milliseconds > b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: 86 > milliseconds > > 2) AFTER Juan's Patch: > a) Time to retrieve LOGDIRTY from KVM: 17 milliseconds > b) Time to synchronize with migration_bitmap from LOGDIRTY bitmap: < 1 > millisecond > > Enormous improvements - very nice. > > - Michael Tested-by: Michael R. Hines