public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: dirty logging optimization - double buffering
@ 2010-10-27  9:21 Takuya Yoshikawa
  2010-10-27  9:22 ` [PATCH 1/3 rebased] KVM: introduce wrapper functions for creating/destroying dirty bitmaps Takuya Yoshikawa
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Takuya Yoshikawa @ 2010-10-27  9:21 UTC (permalink / raw)
  To: avi, mtosatti; +Cc: kvm, takuya.yoshikawa, fernando

This patch series just change the way we allocate dirty bitmaps but don't
change timing related issues.

 - Changelog
  I have not changed anything about patch 1 and 2 since I got
  "looks good" comment from Marcelo. Just rebased.

  Patch 3 has become simpler by rebasing on top of this series.

Thanks,
  Takuya



[Test result for VGA updates]

 - without this patch series
               |  kvm_vm_ioctl_get_dirty_log() {
               |    mutex_lock() {
    0.229 us   |      _cond_resched();
    0.781 us   |    }
               |    vmalloc() {
    ...             ...
    ...             ...
  + 34.898 us  |    }
    0.229 us   |    memset();
               |    T.1684() {
               |      __kmalloc() {
    0.188 us   |        _cond_resched();
    1.075 us   |        memset();
    1.991 us   |      }
    2.450 us   |    }
               |    synchronize_srcu_expedited() {
    ...             ...
    ...             ...
  ! 108.113 us |    }
    0.259 us   |    kfree();
    0.215 us   |    _raw_spin_lock();
               |    kvm_mmu_slot_remove_write_access() {
               |      kvm_flush_remote_tlbs() {
               |        make_all_cpus_request() {
    0.202 us   |          _raw_spin_lock();
    0.643 us   |        }
    1.048 us   |      }
    2.081 us   |    }
               |    copy_to_user() {
    0.304 us   |      _cond_resched();
    0.733 us   |    }
               |    vfree() {
    ...             ...
    ...             ...
    6.456 us   |    }
    0.334 us   |    mutex_unlock();
  ! 159.649 us |  }


 - with this patch series
               |  kvm_vm_ioctl_get_dirty_log() {
               |    mutex_lock() {
    0.300 us   |      _cond_resched();
    0.789 us   |    }
    0.237 us   |    memset();
               |    T.1686() {
               |      __kmalloc() {
    0.207 us   |        _cond_resched();
    1.086 us   |        memset();
    1.981 us   |      }
    2.408 us   |    }
               |    synchronize_srcu_expedited() {
    ...             ...
    ...             ...
  + 88.786 us  |    }
    0.244 us   |    kfree();
    0.221 us   |    _raw_spin_lock();
               |    kvm_mmu_slot_remove_write_access() {
               |      kvm_flush_remote_tlbs() {
               |        make_all_cpus_request() {
    0.206 us   |          _raw_spin_lock();
    0.647 us   |        }
    1.048 us   |      }
    2.056 us   |    }
               |    copy_to_user() {
    0.263 us   |      _cond_resched();
    0.688 us   |    }
    0.202 us   |    mutex_unlock();
  + 98.356 us  |  }


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3 rebased] KVM: introduce wrapper functions for creating/destroying dirty bitmaps
  2010-10-27  9:21 [PATCH 0/3] KVM: dirty logging optimization - double buffering Takuya Yoshikawa
@ 2010-10-27  9:22 ` Takuya Yoshikawa
  2010-10-27  9:23 ` [PATCH 2/3 rebased] KVM: pre-allocate one more dirty bitmap to avoid vmalloc() Takuya Yoshikawa
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Takuya Yoshikawa @ 2010-10-27  9:22 UTC (permalink / raw)
  To: avi, mtosatti; +Cc: kvm, takuya.yoshikawa, fernando

This makes it easy to change the way of allocating/freeing dirty bitmaps.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---
 virt/kvm/kvm_main.c |   30 +++++++++++++++++++++++-------
 1 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 361ad1a..3e2d57e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -444,6 +444,15 @@ out_err_nodisable:
 	return ERR_PTR(r);
 }
 
+static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot)
+{
+	if (!memslot->dirty_bitmap)
+		return;
+
+	vfree(memslot->dirty_bitmap);
+	memslot->dirty_bitmap = NULL;
+}
+
 /*
  * Free any memory in @free but not in @dont.
  */
@@ -456,7 +465,7 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 		vfree(free->rmap);
 
 	if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
-		vfree(free->dirty_bitmap);
+		kvm_destroy_dirty_bitmap(free);
 
 
 	for (i = 0; i < KVM_NR_PAGE_SIZES - 1; ++i) {
@@ -467,7 +476,6 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
 	}
 
 	free->npages = 0;
-	free->dirty_bitmap = NULL;
 	free->rmap = NULL;
 }
 
@@ -529,6 +537,18 @@ static int kvm_vm_release(struct inode *inode, struct file *filp)
 	return 0;
 }
 
+static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
+{
+	unsigned long dirty_bytes = kvm_dirty_bitmap_bytes(memslot);
+
+	memslot->dirty_bitmap = vmalloc(dirty_bytes);
+	if (!memslot->dirty_bitmap)
+		return -ENOMEM;
+
+	memset(memslot->dirty_bitmap, 0, dirty_bytes);
+	return 0;
+}
+
 /*
  * Allocate some memory and give it an address in the guest physical address
  * space.
@@ -663,12 +683,8 @@ skip_lpage:
 
 	/* Allocate page dirty bitmap if needed */
 	if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) {
-		unsigned long dirty_bytes = kvm_dirty_bitmap_bytes(&new);
-
-		new.dirty_bitmap = vmalloc(dirty_bytes);
-		if (!new.dirty_bitmap)
+		if (kvm_create_dirty_bitmap(&new) < 0)
 			goto out_free;
-		memset(new.dirty_bitmap, 0, dirty_bytes);
 		/* destroy any largepage mappings for dirty tracking */
 		if (old.npages)
 			flush_shadow = 1;
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3 rebased] KVM: pre-allocate one more dirty bitmap to avoid vmalloc()
  2010-10-27  9:21 [PATCH 0/3] KVM: dirty logging optimization - double buffering Takuya Yoshikawa
  2010-10-27  9:22 ` [PATCH 1/3 rebased] KVM: introduce wrapper functions for creating/destroying dirty bitmaps Takuya Yoshikawa
@ 2010-10-27  9:23 ` Takuya Yoshikawa
  2010-10-27  9:25 ` [PATCH 3/3 rebased] KVM: use kmalloc() for small dirty bitmaps Takuya Yoshikawa
  2010-11-01 16:42 ` [PATCH 0/3] KVM: dirty logging optimization - double buffering Marcelo Tosatti
  3 siblings, 0 replies; 6+ messages in thread
From: Takuya Yoshikawa @ 2010-10-27  9:23 UTC (permalink / raw)
  To: avi, mtosatti; +Cc: kvm, takuya.yoshikawa, fernando

Currently x86's kvm_vm_ioctl_get_dirty_log() needs to allocate a bitmap by
vmalloc() which will be used in the next logging and this has been causing
bad effect to VGA and live-migration: vmalloc() consumes extra systime,
triggers tlb flush, etc.

This patch resolves this issue by pre-allocating one more bitmap and switching
between two bitmaps during dirty logging.

Performance improvement:
  I measured performance for the case of VGA update by trace-cmd.
  The result was 1.5 times faster than the original one.

  In the case of live migration, the improvement ratio depends on the workload
  and the guest memory size. In general, the larger the memory size is the more
  benefits we get.

Note:
  This does not change other architectures's logic but the allocation size
  becomes twice. This will increase the actual memory consumption only when
  the new size changes the number of pages allocated by vmalloc().

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---
 arch/x86/kvm/x86.c       |   16 +++++-----------
 include/linux/kvm_host.h |    1 +
 virt/kvm/kvm_main.c      |   11 +++++++++--
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7876ec8..18d3bc0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3208,18 +3208,15 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 		struct kvm_memslots *slots, *old_slots;
 		unsigned long *dirty_bitmap;
 
-		r = -ENOMEM;
-		dirty_bitmap = vmalloc(n);
-		if (!dirty_bitmap)
-			goto out;
+		dirty_bitmap = memslot->dirty_bitmap_head;
+		if (memslot->dirty_bitmap == dirty_bitmap)
+			dirty_bitmap += n / sizeof(long);
 		memset(dirty_bitmap, 0, n);
 
 		r = -ENOMEM;
 		slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
-		if (!slots) {
-			vfree(dirty_bitmap);
+		if (!slots)
 			goto out;
-		}
 		memcpy(slots, kvm->memslots, sizeof(struct kvm_memslots));
 		slots->memslots[log->slot].dirty_bitmap = dirty_bitmap;
 		slots->generation++;
@@ -3235,11 +3232,8 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 		spin_unlock(&kvm->mmu_lock);
 
 		r = -EFAULT;
-		if (copy_to_user(log->dirty_bitmap, dirty_bitmap, n)) {
-			vfree(dirty_bitmap);
+		if (copy_to_user(log->dirty_bitmap, dirty_bitmap, n))
 			goto out;
-		}
-		vfree(dirty_bitmap);
 	} else {
 		r = -EFAULT;
 		if (clear_user(log->dirty_bitmap, n))
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ee4314e..8f0eb40 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -150,6 +150,7 @@ struct kvm_memory_slot {
 	unsigned long flags;
 	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
+	unsigned long *dirty_bitmap_head;
 	struct {
 		unsigned long rmap_pde;
 		int write_count;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3e2d57e..81bc1c7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -449,8 +449,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot)
 	if (!memslot->dirty_bitmap)
 		return;
 
-	vfree(memslot->dirty_bitmap);
+	vfree(memslot->dirty_bitmap_head);
 	memslot->dirty_bitmap = NULL;
+	memslot->dirty_bitmap_head = NULL;
 }
 
 /*
@@ -537,15 +538,21 @@ static int kvm_vm_release(struct inode *inode, struct file *filp)
 	return 0;
 }
 
+/*
+ * Allocation size is twice as large as the actual dirty bitmap size.
+ * This makes it possible to do double buffering: see x86's
+ * kvm_vm_ioctl_get_dirty_log().
+ */
 static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
 {
-	unsigned long dirty_bytes = kvm_dirty_bitmap_bytes(memslot);
+	unsigned long dirty_bytes = 2 * kvm_dirty_bitmap_bytes(memslot);
 
 	memslot->dirty_bitmap = vmalloc(dirty_bytes);
 	if (!memslot->dirty_bitmap)
 		return -ENOMEM;
 
 	memset(memslot->dirty_bitmap, 0, dirty_bytes);
+	memslot->dirty_bitmap_head = memslot->dirty_bitmap;
 	return 0;
 }
 
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3 rebased] KVM: use kmalloc() for small dirty bitmaps
  2010-10-27  9:21 [PATCH 0/3] KVM: dirty logging optimization - double buffering Takuya Yoshikawa
  2010-10-27  9:22 ` [PATCH 1/3 rebased] KVM: introduce wrapper functions for creating/destroying dirty bitmaps Takuya Yoshikawa
  2010-10-27  9:23 ` [PATCH 2/3 rebased] KVM: pre-allocate one more dirty bitmap to avoid vmalloc() Takuya Yoshikawa
@ 2010-10-27  9:25 ` Takuya Yoshikawa
  2010-11-01  5:36   ` [PATCH 3/3 rebased-v2] " Takuya Yoshikawa
  2010-11-01 16:42 ` [PATCH 0/3] KVM: dirty logging optimization - double buffering Marcelo Tosatti
  3 siblings, 1 reply; 6+ messages in thread
From: Takuya Yoshikawa @ 2010-10-27  9:25 UTC (permalink / raw)
  To: avi, mtosatti; +Cc: kvm, takuya.yoshikawa, fernando

Currently we are using vmalloc() for all dirty bitmaps even if
they are small enough, say less than K bytes.

We use kmalloc() if dirty bitmap size is less than or equal to
PAGE_SIZE so that we can avoid vmalloc area usage for VGA.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
---
 virt/kvm/kvm_main.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 81bc1c7..236f560 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -449,7 +449,11 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot)
 	if (!memslot->dirty_bitmap)
 		return;
 
-	vfree(memslot->dirty_bitmap_head);
+	if (2 * kvm_dirty_bitmap_bytes(memslot) > PAGE_SIZE)
+		vfree(memslot->dirty_bitmap_head);
+	else
+		kfree(memslot->dirty_bitmap_head);
+
 	memslot->dirty_bitmap = NULL;
 	memslot->dirty_bitmap_head = NULL;
 }
@@ -547,7 +551,11 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
 {
 	unsigned long dirty_bytes = 2 * kvm_dirty_bitmap_bytes(memslot);
 
-	memslot->dirty_bitmap = vmalloc(dirty_bytes);
+	if (dirty_bytes > PAGE_SIZE)
+		memslot->dirty_bitmap = vmalloc(dirty_bytes);
+	else
+		memslot->dirty_bitmap = kmalloc(dirty_bytes, GFP_KERNEL);
+
 	if (!memslot->dirty_bitmap)
 		return -ENOMEM;
 
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3 rebased-v2] KVM: use kmalloc() for small dirty bitmaps
  2010-10-27  9:25 ` [PATCH 3/3 rebased] KVM: use kmalloc() for small dirty bitmaps Takuya Yoshikawa
@ 2010-11-01  5:36   ` Takuya Yoshikawa
  0 siblings, 0 replies; 6+ messages in thread
From: Takuya Yoshikawa @ 2010-11-01  5:36 UTC (permalink / raw)
  To: avi, mtosatti; +Cc: kvm, takuya.yoshikawa, fernando

I updated this one[3/3] only to use [vk]zalloc

Changelog:
  I replaced ((vmalloc | kmalloc) + memset) with (vzalloc | kzalloc).
  Also, the patch description was updated.

vzalloc() was introduced recently, so this one may be better.

BTW, I noticed that bitmaps are created/destroyed many times:
logging stop/start frequently from qemu side?

So even after the double buffering patch[2/3], this patch may have
more performance gain than I expected.

Takuya

===
Currently we are using vmalloc() for all dirty bitmaps even if
they are small enough, say less than K bytes.

We use kmalloc() if dirty bitmap size is less than or equal to
PAGE_SIZE so that we can avoid vmalloc area usage for VGA.

This will also make the logging start/stop faster.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
---
 virt/kvm/kvm_main.c |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b28581..45d2cef 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -449,7 +449,11 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot)
 	if (!memslot->dirty_bitmap)
 		return;
 
-	vfree(memslot->dirty_bitmap_head);
+	if (2 * kvm_dirty_bitmap_bytes(memslot) > PAGE_SIZE)
+		vfree(memslot->dirty_bitmap_head);
+	else
+		kfree(memslot->dirty_bitmap_head);
+
 	memslot->dirty_bitmap = NULL;
 	memslot->dirty_bitmap_head = NULL;
 }
@@ -547,11 +551,14 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
 {
 	unsigned long dirty_bytes = 2 * kvm_dirty_bitmap_bytes(memslot);
 
-	memslot->dirty_bitmap = vmalloc(dirty_bytes);
+	if (dirty_bytes > PAGE_SIZE)
+		memslot->dirty_bitmap = vzalloc(dirty_bytes);
+	else
+		memslot->dirty_bitmap = kzalloc(dirty_bytes, GFP_KERNEL);
+
 	if (!memslot->dirty_bitmap)
 		return -ENOMEM;
 
-	memset(memslot->dirty_bitmap, 0, dirty_bytes);
 	memslot->dirty_bitmap_head = memslot->dirty_bitmap;
 	return 0;
 }
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] KVM: dirty logging optimization - double buffering
  2010-10-27  9:21 [PATCH 0/3] KVM: dirty logging optimization - double buffering Takuya Yoshikawa
                   ` (2 preceding siblings ...)
  2010-10-27  9:25 ` [PATCH 3/3 rebased] KVM: use kmalloc() for small dirty bitmaps Takuya Yoshikawa
@ 2010-11-01 16:42 ` Marcelo Tosatti
  3 siblings, 0 replies; 6+ messages in thread
From: Marcelo Tosatti @ 2010-11-01 16:42 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: avi, kvm, takuya.yoshikawa, fernando

On Wed, Oct 27, 2010 at 06:21:02PM +0900, Takuya Yoshikawa wrote:
> This patch series just change the way we allocate dirty bitmaps but don't
> change timing related issues.
> 
>  - Changelog
>   I have not changed anything about patch 1 and 2 since I got
>   "looks good" comment from Marcelo. Just rebased.
> 
>   Patch 3 has become simpler by rebasing on top of this series.
> 
> Thanks,
>   Takuya

Applied, thanks.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-11-01 17:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-27  9:21 [PATCH 0/3] KVM: dirty logging optimization - double buffering Takuya Yoshikawa
2010-10-27  9:22 ` [PATCH 1/3 rebased] KVM: introduce wrapper functions for creating/destroying dirty bitmaps Takuya Yoshikawa
2010-10-27  9:23 ` [PATCH 2/3 rebased] KVM: pre-allocate one more dirty bitmap to avoid vmalloc() Takuya Yoshikawa
2010-10-27  9:25 ` [PATCH 3/3 rebased] KVM: use kmalloc() for small dirty bitmaps Takuya Yoshikawa
2010-11-01  5:36   ` [PATCH 3/3 rebased-v2] " Takuya Yoshikawa
2010-11-01 16:42 ` [PATCH 0/3] KVM: dirty logging optimization - double buffering Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox