Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
       [not found]     ` <4EC10BFE.7050704@redhat.com>
@ 2011-11-16  8:17       ` Takuya Yoshikawa
       [not found]       ` <4EC33C0B.1060807@oss.ntt.co.jp>
  1 sibling, 0 replies; 6+ messages in thread
From: Takuya Yoshikawa @ 2011-11-16  8:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: mtosatti, takuya.yoshikawa, kvm, qemu-devel

Adding qemu-devel to Cc.

(2011/11/14 21:39), Avi Kivity wrote:
> On 11/14/2011 12:56 PM, Takuya Yoshikawa wrote:
>> (2011/11/14 19:25), Avi Kivity wrote:
>>> On 11/14/2011 11:20 AM, Takuya Yoshikawa wrote:
>>>> This is a revised version of my previous work.  I hope that
>>>> the patches are more self explanatory than before.
>>>>
>>>
>>> It looks good.  I'll let Marcelo (or anyone else?) review it as well
>>> before applying.
>>>
>>> Do you have performance measurements?
>>>
>>
>> For VGA, 30-40us became 3-5us when the display was quiet, with a
>> enough warmed up guest.
>>
>
> That's a nice improvement.
>
>> Near the criterion, the number was not different much from the
>> original version.
>>
>> For live migration, I forgot the number but the result was good.
>> But my test case was not enough to cover every pattern, so I changed
>> the criterion to be a bit conservative.
>>
>>      More tests may be able to find a better criterion.
>>      I am not in a hurry about this, so it is OK to add some tests
>>      before merging this.
>
> I think we can merge is as is, it's clear we get an improvement.


I did a simple test to show numbers!

Here, a 4GB guest was being migrated locally during copying a file in it.


Case 1. corresponds to the original method and case 2 does to the optimized one.

Small numbers are, probably, from VGA:

	Case 1. about 30us
	Case 2. about 3us

Other numbers are from the system RAM (triggered by live migration):

	Case 1. about 500us, 2000us
	Case 2. about  80us, 2000us (not exactly averaged, see below for details)
	* 2000us was when rmap was not used, so equal to that of case 1.

So I can say that my patch worked well for both VGA and live migration.

	Takuya


=== measurement snippet ===

Case 1. kvm_mmu_slot_remove_write_access() only (same as the original method):

  qemu-system-x86-25413 [000]  6546.215009: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.215010: funcgraph_entry:      ! 2039.512 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.217051: funcgraph_exit:       ! 2040.487 us |  }
  qemu-system-x86-25413 [002]  6546.217347: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [002]  6546.217349: funcgraph_entry:      ! 571.121 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [002]  6546.217921: funcgraph_exit:       ! 572.525 us |  }
  qemu-system-x86-25413 [000]  6546.314583: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.314585: funcgraph_entry:      + 29.598 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.314616: funcgraph_exit:       + 31.053 us  |  }
  qemu-system-x86-25413 [000]  6546.314784: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.314785: funcgraph_entry:      ! 2002.591 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.316788: funcgraph_exit:       ! 2003.537 us |  }
  qemu-system-x86-25413 [000]  6546.317082: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.317083: funcgraph_entry:      ! 624.445 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.317709: funcgraph_exit:       ! 625.861 us |  }
  qemu-system-x86-25413 [000]  6546.414261: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.414263: funcgraph_entry:      + 29.593 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.414293: funcgraph_exit:       + 30.944 us  |  }
  qemu-system-x86-25413 [000]  6546.414528: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.414529: funcgraph_entry:      ! 1990.363 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.416520: funcgraph_exit:       ! 1991.370 us |  }
  qemu-system-x86-25413 [000]  6546.416775: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.416776: funcgraph_entry:      ! 594.333 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.417371: funcgraph_exit:       ! 595.415 us |  }
  qemu-system-x86-25413 [000]  6546.514133: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.514135: funcgraph_entry:      + 24.032 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.514160: funcgraph_exit:       + 25.074 us  |  }
  qemu-system-x86-25413 [000]  6546.514312: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.514313: funcgraph_entry:      ! 2035.365 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.516349: funcgraph_exit:       ! 2036.298 us |  }
  qemu-system-x86-25413 [000]  6546.516642: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.516643: funcgraph_entry:      ! 598.308 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.517242: funcgraph_exit:       ! 599.344 us |  }
  qemu-system-x86-25413 [000]  6546.613895: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.613897: funcgraph_entry:      + 27.765 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.613926: funcgraph_exit:       + 29.051 us  |  }
  qemu-system-x86-25413 [000]  6546.614052: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.614053: funcgraph_entry:      ! 1401.083 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.615454: funcgraph_exit:       ! 1401.778 us |  }
  qemu-system-x86-25413 [000]  6546.615656: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.615657: funcgraph_entry:      ! 405.810 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.616063: funcgraph_exit:       ! 406.415 us |  }
  qemu-system-x86-25413 [001]  6546.713523: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [001]  6546.713525: funcgraph_entry:      + 33.166 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [001]  6546.713559: funcgraph_exit:       + 34.644 us  |  }
  qemu-system-x86-25413 [003]  6546.713688: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [003]  6546.713691: funcgraph_entry:      ! 1872.491 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [003]  6546.715564: funcgraph_exit:       ! 1874.775 us |  }
  qemu-system-x86-25413 [001]  6546.715829: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [001]  6546.715830: funcgraph_entry:      + 25.002 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [001]  6546.715856: funcgraph_exit:       + 26.177 us  |  }
  qemu-system-x86-25413 [001]  6546.715969: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [001]  6546.715970: funcgraph_entry:      ! 604.399 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [001]  6546.716575: funcgraph_exit:       ! 605.502 us |  }
  qemu-system-x86-25413 [000]  6546.813387: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.813389: funcgraph_entry:      + 32.248 us  |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.813422: funcgraph_exit:       + 33.592 us  |  }
  qemu-system-x86-25413 [000]  6546.813565: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-25413 [000]  6546.813566: funcgraph_entry:      ! 1970.585 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-25413 [000]  6546.815537: funcgraph_exit:       ! 1971.752 us |  }


2. + rmap (my method):

  qemu-system-x86-7772  [000]  6096.185229: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6096.185230: funcgraph_entry:      ! 2090.108 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6096.187322: funcgraph_exit:       ! 2091.865 us |  }
  qemu-system-x86-7772  [001]  6096.187634: funcgraph_entry:        6.623 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.187765: funcgraph_entry:      ! 110.571 us |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.284811: funcgraph_entry:        2.343 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.284971: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6096.284972: funcgraph_entry:      ! 1999.656 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6096.286973: funcgraph_exit:       ! 2000.955 us |  }
  qemu-system-x86-7772  [000]  6096.287255: funcgraph_entry:      + 79.547 us  |  write_protect_slot();
  qemu-system-x86-7772  [001]  6096.384401: funcgraph_entry:        4.977 us   |  write_protect_slot();
  qemu-system-x86-7772  [001]  6096.384512: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [001]  6096.384513: funcgraph_entry:      ! 1887.579 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [001]  6096.386401: funcgraph_exit:       ! 1889.068 us |  }
  qemu-system-x86-7772  [001]  6096.386631: funcgraph_entry:      + 80.816 us  |  write_protect_slot();
  qemu-system-x86-7772  [001]  6096.484212: funcgraph_entry:        4.249 us   |  write_protect_slot();
  qemu-system-x86-7772  [001]  6096.484280: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [001]  6096.484280: funcgraph_entry:      ! 1872.626 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [001]  6096.486154: funcgraph_exit:       ! 1873.982 us |  }
  qemu-system-x86-7772  [001]  6096.486398: funcgraph_entry:      + 99.259 us  |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.584165: funcgraph_entry:        2.354 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.584450: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6096.584451: funcgraph_entry:      ! 2012.011 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6096.586464: funcgraph_exit:       ! 2013.625 us |  }
  qemu-system-x86-7772  [000]  6096.586791: funcgraph_entry:      + 74.855 us  |  write_protect_slot();
  qemu-system-x86-7772  [001]  6096.683636: funcgraph_entry:        2.386 us   |  write_protect_slot();
  qemu-system-x86-7772  [001]  6096.683749: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [001]  6096.683749: funcgraph_entry:      ! 1922.750 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [001]  6096.685674: funcgraph_exit:       ! 1924.311 us |  }
  qemu-system-x86-7772  [000]  6096.685926: funcgraph_entry:      + 76.410 us  |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.783620: funcgraph_entry:        2.195 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.783715: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6096.783716: funcgraph_entry:      ! 2110.459 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6096.785827: funcgraph_exit:       ! 2111.781 us |  }
  qemu-system-x86-7772  [000]  6096.786243: funcgraph_entry:        3.493 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.786334: funcgraph_entry:      ! 186.657 us |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.883166: funcgraph_entry:        2.289 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.883376: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6096.883376: funcgraph_entry:      ! 2031.813 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6096.885410: funcgraph_exit:       ! 2033.311 us |  }
  qemu-system-x86-7772  [000]  6096.885696: funcgraph_entry:      + 53.929 us  |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.983096: funcgraph_entry:        2.177 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6096.983288: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6096.983288: funcgraph_entry:      ! 2096.257 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6096.985386: funcgraph_exit:       ! 2097.685 us |  }
  qemu-system-x86-7772  [000]  6096.985697: funcgraph_entry:      + 78.442 us  |  write_protect_slot();
  qemu-system-x86-7772  [000]  6097.082658: funcgraph_entry:        2.286 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6097.082800: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6097.082801: funcgraph_entry:      ! 2049.809 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6097.084851: funcgraph_exit:       ! 2051.107 us |  }
  qemu-system-x86-7772  [000]  6097.085290: funcgraph_entry:      + 69.201 us  |  write_protect_slot();
  qemu-system-x86-7772  [000]  6097.182384: funcgraph_entry:        2.239 us   |  write_protect_slot();
  qemu-system-x86-7772  [000]  6097.182545: funcgraph_entry:                   |  write_protect_slot() {
  qemu-system-x86-7772  [000]  6097.182546: funcgraph_entry:      ! 2158.647 us |    kvm_mmu_slot_remove_write_access();
  qemu-system-x86-7772  [000]  6097.184706: funcgraph_exit:       ! 2160.028 us |  }

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
       [not found]                       ` <4ED4E626.5010507@redhat.com>
@ 2011-11-30  5:02                         ` Takuya Yoshikawa
  2011-11-30  5:15                           ` Takuya Yoshikawa
  0 siblings, 1 reply; 6+ messages in thread
From: Takuya Yoshikawa @ 2011-11-30  5:02 UTC (permalink / raw)
  To: Avi Kivity
  Cc: KVM, quintela, Marcelo Tosatti, qemu-devel, Xiao Guangrong,
	Takuya Yoshikawa

CCing qemu devel, Juan,

(2011/11/29 23:03), Avi Kivity wrote:
> On 11/29/2011 02:01 PM, Avi Kivity wrote:
>> On 11/29/2011 01:56 PM, Xiao Guangrong wrote:
>>> On 11/29/2011 07:20 PM, Avi Kivity wrote:
>>>
>>>
>>>> We used to have a bitmap in a shadow page with a bit set for every slot
>>>> pointed to by the page.  If we extend this to non-leaf pages (so, when
>>>> we set a bit, we propagate it through its parent_ptes list), then we do
>>>> the following on write fault:
>>>>
>>>
>>>
>>> Thanks for the detail.
>>>
>>> Um, propagating slot bit to parent ptes is little slow, especially, it
>>> is the overload for no Xwindow guests which is dirty logged only in the
>>> migration(i guess most linux guests are running on this mode and migration
>>> is not frequent). No?
>>
>> You need to propagate very infrequently.  The first pte added to a page
>> will need to propagate, but the second (if from the same slot, which is
>> likely) will already have the bit set in the page, so we're assured it's
>> set in all its parents.
>
> btw, if you plan to work on this, let's agree on pseudocode/data
> structures first to minimize churn.  I'll also want this documented in
> mmu.txt.  Of course we can still end up with something different than
> planned, but let's at least try to think of the issues in advance.
>

I want to hear the overall view as well.

Now we are trying to improve cases when there are too many dirty pages during
live migration.

I did some measurements of live migration some months ago on 10Gbps dedicated line,
two servers were directly connected, and checked that transferring only a few MBs of
memory took ms order of latency, even if I excluded other QEMU side overheads: it
matches simple math calculation.

In another test, I found that even in a relatively normal workload, it needed a few
seconds of pause at the last timing.

	Juan has more data?

So, the current scheme is not scalable with respect to the number of dirty pages,
and administrators should control not to migrate during such workload if possible.

	Server consolidation in the night will be OK, but dynamic load balancing
	may not work well in such restrictions: I am now more interested in the
	former.

Then, taking that in mind, I put the goal on 1K dirty pages, 4MB memory, when
I did the rmap optimization.  Now it takes a few ms or so for write protecting
such number of pages, IIRC: that is not so bad compared to the overall latency?

So, though I like O(1) method, I want to hear the expected improvements in a bit
more detail, if possible.

IIUC, even though O(1) is O(1) at the timing of GET DIRTY LOG, it needs O(N) write
protections with respect to the total number of dirty pages: distributed, but
actually each page fault, which should be logged, does some write protection?

In general, what kind of improvements actually needed for live migration?

Thanks,
	Takuya

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
  2011-11-30  5:02                         ` Takuya Yoshikawa
@ 2011-11-30  5:15                           ` Takuya Yoshikawa
  2011-12-01 15:18                             ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Takuya Yoshikawa @ 2011-11-30  5:15 UTC (permalink / raw)
  To: Avi Kivity
  Cc: KVM, quintela, Marcelo Tosatti, qemu-devel, Xiao Guangrong,
	Takuya Yoshikawa

(2011/11/30 14:02), Takuya Yoshikawa wrote:

> IIUC, even though O(1) is O(1) at the timing of GET DIRTY LOG, it needs O(N) write
> protections with respect to the total number of dirty pages: distributed, but
> actually each page fault, which should be logged, does some write protection?

Sorry, was not precise.  It depends on the level, and not completely distributed.
But I think it is O(N), and the total number of costs will not change so much,
I guess.

	Takuya

>
> In general, what kind of improvements actually needed for live migration?
>
> Thanks,
> Takuya
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
  2011-11-30  5:15                           ` Takuya Yoshikawa
@ 2011-12-01 15:18                             ` Avi Kivity
  2011-12-03  4:37                               ` Takuya Yoshikawa
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2011-12-01 15:18 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: KVM, quintela, Marcelo Tosatti, qemu-devel, Xiao Guangrong,
	Takuya Yoshikawa

On 11/30/2011 07:15 AM, Takuya Yoshikawa wrote:
> (2011/11/30 14:02), Takuya Yoshikawa wrote:
>
>> IIUC, even though O(1) is O(1) at the timing of GET DIRTY LOG, it
>> needs O(N) write
>> protections with respect to the total number of dirty pages:
>> distributed, but
>> actually each page fault, which should be logged, does some write
>> protection?
>
> Sorry, was not precise.  It depends on the level, and not completely
> distributed.
> But I think it is O(N), and the total number of costs will not change
> so much,
> I guess.

That's true.  But some applications do require low latency, and the
current code can impose a lot of time with the mmu spinlock held.

The total amount of work actually increases slightly, from O(N) to O(N
log N), but since the tree is so wide, the overhead is small.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
  2011-12-01 15:18                             ` Avi Kivity
@ 2011-12-03  4:37                               ` Takuya Yoshikawa
  2011-12-04 10:20                                 ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Takuya Yoshikawa @ 2011-12-03  4:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: KVM, quintela, Marcelo Tosatti, qemu-devel, Xiao Guangrong,
	Takuya Yoshikawa

Avi Kivity <avi@redhat.com> wrote:
> That's true.  But some applications do require low latency, and the
> current code can impose a lot of time with the mmu spinlock held.
> 
> The total amount of work actually increases slightly, from O(N) to O(N
> log N), but since the tree is so wide, the overhead is small.
> 

Controlling the latency can be achieved by making the user space limit
the number of dirty pages to scan without hacking the core mmu code.

	The fact that we cannot transfer so many pages on the network at
	once suggests this is reasonable.

With the rmap write protection method in KVM, the only thing we need is
a new GET_DIRTY_LOG api which takes the [gfn_start, gfn_end] to scan,
or max_write_protections optionally.

	I remember that someone suggested splitting the slot at KVM forum.
	Same effect with less effort.

QEMU can also avoid unwanted page faults by using this api wisely.

	E.g. you can use this for "Interactivity improvements" TODO on
	KVM wiki, I think.

Furthermore, QEMU may be able to use multiple threads for the memory
copy task.

	Each thread has its own range of memory to copy, and does
	GET_DIRTY_LOG independently.  This will make things easy to
	add further optimizations in QEMU.

In summary, my impression is that the main cause of the current latency
problem is not the write protection of KVM but the strategy which tries
to cook the large slot in one hand.

What do you think?

	Takuya

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap
  2011-12-03  4:37                               ` Takuya Yoshikawa
@ 2011-12-04 10:20                                 ` Avi Kivity
  0 siblings, 0 replies; 6+ messages in thread
From: Avi Kivity @ 2011-12-04 10:20 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: KVM, quintela, Marcelo Tosatti, qemu-devel, Xiao Guangrong,
	Takuya Yoshikawa

On 12/03/2011 06:37 AM, Takuya Yoshikawa wrote:
> Avi Kivity <avi@redhat.com> wrote:
> > That's true.  But some applications do require low latency, and the
> > current code can impose a lot of time with the mmu spinlock held.
> > 
> > The total amount of work actually increases slightly, from O(N) to O(N
> > log N), but since the tree is so wide, the overhead is small.
> > 
>
> Controlling the latency can be achieved by making the user space limit
> the number of dirty pages to scan without hacking the core mmu code.
>
> 	The fact that we cannot transfer so many pages on the network at
> 	once suggests this is reasonable.

That is true.  Write protecting everything at once means that there is a
large window between the sampling the dirty log, and transferring the
page.  Any writes within that window cause a re-transfer, even when they
should not.

>
> With the rmap write protection method in KVM, the only thing we need is
> a new GET_DIRTY_LOG api which takes the [gfn_start, gfn_end] to scan,
> or max_write_protections optionally.

Right.

>
> 	I remember that someone suggested splitting the slot at KVM forum.
> 	Same effect with less effort.
>
> QEMU can also avoid unwanted page faults by using this api wisely.
>
> 	E.g. you can use this for "Interactivity improvements" TODO on
> 	KVM wiki, I think.
>
> Furthermore, QEMU may be able to use multiple threads for the memory
> copy task.
>
> 	Each thread has its own range of memory to copy, and does
> 	GET_DIRTY_LOG independently.  This will make things easy to
> 	add further optimizations in QEMU.
>
> In summary, my impression is that the main cause of the current latency
> problem is not the write protection of KVM but the strategy which tries
> to cook the large slot in one hand.
>
> What do you think?

I agree.  Maybe O(1) write protection has a place, but it is secondary
to fine-grained dirty logging, and if we implement it, it should be
after your idea, and further measurements.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-12-04 10:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20111114182041.43570cdf.yoshikawa.takuya@oss.ntt.co.jp>
     [not found] ` <4EC0EC90.1090202@redhat.com>
     [not found]   ` <4EC0F3D3.9090907@oss.ntt.co.jp>
     [not found]     ` <4EC10BFE.7050704@redhat.com>
2011-11-16  8:17       ` [Qemu-devel] [PATCH 0/4] KVM: Dirty logging optimization using rmap Takuya Yoshikawa
     [not found]       ` <4EC33C0B.1060807@oss.ntt.co.jp>
     [not found]         ` <4EC37D18.4010609@redhat.com>
     [not found]           ` <loom.20111129T104524-678@post.gmane.org>
     [not found]             ` <4ED4AF43.2040003@linux.vnet.ibm.com>
     [not found]               ` <4ED4B574.8090907@oss.ntt.co.jp>
     [not found]                 ` <4ED4BFEB.5010600@redhat.com>
     [not found]                   ` <4ED4C85A.5020509@linux.vnet.ibm.com>
     [not found]                     ` <4ED4C9A3.50504@redhat.com>
     [not found]                       ` <4ED4E626.5010507@redhat.com>
2011-11-30  5:02                         ` Takuya Yoshikawa
2011-11-30  5:15                           ` Takuya Yoshikawa
2011-12-01 15:18                             ` Avi Kivity
2011-12-03  4:37                               ` Takuya Yoshikawa
2011-12-04 10:20                                 ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).