linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/12] KVM, x86, ppc, asm-generic: moving dirty bitmaps to user space
@ 2010-05-04 12:56 Takuya Yoshikawa
  2010-05-04 12:56 ` Takuya Yoshikawa
                   ` (11 more replies)
  0 siblings, 12 replies; 61+ messages in thread
From: Takuya Yoshikawa @ 2010-05-04 12:56 UTC (permalink / raw)
  To: avi-H+wXaHxf7aLQT0dZR+AlfA, mtosatti-H+wXaHxf7aLQT0dZR+AlfA,
	agraf-l3A5Bk7waGM
  Cc: yoshikawa.takuya-gVGce1chcLdL9jVzuh4AOg,
	fernando-gVGce1chcLdL9jVzuh4AOg, kvm-u79uwXL29TY76Z2rM5mHXA,
	kvm-ppc-u79uwXL29TY76Z2rM5mHXA, kvm-ia64-u79uwXL29TY76Z2rM5mHXA,
	tglx-hfZtesqFncYOwBW4kG4KsQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	hpa-YMNOUZJC4hwAvxtiuMwx3w, x86-DgEjT+Ai2ygdnm+yROfE0A,
	benh-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r,
	paulus-eUNUBHrolfbYtjvyW6yDsg,
	linuxppc-dev-mnsaURCQ41sdnm+yROfE0A, arnd-r2nGTMty4D4,
	linux-arch-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi, sorry for sending from my personal account.
The following series are all from me:

  From: Takuya Yoshikawa <yoshikawa.takuya-gVGce1chcLdL9jVzuh4AOg@public.gmane.org>

  The 3rd version of "moving dirty bitmaps to user space".

From this version, we add x86 and ppc and asm-generic people to CC lists.


[To KVM people]

Sorry for being late to reply your comments.

Avi,
 - I've wrote an answer to your question in patch 5/12: drivers/vhost/vhost.c .

 - I've considered to change the set_bit_user_non_atomic to an inline function,
   but did not change because the other helpers in the uaccess.h are written as
   macros. Anyway, I hope that x86 people will give us appropriate suggestions
   about this.

 - I thought that documenting about making bitmaps 64-bit aligned will be
   written when we add an API to register user-allocated bitmaps. So probably
   in the next series.

Avi, Alex,
 - Could you check the ia64 and ppc parts, please? I tried to keep the logical
   changes as small as possible.

   I personally tried to build these with cross compilers. For ia64, I could check
   build success with my patch series. But book3s, even without my patch series,
   it failed with the following errors:

  arch/powerpc/kvm/book3s_paired_singles.c: In function 'kvmppc_emulate_paired_single':
  arch/powerpc/kvm/book3s_paired_singles.c:1289: error: the frame size of 2288 bytes is larger than 2048 bytes
  make[1]: *** [arch/powerpc/kvm/book3s_paired_singles.o] Error 1
  make: *** [arch/powerpc/kvm] Error 2


About changelog: there are two main changes from the 2nd version:
  1. I changed the treatment of clean slots (see patch 1/12).
     This was already applied today, thanks!
  2. I changed the switch API. (see patch 11/12).

To show this API's advantage, I also did a test (see the end of this mail).


[To x86 people]

Hi, Thomas, Ingo, Peter,

Please review the patches 4,5/12. Because this is the first experience for
me to send patches to x86, please tell me if this lacks anything.


[To ppc people]

Hi, Benjamin, Paul, Alex,

Please see the patches 6,7/12. I first say sorry for that I've not tested these
yet. In that sense, these may not be in the quality for precise reviews. But I
will be happy if you would give me any comments.

Alex, could you help me? Though I have a plan to get PPC box in the future,
currently I cannot test these.



[To asm-generic people]

Hi, Arnd,

Please review the patch 8/12. This kind of macro is acceptable?





[Performance test]

We measured the tsc needed to the ioctl()s for getting dirty logs in
kernel.

Test environment

  AMD Phenom(tm) 9850 Quad-Core Processor with 8GB memory


1. GUI test (running Ubuntu guest in graphical mode)

  sudo qemu-system-x86_64 -hda dirtylog_test.img -boot c -m 4192 -net ...

We show a relatively stable part to compare how much time is needed
for the basic parts of dirty log ioctl.

                           get.org   get.opt  switch.opt

slots[7].len=32768          278379     66398     64024
slots[8].len=32768          181246       270       160
slots[7].len=32768          263961     64673     64494
slots[8].len=32768          181655       265       160
slots[7].len=32768          263736     64701     64610
slots[8].len=32768          182785       267       160
slots[7].len=32768          260925     65360     65042
slots[8].len=32768          182579       264       160
slots[7].len=32768          267823     65915     65682
slots[8].len=32768          186350       271       160

At a glance, we know our optimization improved significantly compared
to the original get dirty log ioctl. This is true for both get.opt and
switch.opt. This has a really big impact for the personal KVM users who
drive KVM in GUI mode on their usual PCs.

Next, we notice that switch.opt improved a hundred nano seconds or so for
these slots. Although this may sound a bit tiny improvement, we can feel
this as a difference of GUI's responses like mouse reactions.

To feel the difference, please try GUI on your PC with our patch series!


2. Live-migration test (4GB guest, write loop with 1GB buf)

We also did a live-migration test.

                           get.org   get.opt  switch.opt

slots[0].len=655360         797383    261144    222181
slots[1].len=3757047808    2186721   1965244   1842824
slots[2].len=637534208     1433562   1012723   1031213
slots[3].len=131072         216858       331       331
slots[4].len=131072         121635       225       164
slots[5].len=131072         120863       356       164
slots[6].len=16777216       121746      1133       156
slots[7].len=32768          120415       230       278
slots[8].len=32768          120368       216       149
slots[0].len=655360         806497    194710    223582
slots[1].len=3757047808    2142922   1878025   1895369
slots[2].len=637534208     1386512   1021309   1000345
slots[3].len=131072         221118       459       296
slots[4].len=131072         121516       272       166
slots[5].len=131072         122652       244       173
slots[6].len=16777216       123226     99185       149
slots[7].len=32768          121803       457       505
slots[8].len=32768          121586       216       155
slots[0].len=655360         766113    211317    213179
slots[1].len=3757047808    2155662   1974790   1842361
slots[2].len=637534208     1481411   1020004   1031352
slots[3].len=131072         223100       351       295
slots[4].len=131072         122982       436       164
slots[5].len=131072         122100       300       503
slots[6].len=16777216       123653       779       151
slots[7].len=32768          122617       284       157
slots[8].len=32768          122737       253       149

For slots other than 0,1,2 we can see the similar improvement.

Considering the fact that switch.opt does not depend on the bitmap length
except for kvm_mmu_slot_remove_write_access(), this is the cause of some
usec to msec time consumption: there might be some context switches.

But note that this was done with the workload which dirtied the memory
endlessly during the live-migration.

In usual workload, the number of dirty pages varies a lot for each iteration
and we should gain really a lot for relatively clean cases.

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2010-05-17  9:02 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-04 12:56 [RFC][PATCH 0/12] KVM, x86, ppc, asm-generic: moving dirty bitmaps to user space Takuya Yoshikawa
2010-05-04 12:56 ` Takuya Yoshikawa
2010-05-04 12:58 ` [RFC][PATCH 1/12 applied today] KVM: x86: avoid unnecessary bitmap allocation when memslot is clean Takuya Yoshikawa
2010-05-04 13:02 ` [RFC][PATCH 4/12] x86: introduce copy_in_user() for 32-bit Takuya Yoshikawa
2010-05-04 13:02 ` [RFC][PATCH 5/12] x86: introduce __set_bit() like function for bitmaps in user space Takuya Yoshikawa
2010-05-04 13:03 ` [RFC][PATCH 6/12 not tested yet] PPC: introduce copy_in_user() for 32-bit Takuya Yoshikawa
2010-05-04 13:03   ` Takuya Yoshikawa
2010-05-04 13:05 ` [RFC][PATCH resend 8/12] asm-generic: bitops: introduce le bit offset macro Takuya Yoshikawa
2010-05-04 13:05   ` Takuya Yoshikawa
2010-05-04 15:03   ` Arnd Bergmann
2010-05-04 15:03     ` Arnd Bergmann
2010-05-04 16:08     ` Avi Kivity
2010-05-04 16:08       ` Avi Kivity
     [not found]       ` <4BE04677.4060608-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-05-05  2:59         ` Takuya Yoshikawa
2010-05-05  2:59           ` Takuya Yoshikawa
     [not found]           ` <20100505115924.7bb92036.takuya.yoshikawa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-05-06 13:38             ` Arnd Bergmann
2010-05-06 13:38               ` Arnd Bergmann
     [not found]               ` <201005061538.54326.arnd-r2nGTMty4D4@public.gmane.org>
2010-05-10 11:46                 ` Takuya Yoshikawa
2010-05-10 11:46                   ` Takuya Yoshikawa
2010-05-10 12:01                   ` Avi Kivity
2010-05-10 12:01                     ` Avi Kivity
     [not found]                   ` <4BE7F22E.9070504-gVGce1chcLdL9jVzuh4AOg@public.gmane.org>
2010-05-10 12:01                     ` Arnd Bergmann
2010-05-10 12:01                       ` Arnd Bergmann
     [not found]                       ` <201005101401.52182.arnd-r2nGTMty4D4@public.gmane.org>
2010-05-10 12:09                         ` Takuya Yoshikawa
2010-05-10 12:09                           ` Takuya Yoshikawa
     [not found] ` <20100504215645.6448af8f.takuya.yoshikawa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-05-04 13:00   ` [RFC][PATCH 2/12] KVM: introduce slot level dirty state management Takuya Yoshikawa
2010-05-04 13:00     ` Takuya Yoshikawa
2010-05-04 13:01   ` [RFC][PATCH 3/12] KVM: introduce wrapper functions to create and destroy dirty bitmaps Takuya Yoshikawa
2010-05-04 13:01     ` Takuya Yoshikawa
2010-05-04 13:04   ` [RFC][PATCH 7/12 not tested yet] PPC: introduce __set_bit() like function for bitmaps in user space Takuya Yoshikawa
2010-05-04 13:04     ` Takuya Yoshikawa
     [not found]     ` <20100504220418.083929bc.takuya.yoshikawa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-05-11 16:00       ` Alexander Graf
2010-05-11 16:00         ` Alexander Graf
2010-05-12  9:25         ` Takuya Yoshikawa
2010-05-04 13:06   ` [RFC][PATCH 9/12] KVM: introduce a wrapper function of set_bit_user_non_atomic() Takuya Yoshikawa
2010-05-04 13:06     ` Takuya Yoshikawa
2010-05-04 13:07 ` [RFC][PATCH RFC 10/12] KVM: move dirty bitmaps to user space Takuya Yoshikawa
     [not found]   ` <20100504220702.f8ba6ccc.takuya.yoshikawa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-05-11  3:28     ` Marcelo Tosatti
2010-05-11  3:28       ` Marcelo Tosatti
2010-05-12  6:27       ` Takuya Yoshikawa
2010-05-04 13:08 ` [RFC][PATCH 11/12] KVM: introduce new API for getting/switching dirty bitmaps Takuya Yoshikawa
     [not found]   ` <20100504220821.d68bde57.takuya.yoshikawa-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-05-11  3:43     ` Marcelo Tosatti
2010-05-11  3:43       ` Marcelo Tosatti
2010-05-11  5:53       ` Takuya Yoshikawa
2010-05-11  5:53         ` Takuya Yoshikawa
     [not found]         ` <4BE8F0F2.60706-gVGce1chcLdL9jVzuh4AOg@public.gmane.org>
2010-05-11 14:07           ` Marcelo Tosatti
2010-05-11 14:07             ` Marcelo Tosatti
2010-05-12  6:03             ` Takuya Yoshikawa
2010-05-04 13:11 ` [RFC][PATCH 12/12 sample] qemu-kvm: use " Takuya Yoshikawa
2010-05-10 12:06 ` [RFC][PATCH 0/12] KVM, x86, ppc, asm-generic: moving dirty bitmaps to user space Avi Kivity
2010-05-10 12:06   ` Avi Kivity
2010-05-10 12:26   ` Takuya Yoshikawa
2010-05-10 12:26     ` Takuya Yoshikawa
2010-05-11 10:11     ` Takuya Yoshikawa
2010-05-13 11:47     ` Avi Kivity
2010-05-13 11:47       ` Avi Kivity
2010-05-17  9:06       ` Takuya Yoshikawa
2010-05-11 15:55 ` Alexander Graf
2010-05-11 15:55   ` Alexander Graf
2010-05-12  9:19   ` Takuya Yoshikawa
2010-05-12  9:19     ` Takuya Yoshikawa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).