* [RFC kvm-unit-tests] api: Add dirty logging performance test
@ 2012-01-08 13:58 Takuya Yoshikawa
2012-01-08 14:21 ` Avi Kivity
0 siblings, 1 reply; 4+ messages in thread
From: Takuya Yoshikawa @ 2012-01-08 13:58 UTC (permalink / raw)
To: avi, mtosatti; +Cc: kvm
Check how long it takes to get dirty log according to the number of
dirty pages, like:
get dirty log: 49 us for 1 dirty pages
get dirty log: 49 us for 2 dirty pages
get dirty log: 45 us for 4 dirty pages
get dirty log: 41 us for 8 dirty pages
get dirty log: 40 us for 16 dirty pages
get dirty log: 44 us for 32 dirty pages
get dirty log: 39 us for 64 dirty pages
get dirty log: 42 us for 128 dirty pages
get dirty log: 45 us for 256 dirty pages
get dirty log: 53 us for 512 dirty pages
get dirty log: 72 us for 1024 dirty pages
get dirty log: 99 us for 2048 dirty pages
get dirty log: 132 us for 4096 dirty pages
get dirty log: 224 us for 8192 dirty pages
get dirty log: 383 us for 16384 dirty pages
get dirty log: 725 us for 32768 dirty pages
get dirty log: 1412 us for 65536 dirty pages
get dirty log: 2746 us for 131072 dirty pages
get dirty log: 5455 us for 262144 dirty pages
Signed-off-by: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
---
api/dirty-log-perf.cc | 107 +++++++++++++++++++++++++++++++++++++++++++++++++
config-x86-common.mak | 3 +
2 files changed, 110 insertions(+), 0 deletions(-)
create mode 100644 api/dirty-log-perf.cc
diff --git a/api/dirty-log-perf.cc b/api/dirty-log-perf.cc
new file mode 100644
index 0000000..83413ce
--- /dev/null
+++ b/api/dirty-log-perf.cc
@@ -0,0 +1,107 @@
+#include "kvmxx.hh"
+#include "memmap.hh"
+#include "identity.hh"
+#include <boost/thread/thread.hpp>
+#include <stdlib.h>
+#include <stdio.h>
+#include <sys/time.h>
+
+namespace {
+
+const int page_size = 4096;
+const int nr_pages = 256 * 1024;
+
+void delay_loop(unsigned n)
+{
+ for (unsigned i = 0; i < n; ++i) {
+ asm volatile("pause");
+ }
+}
+
+void write_mem(volatile bool& running, volatile int& nr_dirty_pages,
+ void* logged_slot_virt)
+{
+ while (nr_dirty_pages >= 0) {
+ char* var = static_cast<char*>(logged_slot_virt);
+
+ while (!running) {
+ delay_loop(1000);
+ }
+ for (int i = 0; i < nr_dirty_pages; ++i) {
+ ++(*var);
+ var += page_size;
+ }
+ running = false;
+ }
+}
+
+void check_dirty_log(mem_slot& slot,
+ volatile bool& running,
+ volatile int& nr_dirty_pages)
+{
+ slot.set_dirty_logging(true);
+ slot.update_dirty_log();
+
+ for (int i = 1; i <= nr_pages; i *= 2) {
+ struct timeval start_time, end_time;
+ long time_usec;
+
+ nr_dirty_pages = i;
+ running = true;
+ // wait until the guest finishes writing
+ while (running) {
+ delay_loop(1000);
+ }
+
+ gettimeofday(&start_time, NULL);
+ slot.update_dirty_log();
+ gettimeofday(&end_time, NULL);
+
+ time_usec = 1000 * 1000 * (end_time.tv_sec - start_time.tv_sec);
+ time_usec += end_time.tv_usec - start_time.tv_usec;
+ printf("get dirty log: %6ld us for %10d dirty pages\n",
+ time_usec, nr_dirty_pages);
+ }
+
+ // stop the guest
+ nr_dirty_pages = -1;
+ running = true;
+ slot.set_dirty_logging(false);
+}
+
+}
+
+using boost::ref;
+using std::tr1::bind;
+
+int main(int ac, char **av)
+{
+ kvm::system sys;
+ kvm::vm vm(sys);
+ mem_map memmap(vm);
+
+ void* logged_slot_virt;
+ int memory_size = nr_pages * page_size;
+ if (posix_memalign(&logged_slot_virt, page_size, memory_size)) {
+ printf("dirty-log-perf: Could not allocate guest memory.\n");
+ exit(1);
+ }
+
+ identity::hole hole(logged_slot_virt, memory_size);
+ identity::vm ident_vm(vm, memmap, hole);
+ kvm::vcpu vcpu(vm, 0);
+ mem_slot logged_slot(memmap,
+ reinterpret_cast<uint64_t>(logged_slot_virt),
+ memory_size, logged_slot_virt);
+
+ bool running = false;
+ int nr_dirty_pages = 0;
+ boost::thread host_poll_thread(check_dirty_log, ref(logged_slot),
+ ref(running), ref(nr_dirty_pages));
+ identity::vcpu guest_write_thread(vcpu, bind(write_mem, ref(running),
+ ref(nr_dirty_pages),
+ logged_slot_virt));
+ vcpu.run();
+ host_poll_thread.join();
+ return 0;
+}
diff --git a/config-x86-common.mak b/config-x86-common.mak
index 1cbc7c6..a3e8ffa 100644
--- a/config-x86-common.mak
+++ b/config-x86-common.mak
@@ -39,6 +39,7 @@ tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \
ifdef API
tests-common += api/api-sample
tests-common += api/dirty-log
+tests-common += api/dirty-log-perf
endif
tests_and_config = $(TEST_DIR)/*.flat $(TEST_DIR)/unittests.cfg
@@ -102,3 +103,5 @@ api/libapi.a: api/kvmxx.o api/identity.o api/exception.o api/memmap.o
api/api-sample: api/api-sample.o api/libapi.a
api/dirty-log: api/dirty-log.o api/libapi.a
+
+api/dirty-log-perf: api/dirty-log-perf.o api/libapi.a
--
1.7.5.4
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [RFC kvm-unit-tests] api: Add dirty logging performance test
2012-01-08 13:58 [RFC kvm-unit-tests] api: Add dirty logging performance test Takuya Yoshikawa
@ 2012-01-08 14:21 ` Avi Kivity
2012-01-08 14:49 ` Takuya Yoshikawa
0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2012-01-08 14:21 UTC (permalink / raw)
To: Takuya Yoshikawa; +Cc: mtosatti, kvm
On 01/08/2012 03:58 PM, Takuya Yoshikawa wrote:
> Check how long it takes to get dirty log according to the number of
> dirty pages, like:
>
> get dirty log: 49 us for 1 dirty pages
> get dirty log: 49 us for 2 dirty pages
> get dirty log: 45 us for 4 dirty pages
> get dirty log: 41 us for 8 dirty pages
> get dirty log: 40 us for 16 dirty pages
> get dirty log: 44 us for 32 dirty pages
> get dirty log: 39 us for 64 dirty pages
> get dirty log: 42 us for 128 dirty pages
> get dirty log: 45 us for 256 dirty pages
> get dirty log: 53 us for 512 dirty pages
> get dirty log: 72 us for 1024 dirty pages
> get dirty log: 99 us for 2048 dirty pages
> get dirty log: 132 us for 4096 dirty pages
> get dirty log: 224 us for 8192 dirty pages
> get dirty log: 383 us for 16384 dirty pages
> get dirty log: 725 us for 32768 dirty pages
> get dirty log: 1412 us for 65536 dirty pages
> get dirty log: 2746 us for 131072 dirty pages
> get dirty log: 5455 us for 262144 dirty pages
Nice!
> +
> +void write_mem(volatile bool& running, volatile int& nr_dirty_pages,
> + void* logged_slot_virt)
> +{
> + while (nr_dirty_pages >= 0) {
> + char* var = static_cast<char*>(logged_slot_virt);
> +
> + while (!running) {
> + delay_loop(1000);
> + }
> + for (int i = 0; i < nr_dirty_pages; ++i) {
> + ++(*var);
> + var += page_size;
> + }
> + running = false;
You use running both to start this loop, and signal its end. Better to
use two variables.
But why use threads at all? Just call this before reading the dirty
log, no need for synchronization.
> + }
> +}
> +
> +void check_dirty_log(mem_slot& slot,
> + volatile bool& running,
> + volatile int& nr_dirty_pages)
> +{
> + slot.set_dirty_logging(true);
> + slot.update_dirty_log();
> +
> + for (int i = 1; i <= nr_pages; i *= 2) {
> + struct timeval start_time, end_time;
> + long time_usec;
> +
> + nr_dirty_pages = i;
> + running = true;
> + // wait until the guest finishes writing
> + while (running) {
> + delay_loop(1000);
> + }
> +
> + gettimeofday(&start_time, NULL);
> + slot.update_dirty_log();
> + gettimeofday(&end_time, NULL);
Nicer to have a function that returns time in nanoseconds.
> +
> + time_usec = 1000 * 1000 * (end_time.tv_sec - start_time.tv_sec);
> + time_usec += end_time.tv_usec - start_time.tv_usec;
> + printf("get dirty log: %6ld us for %10d dirty pages\n",
> + time_usec, nr_dirty_pages);
> + }
> +
> + // stop the guest
> + nr_dirty_pages = -1;
> + running = true;
> + slot.set_dirty_logging(false);
> +}
> +
> +}
> +
>
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [RFC kvm-unit-tests] api: Add dirty logging performance test
2012-01-08 14:21 ` Avi Kivity
@ 2012-01-08 14:49 ` Takuya Yoshikawa
2012-01-08 14:54 ` Avi Kivity
0 siblings, 1 reply; 4+ messages in thread
From: Takuya Yoshikawa @ 2012-01-08 14:49 UTC (permalink / raw)
To: Avi Kivity; +Cc: mtosatti, kvm
On Sun, 08 Jan 2012 16:21:08 +0200
Avi Kivity <avi@redhat.com> wrote:
> On 01/08/2012 03:58 PM, Takuya Yoshikawa wrote:
> > Check how long it takes to get dirty log according to the number of
> > dirty pages, like:
> >
> > get dirty log: 49 us for 1 dirty pages
> > get dirty log: 49 us for 2 dirty pages
> > get dirty log: 45 us for 4 dirty pages
> > get dirty log: 41 us for 8 dirty pages
> > get dirty log: 40 us for 16 dirty pages
> > get dirty log: 44 us for 32 dirty pages
> > get dirty log: 39 us for 64 dirty pages
> > get dirty log: 42 us for 128 dirty pages
> > get dirty log: 45 us for 256 dirty pages
> > get dirty log: 53 us for 512 dirty pages
> > get dirty log: 72 us for 1024 dirty pages
> > get dirty log: 99 us for 2048 dirty pages
> > get dirty log: 132 us for 4096 dirty pages
> > get dirty log: 224 us for 8192 dirty pages
> > get dirty log: 383 us for 16384 dirty pages
> > get dirty log: 725 us for 32768 dirty pages
> > get dirty log: 1412 us for 65536 dirty pages
> > get dirty log: 2746 us for 131072 dirty pages
> > get dirty log: 5455 us for 262144 dirty pages
>
> Nice!
I forgot to add warming up, letting the guest scan the memory,
before starting the test.
So the number of shadow pages was increasing during the test.
>
> > +
> > +void write_mem(volatile bool& running, volatile int& nr_dirty_pages,
> > + void* logged_slot_virt)
> > +{
> > + while (nr_dirty_pages >= 0) {
> > + char* var = static_cast<char*>(logged_slot_virt);
> > +
> > + while (!running) {
> > + delay_loop(1000);
> > + }
> > + for (int i = 0; i < nr_dirty_pages; ++i) {
> > + ++(*var);
> > + var += page_size;
> > + }
> > + running = false;
>
> You use running both to start this loop, and signal its end. Better to
> use two variables.
>
> But why use threads at all? Just call this before reading the dirty
> log, no need for synchronization.
I re-used your dirty-log code a lot, will update.
>
> > + }
> > +}
> > +
> > +void check_dirty_log(mem_slot& slot,
> > + volatile bool& running,
> > + volatile int& nr_dirty_pages)
> > +{
> > + slot.set_dirty_logging(true);
> > + slot.update_dirty_log();
> > +
> > + for (int i = 1; i <= nr_pages; i *= 2) {
> > + struct timeval start_time, end_time;
> > + long time_usec;
> > +
> > + nr_dirty_pages = i;
> > + running = true;
> > + // wait until the guest finishes writing
> > + while (running) {
> > + delay_loop(1000);
> > + }
> > +
> > + gettimeofday(&start_time, NULL);
> > + slot.update_dirty_log();
> > + gettimeofday(&end_time, NULL);
>
> Nicer to have a function that returns time in nanoseconds.
I don't know such an API which can be used in userspace.
If we can use nano-precision timer, I want to check other things
too, e.g. emulation.
Takuya
--
Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [RFC kvm-unit-tests] api: Add dirty logging performance test
2012-01-08 14:49 ` Takuya Yoshikawa
@ 2012-01-08 14:54 ` Avi Kivity
0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2012-01-08 14:54 UTC (permalink / raw)
To: Takuya Yoshikawa; +Cc: mtosatti, kvm
On 01/08/2012 04:49 PM, Takuya Yoshikawa wrote:
> > > +
> > > + gettimeofday(&start_time, NULL);
> > > + slot.update_dirty_log();
> > > + gettimeofday(&end_time, NULL);
> >
> > Nicer to have a function that returns time in nanoseconds.
>
> I don't know such an API which can be used in userspace.
static uint64_t time_ns()
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * (uint64_t)1000000000 + tv.tv_usec * 1000;
}
(that's not nanosecond precision, but it gives a single number instead
of a pair; there's also clock_gettime() that does have nanosecond
precision).
> If we can use nano-precision timer, I want to check other things
> too, e.g. emulation.
>
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-01-08 14:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-08 13:58 [RFC kvm-unit-tests] api: Add dirty logging performance test Takuya Yoshikawa
2012-01-08 14:21 ` Avi Kivity
2012-01-08 14:49 ` Takuya Yoshikawa
2012-01-08 14:54 ` Avi Kivity
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.