All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] optimize is_dup_page for zero pages
@ 2013-03-12 10:51 Peter Lieven
  2013-03-12 11:11 ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Lieven @ 2013-03-12 10:51 UTC (permalink / raw)
  To: qemu-devel@nongnu.org
  Cc: peter.maydell, Paolo Bonzini, Kevin Wolf, Stefan Hajnoczi

Hi,

a second patch to optimize live migration. I have generated some artifical load
testing for zero pages. Ordinary dup or non dup pages are not affected.

savings for zero pages (test case):
  non SSE2:    30s -> 26s
  SSE2:        27s -> 21s

optionally I would suggest optimizing buffer_is_zero to use SSE2 if addr
is 16 byte aligned and length is 128 byte aligned.
in this case bdrv functions could also benefit from it.

Peter

diff --git a/arch_init.c b/arch_init.c
index 98e2bc6..e1051e6 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -164,9 +164,37 @@ int qemu_read_default_config_files(bool userconfig)
      return 0;
  }

-static int is_dup_page(uint8_t *page)
+#if __SSE2__
+static int is_zero_page_sse2(u_int8_t *page)
  {
      VECTYPE *p = (VECTYPE *)page;
+    VECTYPE zero = _mm_setzero_si128();
+    int i;
+    for (i = 0; i < (TARGET_PAGE_SIZE / sizeof(VECTYPE)); i+=8) {
+               VECTYPE tmp0 = _mm_or_si128(p[i+0],p[i+1]);
+               VECTYPE tmp1 = _mm_or_si128(p[i+2],p[i+3]);
+               VECTYPE tmp2 = _mm_or_si128(p[i+4],p[i+5]);
+               VECTYPE tmp3 = _mm_or_si128(p[i+6],p[i+7]);
+               VECTYPE tmp01 = _mm_or_si128(tmp0,tmp1);
+               VECTYPE tmp23 = _mm_or_si128(tmp2,tmp3);
+               if (!ALL_EQ(_mm_or_si128(tmp01,tmp23), zero)) {
+                   return 0;
+               }
+    }
+    return 1;
+}
+#endif
+
+static int is_dup_page(u_int8_t *page) {
+    if (!page[0]) {
+#if __SSE2__
+        return is_zero_page_sse2(page);
+#else
+        return buffer_is_zero(page, TARGET_PAGE_SIZE);
+#endif
+    }
+
+    VECTYPE *p = (VECTYPE *)page;
      VECTYPE val = SPLAT(page);
      int i;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-03-12 20:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-12 10:51 [Qemu-devel] [RFC] optimize is_dup_page for zero pages Peter Lieven
2013-03-12 11:11 ` Paolo Bonzini
2013-03-12 11:20   ` Peter Lieven
2013-03-12 11:46     ` Paolo Bonzini
2013-03-12 11:51       ` Peter Lieven
2013-03-12 12:02         ` Paolo Bonzini
2013-03-12 12:15           ` Peter Lieven
2013-03-12 20:10           ` Peter Lieven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.