From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Subject: [Qemu-devel] [PATCH] migration: vectorize is_dup_page
Date: Tue, 6 Dec 2011 18:25:45 +0100 [thread overview]
Message-ID: <1323192345-22906-1-git-send-email-pbonzini@redhat.com> (raw)
is_dup_page is already proceeding in 32-bit chunks. Changing it to 16
bytes using Altivec or SSE is easy, and provides a noticeable improvement.
Pierre Riteau measured 30->25 seconds on a 16GB guest, I measured 4.6->3.9
seconds on a 6GB guest (best of three times for me; dunno for Pierre).
Both of them are approximately a 15% improvement.
I tried playing with non-temporal prefetches, but I did not get any
improvement (though I did get less cache misses, so the patch was doing
its job).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch_init.c | 28 ++++++++++++++++++++++------
1 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/arch_init.c b/arch_init.c
index cdad805..473df2d 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -94,14 +94,30 @@ const uint32_t arch_type = QEMU_ARCH;
#define RAM_SAVE_FLAG_EOS 0x10
#define RAM_SAVE_FLAG_CONTINUE 0x20
-static int is_dup_page(uint8_t *page, uint8_t ch)
+#if __ALTIVEC__
+#include <altivec.h>
+#define VECTYPE vector unsigned char
+#define SPLAT(p) vec_splat(vec_ld(0, p), 0)
+#define ALL_EQ(v1, v2) vec_all_eq(v1, v2)
+#elif __SSE2__
+#include <emmintrin.h>
+#define VECTYPE __m128i
+#define SPLAT(p) _mm_set1_epi8(*(p))
+#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0xFFFF)
+#else
+#define VECTYPE unsigned long
+#define SPLAT(p) (*(p) * (~0UL / 255))
+#define ALL_EQ(v1, v2) ((v1) == (v2))
+#endif
+
+static int is_dup_page(uint8_t *page)
{
- uint32_t val = ch << 24 | ch << 16 | ch << 8 | ch;
- uint32_t *array = (uint32_t *)page;
+ VECTYPE *p = (VECTYPE *)page;
+ VECTYPE val = SPLAT(p);
int i;
- for (i = 0; i < (TARGET_PAGE_SIZE / 4); i++) {
- if (array[i] != val) {
+ for (i = 0; i < TARGET_PAGE_SIZE / sizeof(VECTYPE); i++) {
+ if (!ALL_EQ(val, p[i])) {
return 0;
}
}
@@ -136,7 +152,7 @@ static int ram_save_block(QEMUFile *f)
p = block->host + offset;
- if (is_dup_page(p, *p)) {
+ if (is_dup_page(p)) {
qemu_put_be64(f, offset | cont | RAM_SAVE_FLAG_COMPRESS);
if (!cont) {
qemu_put_byte(f, strlen(block->idstr));
--
1.7.7.1
next reply other threads:[~2011-12-06 17:26 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-06 17:25 Paolo Bonzini [this message]
2011-12-20 14:13 ` [Qemu-devel] [PATCH] migration: vectorize is_dup_page Anthony Liguori
2011-12-20 15:24 ` Avi Kivity
2011-12-20 15:45 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1323192345-22906-1-git-send-email-pbonzini@redhat.com \
--to=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).