qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: qemu-devel@nongnu.org
Cc: Robert Hoo <robert.hu@linux.intel.com>
Subject: [PULL 07/15] util/bufferiszero: improve avx2 accelerator
Date: Thu,  2 Apr 2020 15:06:32 -0400	[thread overview]
Message-ID: <20200402190640.1693-8-pbonzini@redhat.com> (raw)
In-Reply-To: <20200402190640.1693-1-pbonzini@redhat.com>

From: Robert Hoo <robert.hu@linux.intel.com>

By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a
branch.

The authorship of this patch actually belongs to Richard Henderson
<richard.henderson@linaro.org>, I just fixed a boundary case on his
original patch.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1585119021-46593-2-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/bufferiszero.c | 26 +++++++++-----------------
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index b8012532e4..695bb4ce28 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -158,27 +158,19 @@ buffer_zero_avx2(const void *buf, size_t len)
     __m256i *p = (__m256i *)(((uintptr_t)buf + 5 * 32) & -32);
     __m256i *e = (__m256i *)(((uintptr_t)buf + len) & -32);
 
-    if (likely(p <= e)) {
-        /* Loop over 32-byte aligned blocks of 128.  */
-        do {
-            __builtin_prefetch(p);
-            if (unlikely(!_mm256_testz_si256(t, t))) {
-                return false;
-            }
-            t = p[-4] | p[-3] | p[-2] | p[-1];
-            p += 4;
-        } while (p <= e);
-    } else {
-        t |= _mm256_loadu_si256(buf + 32);
-        if (len <= 128) {
-            goto last2;
+    /* Loop over 32-byte aligned blocks of 128.  */
+    while (p <= e) {
+        __builtin_prefetch(p);
+        if (unlikely(!_mm256_testz_si256(t, t))) {
+            return false;
         }
-    }
+        t = p[-4] | p[-3] | p[-2] | p[-1];
+        p += 4;
+    } ;
 
     /* Finish the last block of 128 unaligned.  */
     t |= _mm256_loadu_si256(buf + len - 4 * 32);
     t |= _mm256_loadu_si256(buf + len - 3 * 32);
- last2:
     t |= _mm256_loadu_si256(buf + len - 2 * 32);
     t |= _mm256_loadu_si256(buf + len - 1 * 32);
 
@@ -263,7 +255,7 @@ static void init_accel(unsigned cache)
     }
     if (cache & CACHE_AVX2) {
         fn = buffer_zero_avx2;
-        length_to_accel = 64;
+        length_to_accel = 128;
     }
 #endif
 #ifdef CONFIG_AVX512F_OPT
-- 
2.18.2




  parent reply	other threads:[~2020-04-02 19:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-02 19:06 [PULL 00/15] Misc patches for 5.0-rc2 Paolo Bonzini
2020-04-02 19:06 ` [PULL 01/15] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset Paolo Bonzini
2020-04-02 19:06 ` [PULL 02/15] hw/isa/superio: Correct the license text Paolo Bonzini
2020-04-02 19:06 ` [PULL 03/15] virtio-iommu: depend on PCI Paolo Bonzini
2020-04-02 19:06 ` [PULL 04/15] softmmu: fix crash with invalid -M memory-backend= Paolo Bonzini
2020-04-02 19:06 ` [PULL 05/15] MAINTAINERS: Add an entry for the HVF accelerator Paolo Bonzini
2020-04-02 19:06 ` [PULL 06/15] util/bufferiszero: assign length_to_accel value for each accelerator case Paolo Bonzini
2020-04-02 19:06 ` Paolo Bonzini [this message]
2020-04-02 19:06 ` [PULL 08/15] vl: fix broken IPA range for ARM -M virt with KVM enabled Paolo Bonzini
2020-04-02 19:06 ` [PULL 09/15] i386: hvf: Reset IRQ inhibition after moving RIP Paolo Bonzini
2020-04-02 19:06 ` [PULL 10/15] serial: Fix double migration data Paolo Bonzini
2020-04-02 19:06 ` [PULL 11/15] target/i386: do not set unsupported VMX secondary execution controls Paolo Bonzini
2020-04-02 19:06 ` [PULL 12/15] migration: fix cleanup_bh leak on resume Paolo Bonzini
2020-04-02 19:06 ` [PULL 13/15] qmp: fix leak on callbacks that return both value and error Paolo Bonzini
2020-04-02 19:06 ` [PULL 14/15] object-add: don't create return value if failed Paolo Bonzini
2020-04-02 19:06 ` [PULL 15/15] xen: fixup RAM memory region initialization Paolo Bonzini
2020-04-02 20:16 ` [PULL 00/15] Misc patches for 5.0-rc2 no-reply
2020-04-02 20:17 ` no-reply
2020-04-02 20:17 ` no-reply
2020-04-03  9:07 ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200402190640.1693-8-pbonzini@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=robert.hu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).