From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36347) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1beRaQ-0001Lb-Ft for qemu-devel@nongnu.org; Mon, 29 Aug 2016 14:47:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1beRaJ-0001oD-H3 for qemu-devel@nongnu.org; Mon, 29 Aug 2016 14:47:09 -0400 Received: from mail-qk0-x242.google.com ([2607:f8b0:400d:c09::242]:36739) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1beRaJ-0001nw-B8 for qemu-devel@nongnu.org; Mon, 29 Aug 2016 14:47:03 -0400 Received: by mail-qk0-x242.google.com with SMTP id v123so10888823qkh.3 for ; Mon, 29 Aug 2016 11:47:03 -0700 (PDT) Sender: Richard Henderson From: Richard Henderson Date: Mon, 29 Aug 2016 11:46:17 -0700 Message-Id: <1472496380-19706-7-git-send-email-rth@twiddle.net> In-Reply-To: <1472496380-19706-1-git-send-email-rth@twiddle.net> References: <1472496380-19706-1-git-send-email-rth@twiddle.net> Subject: [Qemu-devel] [PATCH v3 6/9] cutils: Add generic prefetch List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: pbonzini@redhat.com, vijay.kilari@gmail.com There's no real knowledge of the cacheline size, just prefetching one loop ahead. Signed-off-by: Richard Henderson --- util/bufferiszero.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/util/bufferiszero.c b/util/bufferiszero.c index c356415..2c5801b 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -38,6 +38,8 @@ static bool NAME(const void *buf, size_t len) \ do { \ const VECTYPE *p = buf; \ VECTYPE t; \ + __builtin_prefetch(buf + SIZE); \ + barrier(); \ if (SIZE == sizeof(VECTYPE) * 4) { \ t = (p[0] | p[1]) | (p[2] | p[3]); \ } else if (SIZE == sizeof(VECTYPE) * 8) { \ @@ -239,6 +241,9 @@ bool buffer_is_zero(const void *buf, size_t len) return true; } + /* Fetch the beginning of the buffer while we select the accelerator. */ + __builtin_prefetch(buf); + /* Use an optimized zero check if possible. Note that this also includes a check for an unrolled loop over longs, as well as the unsized, unaligned fallback to buffer_zero_base. */ -- 2.7.4