qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: Alexander Monakov <amonakov@ispras.ru>,
	Mikhail Romanov <mmromanov@ispras.ru>
Subject: [PULL 31/35] util/bufferiszero: Remove useless prefetches
Date: Mon,  8 Apr 2024 07:49:25 -1000	[thread overview]
Message-ID: <20240408174929.862917-32-richard.henderson@linaro.org> (raw)
In-Reply-To: <20240408174929.862917-1-richard.henderson@linaro.org>

From: Alexander Monakov <amonakov@ispras.ru>

Use of prefetching in bufferiszero.c is quite questionable:

- prefetches are issued just a few CPU cycles before the corresponding
  line would be hit by demand loads;

- they are done for simple access patterns, i.e. where hardware
  prefetchers can perform better;

- they compete for load ports in loops that should be limited by load
  port throughput rather than ALU throughput.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-5-amonakov@ispras.ru>
---
 util/bufferiszero.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index 972f394cbd..00118d649e 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -50,7 +50,6 @@ static bool buffer_is_zero_integer(const void *buf, size_t len)
         const uint64_t *e = (uint64_t *)(((uintptr_t)buf + len) & -8);
 
         for (; p + 8 <= e; p += 8) {
-            __builtin_prefetch(p + 8);
             if (t) {
                 return false;
             }
@@ -80,7 +79,6 @@ buffer_zero_sse2(const void *buf, size_t len)
 
     /* Loop over 16-byte aligned blocks of 64.  */
     while (likely(p <= e)) {
-        __builtin_prefetch(p);
         t = _mm_cmpeq_epi8(t, zero);
         if (unlikely(_mm_movemask_epi8(t) != 0xFFFF)) {
             return false;
@@ -111,7 +109,6 @@ buffer_zero_avx2(const void *buf, size_t len)
 
     /* Loop over 32-byte aligned blocks of 128.  */
     while (p <= e) {
-        __builtin_prefetch(p);
         if (unlikely(!_mm256_testz_si256(t, t))) {
             return false;
         }
-- 
2.34.1



  parent reply	other threads:[~2024-04-08 17:55 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-08 17:48 [PULL 00/35] misc patch queue Richard Henderson
2024-04-08 17:48 ` [PULL 01/35] tcg/optimize: Do not attempt to constant fold neg_vec Richard Henderson
2024-04-08 17:48 ` [PULL 02/35] linux-user: Fix waitid return of siginfo_t and rusage Richard Henderson
2024-04-08 17:48 ` [PULL 03/35] linux-user: do_setsockopt: fix SOL_ALG.ALG_SET_KEY Richard Henderson
2024-04-08 17:48 ` [PULL 04/35] linux-user: do_setsockopt: make ip_mreq local to the place it is used and inline target_to_host_ip_mreq() Richard Henderson
2024-04-08 17:48 ` [PULL 05/35] linux-user: do_setsockopt: make ip_mreq_source local to the place where it is used Richard Henderson
2024-04-08 17:49 ` [PULL 06/35] linux-user: do_setsockopt: eliminate goto in switch for SO_SNDTIMEO Richard Henderson
2024-04-08 17:49 ` [PULL 07/35] linux-user: Add FITRIM ioctl Richard Henderson
2024-04-08 17:49 ` [PULL 08/35] linux-user: replace calloc() with g_new0() Richard Henderson
2024-04-08 17:49 ` [PULL 09/35] target/hppa: Fix IIAOQ, IIASQ for pa2.0 Richard Henderson
2024-04-08 17:49 ` [PULL 10/35] target/sh4: mac.w: memory accesses are 16-bit words Richard Henderson
2024-04-08 17:49 ` [PULL 11/35] target/sh4: Merge mach and macl into a union Richard Henderson
2024-04-08 17:49 ` [PULL 12/35] target/sh4: Fix mac.l with saturation enabled Richard Henderson
2024-04-08 17:49 ` [PULL 13/35] target/sh4: Fix mac.w " Richard Henderson
2024-04-08 17:49 ` [PULL 14/35] target/sh4: add missing CHECK_NOT_DELAY_SLOT Richard Henderson
2024-04-08 17:49 ` [PULL 15/35] target/m68k: Map FPU exceptions to FPSR register Richard Henderson
2024-04-08 17:49 ` [PULL 16/35] target/m68k: Pass semihosting arg to exit Richard Henderson
2024-04-08 17:49 ` [PULL 17/35] target/m68k: Perform the semihosting test during translate Richard Henderson
2024-04-08 17:49 ` [PULL 18/35] target/m68k: Support semihosting on non-ColdFire targets Richard Henderson
2024-04-08 17:49 ` [PULL 19/35] tcg: Add TCGContext.emit_before_op Richard Henderson
2024-04-08 17:49 ` [PULL 20/35] accel/tcg: Add insn_start to DisasContextBase Richard Henderson
2024-04-08 17:49 ` [PULL 21/35] target/arm: Use insn_start from DisasContextBase Richard Henderson
2024-04-08 17:49 ` [PULL 22/35] target/hppa: " Richard Henderson
2024-04-08 17:49 ` [PULL 23/35] target/i386: Preserve DisasContextBase.insn_start across rewind Richard Henderson
2024-04-08 17:49 ` [PULL 24/35] target/microblaze: Use insn_start from DisasContextBase Richard Henderson
2024-04-08 17:49 ` [PULL 25/35] target/riscv: " Richard Henderson
2024-04-08 17:49 ` [PULL 26/35] target/s390x: " Richard Henderson
2024-04-08 17:49 ` [PULL 27/35] accel/tcg: Improve can_do_io management Richard Henderson
2024-04-08 17:49 ` [PULL 28/35] util/bufferiszero: Remove SSE4.1 variant Richard Henderson
2024-04-08 17:49 ` [PULL 29/35] util/bufferiszero: Remove AVX512 variant Richard Henderson
2024-04-08 17:49 ` [PULL 30/35] util/bufferiszero: Reorganize for early test for acceleration Richard Henderson
2024-04-08 17:49 ` Richard Henderson [this message]
2024-04-08 17:49 ` [PULL 32/35] util/bufferiszero: Optimize SSE2 and AVX2 variants Richard Henderson
2024-04-08 17:49 ` [PULL 33/35] util/bufferiszero: Improve scalar variant Richard Henderson
2024-04-08 17:49 ` [PULL 34/35] util/bufferiszero: Introduce biz_accel_fn typedef Richard Henderson
2024-04-08 17:49 ` [PULL 35/35] util/bufferiszero: Simplify test_buffer_is_zero_next_accel Richard Henderson
2024-04-09  8:50 ` [PULL 00/35] misc patch queue Peter Maydell
2024-04-09  9:53   ` Philippe Mathieu-Daudé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240408174929.862917-32-richard.henderson@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=amonakov@ispras.ru \
    --cc=mmromanov@ispras.ru \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).