From: Richard Henderson <richard.henderson@linaro.org>
To: Alexander Monakov <amonakov@ispras.ru>, qemu-devel@nongnu.org
Cc: Mikhail Romanov <mmromanov@ispras.ru>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH v3 6/6] util/bufferiszero: improve scalar variant
Date: Wed, 7 Feb 2024 08:34:38 +1000 [thread overview]
Message-ID: <0b532b64-296a-43a6-bec9-6450eb411a65@linaro.org> (raw)
In-Reply-To: <20240206204809.9859-7-amonakov@ispras.ru>
On 2/7/24 06:48, Alexander Monakov wrote:
> - /* Otherwise, use the unaligned memory access functions to
> - handle the beginning and end of the buffer, with a couple
> + /* Use unaligned memory access functions to handle
> + the beginning and end of the buffer, with a couple
> of loops handling the middle aligned section. */
> - uint64_t t = ldq_he_p(buf);
> - const uint64_t *p = (uint64_t *)(((uintptr_t)buf + 8) & -8);
> - const uint64_t *e = (uint64_t *)(((uintptr_t)buf + len) & -8);
> + uint64_t t = ldq_he_p(buf) | ldq_he_p(buf + len - 8);
> + typedef uint64_t uint64_a __attribute__((may_alias));
> + const uint64_a *p = (void *)(((uintptr_t)buf + 8) & -8);
> + const uint64_a *e = (void *)(((uintptr_t)buf + len - 1) & -8);
You appear to be optimizing this routine for x86, which is not the primary consumer.
This is going to perform very poorly on hosts that do not support unaligned accesses (e.g.
Sparc and some RISC-V).
r~
next prev parent reply other threads:[~2024-02-06 22:35 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-06 20:48 [PATCH v3 0/6] Optimize buffer_is_zero Alexander Monakov
2024-02-06 20:48 ` [PATCH v3 1/6] util/bufferiszero: remove SSE4.1 variant Alexander Monakov
2024-02-06 22:24 ` Richard Henderson
2024-02-06 20:48 ` [PATCH v3 2/6] util/bufferiszero: introduce an inline wrapper Alexander Monakov
2024-02-06 22:44 ` Richard Henderson
2024-02-07 7:13 ` Alexander Monakov
2024-02-08 20:07 ` Richard Henderson
2024-02-06 20:48 ` [PATCH v3 3/6] util/bufferiszero: remove AVX512 variant Alexander Monakov
2024-02-06 22:28 ` Richard Henderson
2024-02-06 23:56 ` Elena Ufimtseva
2024-02-07 6:29 ` Alexander Monakov
2024-02-07 10:38 ` Joao Martins
2024-02-06 20:48 ` [PATCH v3 4/6] util/bufferiszero: remove useless prefetches Alexander Monakov
2024-02-06 22:29 ` Richard Henderson
2024-02-06 20:48 ` [PATCH v3 5/6] util/bufferiszero: optimize SSE2 and AVX2 variants Alexander Monakov
2024-02-06 23:10 ` Richard Henderson
2024-02-06 20:48 ` [PATCH v3 6/6] util/bufferiszero: improve scalar variant Alexander Monakov
2024-02-06 22:34 ` Richard Henderson [this message]
2024-02-06 22:46 ` Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b532b64-296a-43a6-bec9-6450eb411a65@linaro.org \
--to=richard.henderson@linaro.org \
--cc=amonakov@ispras.ru \
--cc=mmromanov@ispras.ru \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).