From: Jisheng Zhang <jszhang@kernel.org>
To: Paul Walmsley <pjw@kernel.org>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Alexandre Ghiti <alex@ghiti.fr>
Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: [PATCH 3/3] riscv: word-at-a-time: improve find_zero() for Zbb
Date: Tue, 13 Jan 2026 20:24:57 +0800 [thread overview]
Message-ID: <20260113122457.27507-4-jszhang@kernel.org> (raw)
In-Reply-To: <20260113122457.27507-1-jszhang@kernel.org>
In commit f915a3e5b018 ("arm64: word-at-a-time: improve byte count
calculations for LE"), Linus improved the find_zero() for arm64 LE.
Do the same optimization as he did: "do __ffs() on the intermediate value
that found whether there is a zero byte, before we've actually computed
the final byte mask.", so that we share the similar improvements:
"The difference between the old and the new implementation is that
"count_zero()" ends up scheduling better because it is being done on a
value that is available earlier (before the final mask).
But more importantly, it can be implemented without the insane semantics
of the standard bit finding helpers that have the off-by-one issue and
have to special-case the zero mask situation."
Before the patch:
0000000000000000 <find_zero>:
0: c909 beqz a0,12 <.L1>
2: 60051793 clz a5,a0
6: 03f00513 li a0,63
a: 8d1d sub a0,a0,a5
c: 2505 addiw a0,a0,1
e: 4035551b sraiw a0,a0,0x3
0000000000000012 <.L1>:
12: 8082 ret
After the patch:
0000000000000000 <find_zero>:
0: 60151513 ctz a0,a0
4: 810d srli a0,a0,0x3
6: 8082 ret
7 instructions vs 3 instructions!
As can be seen, on RV64 w/ Zbb, the new "find_zero()" ends up just
"ctz" plus the shift right that then ends up being subsumed by the
"add to final length".
But I have no HW platform which supports Zbb, so I can't get the
performance improvement numbers by the last patch, only built and
tested the patch on QEMU.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
arch/riscv/include/asm/word-at-a-time.h | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/include/asm/word-at-a-time.h b/arch/riscv/include/asm/word-at-a-time.h
index ca3d30741ed1..8c5ac6a72f7f 100644
--- a/arch/riscv/include/asm/word-at-a-time.h
+++ b/arch/riscv/include/asm/word-at-a-time.h
@@ -38,6 +38,9 @@ static inline unsigned long prep_zero_mask(unsigned long val,
static inline unsigned long create_zero_mask(unsigned long bits)
{
+ if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBB))
+ return bits;
+
bits = (bits - 1) & ~bits;
return bits >> 7;
}
@@ -69,13 +72,19 @@ static inline long count_masked_bytes(long mask)
static inline unsigned long find_zero(unsigned long mask)
{
if (riscv_has_extension_likely(RISCV_ISA_EXT_ZBB))
- return !mask ? 0 : ((__fls(mask) + 1) >> 3);
+ return __ffs(mask) >> 3;
return count_masked_bytes(mask);
}
-/* The mask we created is directly usable as a bytemask */
-#define zero_bytemask(mask) (mask)
+static inline unsigned long zero_bytemask(unsigned long bits)
+{
+ if (!riscv_has_extension_likely(RISCV_ISA_EXT_ZBB))
+ return bits;
+
+ bits = (bits - 1) & ~bits;
+ return bits >> 7;
+}
#endif /* !(defined(CONFIG_RISCV_ISA_ZBB) && defined(CONFIG_TOOLCHAIN_HAS_ZBB)) */
--
2.51.0
prev parent reply other threads:[~2026-01-13 12:43 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-13 12:24 [PATCH 0/3] riscv: word-at-a-time: improve find_zero() Jisheng Zhang
2026-01-13 12:24 ` [PATCH 1/3] riscv: word-at-a-time: improve find_zero() for !RISCV_ISA_ZBB Jisheng Zhang
2026-01-13 12:24 ` [PATCH 2/3] riscv: word-at-a-time: improve find_zero() without Zbb Jisheng Zhang
2026-01-13 12:24 ` Jisheng Zhang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260113122457.27507-4-jszhang@kernel.org \
--to=jszhang@kernel.org \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=pjw@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox