From: David Laight <david.laight.linux@gmail.com>
To: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Feng Jiang <jiangfeng@kylinos.cn>,
Andy Shevchenko <andriy.shevchenko@intel.com>,
pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
alex@ghiti.fr, akpm@linux-foundation.org, kees@kernel.org,
andy@kernel.org, ebiggers@kernel.org, martin.petersen@oracle.com,
ardb@kernel.org, charlie@rivosinc.com,
conor.dooley@microchip.com, ajones@ventanamicro.com,
linus.walleij@linaro.org, nathan@kernel.org,
linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
linux-hardening@vger.kernel.org
Subject: Re: [PATCH v3 0/8] riscv: optimize string functions and add kunit tests
Date: Wed, 21 Jan 2026 10:57:17 +0000 [thread overview]
Message-ID: <20260121105717.04853c5d@pumpkin> (raw)
In-Reply-To: <CAHp75Ve_pyVm430FL=aEAw1Cnf-92T3Y23qh7pEOaMYMp9iyvw@mail.gmail.com>
On Wed, 21 Jan 2026 09:01:29 +0200
Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
...
> I understand that. My point is if we move the generic implementation
> to use word-at-a-time technique the difference should not go 4x,
> right? Perhaps 1.5x or so. I believe this will be a very useful
> exercise.
I posted a version earlier.
After the initial setup (aligning the base address and loading
some constants the loop on x86-64 is 7 instructions (should be similar
for other architectures).
I think it will execute in 4 clocks.
You then need to find the byte in the word, easy enough on LE with
a fast ffs() - but harder otherwise.
The real problem is the cost for short strings.
Like memcpy() you need a hint from the source of the 'expected' length
(as a compile-time constant) to compile-time select the algorithm.
OTOH:
for (;;) {
if (!ptr[0]) return ptr - start;
ptr += 2;
while (ptr[-1]);
return ptr - start - 1;
has two 'load+compare+branch' and one add per loop.
On x86 that might all overlap and give you a two-clock loop
that checks one byte every clock - faster than 'rep scasb'.
(You can get a two clock loop, but not a 1 clock loop.)
I think unrolling further will make little/no difference.
The break-even for the word-at-a-time version is probably at least 64
characters.
David
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-01-21 10:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-20 6:58 [PATCH v3 0/8] riscv: optimize string functions and add kunit tests Feng Jiang
2026-01-20 6:58 ` [PATCH v3 1/8] lib/string_kunit: add correctness test for strlen Feng Jiang
2026-01-20 7:28 ` Andy Shevchenko
2026-01-20 6:58 ` [PATCH v3 2/8] lib/string_kunit: add correctness test for strnlen Feng Jiang
2026-01-20 7:29 ` Andy Shevchenko
2026-01-20 6:58 ` [PATCH v3 3/8] lib/string_kunit: add correctness test for strrchr() Feng Jiang
2026-01-20 7:30 ` Andy Shevchenko
2026-01-20 6:58 ` [PATCH v3 4/8] lib/string_kunit: add performance benchmarks for strlen Feng Jiang
2026-01-20 7:46 ` Andy Shevchenko
2026-01-21 5:45 ` Feng Jiang
2026-01-20 6:58 ` [PATCH v3 5/8] lib/string_kunit: extend benchmarks to strnlen and chr searches Feng Jiang
2026-01-20 7:48 ` Andy Shevchenko
2026-01-21 5:48 ` Feng Jiang
2026-01-20 6:58 ` [PATCH v3 6/8] riscv: lib: add strnlen implementation Feng Jiang
2026-01-20 7:31 ` Andy Shevchenko
2026-01-21 5:52 ` Feng Jiang
2026-01-21 7:24 ` Qingfang Deng
2026-01-23 1:28 ` Feng Jiang
2026-01-20 6:58 ` [PATCH v3 7/8] riscv: lib: add strchr implementation Feng Jiang
2026-01-20 7:31 ` Andy Shevchenko
2026-01-20 6:58 ` [PATCH v3 8/8] riscv: lib: add strrchr implementation Feng Jiang
2026-01-20 7:32 ` Andy Shevchenko
2026-01-20 7:36 ` [PATCH v3 0/8] riscv: optimize string functions and add kunit tests Andy Shevchenko
2026-01-21 6:44 ` Feng Jiang
2026-01-21 7:01 ` Andy Shevchenko
2026-01-21 8:12 ` Feng Jiang
2026-01-21 10:57 ` David Laight [this message]
2026-01-23 3:12 ` Feng Jiang
2026-01-23 10:16 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260121105717.04853c5d@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=ajones@ventanamicro.com \
--cc=akpm@linux-foundation.org \
--cc=alex@ghiti.fr \
--cc=andriy.shevchenko@intel.com \
--cc=andy.shevchenko@gmail.com \
--cc=andy@kernel.org \
--cc=aou@eecs.berkeley.edu \
--cc=ardb@kernel.org \
--cc=charlie@rivosinc.com \
--cc=conor.dooley@microchip.com \
--cc=ebiggers@kernel.org \
--cc=jiangfeng@kylinos.cn \
--cc=kees@kernel.org \
--cc=linus.walleij@linaro.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=martin.petersen@oracle.com \
--cc=nathan@kernel.org \
--cc=palmer@dabbelt.com \
--cc=pjw@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox