public inbox for linux-riscv@lists.infradead.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Paul Walmsley <pjw@kernel.org>
Cc: Feng Jiang <jiangfeng@kylinos.cn>,
	palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr,
	samuel.holland@sifive.com, charlie@rivosinc.com,
	conor.dooley@microchip.com, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] riscv: lib: optimize strlen loop efficiency
Date: Wed, 28 Jan 2026 18:59:04 +0000	[thread overview]
Message-ID: <20260128185904.5ec5c24e@pumpkin> (raw)
In-Reply-To: <20260115184619.574f1b36@pumpkin>

On Thu, 15 Jan 2026 18:46:19 +0000
David Laight <david.laight.linux@gmail.com> wrote:

... 
> While I suspect the per-byte cost is 'two bytes/clock' on x86-64
> the fixed cost may move the break-even point above the length of the
> average strlen() in the kernel.
> Of course, x86 probably falls back to 'rep scasb' at (maybe)
> (40 + 2n) clocks for 'n' bytes.
> A carefully written slightly unrolled asm loop might manage one
> byte per clock!
> I could spend weeks benchmarking different versions.

I've spent a quick half-hour...

On my zen-5 in userspace:

glibc's strlen() is showing the same fixed cost (50 clocks including overhead)
for sizes below (about) 100 bytes, for big buffers add 1 clock for ~50 bytes.
It must be using some simd instructions.

A simple:
	len = 0; while (s[len]) len++; return len;
loop is about 1 byte/clock, overhead ~25 clocks (probably the mostly one 'rdpmc'
instruction).
(Needs a barrier() to stop gcc converting it to a libc call.)

Unrolling the loop once:
	for (len = 0; s[len]; len += 2)
		if (!s[len + 1] return len + 1;
	return len;
actually runs twice as fast - so 2 bytes/clock.

Unrolling 4 times doesn't help, suddenly goes somewhat slower somewhere
between 128 and 256 bytes (to 1.5 bytes/clock).

The C 'longs' loop has an overhead of ~45 clocks and does 6 bytes/clock.
So the is better for buffers longer than 64 bytes.

The 'elephant in the room' is 'repne scasb'.
The fixed cost is some 150 clocks and the cost 3 clocks/byte.

I don't think any of the Intel cpu I have will do a 'one clock loop'.
I certainly failed to get one in the past when there was a data-dependency
between the iterations.

But I don't have anything modern (newest is an i7-7xxx) and I don't have
any old amd ones.
I needs to get a zen-1 (or 1a) and one of the Intel system that should be
cheap because they won't run win-11.

	David

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2026-01-28 18:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-18  3:26 [PATCH] riscv: lib: optimize strlen loop efficiency Feng Jiang
2026-01-15  2:03 ` Paul Walmsley
2026-01-15  3:23   ` Feng Jiang
2026-01-24  8:14     ` Paul Walmsley
2026-01-26  3:05       ` Feng Jiang
2026-01-15 11:19   ` David Laight
2026-01-15 18:46     ` David Laight
2026-01-26  2:52       ` Feng Jiang
2026-01-28 18:59       ` David Laight [this message]
2026-01-29  8:34         ` Feng Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260128185904.5ec5c24e@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=charlie@rivosinc.com \
    --cc=conor.dooley@microchip.com \
    --cc=jiangfeng@kylinos.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=pjw@kernel.org \
    --cc=samuel.holland@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox