All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Shevchenko <andriy.shevchenko@intel.com>
To: Feng Jiang <jiangfeng@kylinos.cn>
Cc: pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
	alex@ghiti.fr, akpm@linux-foundation.org, kees@kernel.org,
	andy@kernel.org, ebiggers@kernel.org, martin.petersen@oracle.com,
	ardb@kernel.org, charlie@rivosinc.com,
	conor.dooley@microchip.com, ajones@ventanamicro.com,
	linus.walleij@linaro.org, nathan@kernel.org,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-hardening@vger.kernel.org
Subject: Re: [PATCH v3 0/8] riscv: optimize string functions and add kunit tests
Date: Tue, 20 Jan 2026 09:36:36 +0200	[thread overview]
Message-ID: <aW8whGcAR0x6FSRJ@smile.fi.intel.com> (raw)
In-Reply-To: <20260120065852.166857-1-jiangfeng@kylinos.cn>

On Tue, Jan 20, 2026 at 02:58:44PM +0800, Feng Jiang wrote:
> This series provides optimized implementations of strnlen(), strchr(),
> and strrchr() for the RISC-V architecture. The strnlen implementation
> is derived from the existing optimized strlen. For strchr and strrchr,

strchr() and strrchr()

> the current versions use simple byte-by-byte assembly logic, which
> will serve as a baseline for future Zbb-based optimizations.
> 
> The patch series is organized into three parts:
> 1. Correctness Testing: The first three patches add KUnit test cases
>    for strlen, strnlen, and strrchr to ensure the baseline and optimized

strlen(), strnlen(), and strrchr()

>    versions are functionally correct.
> 2. Benchmarking Tool: Patches 4 and 5 extend string_kunit to include
>    performance measurement capabilities, allowing for comparative
>    analysis within the KUnit environment.
> 3. Architectural Optimizations: The final three patches introduce the
>    RISC-V specific assembly implementations.
> 
> Following suggestions from Andy Shevchenko, performance benchmarks have
> been added to string_kunit.c to provide quantifiable evidence of the
> improvements. Andy provided many specific comments on the implementation
> of the benchmark logic, which is also inspired by Eric Biggers'
> crc_benchmark(). Performance was measured in a QEMU TCG (rv64) environment,
> comparing the generic C implementation with the new RISC-V assembly versions.
> 
> Performance Summary (Improvement %):
> ---------------------------------------------------------------
> Function  |  16 B (Short) |  512 B (Mid) |  4096 B (Long)
> ---------------------------------------------------------------
> strnlen   |    +64.0%     |   +346.2%    |    +410.7%

This is still suspicious.

> strchr    |    +4.0%      |   +6.4%      |    +1.5%
> strrchr   |    +6.6%      |   +2.8%      |    +0.0%
> ---------------------------------------------------------------
> The benchmarks can be reproduced by enabling CONFIG_STRING_KUNIT_BENCH
> and running: ./tools/testing/kunit/kunit.py run --arch=riscv \
> --cross_compile=riscv64-linux-gnu- --kunitconfig=my_string.kunitconfig \
> --raw_output
> 
> The strnlen implementation leverages the Zbb 'orc.b' instruction and

strnlen()

> word-at-a-time logic, showing significant gains as the string length
> increases.

Hmm... Have you tried to optimise the generic implementation to use
word-at-a-time logic and compare?

> For strchr and strrchr, the handwritten assembly reduces

strchr() and strrchr()

> fixed overhead by eliminating stack frame management. The gain is most
> prominent on short strings (1-16B) where function call overhead dominates,
> while the performance converges with the C implementation for longer
> strings in the TCG environment.

-- 
With Best Regards,
Andy Shevchenko



WARNING: multiple messages have this Message-ID (diff)
From: Andy Shevchenko <andriy.shevchenko@intel.com>
To: Feng Jiang <jiangfeng@kylinos.cn>
Cc: pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
	alex@ghiti.fr, akpm@linux-foundation.org, kees@kernel.org,
	andy@kernel.org, ebiggers@kernel.org, martin.petersen@oracle.com,
	ardb@kernel.org, charlie@rivosinc.com,
	conor.dooley@microchip.com, ajones@ventanamicro.com,
	linus.walleij@linaro.org, nathan@kernel.org,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	linux-hardening@vger.kernel.org
Subject: Re: [PATCH v3 0/8] riscv: optimize string functions and add kunit tests
Date: Tue, 20 Jan 2026 09:36:36 +0200	[thread overview]
Message-ID: <aW8whGcAR0x6FSRJ@smile.fi.intel.com> (raw)
In-Reply-To: <20260120065852.166857-1-jiangfeng@kylinos.cn>

On Tue, Jan 20, 2026 at 02:58:44PM +0800, Feng Jiang wrote:
> This series provides optimized implementations of strnlen(), strchr(),
> and strrchr() for the RISC-V architecture. The strnlen implementation
> is derived from the existing optimized strlen. For strchr and strrchr,

strchr() and strrchr()

> the current versions use simple byte-by-byte assembly logic, which
> will serve as a baseline for future Zbb-based optimizations.
> 
> The patch series is organized into three parts:
> 1. Correctness Testing: The first three patches add KUnit test cases
>    for strlen, strnlen, and strrchr to ensure the baseline and optimized

strlen(), strnlen(), and strrchr()

>    versions are functionally correct.
> 2. Benchmarking Tool: Patches 4 and 5 extend string_kunit to include
>    performance measurement capabilities, allowing for comparative
>    analysis within the KUnit environment.
> 3. Architectural Optimizations: The final three patches introduce the
>    RISC-V specific assembly implementations.
> 
> Following suggestions from Andy Shevchenko, performance benchmarks have
> been added to string_kunit.c to provide quantifiable evidence of the
> improvements. Andy provided many specific comments on the implementation
> of the benchmark logic, which is also inspired by Eric Biggers'
> crc_benchmark(). Performance was measured in a QEMU TCG (rv64) environment,
> comparing the generic C implementation with the new RISC-V assembly versions.
> 
> Performance Summary (Improvement %):
> ---------------------------------------------------------------
> Function  |  16 B (Short) |  512 B (Mid) |  4096 B (Long)
> ---------------------------------------------------------------
> strnlen   |    +64.0%     |   +346.2%    |    +410.7%

This is still suspicious.

> strchr    |    +4.0%      |   +6.4%      |    +1.5%
> strrchr   |    +6.6%      |   +2.8%      |    +0.0%
> ---------------------------------------------------------------
> The benchmarks can be reproduced by enabling CONFIG_STRING_KUNIT_BENCH
> and running: ./tools/testing/kunit/kunit.py run --arch=riscv \
> --cross_compile=riscv64-linux-gnu- --kunitconfig=my_string.kunitconfig \
> --raw_output
> 
> The strnlen implementation leverages the Zbb 'orc.b' instruction and

strnlen()

> word-at-a-time logic, showing significant gains as the string length
> increases.

Hmm... Have you tried to optimise the generic implementation to use
word-at-a-time logic and compare?

> For strchr and strrchr, the handwritten assembly reduces

strchr() and strrchr()

> fixed overhead by eliminating stack frame management. The gain is most
> prominent on short strings (1-16B) where function call overhead dominates,
> while the performance converges with the C implementation for longer
> strings in the TCG environment.

-- 
With Best Regards,
Andy Shevchenko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2026-01-20  7:36 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-20  6:58 [PATCH v3 0/8] riscv: optimize string functions and add kunit tests Feng Jiang
2026-01-20  6:58 ` Feng Jiang
2026-01-20  6:58 ` [PATCH v3 1/8] lib/string_kunit: add correctness test for strlen Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:28   ` Andy Shevchenko
2026-01-20  7:28     ` Andy Shevchenko
2026-01-20  6:58 ` [PATCH v3 2/8] lib/string_kunit: add correctness test for strnlen Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:29   ` Andy Shevchenko
2026-01-20  7:29     ` Andy Shevchenko
2026-01-20  6:58 ` [PATCH v3 3/8] lib/string_kunit: add correctness test for strrchr() Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:30   ` Andy Shevchenko
2026-01-20  7:30     ` Andy Shevchenko
2026-01-20  6:58 ` [PATCH v3 4/8] lib/string_kunit: add performance benchmarks for strlen Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:46   ` Andy Shevchenko
2026-01-20  7:46     ` Andy Shevchenko
2026-01-21  5:45     ` Feng Jiang
2026-01-21  5:45       ` Feng Jiang
2026-01-20  6:58 ` [PATCH v3 5/8] lib/string_kunit: extend benchmarks to strnlen and chr searches Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:48   ` Andy Shevchenko
2026-01-20  7:48     ` Andy Shevchenko
2026-01-21  5:48     ` Feng Jiang
2026-01-21  5:48       ` Feng Jiang
2026-01-20  6:58 ` [PATCH v3 6/8] riscv: lib: add strnlen implementation Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:31   ` Andy Shevchenko
2026-01-20  7:31     ` Andy Shevchenko
2026-01-21  5:52     ` Feng Jiang
2026-01-21  5:52       ` Feng Jiang
2026-01-21  7:24   ` Qingfang Deng
2026-01-21  7:24     ` Qingfang Deng
2026-01-23  1:28     ` Feng Jiang
2026-01-23  1:28       ` Feng Jiang
2026-01-20  6:58 ` [PATCH v3 7/8] riscv: lib: add strchr implementation Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:31   ` Andy Shevchenko
2026-01-20  7:31     ` Andy Shevchenko
2026-01-20  6:58 ` [PATCH v3 8/8] riscv: lib: add strrchr implementation Feng Jiang
2026-01-20  6:58   ` Feng Jiang
2026-01-20  7:32   ` Andy Shevchenko
2026-01-20  7:32     ` Andy Shevchenko
2026-01-20  7:36 ` Andy Shevchenko [this message]
2026-01-20  7:36   ` [PATCH v3 0/8] riscv: optimize string functions and add kunit tests Andy Shevchenko
2026-01-21  6:44   ` Feng Jiang
2026-01-21  6:44     ` Feng Jiang
2026-01-21  7:01     ` Andy Shevchenko
2026-01-21  7:01       ` Andy Shevchenko
2026-01-21  8:12       ` Feng Jiang
2026-01-21  8:12         ` Feng Jiang
2026-01-21 10:57       ` David Laight
2026-01-21 10:57         ` David Laight
2026-01-23  3:12         ` Feng Jiang
2026-01-23  3:12           ` Feng Jiang
2026-01-23 10:16           ` David Laight
2026-01-23 10:16             ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aW8whGcAR0x6FSRJ@smile.fi.intel.com \
    --to=andriy.shevchenko@intel.com \
    --cc=ajones@ventanamicro.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=andy@kernel.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=ardb@kernel.org \
    --cc=charlie@rivosinc.com \
    --cc=conor.dooley@microchip.com \
    --cc=ebiggers@kernel.org \
    --cc=jiangfeng@kylinos.cn \
    --cc=kees@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=martin.petersen@oracle.com \
    --cc=nathan@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=pjw@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.