From: Andy Shevchenko <andriy.shevchenko@intel.com>
To: Feng Jiang <jiangfeng@kylinos.cn>
Cc: pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
alex@ghiti.fr, kees@kernel.org, andy@kernel.org,
akpm@linux-foundation.org, ebiggers@kernel.org,
martin.petersen@oracle.com, ardb@kernel.org,
ajones@ventanamicro.com, conor.dooley@microchip.com,
samuel.holland@sifive.com, linus.walleij@linaro.org,
nathan@kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org
Subject: Re: [PATCH v2 08/14] lib/string_kunit: add performance benchmark for strlen()
Date: Wed, 14 Jan 2026 09:21:00 +0200 [thread overview]
Message-ID: <aWdD3N_jwnt_ncc1@smile.fi.intel.com> (raw)
In-Reply-To: <a58e97ad-a69e-498d-9382-2be4914569b0@kylinos.cn>
On Wed, Jan 14, 2026 at 03:04:58PM +0800, Feng Jiang wrote:
> On 2026/1/14 14:14, Feng Jiang wrote:
> > On 2026/1/13 16:46, Andy Shevchenko wrote:
...
> > Thank you for the catch. You are absolutely correct—the 2500x figure is heavily
> > distorted and does not reflect real-world performance.
> >
> > I've found that by using a volatile function pointer to call the implementations
> > (instead of direct calls), the results returned to a realistic range. It appears
> > the previous benchmark logic allowed the compiler to over-optimize the test loop
> > in ways that skewed the data.
> >
> > I will refactor the benchmark logic in v3, specifically referencing the crc32
> > KUnit implementation (e.g., using warm-up loops and adding preempt_disable()
> > to eliminate context-switch interference) to ensure the data is robust and accurate.
> >
>
> Just a quick follow-up: I've also verified that using a volatile variable to store
> the return value (as seen in crc_benchmark()) is equally effective at preventing
> the optimization.
>
> The core change is as follows:
>
> volatile size_t len;
> ...
> for (unsigned int j = 0; j < iters; j++) {
> OPTIMIZER_HIDE_VAR(buf);
> len = strlen(buf);
But please, check for sure this is Linux kernel generic implementation (before)
and not __builtin_strlen() from GCC. (OTOH, it would be nice to benchmark that
one as well, although I think that __builtin_strlen() in general maybe slightly
better choice than Linux kernel generic implementation.) I.o.w. be sure *what*
you test.
> }
Or using WRITE_ONCE() :-) But that one will probably be confusing as it usually
should be paired with READ_ONCE() somewhere else in the code. So, I agree on
crc_benchmark() approach taken.
> Preliminary results with this change look much more reasonable:
>
> ok 4 string_test_strlen
> # string_test_strlen_bench: strlen performance (short, len: 8, iters: 100000):
> # string_test_strlen_bench: arch-optimized: 4767500 ns
> # string_test_strlen_bench: generic C: 5815800 ns
> # string_test_strlen_bench: speedup: 1.21x
> # string_test_strlen_bench: strlen performance (medium, len: 64, iters: 100000):
> # string_test_strlen_bench: arch-optimized: 6573600 ns
> # string_test_strlen_bench: generic C: 16342500 ns
> # string_test_strlen_bench: speedup: 2.48x
> # string_test_strlen_bench: strlen performance (long, len: 2048, iters: 10000):
> # string_test_strlen_bench: arch-optimized: 7931000 ns
> # string_test_strlen_bench: generic C: 35347300 ns
> # string_test_strlen_bench: speedup: 4.45x
> ok 5 string_test_strlen_bench
>
> I will adopt this pattern in v3, along with cache warm-up and preempt_disable(),
> to stay consistent with existing kernel benchmarks and ensure robust measurements.
--
With Best Regards,
Andy Shevchenko
WARNING: multiple messages have this Message-ID (diff)
From: Andy Shevchenko <andriy.shevchenko@intel.com>
To: Feng Jiang <jiangfeng@kylinos.cn>
Cc: pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu,
alex@ghiti.fr, kees@kernel.org, andy@kernel.org,
akpm@linux-foundation.org, ebiggers@kernel.org,
martin.petersen@oracle.com, ardb@kernel.org,
ajones@ventanamicro.com, conor.dooley@microchip.com,
samuel.holland@sifive.com, linus.walleij@linaro.org,
nathan@kernel.org, linux-riscv@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org
Subject: Re: [PATCH v2 08/14] lib/string_kunit: add performance benchmark for strlen()
Date: Wed, 14 Jan 2026 09:21:00 +0200 [thread overview]
Message-ID: <aWdD3N_jwnt_ncc1@smile.fi.intel.com> (raw)
In-Reply-To: <a58e97ad-a69e-498d-9382-2be4914569b0@kylinos.cn>
On Wed, Jan 14, 2026 at 03:04:58PM +0800, Feng Jiang wrote:
> On 2026/1/14 14:14, Feng Jiang wrote:
> > On 2026/1/13 16:46, Andy Shevchenko wrote:
...
> > Thank you for the catch. You are absolutely correct—the 2500x figure is heavily
> > distorted and does not reflect real-world performance.
> >
> > I've found that by using a volatile function pointer to call the implementations
> > (instead of direct calls), the results returned to a realistic range. It appears
> > the previous benchmark logic allowed the compiler to over-optimize the test loop
> > in ways that skewed the data.
> >
> > I will refactor the benchmark logic in v3, specifically referencing the crc32
> > KUnit implementation (e.g., using warm-up loops and adding preempt_disable()
> > to eliminate context-switch interference) to ensure the data is robust and accurate.
> >
>
> Just a quick follow-up: I've also verified that using a volatile variable to store
> the return value (as seen in crc_benchmark()) is equally effective at preventing
> the optimization.
>
> The core change is as follows:
>
> volatile size_t len;
> ...
> for (unsigned int j = 0; j < iters; j++) {
> OPTIMIZER_HIDE_VAR(buf);
> len = strlen(buf);
But please, check for sure this is Linux kernel generic implementation (before)
and not __builtin_strlen() from GCC. (OTOH, it would be nice to benchmark that
one as well, although I think that __builtin_strlen() in general maybe slightly
better choice than Linux kernel generic implementation.) I.o.w. be sure *what*
you test.
> }
Or using WRITE_ONCE() :-) But that one will probably be confusing as it usually
should be paired with READ_ONCE() somewhere else in the code. So, I agree on
crc_benchmark() approach taken.
> Preliminary results with this change look much more reasonable:
>
> ok 4 string_test_strlen
> # string_test_strlen_bench: strlen performance (short, len: 8, iters: 100000):
> # string_test_strlen_bench: arch-optimized: 4767500 ns
> # string_test_strlen_bench: generic C: 5815800 ns
> # string_test_strlen_bench: speedup: 1.21x
> # string_test_strlen_bench: strlen performance (medium, len: 64, iters: 100000):
> # string_test_strlen_bench: arch-optimized: 6573600 ns
> # string_test_strlen_bench: generic C: 16342500 ns
> # string_test_strlen_bench: speedup: 2.48x
> # string_test_strlen_bench: strlen performance (long, len: 2048, iters: 10000):
> # string_test_strlen_bench: arch-optimized: 7931000 ns
> # string_test_strlen_bench: generic C: 35347300 ns
> # string_test_strlen_bench: speedup: 4.45x
> ok 5 string_test_strlen_bench
>
> I will adopt this pattern in v3, along with cache warm-up and preempt_disable(),
> to stay consistent with existing kernel benchmarks and ensure robust measurements.
--
With Best Regards,
Andy Shevchenko
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-01-14 7:21 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-13 8:27 [PATCH v2 00/14] riscv: optimize string functions and add kunit tests Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 01/14] lib/string: extract generic strlen() into __generic_strlen() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:33 ` Andy Shevchenko
2026-01-13 8:33 ` Andy Shevchenko
2026-01-14 0:01 ` Eric Biggers
2026-01-14 0:01 ` Eric Biggers
2026-01-14 1:41 ` Feng Jiang
2026-01-14 1:41 ` Feng Jiang
2026-01-14 7:07 ` Andy Shevchenko
2026-01-14 7:07 ` Andy Shevchenko
2026-01-14 10:10 ` David Laight
2026-01-14 10:10 ` David Laight
2026-01-15 6:50 ` Feng Jiang
2026-01-15 6:50 ` Feng Jiang
2026-01-15 6:55 ` Andy Shevchenko
2026-01-15 6:55 ` Andy Shevchenko
2026-01-13 8:27 ` [PATCH v2 02/14] lib/string: extract generic strnlen() into __generic_strnlen() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 03/14] lib/string: extract generic strchr() into __generic_strchr() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 04/14] lib/string: extract generic strrchr() into __generic_strrchr() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 05/14] lib/string_kunit: add correctness test for strlen Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 06/14] lib/string_kunit: add correctness test for strnlen Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:41 ` Andy Shevchenko
2026-01-13 8:41 ` Andy Shevchenko
2026-01-13 8:27 ` [PATCH v2 07/14] lib/string_kunit: add correctness test for strrchr() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 08/14] lib/string_kunit: add performance benchmark for strlen() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:46 ` Andy Shevchenko
2026-01-13 8:46 ` Andy Shevchenko
2026-01-14 6:14 ` Feng Jiang
2026-01-14 6:14 ` Feng Jiang
2026-01-14 7:04 ` Feng Jiang
2026-01-14 7:04 ` Feng Jiang
2026-01-14 7:21 ` Andy Shevchenko [this message]
2026-01-14 7:21 ` Andy Shevchenko
2026-01-14 8:05 ` Feng Jiang
2026-01-14 8:05 ` Feng Jiang
2026-01-14 10:21 ` David Laight
2026-01-14 10:21 ` David Laight
2026-01-15 6:24 ` Feng Jiang
2026-01-15 6:24 ` Feng Jiang
2026-01-15 10:40 ` David Laight
2026-01-15 10:40 ` David Laight
2026-01-18 11:11 ` kernel test robot
2026-01-18 11:11 ` kernel test robot
2026-01-13 8:27 ` [PATCH v2 09/14] lib/string_kunit: add performance benchmark for strnlen() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 10/14] lib/string_kunit: add performance benchmark for strchr() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 11/14] lib/string_kunit: add performance benchmark for strrchr() Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 12/14] riscv: lib: add strnlen implementation Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:48 ` Andy Shevchenko
2026-01-13 8:48 ` Andy Shevchenko
2026-01-13 8:27 ` [PATCH v2 13/14] riscv: lib: add strchr implementation Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:27 ` [PATCH v2 14/14] riscv: lib: add strrchr implementation Feng Jiang
2026-01-13 8:27 ` Feng Jiang
2026-01-13 8:52 ` [PATCH v2 00/14] riscv: optimize string functions and add kunit tests Andy Shevchenko
2026-01-13 8:52 ` Andy Shevchenko
2026-01-15 4:43 ` Joel Stanley
2026-01-15 4:43 ` Joel Stanley
2026-01-19 9:24 ` Feng Jiang
2026-01-19 9:24 ` Feng Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWdD3N_jwnt_ncc1@smile.fi.intel.com \
--to=andriy.shevchenko@intel.com \
--cc=ajones@ventanamicro.com \
--cc=akpm@linux-foundation.org \
--cc=alex@ghiti.fr \
--cc=andy@kernel.org \
--cc=aou@eecs.berkeley.edu \
--cc=ardb@kernel.org \
--cc=conor.dooley@microchip.com \
--cc=ebiggers@kernel.org \
--cc=jiangfeng@kylinos.cn \
--cc=kees@kernel.org \
--cc=linus.walleij@linaro.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=martin.petersen@oracle.com \
--cc=nathan@kernel.org \
--cc=palmer@dabbelt.com \
--cc=pjw@kernel.org \
--cc=samuel.holland@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.