Linux-RISC-V Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Kuan-Wei Chiu <visitorckw@gmail.com>
To: Alexandre Ghiti <alex@ghiti.fr>
Cc: paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, jserv@ccns.ncku.edu.tw,
	eleanor15x@gmail.com, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RFC PATCH] riscv: Optimize gcd() performance by selecting CPU_NO_EFFICIENT_FFS
Date: Fri, 4 Apr 2025 21:54:35 +0800	[thread overview]
Message-ID: <Z+/km3h1ZmnJjyId@visitorckw-System-Product-Name> (raw)
In-Reply-To: <61173b04-faea-4dfe-8e82-95a55ee33f3f@ghiti.fr>

+Cc Andrew, since this might touch lib/math/gcd.c

On Fri, Mar 28, 2025 at 03:07:36PM +0100, Alexandre Ghiti wrote:
> Hi Kuan-Wei,
> 
> First sorry for the late review.
> 
> On 17/02/2025 02:37, Kuan-Wei Chiu wrote:
> > When the Zbb extension is not supported, ffs() falls back to a software
> > implementation instead of leveraging the hardware ctz instruction for
> > fast computation. In such cases, selecting CPU_NO_EFFICIENT_FFS
> > optimizes the efficiency of gcd().
> > 
> > The implementation of gcd() depends on the CPU_NO_EFFICIENT_FFS option.
> > With hardware support for ffs, the binary GCD algorithm is used.
> > Without it, the odd-even GCD algorithm is employed for better
> > performance.
> > 
> > Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
> > Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
> > Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> > ---
> > Although selecting NO_EFFICIENT_FFS seems reasonable without ctz
> > instructions, this patch hasn't been tested on real hardware. We'd
> > greatly appreciate it if someone could help test and provide
> > performance numbers!
> > 
> >   arch/riscv/Kconfig | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 7612c52e9b1e..2dd3699ad09b 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -91,6 +91,7 @@ config RISCV
> >   	select CLINT_TIMER if RISCV_M_MODE
> >   	select CLONE_BACKWARDS
> >   	select COMMON_CLK
> > +	select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
> >   	select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
> >   	select EDAC_SUPPORT
> >   	select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)
> 
> 
> So your patch is correct. But a kernel built with RISCV_ISA_ZBB does not
> mean the platform supports zbb and in that case, we'd still use the slow
> version of gcd().
> 
> Then I would use static keys instead, can you try to come up with a patch
> that does that?
> 
Here's my current thought: I'd like to add a static key named
efficient_ffs_key in gcd.c, and possibly call
static_branch_disable(&efficient_ffs_key) somewhere under arch/riscv/
when RISCV_ISA_ZBB is enabled but the Zbb extension is not detected at
runtime.

However, I'm new to the RISC-V kernel code and not sure where would be
the most appropriate place to insert that static_branch_disable() call.
Suggestions are very welcome!

Below is the diff for context.

Regards,
Kuan-Wei

diff --git a/lib/math/gcd.c b/lib/math/gcd.c
index e3b042214d1b..514b8a86b461 100644
--- a/lib/math/gcd.c
+++ b/lib/math/gcd.c
@@ -2,6 +2,7 @@
 #include <linux/kernel.h>
 #include <linux/gcd.h>
 #include <linux/export.h>
+#include <linux/jump_label.h>

 /*
  * This implements the binary GCD algorithm. (Often attributed to Stein,
@@ -11,6 +12,8 @@
  * has decent hardware division.
  */

+DEFINE_STATIC_KEY_TRUE(efficient_ffs_key);
+
 #if !defined(CONFIG_CPU_NO_EFFICIENT_FFS)

 /* If __ffs is available, the even/odd algorithm benchmarks slower. */
@@ -20,7 +23,7 @@
  * @a: first value
  * @b: second value
  */
-unsigned long gcd(unsigned long a, unsigned long b)
+static unsigned long gcd_binary(unsigned long a, unsigned long b)
 {
 	unsigned long r = a | b;

@@ -44,7 +47,7 @@ unsigned long gcd(unsigned long a, unsigned long b)
 	}
 }

-#else
+#endif

 /* If normalization is done by loops, the even/odd algorithm is a win. */
 unsigned long gcd(unsigned long a, unsigned long b)
@@ -54,6 +57,11 @@ unsigned long gcd(unsigned long a, unsigned long b)
 	if (!a || !b)
 		return r;

+#if !defined(CONFIG_CPU_NO_EFFICIENT_FFS)
+	if (static_branch_likely(&efficient_ffs_key))
+		return binary_gcd(a, b);
+#endif
+
 	/* Isolate lsbit of r */
 	r &= -r;

@@ -80,6 +88,4 @@ unsigned long gcd(unsigned long a, unsigned long b)
 	}
 }

-#endif
-
 EXPORT_SYMBOL_GPL(gcd);

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2025-04-04 14:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-17  1:37 [RFC PATCH] riscv: Optimize gcd() performance by selecting CPU_NO_EFFICIENT_FFS Kuan-Wei Chiu
2025-03-28 14:07 ` Alexandre Ghiti
2025-04-04 13:54   ` Kuan-Wei Chiu [this message]
2025-04-23  6:57     ` Alexandre Ghiti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z+/km3h1ZmnJjyId@visitorckw-System-Product-Name \
    --to=visitorckw@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=aou@eecs.berkeley.edu \
    --cc=eleanor15x@gmail.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox