linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime
@ 2025-06-06 13:47 Kuan-Wei Chiu
  2025-06-06 13:47 ` [PATCH v3 1/3] lib/math/gcd: Use static key to select " Kuan-Wei Chiu
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Kuan-Wei Chiu @ 2025-06-06 13:47 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, alex, akpm
  Cc: linux-riscv, linux-kernel, jserv, Kuan-Wei Chiu, Yu-Chun Lin

The current implementation of gcd() selects between the binary GCD and
the odd-even GCD algorithm at compile time, depending on whether
CONFIG_CPU_NO_EFFICIENT_FFS is set. On platforms like RISC-V, however,
this compile-time decision can be misleading: even when the compiler
emits ctz instructions based on the assumption that they are efficient
(as is the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual
hardware may lack support for the Zbb extension. In such cases, ffs()
falls back to a software implementation at runtime, making the binary
GCD algorithm significantly slower than the odd-even variant.

To address this, we introduce a static key to allow runtime selection
between the binary and odd-even GCD implementations. On RISC-V, the
kernel now checks for Zbb support during boot. If Zbb is unavailable,
the static key is disabled so that gcd() consistently uses the more
efficient odd-even algorithm in that scenario. Additionally, to further
reduce code size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically
when CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the
unused binary GCD implementation entirely on systems where it would
never be executed.

This series ensures that the most efficient GCD algorithm is used in
practice and avoids compiling unnecessary code based on hardware
capabilities and kernel configuration.

Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>

---
This series has been tested on QEMU to verify that the correct GCD
implementation is used both with and without Zbb support.

v2 -> v3:
- Drop if (!a || !b) check in binary_gcd()
- Move DECLARE_STATIC_KEY_TRUE(efficient_ffs_key) to gcd.h
v1 -> v2:
- Use a static key to select the GCD implementation at runtime.

v2: https://lore.kernel.org/lkml/20250524155519.1142570-1-visitorckw@gmail.com/
v1: https://lore.kernel.org/lkml/20250217013708.1932496-1-visitorckw@gmail.com/

Kuan-Wei Chiu (3):
  lib/math/gcd: Use static key to select implementation at runtime
  riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled
  riscv: Optimize gcd() performance on RISC-V without Zbb extension

 arch/riscv/Kconfig        |  1 +
 arch/riscv/kernel/setup.c |  5 +++++
 include/linux/gcd.h       |  3 +++
 lib/math/gcd.c            | 27 +++++++++++++++------------
 4 files changed, 24 insertions(+), 12 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/3] lib/math/gcd: Use static key to select implementation at runtime
  2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
@ 2025-06-06 13:47 ` Kuan-Wei Chiu
  2025-06-06 13:47 ` [PATCH v3 2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled Kuan-Wei Chiu
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kuan-Wei Chiu @ 2025-06-06 13:47 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, alex, akpm
  Cc: linux-riscv, linux-kernel, jserv, Kuan-Wei Chiu, Yu-Chun Lin

On platforms like RISC-V, the compiler may generate hardware FFS
instructions even if the underlying CPU does not actually support them.
Currently, the GCD implementation is chosen at compile time based on
CONFIG_CPU_NO_EFFICIENT_FFS, which can result in suboptimal behavior on
such systems.

Introduce a static key, efficient_ffs_key, to enable runtime selection
between the binary GCD (using ffs) and the odd-even GCD implementation.
This allows the kernel to default to the faster binary GCD when FFS is
efficient, while retaining the ability to fall back when needed.

Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
 include/linux/gcd.h |  3 +++
 lib/math/gcd.c      | 27 +++++++++++++++------------
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/include/linux/gcd.h b/include/linux/gcd.h
index cb572677fd7f..616e81a7f7e3 100644
--- a/include/linux/gcd.h
+++ b/include/linux/gcd.h
@@ -3,6 +3,9 @@
 #define _GCD_H
 
 #include <linux/compiler.h>
+#include <linux/jump_label.h>
+
+DECLARE_STATIC_KEY_TRUE(efficient_ffs_key);
 
 unsigned long gcd(unsigned long a, unsigned long b) __attribute_const__;
 
diff --git a/lib/math/gcd.c b/lib/math/gcd.c
index e3b042214d1b..62efca6787ae 100644
--- a/lib/math/gcd.c
+++ b/lib/math/gcd.c
@@ -11,22 +11,16 @@
  * has decent hardware division.
  */
 
+DEFINE_STATIC_KEY_TRUE(efficient_ffs_key);
+
 #if !defined(CONFIG_CPU_NO_EFFICIENT_FFS)
 
 /* If __ffs is available, the even/odd algorithm benchmarks slower. */
 
-/**
- * gcd - calculate and return the greatest common divisor of 2 unsigned longs
- * @a: first value
- * @b: second value
- */
-unsigned long gcd(unsigned long a, unsigned long b)
+static unsigned long binary_gcd(unsigned long a, unsigned long b)
 {
 	unsigned long r = a | b;
 
-	if (!a || !b)
-		return r;
-
 	b >>= __ffs(b);
 	if (b == 1)
 		return r & -r;
@@ -44,9 +38,15 @@ unsigned long gcd(unsigned long a, unsigned long b)
 	}
 }
 
-#else
+#endif
 
 /* If normalization is done by loops, the even/odd algorithm is a win. */
+
+/**
+ * gcd - calculate and return the greatest common divisor of 2 unsigned longs
+ * @a: first value
+ * @b: second value
+ */
 unsigned long gcd(unsigned long a, unsigned long b)
 {
 	unsigned long r = a | b;
@@ -54,6 +54,11 @@ unsigned long gcd(unsigned long a, unsigned long b)
 	if (!a || !b)
 		return r;
 
+#if !defined(CONFIG_CPU_NO_EFFICIENT_FFS)
+	if (static_branch_likely(&efficient_ffs_key))
+		return binary_gcd(a, b);
+#endif
+
 	/* Isolate lsbit of r */
 	r &= -r;
 
@@ -80,6 +85,4 @@ unsigned long gcd(unsigned long a, unsigned long b)
 	}
 }
 
-#endif
-
 EXPORT_SYMBOL_GPL(gcd);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled
  2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
  2025-06-06 13:47 ` [PATCH v3 1/3] lib/math/gcd: Use static key to select " Kuan-Wei Chiu
@ 2025-06-06 13:47 ` Kuan-Wei Chiu
  2025-06-12 12:59   ` Alexandre Ghiti
  2025-06-06 13:47 ` [PATCH v3 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension Kuan-Wei Chiu
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Kuan-Wei Chiu @ 2025-06-06 13:47 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, alex, akpm
  Cc: linux-riscv, linux-kernel, jserv, Kuan-Wei Chiu, Yu-Chun Lin

The binary GCD implementation depends on efficient ffs(), which on
RISC-V requires hardware support for the Zbb extension. When
CONFIG_RISCV_ISA_ZBB is not enabled, the kernel will never use binary
GCD, as runtime logic will always fall back to the odd-even
implementation.

To avoid compiling unused code and reduce code size, select
CONFIG_CPU_NO_EFFICIENT_FFS when CONFIG_RISCV_ISA_ZBB is not set.

$ ./scripts/bloat-o-meter ./lib/math/gcd.o.old ./lib/math/gcd.o.new
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-274 (-274)
Function                                     old     new   delta
gcd                                          360      86    -274
Total: Before=384, After=110, chg -71.35%

Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index bbec87b79309..f085adc6f573 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -95,6 +95,7 @@ config RISCV
 	select CLINT_TIMER if RISCV_M_MODE
 	select CLONE_BACKWARDS
 	select COMMON_CLK
+	select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
 	select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
 	select EDAC_SUPPORT
 	select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension
  2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
  2025-06-06 13:47 ` [PATCH v3 1/3] lib/math/gcd: Use static key to select " Kuan-Wei Chiu
  2025-06-06 13:47 ` [PATCH v3 2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled Kuan-Wei Chiu
@ 2025-06-06 13:47 ` Kuan-Wei Chiu
  2025-06-12 13:00   ` Alexandre Ghiti
  2025-07-09 15:08 ` [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Alexandre Ghiti
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Kuan-Wei Chiu @ 2025-06-06 13:47 UTC (permalink / raw)
  To: paul.walmsley, palmer, aou, alex, akpm
  Cc: linux-riscv, linux-kernel, jserv, Kuan-Wei Chiu, Yu-Chun Lin

The binary GCD implementation uses FFS (find first set), which benefits
from hardware support for the ctz instruction, provided by the Zbb
extension on RISC-V. Without Zbb, this results in slower
software-emulated behavior.

Previously, RISC-V always used the binary GCD, regardless of actual
hardware support. This patch improves runtime efficiency by disabling
the efficient_ffs_key static branch when Zbb is either not enabled in
the kernel (config) or not supported on the executing CPU. This selects
the odd-even GCD implementation, which is faster in the absence of
efficient FFS.

This change ensures the most suitable GCD algorithm is chosen
dynamically based on actual hardware capabilities.

Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---
 arch/riscv/kernel/setup.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index f7c9a1caa83e..785c7104fde7 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -21,6 +21,8 @@
 #include <linux/efi.h>
 #include <linux/crash_dump.h>
 #include <linux/panic_notifier.h>
+#include <linux/jump_label.h>
+#include <linux/gcd.h>
 
 #include <asm/acpi.h>
 #include <asm/alternative.h>
@@ -361,6 +363,9 @@ void __init setup_arch(char **cmdline_p)
 
 	riscv_user_isa_enable();
 	riscv_spinlock_init();
+
+	if (!IS_ENABLED(CONFIG_RISCV_ISA_ZBB) || !riscv_isa_extension_available(NULL, ZBB))
+		static_branch_disable(&efficient_ffs_key);
 }
 
 bool arch_cpu_is_hotpluggable(int cpu)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled
  2025-06-06 13:47 ` [PATCH v3 2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled Kuan-Wei Chiu
@ 2025-06-12 12:59   ` Alexandre Ghiti
  0 siblings, 0 replies; 9+ messages in thread
From: Alexandre Ghiti @ 2025-06-12 12:59 UTC (permalink / raw)
  To: Kuan-Wei Chiu, paul.walmsley, palmer, aou, akpm
  Cc: linux-riscv, linux-kernel, jserv, Yu-Chun Lin

Hi Kuan-Wei,

On 6/6/25 15:47, Kuan-Wei Chiu wrote:
> The binary GCD implementation depends on efficient ffs(), which on
> RISC-V requires hardware support for the Zbb extension. When
> CONFIG_RISCV_ISA_ZBB is not enabled, the kernel will never use binary
> GCD, as runtime logic will always fall back to the odd-even
> implementation.
>
> To avoid compiling unused code and reduce code size, select
> CONFIG_CPU_NO_EFFICIENT_FFS when CONFIG_RISCV_ISA_ZBB is not set.
>
> $ ./scripts/bloat-o-meter ./lib/math/gcd.o.old ./lib/math/gcd.o.new
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-274 (-274)
> Function                                     old     new   delta
> gcd                                          360      86    -274
> Total: Before=384, After=110, chg -71.35%
>
> Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
>   arch/riscv/Kconfig | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index bbec87b79309..f085adc6f573 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -95,6 +95,7 @@ config RISCV
>   	select CLINT_TIMER if RISCV_M_MODE
>   	select CLONE_BACKWARDS
>   	select COMMON_CLK
> +	select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
>   	select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
>   	select EDAC_SUPPORT
>   	select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)


In v2, Andrew asked if he could merge it, so:

Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension
  2025-06-06 13:47 ` [PATCH v3 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension Kuan-Wei Chiu
@ 2025-06-12 13:00   ` Alexandre Ghiti
  0 siblings, 0 replies; 9+ messages in thread
From: Alexandre Ghiti @ 2025-06-12 13:00 UTC (permalink / raw)
  To: Kuan-Wei Chiu, paul.walmsley, palmer, aou, akpm
  Cc: linux-riscv, linux-kernel, jserv, Yu-Chun Lin

On 6/6/25 15:47, Kuan-Wei Chiu wrote:
> The binary GCD implementation uses FFS (find first set), which benefits
> from hardware support for the ctz instruction, provided by the Zbb
> extension on RISC-V. Without Zbb, this results in slower
> software-emulated behavior.
>
> Previously, RISC-V always used the binary GCD, regardless of actual
> hardware support. This patch improves runtime efficiency by disabling
> the efficient_ffs_key static branch when Zbb is either not enabled in
> the kernel (config) or not supported on the executing CPU. This selects
> the odd-even GCD implementation, which is faster in the absence of
> efficient FFS.
>
> This change ensures the most suitable GCD algorithm is chosen
> dynamically based on actual hardware capabilities.
>
> Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
> ---
>   arch/riscv/kernel/setup.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index f7c9a1caa83e..785c7104fde7 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -21,6 +21,8 @@
>   #include <linux/efi.h>
>   #include <linux/crash_dump.h>
>   #include <linux/panic_notifier.h>
> +#include <linux/jump_label.h>
> +#include <linux/gcd.h>
>   
>   #include <asm/acpi.h>
>   #include <asm/alternative.h>
> @@ -361,6 +363,9 @@ void __init setup_arch(char **cmdline_p)
>   
>   	riscv_user_isa_enable();
>   	riscv_spinlock_init();
> +
> +	if (!IS_ENABLED(CONFIG_RISCV_ISA_ZBB) || !riscv_isa_extension_available(NULL, ZBB))
> +		static_branch_disable(&efficient_ffs_key);
>   }
>   
>   bool arch_cpu_is_hotpluggable(int cpu)


Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime
  2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
                   ` (2 preceding siblings ...)
  2025-06-06 13:47 ` [PATCH v3 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension Kuan-Wei Chiu
@ 2025-07-09 15:08 ` Alexandre Ghiti
  2025-07-09 23:20 ` Andrew Morton
  2025-08-10 21:12 ` patchwork-bot+linux-riscv
  5 siblings, 0 replies; 9+ messages in thread
From: Alexandre Ghiti @ 2025-07-09 15:08 UTC (permalink / raw)
  To: Kuan-Wei Chiu, paul.walmsley, palmer, aou, akpm
  Cc: linux-riscv, linux-kernel, jserv, Yu-Chun Lin

Hi Kuan-Wei, Andrew,

@Andrew: Will you merge this one? I can do it through the riscv tree if 
not, no problem at all.

Thanks,

Alex

On 6/6/25 15:47, Kuan-Wei Chiu wrote:
> The current implementation of gcd() selects between the binary GCD and
> the odd-even GCD algorithm at compile time, depending on whether
> CONFIG_CPU_NO_EFFICIENT_FFS is set. On platforms like RISC-V, however,
> this compile-time decision can be misleading: even when the compiler
> emits ctz instructions based on the assumption that they are efficient
> (as is the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual
> hardware may lack support for the Zbb extension. In such cases, ffs()
> falls back to a software implementation at runtime, making the binary
> GCD algorithm significantly slower than the odd-even variant.
>
> To address this, we introduce a static key to allow runtime selection
> between the binary and odd-even GCD implementations. On RISC-V, the
> kernel now checks for Zbb support during boot. If Zbb is unavailable,
> the static key is disabled so that gcd() consistently uses the more
> efficient odd-even algorithm in that scenario. Additionally, to further
> reduce code size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically
> when CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the
> unused binary GCD implementation entirely on systems where it would
> never be executed.
>
> This series ensures that the most efficient GCD algorithm is used in
> practice and avoids compiling unnecessary code based on hardware
> capabilities and kernel configuration.
>
> Co-developed-by: Yu-Chun Lin <eleanor15x@gmail.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
>
> ---
> This series has been tested on QEMU to verify that the correct GCD
> implementation is used both with and without Zbb support.
>
> v2 -> v3:
> - Drop if (!a || !b) check in binary_gcd()
> - Move DECLARE_STATIC_KEY_TRUE(efficient_ffs_key) to gcd.h
> v1 -> v2:
> - Use a static key to select the GCD implementation at runtime.
>
> v2: https://lore.kernel.org/lkml/20250524155519.1142570-1-visitorckw@gmail.com/
> v1: https://lore.kernel.org/lkml/20250217013708.1932496-1-visitorckw@gmail.com/
>
> Kuan-Wei Chiu (3):
>    lib/math/gcd: Use static key to select implementation at runtime
>    riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled
>    riscv: Optimize gcd() performance on RISC-V without Zbb extension
>
>   arch/riscv/Kconfig        |  1 +
>   arch/riscv/kernel/setup.c |  5 +++++
>   include/linux/gcd.h       |  3 +++
>   lib/math/gcd.c            | 27 +++++++++++++++------------
>   4 files changed, 24 insertions(+), 12 deletions(-)
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime
  2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
                   ` (3 preceding siblings ...)
  2025-07-09 15:08 ` [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Alexandre Ghiti
@ 2025-07-09 23:20 ` Andrew Morton
  2025-08-10 21:12 ` patchwork-bot+linux-riscv
  5 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2025-07-09 23:20 UTC (permalink / raw)
  To: Kuan-Wei Chiu
  Cc: paul.walmsley, palmer, aou, alex, linux-riscv, linux-kernel,
	jserv, Yu-Chun Lin

On Fri,  6 Jun 2025 21:47:55 +0800 Kuan-Wei Chiu <visitorckw@gmail.com> wrote:

> The current implementation of gcd() selects between the binary GCD and
> the odd-even GCD algorithm at compile time, depending on whether
> CONFIG_CPU_NO_EFFICIENT_FFS is set. On platforms like RISC-V, however,
> this compile-time decision can be misleading: even when the compiler
> emits ctz instructions based on the assumption that they are efficient
> (as is the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual
> hardware may lack support for the Zbb extension. In such cases, ffs()
> falls back to a software implementation at runtime, making the binary
> GCD algorithm significantly slower than the odd-even variant.
> 
> To address this, we introduce a static key to allow runtime selection
> between the binary and odd-even GCD implementations. On RISC-V, the
> kernel now checks for Zbb support during boot. If Zbb is unavailable,
> the static key is disabled so that gcd() consistently uses the more
> efficient odd-even algorithm in that scenario. Additionally, to further
> reduce code size, we select CONFIG_CPU_NO_EFFICIENT_FFS automatically
> when CONFIG_RISCV_ISA_ZBB is not enabled, avoiding compilation of the
> unused binary GCD implementation entirely on systems where it would
> never be executed.
> 
> This series ensures that the most efficient GCD algorithm is used in
> practice and avoids compiling unnecessary code based on hardware
> capabilities and kernel configuration.

I removed the v2 series from mm.git and added this, thanks.

v2 was in -next for a month, no issues of which I am aware.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime
  2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
                   ` (4 preceding siblings ...)
  2025-07-09 23:20 ` Andrew Morton
@ 2025-08-10 21:12 ` patchwork-bot+linux-riscv
  5 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+linux-riscv @ 2025-08-10 21:12 UTC (permalink / raw)
  To: Kuan-Wei Chiu
  Cc: linux-riscv, paul.walmsley, palmer, aou, alex, akpm, linux-kernel,
	jserv, eleanor15x

Hello:

This series was applied to riscv/linux.git (fixes)
by Andrew Morton <akpm@linux-foundation.org>:

On Fri,  6 Jun 2025 21:47:55 +0800 you wrote:
> The current implementation of gcd() selects between the binary GCD and
> the odd-even GCD algorithm at compile time, depending on whether
> CONFIG_CPU_NO_EFFICIENT_FFS is set. On platforms like RISC-V, however,
> this compile-time decision can be misleading: even when the compiler
> emits ctz instructions based on the assumption that they are efficient
> (as is the case when CONFIG_RISCV_ISA_ZBB is enabled), the actual
> hardware may lack support for the Zbb extension. In such cases, ffs()
> falls back to a software implementation at runtime, making the binary
> GCD algorithm significantly slower than the odd-even variant.
> 
> [...]

Here is the summary with links:
  - [v3,1/3] lib/math/gcd: Use static key to select implementation at runtime
    https://git.kernel.org/riscv/c/b3d5fd6f82dd
  - [v3,2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled
    https://git.kernel.org/riscv/c/26b537edc533
  - [v3,3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension
    https://git.kernel.org/riscv/c/36e224168721

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-10 21:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-06 13:47 [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Kuan-Wei Chiu
2025-06-06 13:47 ` [PATCH v3 1/3] lib/math/gcd: Use static key to select " Kuan-Wei Chiu
2025-06-06 13:47 ` [PATCH v3 2/3] riscv: Optimize gcd() code size when CONFIG_RISCV_ISA_ZBB is disabled Kuan-Wei Chiu
2025-06-12 12:59   ` Alexandre Ghiti
2025-06-06 13:47 ` [PATCH v3 3/3] riscv: Optimize gcd() performance on RISC-V without Zbb extension Kuan-Wei Chiu
2025-06-12 13:00   ` Alexandre Ghiti
2025-07-09 15:08 ` [PATCH v3 0/3] Optimize GCD performance on RISC-V by selecting implementation at runtime Alexandre Ghiti
2025-07-09 23:20 ` Andrew Morton
2025-08-10 21:12 ` patchwork-bot+linux-riscv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).