* [PATCH 0/2] riscv: Some misaligned cleanups
@ 2026-05-28 21:12 ` Nam Cao
0 siblings, 0 replies; 6+ messages in thread
From: Nam Cao @ 2026-05-28 21:12 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Clément Léger, Andrew Jones, Charlie Jenkins,
linux-riscv, linux-kernel
Cc: Nam Cao
Two minor cleanup which affects performance. The first patch addresses a bug
causing improper initialization of fast_unaligned_access_speed_key, affecting
performance of do_csum(). The second patch remove a redundant access speed
probe in CPU hotplug.
Nam Cao (2):
riscv: misaligned: Fix fast_unaligned_access_speed_key init
riscv: traps_misaligned: Avoid redundant unaligned access speed probe
arch/riscv/kernel/traps_misaligned.c | 4 +-
arch/riscv/kernel/unaligned_access_speed.c | 69 +++++++---------------
2 files changed, 24 insertions(+), 49 deletions(-)
--
2.47.3
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 0/2] riscv: Some misaligned cleanups
@ 2026-05-28 21:12 ` Nam Cao
0 siblings, 0 replies; 6+ messages in thread
From: Nam Cao @ 2026-05-28 21:12 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Clément Léger, Andrew Jones, Charlie Jenkins,
linux-riscv, linux-kernel
Cc: Nam Cao
Two minor cleanup which affects performance. The first patch addresses a bug
causing improper initialization of fast_unaligned_access_speed_key, affecting
performance of do_csum(). The second patch remove a redundant access speed
probe in CPU hotplug.
Nam Cao (2):
riscv: misaligned: Fix fast_unaligned_access_speed_key init
riscv: traps_misaligned: Avoid redundant unaligned access speed probe
arch/riscv/kernel/traps_misaligned.c | 4 +-
arch/riscv/kernel/unaligned_access_speed.c | 69 +++++++---------------
2 files changed, 24 insertions(+), 49 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] riscv: misaligned: Fix fast_unaligned_access_speed_key init
2026-05-28 21:12 ` Nam Cao
@ 2026-05-28 21:12 ` Nam Cao
-1 siblings, 0 replies; 6+ messages in thread
From: Nam Cao @ 2026-05-28 21:12 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Clément Léger, Andrew Jones, Charlie Jenkins,
linux-riscv, linux-kernel
Cc: Nam Cao
When booting with unaligned_scalar_speed=fast,
fast_unaligned_access_speed_key is initialized incorrectly.
The key is currently derived from the fast_misaligned_access cpumask, but
that mask is only populated when the unaligned access speed probe runs.
Specifying unaligned_scalar_speed=fast skips the probe entirely, leaving
the mask uninitialized.
The information tracked by fast_misaligned_access is already available in
the misaligned_access_speed per-CPU variable. Use that to initialize
fast_unaligned_access_speed_key instead and remove the redundant cpumask.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
arch/riscv/kernel/unaligned_access_speed.c | 69 +++++++---------------
1 file changed, 22 insertions(+), 47 deletions(-)
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index 11c781a4de73..bb57eb5d19df 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -27,8 +27,6 @@ DEFINE_PER_CPU(long, vector_misaligned_access) = RISCV_HWPROBE_MISALIGNED_VECTOR
static long unaligned_scalar_speed_param = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
static long unaligned_vector_speed_param = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
-static cpumask_t fast_misaligned_access;
-
static u64 __maybe_unused
measure_cycles(void (*func)(void *dst, const void *src, size_t len),
void *dst, void *src, size_t len)
@@ -131,13 +129,10 @@ static int check_unaligned_access(struct page *page)
* Set the value of fast_misaligned_access of a CPU. These operations
* are atomic to avoid race conditions.
*/
- if (ret) {
+ if (ret)
per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_SCALAR_FAST;
- cpumask_set_cpu(cpu, &fast_misaligned_access);
- } else {
+ else
per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW;
- cpumask_clear_cpu(cpu, &fast_misaligned_access);
- }
return 0;
}
@@ -192,49 +187,24 @@ static void __init check_unaligned_access_speed_all_cpus(void)
DEFINE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);
-static void modify_unaligned_access_branches(cpumask_t *mask, int weight)
+static void modify_unaligned_access_branches(const cpumask_t *mask)
{
- if (cpumask_weight(mask) == weight)
+ bool fast = true;
+ int cpu;
+
+ for_each_cpu(cpu, mask) {
+ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_FAST) {
+ fast = false;
+ break;
+ }
+ }
+
+ if (fast)
static_branch_enable_cpuslocked(&fast_unaligned_access_speed_key);
else
static_branch_disable_cpuslocked(&fast_unaligned_access_speed_key);
}
-static void set_unaligned_access_static_branches_except_cpu(int cpu)
-{
- /*
- * Same as set_unaligned_access_static_branches, except excludes the
- * given CPU from the result. When a CPU is hotplugged into an offline
- * state, this function is called before the CPU is set to offline in
- * the cpumask, and thus the CPU needs to be explicitly excluded.
- */
-
- cpumask_t fast_except_me;
-
- cpumask_and(&fast_except_me, &fast_misaligned_access, cpu_online_mask);
- cpumask_clear_cpu(cpu, &fast_except_me);
-
- modify_unaligned_access_branches(&fast_except_me, num_online_cpus() - 1);
-}
-
-static void set_unaligned_access_static_branches(void)
-{
- /*
- * This will be called after check_unaligned_access_all_cpus so the
- * result of unaligned access speed for all CPUs will be available.
- *
- * To avoid the number of online cpus changing between reading
- * cpu_online_mask and calling num_online_cpus, cpus_read_lock must be
- * held before calling this function.
- */
-
- cpumask_t fast_and_online;
-
- cpumask_and(&fast_and_online, &fast_misaligned_access, cpu_online_mask);
-
- modify_unaligned_access_branches(&fast_and_online, num_online_cpus());
-}
-
static int riscv_online_cpu(unsigned int cpu)
{
int ret = cpu_online_unaligned_access_init(cpu);
@@ -266,14 +236,19 @@ static int riscv_online_cpu(unsigned int cpu)
#endif
exit:
- set_unaligned_access_static_branches();
+ modify_unaligned_access_branches(cpu_online_mask);
return 0;
}
static int riscv_offline_cpu(unsigned int cpu)
{
- set_unaligned_access_static_branches_except_cpu(cpu);
+ cpumask_t mask;
+
+ cpumask_copy(&mask, cpu_online_mask);
+ cpumask_clear_cpu(cpu, &mask);
+
+ modify_unaligned_access_branches(&mask);
return 0;
}
@@ -430,7 +405,7 @@ static int __init check_unaligned_access_all_cpus(void)
riscv_online_cpu_vec, NULL);
cpus_read_lock();
- set_unaligned_access_static_branches();
+ modify_unaligned_access_branches(cpu_online_mask);
cpus_read_unlock();
return 0;
--
2.47.3
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 1/2] riscv: misaligned: Fix fast_unaligned_access_speed_key init
@ 2026-05-28 21:12 ` Nam Cao
0 siblings, 0 replies; 6+ messages in thread
From: Nam Cao @ 2026-05-28 21:12 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Clément Léger, Andrew Jones, Charlie Jenkins,
linux-riscv, linux-kernel
Cc: Nam Cao
When booting with unaligned_scalar_speed=fast,
fast_unaligned_access_speed_key is initialized incorrectly.
The key is currently derived from the fast_misaligned_access cpumask, but
that mask is only populated when the unaligned access speed probe runs.
Specifying unaligned_scalar_speed=fast skips the probe entirely, leaving
the mask uninitialized.
The information tracked by fast_misaligned_access is already available in
the misaligned_access_speed per-CPU variable. Use that to initialize
fast_unaligned_access_speed_key instead and remove the redundant cpumask.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
arch/riscv/kernel/unaligned_access_speed.c | 69 +++++++---------------
1 file changed, 22 insertions(+), 47 deletions(-)
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index 11c781a4de73..bb57eb5d19df 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -27,8 +27,6 @@ DEFINE_PER_CPU(long, vector_misaligned_access) = RISCV_HWPROBE_MISALIGNED_VECTOR
static long unaligned_scalar_speed_param = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
static long unaligned_vector_speed_param = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
-static cpumask_t fast_misaligned_access;
-
static u64 __maybe_unused
measure_cycles(void (*func)(void *dst, const void *src, size_t len),
void *dst, void *src, size_t len)
@@ -131,13 +129,10 @@ static int check_unaligned_access(struct page *page)
* Set the value of fast_misaligned_access of a CPU. These operations
* are atomic to avoid race conditions.
*/
- if (ret) {
+ if (ret)
per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_SCALAR_FAST;
- cpumask_set_cpu(cpu, &fast_misaligned_access);
- } else {
+ else
per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW;
- cpumask_clear_cpu(cpu, &fast_misaligned_access);
- }
return 0;
}
@@ -192,49 +187,24 @@ static void __init check_unaligned_access_speed_all_cpus(void)
DEFINE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);
-static void modify_unaligned_access_branches(cpumask_t *mask, int weight)
+static void modify_unaligned_access_branches(const cpumask_t *mask)
{
- if (cpumask_weight(mask) == weight)
+ bool fast = true;
+ int cpu;
+
+ for_each_cpu(cpu, mask) {
+ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_FAST) {
+ fast = false;
+ break;
+ }
+ }
+
+ if (fast)
static_branch_enable_cpuslocked(&fast_unaligned_access_speed_key);
else
static_branch_disable_cpuslocked(&fast_unaligned_access_speed_key);
}
-static void set_unaligned_access_static_branches_except_cpu(int cpu)
-{
- /*
- * Same as set_unaligned_access_static_branches, except excludes the
- * given CPU from the result. When a CPU is hotplugged into an offline
- * state, this function is called before the CPU is set to offline in
- * the cpumask, and thus the CPU needs to be explicitly excluded.
- */
-
- cpumask_t fast_except_me;
-
- cpumask_and(&fast_except_me, &fast_misaligned_access, cpu_online_mask);
- cpumask_clear_cpu(cpu, &fast_except_me);
-
- modify_unaligned_access_branches(&fast_except_me, num_online_cpus() - 1);
-}
-
-static void set_unaligned_access_static_branches(void)
-{
- /*
- * This will be called after check_unaligned_access_all_cpus so the
- * result of unaligned access speed for all CPUs will be available.
- *
- * To avoid the number of online cpus changing between reading
- * cpu_online_mask and calling num_online_cpus, cpus_read_lock must be
- * held before calling this function.
- */
-
- cpumask_t fast_and_online;
-
- cpumask_and(&fast_and_online, &fast_misaligned_access, cpu_online_mask);
-
- modify_unaligned_access_branches(&fast_and_online, num_online_cpus());
-}
-
static int riscv_online_cpu(unsigned int cpu)
{
int ret = cpu_online_unaligned_access_init(cpu);
@@ -266,14 +236,19 @@ static int riscv_online_cpu(unsigned int cpu)
#endif
exit:
- set_unaligned_access_static_branches();
+ modify_unaligned_access_branches(cpu_online_mask);
return 0;
}
static int riscv_offline_cpu(unsigned int cpu)
{
- set_unaligned_access_static_branches_except_cpu(cpu);
+ cpumask_t mask;
+
+ cpumask_copy(&mask, cpu_online_mask);
+ cpumask_clear_cpu(cpu, &mask);
+
+ modify_unaligned_access_branches(&mask);
return 0;
}
@@ -430,7 +405,7 @@ static int __init check_unaligned_access_all_cpus(void)
riscv_online_cpu_vec, NULL);
cpus_read_lock();
- set_unaligned_access_static_branches();
+ modify_unaligned_access_branches(cpu_online_mask);
cpus_read_unlock();
return 0;
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] riscv: traps_misaligned: Avoid redundant unaligned access speed probe
2026-05-28 21:12 ` Nam Cao
@ 2026-05-28 21:12 ` Nam Cao
-1 siblings, 0 replies; 6+ messages in thread
From: Nam Cao @ 2026-05-28 21:12 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Clément Léger, Andrew Jones, Charlie Jenkins,
linux-riscv, linux-kernel
Cc: Nam Cao
When a CPU is taken offline and then is brought back online, unaligned
access speed probe always runs even though the unaligned access speed is
already known, wasting CPU cycles.
This is because when a CPU becomes online, the following happen:
1. check_unaligned_access_emulated() is called, which clears
misaligned_access_speed if there is no emulation.
2. check_unaligned_access() is called because misaligned_access_speed is
cleared, wasting CPU cycles determining something already previous
known.
Avoid the redundant access speed probe by stop clearing
misaligned_access_speed in (1). If access speed is already known, just
reuse it.
On my Visionfive 2, this reduces CPU bring-up time from 26ms to 0.8ms.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
arch/riscv/kernel/traps_misaligned.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index 81b7682e6c6d..6e8ae6c66322 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -522,10 +522,10 @@ static bool unaligned_ctl __read_mostly;
static void check_unaligned_access_emulated(void *arg __always_unused)
{
int cpu = smp_processor_id();
- long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu);
unsigned long tmp_var, tmp_val;
- *mas_ptr = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
+ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN)
+ return;
__asm__ __volatile__ (
" "REG_L" %[tmp], 1(%[ptr])\n"
--
2.47.3
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] riscv: traps_misaligned: Avoid redundant unaligned access speed probe
@ 2026-05-28 21:12 ` Nam Cao
0 siblings, 0 replies; 6+ messages in thread
From: Nam Cao @ 2026-05-28 21:12 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Clément Léger, Andrew Jones, Charlie Jenkins,
linux-riscv, linux-kernel
Cc: Nam Cao
When a CPU is taken offline and then is brought back online, unaligned
access speed probe always runs even though the unaligned access speed is
already known, wasting CPU cycles.
This is because when a CPU becomes online, the following happen:
1. check_unaligned_access_emulated() is called, which clears
misaligned_access_speed if there is no emulation.
2. check_unaligned_access() is called because misaligned_access_speed is
cleared, wasting CPU cycles determining something already previous
known.
Avoid the redundant access speed probe by stop clearing
misaligned_access_speed in (1). If access speed is already known, just
reuse it.
On my Visionfive 2, this reduces CPU bring-up time from 26ms to 0.8ms.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
arch/riscv/kernel/traps_misaligned.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index 81b7682e6c6d..6e8ae6c66322 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -522,10 +522,10 @@ static bool unaligned_ctl __read_mostly;
static void check_unaligned_access_emulated(void *arg __always_unused)
{
int cpu = smp_processor_id();
- long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu);
unsigned long tmp_var, tmp_val;
- *mas_ptr = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
+ if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN)
+ return;
__asm__ __volatile__ (
" "REG_L" %[tmp], 1(%[ptr])\n"
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-28 21:13 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28 21:12 [PATCH 0/2] riscv: Some misaligned cleanups Nam Cao
2026-05-28 21:12 ` Nam Cao
2026-05-28 21:12 ` [PATCH 1/2] riscv: misaligned: Fix fast_unaligned_access_speed_key init Nam Cao
2026-05-28 21:12 ` Nam Cao
2026-05-28 21:12 ` [PATCH 2/2] riscv: traps_misaligned: Avoid redundant unaligned access speed probe Nam Cao
2026-05-28 21:12 ` Nam Cao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.