* [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe
@ 2026-06-17 3:38 Nam Cao
2026-06-17 3:38 ` [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access() Nam Cao
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Nam Cao @ 2026-06-17 3:38 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Andrew Jones, Jingwei Wang, Anirudh Srinivasan, linux-riscv,
linux-kernel
Cc: Nam Cao
Hi,
This series is follow-up to the discussion at:
https://lore.kernel.org/linux-riscv/20260612-vec_unaligned_drop_init-v1-1-df969210ae34@oss.tenstorrent.com/
It removes the kthread usage for unaligned vector access speed probe, avoiding
a bug that the kthread may still be excuting a __init function that have
already been freed.
It also allows removing some vdso synchronization, simplify the code.
This kthread has been bothering me for a while now, and the recent bug report
pushed me to post this series.
Nam Cao (2):
riscv: unaligned: stop using kthread for
check_vector_unaligned_access()
Revert "riscv: hwprobe: Fix stale vDSO data for late-initialized keys
at boot"
arch/riscv/include/asm/hwprobe.h | 7 ---
arch/riscv/include/asm/vdso/arch_data.h | 6 --
arch/riscv/kernel/sys_hwprobe.c | 70 ++++------------------
arch/riscv/kernel/unaligned_access_speed.c | 19 +-----
arch/riscv/kernel/vdso/hwprobe.c | 2 +-
5 files changed, 15 insertions(+), 89 deletions(-)
--
2.47.3
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access()
2026-06-17 3:38 [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Nam Cao
@ 2026-06-17 3:38 ` Nam Cao
2026-06-17 8:49 ` Nam Cao
2026-06-17 3:38 ` [PATCH 2/2] Revert "riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot" Nam Cao
2026-06-17 14:21 ` [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Anirudh Srinivasan
2 siblings, 1 reply; 5+ messages in thread
From: Nam Cao @ 2026-06-17 3:38 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Andrew Jones, Jingwei Wang, Anirudh Srinivasan, linux-riscv,
linux-kernel
Cc: Nam Cao, stable
A kthread is used to run check_vector_unaligned_access() to optimize boot
time, allowing the kernel to continue booting without waiting for the
unaligned vector speed probe to finish.
However, this asynchronous approach introduces several complications.
First, the kthread may not complete before a user reads vDSO data,
resulting in incorrect values. This was previously addressed by
commit 5d15d2ad36b0 ("riscv: hwprobe: Fix stale vDSO data for
late-initialized keys at boot"), which added complex synchronization
between the kthread and vDSO reads.
Second, it was discovered that the kthread may not finish before
vec_check_unaligned_access_speed_all_cpus() (marked with __init) is freed,
triggering a page fault.
These issues raise the question of whether the kthread is worth the added
complexity. A past boot time regression report was actually unrelated to
synchronous probing; it was caused by the probe running serially. Since
switching to a parallel probe, no further complaints have been made.
Furthermore, the unaligned scalar access speed probe takes the same amount
of time, runs synchronously, and has caused no issues.
Testing shows no noticeable boot time slowdown when running the vector
probe synchronously (0.464474s with kthread vs. 0.457991s without).
Remove the kthread usage and run the probe synchronously. This simplifies
the boot flow and allows for the revert of commit 5d15d2ad36b0 ("riscv:
hwprobe: Fix stale vDSO data for late-initialized keys at boot")
Reported-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
Closes: https://lore.kernel.org/linux-riscv/20260612-vec_unaligned_drop_init-v1-1-df969210ae34@oss.tenstorrent.com/
Fixes: a00e022be531 ("riscv: Annotate unaligned access init functions")
Cc: <stable@vger.kernel.org>
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
arch/riscv/kernel/unaligned_access_speed.c | 19 ++-----------------
1 file changed, 2 insertions(+), 17 deletions(-)
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index bb57eb5d19df..6e35bca568de 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -6,7 +6,6 @@
#include <linux/cpu.h>
#include <linux/cpumask.h>
#include <linux/jump_label.h>
-#include <linux/kthread.h>
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/types.h>
@@ -288,18 +287,9 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus
__free_pages(page, MISALIGNED_BUFFER_ORDER);
}
-/* Measure unaligned access speed on all CPUs present at boot in parallel. */
-static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused)
-{
- schedule_on_each_cpu(check_vector_unaligned_access);
- riscv_hwprobe_complete_async_probe();
-
- return 0;
-}
#else /* CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS */
-static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused)
+static void check_vector_unaligned_access(struct work_struct *work __always_unused)
{
- return 0;
}
#endif
@@ -387,12 +377,7 @@ static int __init check_unaligned_access_all_cpus(void)
per_cpu(vector_misaligned_access, cpu) = unaligned_vector_speed_param;
} else if (!check_vector_unaligned_access_emulated_all_cpus() &&
IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) {
- riscv_hwprobe_register_async_probe();
- if (IS_ERR(kthread_run(vec_check_unaligned_access_speed_all_cpus,
- NULL, "vec_check_unaligned_access_speed_all_cpus"))) {
- pr_warn("Failed to create vec_unalign_check kthread\n");
- riscv_hwprobe_complete_async_probe();
- }
+ schedule_on_each_cpu(check_vector_unaligned_access);
}
/*
--
2.47.3
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] Revert "riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot"
2026-06-17 3:38 [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Nam Cao
2026-06-17 3:38 ` [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access() Nam Cao
@ 2026-06-17 3:38 ` Nam Cao
2026-06-17 14:21 ` [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Anirudh Srinivasan
2 siblings, 0 replies; 5+ messages in thread
From: Nam Cao @ 2026-06-17 3:38 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Andrew Jones, Jingwei Wang, Anirudh Srinivasan, linux-riscv,
linux-kernel
Cc: Nam Cao
This reverts commit 5d15d2ad36b0 ("riscv: hwprobe: Fix stale vDSO data for
late-initialized keys at boot"). The commit ensures synchronization between
the unaligned vector access speed probe kthread and vDSO data read. But now
that the kthread has been removed, this commit can be reverted.
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
arch/riscv/include/asm/hwprobe.h | 7 ---
arch/riscv/include/asm/vdso/arch_data.h | 6 ---
arch/riscv/kernel/sys_hwprobe.c | 70 +++++--------------------
arch/riscv/kernel/vdso/hwprobe.c | 2 +-
4 files changed, 13 insertions(+), 72 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
index 8b9f5e1cf4cb..9b04377c0f98 100644
--- a/arch/riscv/include/asm/hwprobe.h
+++ b/arch/riscv/include/asm/hwprobe.h
@@ -43,11 +43,4 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair,
return pair->value == other_pair->value;
}
-#ifdef CONFIG_MMU
-void riscv_hwprobe_register_async_probe(void);
-void riscv_hwprobe_complete_async_probe(void);
-#else
-static inline void riscv_hwprobe_register_async_probe(void) {}
-static inline void riscv_hwprobe_complete_async_probe(void) {}
-#endif
#endif
diff --git a/arch/riscv/include/asm/vdso/arch_data.h b/arch/riscv/include/asm/vdso/arch_data.h
index 88b37af55175..da57a3786f7a 100644
--- a/arch/riscv/include/asm/vdso/arch_data.h
+++ b/arch/riscv/include/asm/vdso/arch_data.h
@@ -12,12 +12,6 @@ struct vdso_arch_data {
/* Boolean indicating all CPUs have the same static hwprobe values. */
__u8 homogeneous_cpus;
-
- /*
- * A gate to check and see if the hwprobe data is actually ready, as
- * probing is deferred to avoid boot slowdowns.
- */
- __u8 ready;
};
#endif /* __RISCV_ASM_VDSO_ARCH_DATA_H */
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index 1659d31fd288..2ce162f5cf7c 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -5,9 +5,6 @@
* more details.
*/
#include <linux/syscalls.h>
-#include <linux/completion.h>
-#include <linux/atomic.h>
-#include <linux/once.h>
#include <asm/cacheflush.h>
#include <asm/cpufeature.h>
#include <asm/hwprobe.h>
@@ -504,32 +501,28 @@ static int hwprobe_get_cpus(struct riscv_hwprobe __user *pairs,
return 0;
}
-#ifdef CONFIG_MMU
-
-static DECLARE_COMPLETION(boot_probes_done);
-static atomic_t pending_boot_probes = ATOMIC_INIT(1);
-
-void riscv_hwprobe_register_async_probe(void)
+static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
+ size_t pair_count, size_t cpusetsize,
+ unsigned long __user *cpus_user,
+ unsigned int flags)
{
- atomic_inc(&pending_boot_probes);
-}
+ if (flags & RISCV_HWPROBE_WHICH_CPUS)
+ return hwprobe_get_cpus(pairs, pair_count, cpusetsize,
+ cpus_user, flags);
-void riscv_hwprobe_complete_async_probe(void)
-{
- if (atomic_dec_and_test(&pending_boot_probes))
- complete(&boot_probes_done);
+ return hwprobe_get_values(pairs, pair_count, cpusetsize,
+ cpus_user, flags);
}
-static int complete_hwprobe_vdso_data(void)
+#ifdef CONFIG_MMU
+
+static int __init init_hwprobe_vdso_data(void)
{
struct vdso_arch_data *avd = vdso_k_arch_data;
u64 id_bitsmash = 0;
struct riscv_hwprobe pair;
int key;
- if (unlikely(!atomic_dec_and_test(&pending_boot_probes)))
- wait_for_completion(&boot_probes_done);
-
/*
* Initialize vDSO data with the answers for the "all CPUs" case, to
* save a syscall in the common case.
@@ -557,52 +550,13 @@ static int complete_hwprobe_vdso_data(void)
* vDSO should defer to the kernel for exotic cpu masks.
*/
avd->homogeneous_cpus = id_bitsmash != 0 && id_bitsmash != -1;
-
- /*
- * Make sure all the VDSO values are visible before we look at them.
- * This pairs with the implicit "no speculativly visible accesses"
- * barrier in the VDSO hwprobe code.
- */
- smp_wmb();
- avd->ready = true;
- return 0;
-}
-
-static int __init init_hwprobe_vdso_data(void)
-{
- struct vdso_arch_data *avd = vdso_k_arch_data;
-
- /*
- * Prevent the vDSO cached values from being used, as they're not ready
- * yet.
- */
- avd->ready = false;
return 0;
}
arch_initcall_sync(init_hwprobe_vdso_data);
-#else
-
-static int complete_hwprobe_vdso_data(void) { return 0; }
-
#endif /* CONFIG_MMU */
-static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
- size_t pair_count, size_t cpusetsize,
- unsigned long __user *cpus_user,
- unsigned int flags)
-{
- DO_ONCE_SLEEPABLE(complete_hwprobe_vdso_data);
-
- if (flags & RISCV_HWPROBE_WHICH_CPUS)
- return hwprobe_get_cpus(pairs, pair_count, cpusetsize,
- cpus_user, flags);
-
- return hwprobe_get_values(pairs, pair_count, cpusetsize,
- cpus_user, flags);
-}
-
SYSCALL_DEFINE5(riscv_hwprobe, struct riscv_hwprobe __user *, pairs,
size_t, pair_count, size_t, cpusetsize, unsigned long __user *,
cpus, unsigned int, flags)
diff --git a/arch/riscv/kernel/vdso/hwprobe.c b/arch/riscv/kernel/vdso/hwprobe.c
index 8f45500d0a6e..2ddeba6c68dd 100644
--- a/arch/riscv/kernel/vdso/hwprobe.c
+++ b/arch/riscv/kernel/vdso/hwprobe.c
@@ -27,7 +27,7 @@ static int riscv_vdso_get_values(struct riscv_hwprobe *pairs, size_t pair_count,
* homogeneous, then this function can handle requests for arbitrary
* masks.
*/
- if (flags != 0 || (!all_cpus && !avd->homogeneous_cpus) || unlikely(!avd->ready))
+ if ((flags != 0) || (!all_cpus && !avd->homogeneous_cpus))
return riscv_hwprobe(pairs, pair_count, cpusetsize, cpus, flags);
/* This is something we can handle, fill out the pairs. */
--
2.47.3
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access()
2026-06-17 3:38 ` [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access() Nam Cao
@ 2026-06-17 8:49 ` Nam Cao
0 siblings, 0 replies; 5+ messages in thread
From: Nam Cao @ 2026-06-17 8:49 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Andrew Jones, Jingwei Wang, Anirudh Srinivasan, linux-riscv,
linux-kernel
Cc: stable
Nam Cao <namcao@linutronix.de> writes:
> A kthread is used to run check_vector_unaligned_access() to optimize boot
> time, allowing the kernel to continue booting without waiting for the
> unaligned vector speed probe to finish.
>
> However, this asynchronous approach introduces several complications.
> First, the kthread may not complete before a user reads vDSO data,
> resulting in incorrect values. This was previously addressed by
> commit 5d15d2ad36b0 ("riscv: hwprobe: Fix stale vDSO data for
> late-initialized keys at boot"), which added complex synchronization
> between the kthread and vDSO reads.
>
> Second, it was discovered that the kthread may not finish before
> vec_check_unaligned_access_speed_all_cpus() (marked with __init) is freed,
> triggering a page fault.
>
> These issues raise the question of whether the kthread is worth the added
> complexity. A past boot time regression report was actually unrelated to
> synchronous probing; it was caused by the probe running serially. Since
> switching to a parallel probe, no further complaints have been made.
> Furthermore, the unaligned scalar access speed probe takes the same amount
> of time, runs synchronously, and has caused no issues.
Another point I forgot to include. We start the kthread to run
asynchronously, but the probe is executed on all CPUs including the boot
CPU. Therefore if the kthread is executed before boot is completed,
asynchronous probe will actually slow down boot time due to the overhead
with kthread. If the kthread is executed after boot is completed, we run
into the two race conditions mentioned above.
> Testing shows no noticeable boot time slowdown when running the vector
> probe synchronously (0.464474s with kthread vs. 0.457991s without).
>
> Remove the kthread usage and run the probe synchronously. This simplifies
> the boot flow and allows for the revert of commit 5d15d2ad36b0 ("riscv:
> hwprobe: Fix stale vDSO data for late-initialized keys at boot")
>
> Reported-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
> Closes: https://lore.kernel.org/linux-riscv/20260612-vec_unaligned_drop_init-v1-1-df969210ae34@oss.tenstorrent.com/
> Fixes: a00e022be531 ("riscv: Annotate unaligned access init functions")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Nam Cao <namcao@linutronix.de>
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe
2026-06-17 3:38 [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Nam Cao
2026-06-17 3:38 ` [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access() Nam Cao
2026-06-17 3:38 ` [PATCH 2/2] Revert "riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot" Nam Cao
@ 2026-06-17 14:21 ` Anirudh Srinivasan
2 siblings, 0 replies; 5+ messages in thread
From: Anirudh Srinivasan @ 2026-06-17 14:21 UTC (permalink / raw)
To: Nam Cao
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Andrew Jones, Jingwei Wang, linux-riscv, linux-kernel
Hi,
On Tue, Jun 16, 2026 at 10:38 PM Nam Cao <namcao@linutronix.de> wrote:
>
> Hi,
>
> This series is follow-up to the discussion at:
> https://lore.kernel.org/linux-riscv/20260612-vec_unaligned_drop_init-v1-1-df969210ae34@oss.tenstorrent.com/
>
> It removes the kthread usage for unaligned vector access speed probe, avoiding
> a bug that the kthread may still be excuting a __init function that have
> already been freed.
>
> It also allows removing some vdso synchronization, simplify the code.
>
> This kthread has been bothering me for a while now, and the recent bug report
> pushed me to post this series.
>
> Nam Cao (2):
> riscv: unaligned: stop using kthread for
> check_vector_unaligned_access()
> Revert "riscv: hwprobe: Fix stale vDSO data for late-initialized keys
> at boot"
>
> arch/riscv/include/asm/hwprobe.h | 7 ---
> arch/riscv/include/asm/vdso/arch_data.h | 6 --
> arch/riscv/kernel/sys_hwprobe.c | 70 ++++------------------
> arch/riscv/kernel/unaligned_access_speed.c | 19 +-----
> arch/riscv/kernel/vdso/hwprobe.c | 2 +-
> 5 files changed, 15 insertions(+), 89 deletions(-)
Tested on Tenstorrent Blackhole with Sifive x280 cores. Slowing down
the unaligned vector access speed probe doesn't cause boot to fail
now.
Tested-by: Anirudh Srinivasan <asrinivasan@oss.tenstorrent.com>
> --
> 2.47.3
>
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-17 14:22 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17 3:38 [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Nam Cao
2026-06-17 3:38 ` [PATCH 1/2] riscv: unaligned: stop using kthread for check_vector_unaligned_access() Nam Cao
2026-06-17 8:49 ` Nam Cao
2026-06-17 3:38 ` [PATCH 2/2] Revert "riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot" Nam Cao
2026-06-17 14:21 ` [PATCH 0/2] riscv: unaligned: stop using kthread for vector speed probe Anirudh Srinivasan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox