* [PATCH v10 1/6] RISC-V: Check scalar unaligned access on all CPUs
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
@ 2024-10-17 19:00 ` Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 2/6] RISC-V: Scalar unaligned access emulated on hotplug CPUs Charlie Jenkins
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Charlie Jenkins @ 2024-10-17 19:00 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet
Cc: Palmer Dabbelt, linux-riscv, linux-kernel, linux-doc,
Charlie Jenkins, Jesse Taube, stable
From: Jesse Taube <jesse@rivosinc.com>
Originally, the check_unaligned_access_emulated_all_cpus function
only checked the boot hart. This fixes the function to check all
harts.
Fixes: 71c54b3d169d ("riscv: report misaligned accesses emulation to hwprobe")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
---
arch/riscv/include/asm/cpufeature.h | 2 ++
arch/riscv/kernel/traps_misaligned.c | 14 +++++++-------
2 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index 45f9c1171a48..dfa5cdddd367 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -8,6 +8,7 @@
#include <linux/bitmap.h>
#include <linux/jump_label.h>
+#include <linux/workqueue.h>
#include <asm/hwcap.h>
#include <asm/alternative-macros.h>
#include <asm/errno.h>
@@ -60,6 +61,7 @@ void riscv_user_isa_enable(void);
#if defined(CONFIG_RISCV_MISALIGNED)
bool check_unaligned_access_emulated_all_cpus(void);
+void check_unaligned_access_emulated(struct work_struct *work __always_unused);
void unaligned_emulation_finish(void);
bool unaligned_ctl_available(void);
DECLARE_PER_CPU(long, misaligned_access_speed);
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index d4fd8af7aaf5..d076dde5ad20 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -526,11 +526,11 @@ int handle_misaligned_store(struct pt_regs *regs)
return 0;
}
-static bool check_unaligned_access_emulated(int cpu)
+void check_unaligned_access_emulated(struct work_struct *work __always_unused)
{
+ int cpu = smp_processor_id();
long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu);
unsigned long tmp_var, tmp_val;
- bool misaligned_emu_detected;
*mas_ptr = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
@@ -538,19 +538,16 @@ static bool check_unaligned_access_emulated(int cpu)
" "REG_L" %[tmp], 1(%[ptr])\n"
: [tmp] "=r" (tmp_val) : [ptr] "r" (&tmp_var) : "memory");
- misaligned_emu_detected = (*mas_ptr == RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED);
/*
* If unaligned_ctl is already set, this means that we detected that all
* CPUS uses emulated misaligned access at boot time. If that changed
* when hotplugging the new cpu, this is something we don't handle.
*/
- if (unlikely(unaligned_ctl && !misaligned_emu_detected)) {
+ if (unlikely(unaligned_ctl && (*mas_ptr != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED))) {
pr_crit("CPU misaligned accesses non homogeneous (expected all emulated)\n");
while (true)
cpu_relax();
}
-
- return misaligned_emu_detected;
}
bool check_unaligned_access_emulated_all_cpus(void)
@@ -562,8 +559,11 @@ bool check_unaligned_access_emulated_all_cpus(void)
* accesses emulated since tasks requesting such control can run on any
* CPU.
*/
+ schedule_on_each_cpu(check_unaligned_access_emulated);
+
for_each_online_cpu(cpu)
- if (!check_unaligned_access_emulated(cpu))
+ if (per_cpu(misaligned_access_speed, cpu)
+ != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED)
return false;
unaligned_ctl = true;
--
2.45.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 2/6] RISC-V: Scalar unaligned access emulated on hotplug CPUs
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 1/6] RISC-V: Check scalar unaligned access on all CPUs Charlie Jenkins
@ 2024-10-17 19:00 ` Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 3/6] RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED Charlie Jenkins
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Charlie Jenkins @ 2024-10-17 19:00 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet
Cc: Palmer Dabbelt, linux-riscv, linux-kernel, linux-doc,
Charlie Jenkins, Jesse Taube, stable
From: Jesse Taube <jesse@rivosinc.com>
The check_unaligned_access_emulated() function should have been called
during CPU hotplug to ensure that if all CPUs had emulated unaligned
accesses, the new CPU also does.
This patch adds the call to check_unaligned_access_emulated() in
the hotplug path.
Fixes: 55e0bf49a0d0 ("RISC-V: Probe misaligned access speed in parallel")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
---
arch/riscv/kernel/unaligned_access_speed.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index 160628a2116d..f3508cc54f91 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -191,6 +191,7 @@ static int riscv_online_cpu(unsigned int cpu)
if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN)
goto exit;
+ check_unaligned_access_emulated(NULL);
buf = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER);
if (!buf) {
pr_warn("Allocation failure, not measuring misaligned performance\n");
--
2.45.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 3/6] RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 1/6] RISC-V: Check scalar unaligned access on all CPUs Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 2/6] RISC-V: Scalar unaligned access emulated on hotplug CPUs Charlie Jenkins
@ 2024-10-17 19:00 ` Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 4/6] RISC-V: Detect unaligned vector accesses supported Charlie Jenkins
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Charlie Jenkins @ 2024-10-17 19:00 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet
Cc: Palmer Dabbelt, linux-riscv, linux-kernel, linux-doc,
Charlie Jenkins, Jesse Taube, Conor Dooley
From: Jesse Taube <jesse@rivosinc.com>
Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED to allow
for the addition of RISCV_VECTOR_MISALIGNED in a later patch.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
---
arch/riscv/Kconfig | 6 +++---
arch/riscv/include/asm/cpufeature.h | 2 +-
arch/riscv/include/asm/entry-common.h | 2 +-
arch/riscv/kernel/Makefile | 4 ++--
arch/riscv/kernel/fpu.S | 4 ++--
5 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 939ea7f6a228..2d963d4a26d7 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -765,7 +765,7 @@ config THREAD_SIZE_ORDER
Specify the Pages of thread stack size (from 4KB to 64KB), which also
affects irq stack size, which is equal to thread stack size.
-config RISCV_MISALIGNED
+config RISCV_SCALAR_MISALIGNED
bool
select SYSCTL_ARCH_UNALIGN_ALLOW
help
@@ -782,7 +782,7 @@ choice
config RISCV_PROBE_UNALIGNED_ACCESS
bool "Probe for hardware unaligned access support"
- select RISCV_MISALIGNED
+ select RISCV_SCALAR_MISALIGNED
help
During boot, the kernel will run a series of tests to determine the
speed of unaligned accesses. This probing will dynamically determine
@@ -793,7 +793,7 @@ config RISCV_PROBE_UNALIGNED_ACCESS
config RISCV_EMULATED_UNALIGNED_ACCESS
bool "Emulate unaligned access where system support is missing"
- select RISCV_MISALIGNED
+ select RISCV_SCALAR_MISALIGNED
help
If unaligned memory accesses trap into the kernel as they are not
supported by the system, the kernel will emulate the unaligned
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index dfa5cdddd367..ccc6cf141c20 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -59,7 +59,7 @@ void riscv_user_isa_enable(void);
#define __RISCV_ISA_EXT_SUPERSET_VALIDATE(_name, _id, _sub_exts, _validate) \
_RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate)
-#if defined(CONFIG_RISCV_MISALIGNED)
+#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
bool check_unaligned_access_emulated_all_cpus(void);
void check_unaligned_access_emulated(struct work_struct *work __always_unused);
void unaligned_emulation_finish(void);
diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h
index 2293e535f865..0a4e3544c877 100644
--- a/arch/riscv/include/asm/entry-common.h
+++ b/arch/riscv/include/asm/entry-common.h
@@ -25,7 +25,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
void handle_page_fault(struct pt_regs *regs);
void handle_break(struct pt_regs *regs);
-#ifdef CONFIG_RISCV_MISALIGNED
+#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
int handle_misaligned_load(struct pt_regs *regs);
int handle_misaligned_store(struct pt_regs *regs);
#else
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 7f88cc4931f5..45624c5ea86c 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -68,8 +68,8 @@ obj-y += probes/
obj-y += tests/
obj-$(CONFIG_MMU) += vdso.o vdso/
-obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o
-obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
+obj-$(CONFIG_RISCV_SCALAR_MISALIGNED) += traps_misaligned.o
+obj-$(CONFIG_RISCV_SCALAR_MISALIGNED) += unaligned_access_speed.o
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
obj-$(CONFIG_FPU) += fpu.o
diff --git a/arch/riscv/kernel/fpu.S b/arch/riscv/kernel/fpu.S
index 327cf527dd7e..f74f6b60e347 100644
--- a/arch/riscv/kernel/fpu.S
+++ b/arch/riscv/kernel/fpu.S
@@ -170,7 +170,7 @@ SYM_FUNC_END(__fstate_restore)
__access_func(f31)
-#ifdef CONFIG_RISCV_MISALIGNED
+#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
/*
* Disable compressed instructions set to keep a constant offset between FP
@@ -224,4 +224,4 @@ SYM_FUNC_START(get_f64_reg)
fp_access_epilogue
SYM_FUNC_END(get_f64_reg)
-#endif /* CONFIG_RISCV_MISALIGNED */
+#endif /* CONFIG_RISCV_SCALAR_MISALIGNED */
--
2.45.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 4/6] RISC-V: Detect unaligned vector accesses supported
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
` (2 preceding siblings ...)
2024-10-17 19:00 ` [PATCH v10 3/6] RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED Charlie Jenkins
@ 2024-10-17 19:00 ` Charlie Jenkins
2024-10-18 23:48 ` Jesse T
2024-10-17 19:00 ` [PATCH v10 5/6] RISC-V: Report vector unaligned access speed hwprobe Charlie Jenkins
` (2 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: Charlie Jenkins @ 2024-10-17 19:00 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet
Cc: Palmer Dabbelt, linux-riscv, linux-kernel, linux-doc,
Charlie Jenkins, Jesse Taube
From: Jesse Taube <jesse@rivosinc.com>
Run an unaligned vector access to test if the system supports
vector unaligned access. Add the result to a new key in hwprobe.
This is useful for usermode to know if vector misaligned accesses are
supported and if they are faster or slower than equivalent byte accesses.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
arch/riscv/Kconfig | 36 +++++++++
arch/riscv/include/asm/cpufeature.h | 8 +-
arch/riscv/include/asm/entry-common.h | 11 ---
arch/riscv/include/asm/hwprobe.h | 2 +-
arch/riscv/include/asm/vector.h | 2 +
arch/riscv/include/uapi/asm/hwprobe.h | 5 ++
arch/riscv/kernel/Makefile | 4 +-
arch/riscv/kernel/sys_hwprobe.c | 35 ++++++++
arch/riscv/kernel/traps_misaligned.c | 125 ++++++++++++++++++++++++++++-
arch/riscv/kernel/unaligned_access_speed.c | 22 ++---
arch/riscv/kernel/vector.c | 2 +-
11 files changed, 222 insertions(+), 30 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 2d963d4a26d7..93f9a2958de7 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -765,12 +765,26 @@ config THREAD_SIZE_ORDER
Specify the Pages of thread stack size (from 4KB to 64KB), which also
affects irq stack size, which is equal to thread stack size.
+config RISCV_MISALIGNED
+ bool
+ help
+ Embed support for detecting and emulating misaligned
+ scalar or vector loads and stores.
+
config RISCV_SCALAR_MISALIGNED
bool
+ select RISCV_MISALIGNED
select SYSCTL_ARCH_UNALIGN_ALLOW
help
Embed support for emulating misaligned loads and stores.
+config RISCV_VECTOR_MISALIGNED
+ bool
+ select RISCV_MISALIGNED
+ depends on RISCV_ISA_V
+ help
+ Enable detecting support for vector misaligned loads and stores.
+
choice
prompt "Unaligned Accesses Support"
default RISCV_PROBE_UNALIGNED_ACCESS
@@ -822,6 +836,28 @@ config RISCV_EFFICIENT_UNALIGNED_ACCESS
endchoice
+choice
+ prompt "Vector unaligned Accesses Support"
+ depends on RISCV_ISA_V
+ default RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
+ help
+ This determines the level of support for vector unaligned accesses. This
+ information is used by the kernel to perform optimizations. It is also
+ exposed to user space via the hwprobe syscall. The hardware will be
+ probed at boot by default.
+
+config RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
+ bool "Probe speed of vector unaligned accesses"
+ select RISCV_VECTOR_MISALIGNED
+ depends on RISCV_ISA_V
+ help
+ During boot, the kernel will run a series of tests to determine the
+ speed of vector unaligned accesses if they are supported. This probing
+ will dynamically determine the speed of vector unaligned accesses on
+ the underlying system if they are supported.
+
+endchoice
+
source "arch/riscv/Kconfig.vendor"
endmenu # "Platform type"
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index ccc6cf141c20..85bf1bce51e6 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -59,8 +59,8 @@ void riscv_user_isa_enable(void);
#define __RISCV_ISA_EXT_SUPERSET_VALIDATE(_name, _id, _sub_exts, _validate) \
_RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate)
-#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
bool check_unaligned_access_emulated_all_cpus(void);
+#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
void check_unaligned_access_emulated(struct work_struct *work __always_unused);
void unaligned_emulation_finish(void);
bool unaligned_ctl_available(void);
@@ -72,6 +72,12 @@ static inline bool unaligned_ctl_available(void)
}
#endif
+bool check_vector_unaligned_access_emulated_all_cpus(void);
+#if defined(CONFIG_RISCV_VECTOR_MISALIGNED)
+void check_vector_unaligned_access_emulated(struct work_struct *work __always_unused);
+DECLARE_PER_CPU(long, vector_misaligned_access);
+#endif
+
#if defined(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS)
DECLARE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);
diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h
index 0a4e3544c877..7b32d2b08bb6 100644
--- a/arch/riscv/include/asm/entry-common.h
+++ b/arch/riscv/include/asm/entry-common.h
@@ -25,18 +25,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
void handle_page_fault(struct pt_regs *regs);
void handle_break(struct pt_regs *regs);
-#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
int handle_misaligned_load(struct pt_regs *regs);
int handle_misaligned_store(struct pt_regs *regs);
-#else
-static inline int handle_misaligned_load(struct pt_regs *regs)
-{
- return -1;
-}
-static inline int handle_misaligned_store(struct pt_regs *regs)
-{
- return -1;
-}
-#endif
#endif /* _ASM_RISCV_ENTRY_COMMON_H */
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
index ffb9484531af..1ce1df6d0ff3 100644
--- a/arch/riscv/include/asm/hwprobe.h
+++ b/arch/riscv/include/asm/hwprobe.h
@@ -8,7 +8,7 @@
#include <uapi/asm/hwprobe.h>
-#define RISCV_HWPROBE_MAX_KEY 9
+#define RISCV_HWPROBE_MAX_KEY 10
static inline bool riscv_hwprobe_key_is_valid(__s64 key)
{
diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
index be7d309cca8a..c7c023afbacd 100644
--- a/arch/riscv/include/asm/vector.h
+++ b/arch/riscv/include/asm/vector.h
@@ -21,6 +21,7 @@
extern unsigned long riscv_v_vsize;
int riscv_v_setup_vsize(void);
+bool insn_is_vector(u32 insn_buf);
bool riscv_v_first_use_handler(struct pt_regs *regs);
void kernel_vector_begin(void);
void kernel_vector_end(void);
@@ -268,6 +269,7 @@ struct pt_regs;
static inline int riscv_v_setup_vsize(void) { return -EOPNOTSUPP; }
static __always_inline bool has_vector(void) { return false; }
+static __always_inline bool insn_is_vector(u32 insn_buf) { return false; }
static inline bool riscv_v_first_use_handler(struct pt_regs *regs) { return false; }
static inline bool riscv_v_vstate_query(struct pt_regs *regs) { return false; }
static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; }
diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
index 1e153cda57db..34c88c15322c 100644
--- a/arch/riscv/include/uapi/asm/hwprobe.h
+++ b/arch/riscv/include/uapi/asm/hwprobe.h
@@ -88,6 +88,11 @@ struct riscv_hwprobe {
#define RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW 2
#define RISCV_HWPROBE_MISALIGNED_SCALAR_FAST 3
#define RISCV_HWPROBE_MISALIGNED_SCALAR_UNSUPPORTED 4
+#define RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF 10
+#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN 0
+#define RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW 2
+#define RISCV_HWPROBE_MISALIGNED_VECTOR_FAST 3
+#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED 4
/* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
/* Flags */
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 45624c5ea86c..7f88cc4931f5 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -68,8 +68,8 @@ obj-y += probes/
obj-y += tests/
obj-$(CONFIG_MMU) += vdso.o vdso/
-obj-$(CONFIG_RISCV_SCALAR_MISALIGNED) += traps_misaligned.o
-obj-$(CONFIG_RISCV_SCALAR_MISALIGNED) += unaligned_access_speed.o
+obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o
+obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
obj-$(CONFIG_FPU) += fpu.o
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index cea0ca2bf2a2..6441baada36b 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -201,6 +201,37 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus)
}
#endif
+#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
+static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
+{
+ int cpu;
+ u64 perf = -1ULL;
+
+ /* Return if supported or not even if speed wasn't probed */
+ for_each_cpu(cpu, cpus) {
+ int this_perf = per_cpu(vector_misaligned_access, cpu);
+
+ if (perf == -1ULL)
+ perf = this_perf;
+
+ if (perf != this_perf) {
+ perf = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
+ break;
+ }
+ }
+
+ if (perf == -1ULL)
+ return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
+
+ return perf;
+}
+#else
+static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
+{
+ return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
+}
+#endif
+
static void hwprobe_one_pair(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
{
@@ -229,6 +260,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair,
pair->value = hwprobe_misaligned(cpus);
break;
+ case RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF:
+ pair->value = hwprobe_vec_misaligned(cpus);
+ break;
+
case RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE:
pair->value = 0;
if (hwprobe_ext0_has(cpus, RISCV_HWPROBE_EXT_ZICBOZ))
diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
index d076dde5ad20..ef59ecfc64cb 100644
--- a/arch/riscv/kernel/traps_misaligned.c
+++ b/arch/riscv/kernel/traps_misaligned.c
@@ -16,6 +16,7 @@
#include <asm/entry-common.h>
#include <asm/hwprobe.h>
#include <asm/cpufeature.h>
+#include <asm/vector.h>
#define INSN_MATCH_LB 0x3
#define INSN_MASK_LB 0x707f
@@ -322,12 +323,37 @@ union reg_data {
u64 data_u64;
};
-static bool unaligned_ctl __read_mostly;
-
/* sysctl hooks */
int unaligned_enabled __read_mostly = 1; /* Enabled by default */
-int handle_misaligned_load(struct pt_regs *regs)
+#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
+static int handle_vector_misaligned_load(struct pt_regs *regs)
+{
+ unsigned long epc = regs->epc;
+ unsigned long insn;
+
+ if (get_insn(regs, epc, &insn))
+ return -1;
+
+ /* Only return 0 when in check_vector_unaligned_access_emulated */
+ if (*this_cpu_ptr(&vector_misaligned_access) == RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN) {
+ *this_cpu_ptr(&vector_misaligned_access) = RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED;
+ regs->epc = epc + INSN_LEN(insn);
+ return 0;
+ }
+
+ /* If vector instruction we don't emulate it yet */
+ regs->epc = epc;
+ return -1;
+}
+#else
+static int handle_vector_misaligned_load(struct pt_regs *regs)
+{
+ return -1;
+}
+#endif
+
+static int handle_scalar_misaligned_load(struct pt_regs *regs)
{
union reg_data val;
unsigned long epc = regs->epc;
@@ -435,7 +461,7 @@ int handle_misaligned_load(struct pt_regs *regs)
return 0;
}
-int handle_misaligned_store(struct pt_regs *regs)
+static int handle_scalar_misaligned_store(struct pt_regs *regs)
{
union reg_data val;
unsigned long epc = regs->epc;
@@ -526,6 +552,91 @@ int handle_misaligned_store(struct pt_regs *regs)
return 0;
}
+int handle_misaligned_load(struct pt_regs *regs)
+{
+ unsigned long epc = regs->epc;
+ unsigned long insn;
+
+ if (IS_ENABLED(CONFIG_RISCV_VECTOR_MISALIGNED)) {
+ if (get_insn(regs, epc, &insn))
+ return -1;
+
+ if (insn_is_vector(insn))
+ return handle_vector_misaligned_load(regs);
+ }
+
+ if (IS_ENABLED(CONFIG_RISCV_SCALAR_MISALIGNED))
+ return handle_scalar_misaligned_load(regs);
+
+ return -1;
+}
+
+int handle_misaligned_store(struct pt_regs *regs)
+{
+ if (IS_ENABLED(CONFIG_RISCV_SCALAR_MISALIGNED))
+ return handle_scalar_misaligned_store(regs);
+
+ return -1;
+}
+
+#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
+void check_vector_unaligned_access_emulated(struct work_struct *work __always_unused)
+{
+ long *mas_ptr = this_cpu_ptr(&vector_misaligned_access);
+ unsigned long tmp_var;
+
+ *mas_ptr = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
+
+ kernel_vector_begin();
+ /*
+ * In pre-13.0.0 versions of GCC, vector registers cannot appear in
+ * the clobber list. This inline asm clobbers v0, but since we do not
+ * currently build the kernel with V enabled, the v0 clobber arg is not
+ * needed (as the compiler will not emit vector code itself). If the kernel
+ * is changed to build with V enabled, the clobber arg will need to be
+ * added here.
+ */
+ __asm__ __volatile__ (
+ ".balign 4\n\t"
+ ".option push\n\t"
+ ".option arch, +zve32x\n\t"
+ " vsetivli zero, 1, e16, m1, ta, ma\n\t" // Vectors of 16b
+ " vle16.v v0, (%[ptr])\n\t" // Load bytes
+ ".option pop\n\t"
+ : : [ptr] "r" ((u8 *)&tmp_var + 1));
+ kernel_vector_end();
+}
+
+bool check_vector_unaligned_access_emulated_all_cpus(void)
+{
+ int cpu;
+
+ if (!has_vector()) {
+ for_each_online_cpu(cpu)
+ per_cpu(vector_misaligned_access, cpu) = RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED;
+ return false;
+ }
+
+ schedule_on_each_cpu(check_vector_unaligned_access_emulated);
+
+ for_each_online_cpu(cpu)
+ if (per_cpu(vector_misaligned_access, cpu)
+ == RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN)
+ return false;
+
+ return true;
+}
+#else
+bool check_vector_unaligned_access_emulated_all_cpus(void)
+{
+ return false;
+}
+#endif
+
+#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
+
+static bool unaligned_ctl __read_mostly;
+
void check_unaligned_access_emulated(struct work_struct *work __always_unused)
{
int cpu = smp_processor_id();
@@ -574,3 +685,9 @@ bool unaligned_ctl_available(void)
{
return unaligned_ctl;
}
+#else
+bool check_unaligned_access_emulated_all_cpus(void)
+{
+ return false;
+}
+#endif
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index f3508cc54f91..0b8b5e17453a 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -19,7 +19,8 @@
#define MISALIGNED_BUFFER_ORDER get_order(MISALIGNED_BUFFER_SIZE)
#define MISALIGNED_COPY_SIZE ((MISALIGNED_BUFFER_SIZE / 2) - 0x80)
-DEFINE_PER_CPU(long, misaligned_access_speed);
+DEFINE_PER_CPU(long, misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
+DEFINE_PER_CPU(long, vector_misaligned_access) = RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED;
#ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS
static cpumask_t fast_misaligned_access;
@@ -260,23 +261,24 @@ static int check_unaligned_access_speed_all_cpus(void)
kfree(bufs);
return 0;
}
+#else /* CONFIG_RISCV_PROBE_UNALIGNED_ACCESS */
+static int check_unaligned_access_speed_all_cpus(void)
+{
+ return 0;
+}
+#endif
static int check_unaligned_access_all_cpus(void)
{
- bool all_cpus_emulated = check_unaligned_access_emulated_all_cpus();
+ bool all_cpus_emulated;
+
+ all_cpus_emulated = check_unaligned_access_emulated_all_cpus();
+ check_vector_unaligned_access_emulated_all_cpus();
if (!all_cpus_emulated)
return check_unaligned_access_speed_all_cpus();
return 0;
}
-#else /* CONFIG_RISCV_PROBE_UNALIGNED_ACCESS */
-static int check_unaligned_access_all_cpus(void)
-{
- check_unaligned_access_emulated_all_cpus();
-
- return 0;
-}
-#endif
arch_initcall(check_unaligned_access_all_cpus);
diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
index 682b3feee451..821818886fab 100644
--- a/arch/riscv/kernel/vector.c
+++ b/arch/riscv/kernel/vector.c
@@ -66,7 +66,7 @@ void __init riscv_v_setup_ctx_cache(void)
#endif
}
-static bool insn_is_vector(u32 insn_buf)
+bool insn_is_vector(u32 insn_buf)
{
u32 opcode = insn_buf & __INSN_OPCODE_MASK;
u32 width, csr;
--
2.45.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v10 4/6] RISC-V: Detect unaligned vector accesses supported
2024-10-17 19:00 ` [PATCH v10 4/6] RISC-V: Detect unaligned vector accesses supported Charlie Jenkins
@ 2024-10-18 23:48 ` Jesse T
0 siblings, 0 replies; 9+ messages in thread
From: Jesse T @ 2024-10-18 23:48 UTC (permalink / raw)
To: Charlie Jenkins
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet, Palmer Dabbelt, linux-riscv,
linux-kernel, linux-doc, Jesse Taube
On Thu, Oct 17, 2024 at 3:02 PM Charlie Jenkins <charlie@rivosinc.com> wrote:
>
> From: Jesse Taube <jesse@rivosinc.com>
>
> Run an unaligned vector access to test if the system supports
> vector unaligned access. Add the result to a new key in hwprobe.
> This is useful for usermode to know if vector misaligned accesses are
> supported and if they are faster or slower than equivalent byte accesses.
>
> Signed-off-by: Jesse Taube <jesse@rivosinc.com>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> ---
> arch/riscv/Kconfig | 36 +++++++++
> arch/riscv/include/asm/cpufeature.h | 8 +-
> arch/riscv/include/asm/entry-common.h | 11 ---
> arch/riscv/include/asm/hwprobe.h | 2 +-
> arch/riscv/include/asm/vector.h | 2 +
> arch/riscv/include/uapi/asm/hwprobe.h | 5 ++
> arch/riscv/kernel/Makefile | 4 +-
> arch/riscv/kernel/sys_hwprobe.c | 35 ++++++++
> arch/riscv/kernel/traps_misaligned.c | 125 ++++++++++++++++++++++++++++-
> arch/riscv/kernel/unaligned_access_speed.c | 22 ++---
> arch/riscv/kernel/vector.c | 2 +-
> 11 files changed, 222 insertions(+), 30 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 2d963d4a26d7..93f9a2958de7 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -765,12 +765,26 @@ config THREAD_SIZE_ORDER
> Specify the Pages of thread stack size (from 4KB to 64KB), which also
> affects irq stack size, which is equal to thread stack size.
>
> +config RISCV_MISALIGNED
> + bool
> + help
> + Embed support for detecting and emulating misaligned
> + scalar or vector loads and stores.
> +
> config RISCV_SCALAR_MISALIGNED
> bool
> + select RISCV_MISALIGNED
> select SYSCTL_ARCH_UNALIGN_ALLOW
> help
> Embed support for emulating misaligned loads and stores.
>
> +config RISCV_VECTOR_MISALIGNED
> + bool
> + select RISCV_MISALIGNED
> + depends on RISCV_ISA_V
> + help
> + Enable detecting support for vector misaligned loads and stores.
> +
> choice
> prompt "Unaligned Accesses Support"
> default RISCV_PROBE_UNALIGNED_ACCESS
> @@ -822,6 +836,28 @@ config RISCV_EFFICIENT_UNALIGNED_ACCESS
>
> endchoice
>
> +choice
> + prompt "Vector unaligned Accesses Support"
> + depends on RISCV_ISA_V
> + default RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
> + help
> + This determines the level of support for vector unaligned accesses. This
> + information is used by the kernel to perform optimizations. It is also
> + exposed to user space via the hwprobe syscall. The hardware will be
> + probed at boot by default.
> +
> +config RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
> + bool "Probe speed of vector unaligned accesses"
> + select RISCV_VECTOR_MISALIGNED
> + depends on RISCV_ISA_V
> + help
> + During boot, the kernel will run a series of tests to determine the
> + speed of vector unaligned accesses if they are supported. This probing
> + will dynamically determine the speed of vector unaligned accesses on
> + the underlying system if they are supported.
> +
> +endchoice
> +
> source "arch/riscv/Kconfig.vendor"
>
> endmenu # "Platform type"
> diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
> index ccc6cf141c20..85bf1bce51e6 100644
> --- a/arch/riscv/include/asm/cpufeature.h
> +++ b/arch/riscv/include/asm/cpufeature.h
> @@ -59,8 +59,8 @@ void riscv_user_isa_enable(void);
> #define __RISCV_ISA_EXT_SUPERSET_VALIDATE(_name, _id, _sub_exts, _validate) \
> _RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate)
>
> -#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
> bool check_unaligned_access_emulated_all_cpus(void);
> +#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
> void check_unaligned_access_emulated(struct work_struct *work __always_unused);
> void unaligned_emulation_finish(void);
> bool unaligned_ctl_available(void);
> @@ -72,6 +72,12 @@ static inline bool unaligned_ctl_available(void)
> }
> #endif
>
> +bool check_vector_unaligned_access_emulated_all_cpus(void);
> +#if defined(CONFIG_RISCV_VECTOR_MISALIGNED)
> +void check_vector_unaligned_access_emulated(struct work_struct *work __always_unused);
> +DECLARE_PER_CPU(long, vector_misaligned_access);
> +#endif
> +
> #if defined(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS)
> DECLARE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);
>
> diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h
> index 0a4e3544c877..7b32d2b08bb6 100644
> --- a/arch/riscv/include/asm/entry-common.h
> +++ b/arch/riscv/include/asm/entry-common.h
> @@ -25,18 +25,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
> void handle_page_fault(struct pt_regs *regs);
> void handle_break(struct pt_regs *regs);
>
> -#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
> int handle_misaligned_load(struct pt_regs *regs);
> int handle_misaligned_store(struct pt_regs *regs);
> -#else
> -static inline int handle_misaligned_load(struct pt_regs *regs)
> -{
> - return -1;
> -}
> -static inline int handle_misaligned_store(struct pt_regs *regs)
> -{
> - return -1;
> -}
> -#endif
>
> #endif /* _ASM_RISCV_ENTRY_COMMON_H */
> diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
> index ffb9484531af..1ce1df6d0ff3 100644
> --- a/arch/riscv/include/asm/hwprobe.h
> +++ b/arch/riscv/include/asm/hwprobe.h
> @@ -8,7 +8,7 @@
>
> #include <uapi/asm/hwprobe.h>
>
> -#define RISCV_HWPROBE_MAX_KEY 9
> +#define RISCV_HWPROBE_MAX_KEY 10
>
> static inline bool riscv_hwprobe_key_is_valid(__s64 key)
> {
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index be7d309cca8a..c7c023afbacd 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -21,6 +21,7 @@
>
> extern unsigned long riscv_v_vsize;
> int riscv_v_setup_vsize(void);
> +bool insn_is_vector(u32 insn_buf);
> bool riscv_v_first_use_handler(struct pt_regs *regs);
> void kernel_vector_begin(void);
> void kernel_vector_end(void);
> @@ -268,6 +269,7 @@ struct pt_regs;
>
> static inline int riscv_v_setup_vsize(void) { return -EOPNOTSUPP; }
> static __always_inline bool has_vector(void) { return false; }
> +static __always_inline bool insn_is_vector(u32 insn_buf) { return false; }
> static inline bool riscv_v_first_use_handler(struct pt_regs *regs) { return false; }
> static inline bool riscv_v_vstate_query(struct pt_regs *regs) { return false; }
> static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; }
> diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h
> index 1e153cda57db..34c88c15322c 100644
> --- a/arch/riscv/include/uapi/asm/hwprobe.h
> +++ b/arch/riscv/include/uapi/asm/hwprobe.h
> @@ -88,6 +88,11 @@ struct riscv_hwprobe {
> #define RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW 2
> #define RISCV_HWPROBE_MISALIGNED_SCALAR_FAST 3
> #define RISCV_HWPROBE_MISALIGNED_SCALAR_UNSUPPORTED 4
> +#define RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF 10
> +#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN 0
> +#define RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW 2
> +#define RISCV_HWPROBE_MISALIGNED_VECTOR_FAST 3
> +#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED 4
> /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
>
> /* Flags */
> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
> index 45624c5ea86c..7f88cc4931f5 100644
> --- a/arch/riscv/kernel/Makefile
> +++ b/arch/riscv/kernel/Makefile
> @@ -68,8 +68,8 @@ obj-y += probes/
> obj-y += tests/
> obj-$(CONFIG_MMU) += vdso.o vdso/
>
> -obj-$(CONFIG_RISCV_SCALAR_MISALIGNED) += traps_misaligned.o
> -obj-$(CONFIG_RISCV_SCALAR_MISALIGNED) += unaligned_access_speed.o
> +obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o
> +obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
> obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
>
> obj-$(CONFIG_FPU) += fpu.o
> diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
> index cea0ca2bf2a2..6441baada36b 100644
> --- a/arch/riscv/kernel/sys_hwprobe.c
> +++ b/arch/riscv/kernel/sys_hwprobe.c
> @@ -201,6 +201,37 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus)
> }
> #endif
>
> +#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
> +static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
> +{
> + int cpu;
> + u64 perf = -1ULL;
> +
> + /* Return if supported or not even if speed wasn't probed */
> + for_each_cpu(cpu, cpus) {
> + int this_perf = per_cpu(vector_misaligned_access, cpu);
> +
> + if (perf == -1ULL)
> + perf = this_perf;
> +
> + if (perf != this_perf) {
> + perf = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
> + break;
> + }
> + }
> +
> + if (perf == -1ULL)
> + return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
> +
> + return perf;
> +}
> +#else
> +static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
> +{
> + return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
> +}
> +#endif
> +
> static void hwprobe_one_pair(struct riscv_hwprobe *pair,
> const struct cpumask *cpus)
> {
> @@ -229,6 +260,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair,
> pair->value = hwprobe_misaligned(cpus);
> break;
>
> + case RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF:
> + pair->value = hwprobe_vec_misaligned(cpus);
> + break;
> +
> case RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE:
> pair->value = 0;
> if (hwprobe_ext0_has(cpus, RISCV_HWPROBE_EXT_ZICBOZ))
> diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c
> index d076dde5ad20..ef59ecfc64cb 100644
> --- a/arch/riscv/kernel/traps_misaligned.c
> +++ b/arch/riscv/kernel/traps_misaligned.c
> @@ -16,6 +16,7 @@
> #include <asm/entry-common.h>
> #include <asm/hwprobe.h>
> #include <asm/cpufeature.h>
> +#include <asm/vector.h>
>
> #define INSN_MATCH_LB 0x3
> #define INSN_MASK_LB 0x707f
> @@ -322,12 +323,37 @@ union reg_data {
> u64 data_u64;
> };
>
> -static bool unaligned_ctl __read_mostly;
> -
> /* sysctl hooks */
> int unaligned_enabled __read_mostly = 1; /* Enabled by default */
>
> -int handle_misaligned_load(struct pt_regs *regs)
> +#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
> +static int handle_vector_misaligned_load(struct pt_regs *regs)
> +{
> + unsigned long epc = regs->epc;
> + unsigned long insn;
> +
> + if (get_insn(regs, epc, &insn))
> + return -1;
> +
> + /* Only return 0 when in check_vector_unaligned_access_emulated */
> + if (*this_cpu_ptr(&vector_misaligned_access) == RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN) {
> + *this_cpu_ptr(&vector_misaligned_access) = RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED;
> + regs->epc = epc + INSN_LEN(insn);
> + return 0;
> + }
> +
> + /* If vector instruction we don't emulate it yet */
> + regs->epc = epc;
> + return -1;
> +}
> +#else
> +static int handle_vector_misaligned_load(struct pt_regs *regs)
> +{
> + return -1;
> +}
> +#endif
> +
> +static int handle_scalar_misaligned_load(struct pt_regs *regs)
> {
> union reg_data val;
> unsigned long epc = regs->epc;
> @@ -435,7 +461,7 @@ int handle_misaligned_load(struct pt_regs *regs)
> return 0;
> }
>
> -int handle_misaligned_store(struct pt_regs *regs)
> +static int handle_scalar_misaligned_store(struct pt_regs *regs)
> {
> union reg_data val;
> unsigned long epc = regs->epc;
> @@ -526,6 +552,91 @@ int handle_misaligned_store(struct pt_regs *regs)
> return 0;
> }
>
> +int handle_misaligned_load(struct pt_regs *regs)
> +{
> + unsigned long epc = regs->epc;
> + unsigned long insn;
> +
> + if (IS_ENABLED(CONFIG_RISCV_VECTOR_MISALIGNED)) {
> + if (get_insn(regs, epc, &insn))
> + return -1;
> +
> + if (insn_is_vector(insn))
> + return handle_vector_misaligned_load(regs);
> + }
> +
> + if (IS_ENABLED(CONFIG_RISCV_SCALAR_MISALIGNED))
> + return handle_scalar_misaligned_load(regs);
> +
> + return -1;
> +}
> +
> +int handle_misaligned_store(struct pt_regs *regs)
> +{
> + if (IS_ENABLED(CONFIG_RISCV_SCALAR_MISALIGNED))
> + return handle_scalar_misaligned_store(regs);
> +
> + return -1;
> +}
> +
> +#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
> +void check_vector_unaligned_access_emulated(struct work_struct *work __always_unused)
> +{
> + long *mas_ptr = this_cpu_ptr(&vector_misaligned_access);
> + unsigned long tmp_var;
> +
> + *mas_ptr = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
> +
> + kernel_vector_begin();
> + /*
> + * In pre-13.0.0 versions of GCC, vector registers cannot appear in
> + * the clobber list. This inline asm clobbers v0, but since we do not
> + * currently build the kernel with V enabled, the v0 clobber arg is not
> + * needed (as the compiler will not emit vector code itself). If the kernel
> + * is changed to build with V enabled, the clobber arg will need to be
> + * added here.
> + */
Interesting. thanks for the fix!
> + __asm__ __volatile__ (
> + ".balign 4\n\t"
> + ".option push\n\t"
> + ".option arch, +zve32x\n\t"
> + " vsetivli zero, 1, e16, m1, ta, ma\n\t" // Vectors of 16b
> + " vle16.v v0, (%[ptr])\n\t" // Load bytes
> + ".option pop\n\t"
> + : : [ptr] "r" ((u8 *)&tmp_var + 1));
> + kernel_vector_end();
> +}
> +
> +bool check_vector_unaligned_access_emulated_all_cpus(void)
> +{
> + int cpu;
> +
> + if (!has_vector()) {
> + for_each_online_cpu(cpu)
> + per_cpu(vector_misaligned_access, cpu) = RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED;
> + return false;
> + }
> +
> + schedule_on_each_cpu(check_vector_unaligned_access_emulated);
> +
> + for_each_online_cpu(cpu)
> + if (per_cpu(vector_misaligned_access, cpu)
> + == RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN)
> + return false;
> +
> + return true;
> +}
> +#else
> +bool check_vector_unaligned_access_emulated_all_cpus(void)
> +{
> + return false;
> +}
> +#endif
> +
> +#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
> +
> +static bool unaligned_ctl __read_mostly;
> +
> void check_unaligned_access_emulated(struct work_struct *work __always_unused)
> {
> int cpu = smp_processor_id();
> @@ -574,3 +685,9 @@ bool unaligned_ctl_available(void)
> {
> return unaligned_ctl;
> }
> +#else
> +bool check_unaligned_access_emulated_all_cpus(void)
> +{
> + return false;
> +}
> +#endif
> diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
> index f3508cc54f91..0b8b5e17453a 100644
> --- a/arch/riscv/kernel/unaligned_access_speed.c
> +++ b/arch/riscv/kernel/unaligned_access_speed.c
> @@ -19,7 +19,8 @@
> #define MISALIGNED_BUFFER_ORDER get_order(MISALIGNED_BUFFER_SIZE)
> #define MISALIGNED_COPY_SIZE ((MISALIGNED_BUFFER_SIZE / 2) - 0x80)
>
> -DEFINE_PER_CPU(long, misaligned_access_speed);
> +DEFINE_PER_CPU(long, misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN;
> +DEFINE_PER_CPU(long, vector_misaligned_access) = RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED;
>
> #ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS
> static cpumask_t fast_misaligned_access;
> @@ -260,23 +261,24 @@ static int check_unaligned_access_speed_all_cpus(void)
> kfree(bufs);
> return 0;
> }
> +#else /* CONFIG_RISCV_PROBE_UNALIGNED_ACCESS */
> +static int check_unaligned_access_speed_all_cpus(void)
> +{
> + return 0;
> +}
> +#endif
>
> static int check_unaligned_access_all_cpus(void)
> {
> - bool all_cpus_emulated = check_unaligned_access_emulated_all_cpus();
> + bool all_cpus_emulated;
> +
> + all_cpus_emulated = check_unaligned_access_emulated_all_cpus();
> + check_vector_unaligned_access_emulated_all_cpus();
>
> if (!all_cpus_emulated)
> return check_unaligned_access_speed_all_cpus();
>
> return 0;
> }
> -#else /* CONFIG_RISCV_PROBE_UNALIGNED_ACCESS */
> -static int check_unaligned_access_all_cpus(void)
> -{
> - check_unaligned_access_emulated_all_cpus();
> -
> - return 0;
> -}
> -#endif
>
> arch_initcall(check_unaligned_access_all_cpus);
> diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
> index 682b3feee451..821818886fab 100644
> --- a/arch/riscv/kernel/vector.c
> +++ b/arch/riscv/kernel/vector.c
> @@ -66,7 +66,7 @@ void __init riscv_v_setup_ctx_cache(void)
> #endif
> }
>
> -static bool insn_is_vector(u32 insn_buf)
> +bool insn_is_vector(u32 insn_buf)
> {
> u32 opcode = insn_buf & __INSN_OPCODE_MASK;
> u32 width, csr;
>
> --
> 2.45.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v10 5/6] RISC-V: Report vector unaligned access speed hwprobe
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
` (3 preceding siblings ...)
2024-10-17 19:00 ` [PATCH v10 4/6] RISC-V: Detect unaligned vector accesses supported Charlie Jenkins
@ 2024-10-17 19:00 ` Charlie Jenkins
2024-10-17 19:00 ` [PATCH v10 6/6] RISC-V: hwprobe: Document unaligned vector perf key Charlie Jenkins
2024-10-24 17:50 ` [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses patchwork-bot+linux-riscv
6 siblings, 0 replies; 9+ messages in thread
From: Charlie Jenkins @ 2024-10-17 19:00 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet
Cc: Palmer Dabbelt, linux-riscv, linux-kernel, linux-doc,
Charlie Jenkins, Jesse Taube
From: Jesse Taube <jesse@rivosinc.com>
Detect if vector misaligned accesses are faster or slower than
equivalent vector byte accesses. This is useful for usermode to know
whether vector byte accesses or vector misaligned accesses have a better
bandwidth for operations like memcpy.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
arch/riscv/Kconfig | 18 ++++
arch/riscv/kernel/Makefile | 3 +-
arch/riscv/kernel/copy-unaligned.h | 5 +
arch/riscv/kernel/sys_hwprobe.c | 6 ++
arch/riscv/kernel/unaligned_access_speed.c | 141 ++++++++++++++++++++++++++++-
arch/riscv/kernel/vec-copy-unaligned.S | 58 ++++++++++++
6 files changed, 228 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 93f9a2958de7..c33311fdfc8c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -856,6 +856,24 @@ config RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
will dynamically determine the speed of vector unaligned accesses on
the underlying system if they are supported.
+config RISCV_SLOW_VECTOR_UNALIGNED_ACCESS
+ bool "Assume the system supports slow vector unaligned memory accesses"
+ depends on NONPORTABLE
+ help
+ Assume that the system supports slow vector unaligned memory accesses. The
+ kernel and userspace programs may not be able to run at all on systems
+ that do not support unaligned memory accesses.
+
+config RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
+ bool "Assume the system supports fast vector unaligned memory accesses"
+ depends on NONPORTABLE
+ help
+ Assume that the system supports fast vector unaligned memory accesses. When
+ enabled, this option improves the performance of the kernel on such
+ systems. However, the kernel and userspace programs will run much more
+ slowly, or will not be able to run at all, on systems that do not
+ support efficient unaligned memory accesses.
+
endchoice
source "arch/riscv/Kconfig.vendor"
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 7f88cc4931f5..30db92672ada 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -70,7 +70,8 @@ obj-$(CONFIG_MMU) += vdso.o vdso/
obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o
obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
-obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
+obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
+obj-$(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS) += vec-copy-unaligned.o
obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_FPU) += kernel_mode_fpu.o
diff --git a/arch/riscv/kernel/copy-unaligned.h b/arch/riscv/kernel/copy-unaligned.h
index e3d70d35b708..85d4d11450cb 100644
--- a/arch/riscv/kernel/copy-unaligned.h
+++ b/arch/riscv/kernel/copy-unaligned.h
@@ -10,4 +10,9 @@
void __riscv_copy_words_unaligned(void *dst, const void *src, size_t size);
void __riscv_copy_bytes_unaligned(void *dst, const void *src, size_t size);
+#ifdef CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
+void __riscv_copy_vec_words_unaligned(void *dst, const void *src, size_t size);
+void __riscv_copy_vec_bytes_unaligned(void *dst, const void *src, size_t size);
+#endif
+
#endif /* __RISCV_KERNEL_COPY_UNALIGNED_H */
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index 6441baada36b..6673278e84d5 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -228,6 +228,12 @@ static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
#else
static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
{
+ if (IS_ENABLED(CONFIG_RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS))
+ return RISCV_HWPROBE_MISALIGNED_VECTOR_FAST;
+
+ if (IS_ENABLED(CONFIG_RISCV_SLOW_VECTOR_UNALIGNED_ACCESS))
+ return RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW;
+
return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
}
#endif
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index 0b8b5e17453a..91f189cf1611 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -6,11 +6,13 @@
#include <linux/cpu.h>
#include <linux/cpumask.h>
#include <linux/jump_label.h>
+#include <linux/kthread.h>
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/types.h>
#include <asm/cpufeature.h>
#include <asm/hwprobe.h>
+#include <asm/vector.h>
#include "copy-unaligned.h"
@@ -268,12 +270,147 @@ static int check_unaligned_access_speed_all_cpus(void)
}
#endif
+#ifdef CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
+static void check_vector_unaligned_access(struct work_struct *work __always_unused)
+{
+ int cpu = smp_processor_id();
+ u64 start_cycles, end_cycles;
+ u64 word_cycles;
+ u64 byte_cycles;
+ int ratio;
+ unsigned long start_jiffies, now;
+ struct page *page;
+ void *dst;
+ void *src;
+ long speed = RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW;
+
+ if (per_cpu(vector_misaligned_access, cpu) != RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN)
+ return;
+
+ page = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER);
+ if (!page) {
+ pr_warn("Allocation failure, not measuring vector misaligned performance\n");
+ return;
+ }
+
+ /* Make an unaligned destination buffer. */
+ dst = (void *)((unsigned long)page_address(page) | 0x1);
+ /* Unalign src as well, but differently (off by 1 + 2 = 3). */
+ src = dst + (MISALIGNED_BUFFER_SIZE / 2);
+ src += 2;
+ word_cycles = -1ULL;
+
+ /* Do a warmup. */
+ kernel_vector_begin();
+ __riscv_copy_vec_words_unaligned(dst, src, MISALIGNED_COPY_SIZE);
+
+ start_jiffies = jiffies;
+ while ((now = jiffies) == start_jiffies)
+ cpu_relax();
+
+ /*
+ * For a fixed amount of time, repeatedly try the function, and take
+ * the best time in cycles as the measurement.
+ */
+ while (time_before(jiffies, now + (1 << MISALIGNED_ACCESS_JIFFIES_LG2))) {
+ start_cycles = get_cycles64();
+ /* Ensure the CSR read can't reorder WRT to the copy. */
+ mb();
+ __riscv_copy_vec_words_unaligned(dst, src, MISALIGNED_COPY_SIZE);
+ /* Ensure the copy ends before the end time is snapped. */
+ mb();
+ end_cycles = get_cycles64();
+ if ((end_cycles - start_cycles) < word_cycles)
+ word_cycles = end_cycles - start_cycles;
+ }
+
+ byte_cycles = -1ULL;
+ __riscv_copy_vec_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE);
+ start_jiffies = jiffies;
+ while ((now = jiffies) == start_jiffies)
+ cpu_relax();
+
+ while (time_before(jiffies, now + (1 << MISALIGNED_ACCESS_JIFFIES_LG2))) {
+ start_cycles = get_cycles64();
+ /* Ensure the CSR read can't reorder WRT to the copy. */
+ mb();
+ __riscv_copy_vec_bytes_unaligned(dst, src, MISALIGNED_COPY_SIZE);
+ /* Ensure the copy ends before the end time is snapped. */
+ mb();
+ end_cycles = get_cycles64();
+ if ((end_cycles - start_cycles) < byte_cycles)
+ byte_cycles = end_cycles - start_cycles;
+ }
+
+ kernel_vector_end();
+
+ /* Don't divide by zero. */
+ if (!word_cycles || !byte_cycles) {
+ pr_warn("cpu%d: rdtime lacks granularity needed to measure unaligned vector access speed\n",
+ cpu);
+
+ return;
+ }
+
+ if (word_cycles < byte_cycles)
+ speed = RISCV_HWPROBE_MISALIGNED_VECTOR_FAST;
+
+ ratio = div_u64((byte_cycles * 100), word_cycles);
+ pr_info("cpu%d: Ratio of vector byte access time to vector unaligned word access is %d.%02d, unaligned accesses are %s\n",
+ cpu,
+ ratio / 100,
+ ratio % 100,
+ (speed == RISCV_HWPROBE_MISALIGNED_VECTOR_FAST) ? "fast" : "slow");
+
+ per_cpu(vector_misaligned_access, cpu) = speed;
+}
+
+static int riscv_online_cpu_vec(unsigned int cpu)
+{
+ if (!has_vector())
+ return 0;
+
+ if (per_cpu(vector_misaligned_access, cpu) != RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED)
+ return 0;
+
+ check_vector_unaligned_access_emulated(NULL);
+ check_vector_unaligned_access(NULL);
+ return 0;
+}
+
+/* Measure unaligned access speed on all CPUs present at boot in parallel. */
+static int vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused)
+{
+ schedule_on_each_cpu(check_vector_unaligned_access);
+
+ /*
+ * Setup hotplug callbacks for any new CPUs that come online or go
+ * offline.
+ */
+ cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, "riscv:online",
+ riscv_online_cpu_vec, NULL);
+
+ return 0;
+}
+#else /* CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS */
+static int vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused)
+{
+ return 0;
+}
+#endif
+
static int check_unaligned_access_all_cpus(void)
{
- bool all_cpus_emulated;
+ bool all_cpus_emulated, all_cpus_vec_unsupported;
all_cpus_emulated = check_unaligned_access_emulated_all_cpus();
- check_vector_unaligned_access_emulated_all_cpus();
+ all_cpus_vec_unsupported = check_vector_unaligned_access_emulated_all_cpus();
+
+ if (!all_cpus_vec_unsupported &&
+ IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) {
+ kthread_run(vec_check_unaligned_access_speed_all_cpus,
+ NULL, "vec_check_unaligned_access_speed_all_cpus");
+ }
if (!all_cpus_emulated)
return check_unaligned_access_speed_all_cpus();
diff --git a/arch/riscv/kernel/vec-copy-unaligned.S b/arch/riscv/kernel/vec-copy-unaligned.S
new file mode 100644
index 000000000000..d16f19f1b3b6
--- /dev/null
+++ b/arch/riscv/kernel/vec-copy-unaligned.S
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2024 Rivos Inc. */
+
+#include <linux/args.h>
+#include <linux/linkage.h>
+#include <asm/asm.h>
+
+ .text
+
+#define WORD_EEW 32
+
+#define WORD_SEW CONCATENATE(e, WORD_EEW)
+#define VEC_L CONCATENATE(vle, WORD_EEW).v
+#define VEC_S CONCATENATE(vle, WORD_EEW).v
+
+/* void __riscv_copy_vec_words_unaligned(void *, const void *, size_t) */
+/* Performs a memcpy without aligning buffers, using word loads and stores. */
+/* Note: The size is truncated to a multiple of WORD_EEW */
+SYM_FUNC_START(__riscv_copy_vec_words_unaligned)
+ andi a4, a2, ~(WORD_EEW-1)
+ beqz a4, 2f
+ add a3, a1, a4
+ .option push
+ .option arch, +zve32x
+1:
+ vsetivli t0, 8, WORD_SEW, m8, ta, ma
+ VEC_L v0, (a1)
+ VEC_S v0, (a0)
+ addi a0, a0, WORD_EEW
+ addi a1, a1, WORD_EEW
+ bltu a1, a3, 1b
+
+2:
+ .option pop
+ ret
+SYM_FUNC_END(__riscv_copy_vec_words_unaligned)
+
+/* void __riscv_copy_vec_bytes_unaligned(void *, const void *, size_t) */
+/* Performs a memcpy without aligning buffers, using only byte accesses. */
+/* Note: The size is truncated to a multiple of 8 */
+SYM_FUNC_START(__riscv_copy_vec_bytes_unaligned)
+ andi a4, a2, ~(8-1)
+ beqz a4, 2f
+ add a3, a1, a4
+ .option push
+ .option arch, +zve32x
+1:
+ vsetivli t0, 8, e8, m8, ta, ma
+ vle8.v v0, (a1)
+ vse8.v v0, (a0)
+ addi a0, a0, 8
+ addi a1, a1, 8
+ bltu a1, a3, 1b
+
+2:
+ .option pop
+ ret
+SYM_FUNC_END(__riscv_copy_vec_bytes_unaligned)
--
2.45.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v10 6/6] RISC-V: hwprobe: Document unaligned vector perf key
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
` (4 preceding siblings ...)
2024-10-17 19:00 ` [PATCH v10 5/6] RISC-V: Report vector unaligned access speed hwprobe Charlie Jenkins
@ 2024-10-17 19:00 ` Charlie Jenkins
2024-10-24 17:50 ` [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses patchwork-bot+linux-riscv
6 siblings, 0 replies; 9+ messages in thread
From: Charlie Jenkins @ 2024-10-17 19:00 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Clément Léger,
Evan Green, Jonathan Corbet
Cc: Palmer Dabbelt, linux-riscv, linux-kernel, linux-doc,
Charlie Jenkins, Jesse Taube
From: Jesse Taube <jesse@rivosinc.com>
Document key for reporting the speed of unaligned vector accesses.
The descriptions are the same as the scalar equivalent values.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
---
Documentation/arch/riscv/hwprobe.rst | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst
index 85b709257918..ea4e0b9c73e7 100644
--- a/Documentation/arch/riscv/hwprobe.rst
+++ b/Documentation/arch/riscv/hwprobe.rst
@@ -274,3 +274,19 @@ The following keys are defined:
represent the highest userspace virtual address usable.
* :c:macro:`RISCV_HWPROBE_KEY_TIME_CSR_FREQ`: Frequency (in Hz) of `time CSR`.
+
+* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF`: An enum value describing the
+ performance of misaligned vector accesses on the selected set of processors.
+
+ * :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN`: The performance of misaligned
+ vector accesses is unknown.
+
+ * :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW`: 32-bit misaligned accesses using vector
+ registers are slower than the equivalent quantity of byte accesses via vector registers.
+ Misaligned accesses may be supported directly in hardware, or trapped and emulated by software.
+
+ * :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_FAST`: 32-bit misaligned accesses using vector
+ registers are faster than the equivalent quantity of byte accesses via vector registers.
+
+ * :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED`: Misaligned vector accesses are
+ not supported at all and will generate a misaligned address fault.
--
2.45.0
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses
2024-10-17 19:00 [PATCH v10 0/6] RISC-V: Detect and report speed of unaligned vector accesses Charlie Jenkins
` (5 preceding siblings ...)
2024-10-17 19:00 ` [PATCH v10 6/6] RISC-V: hwprobe: Document unaligned vector perf key Charlie Jenkins
@ 2024-10-24 17:50 ` patchwork-bot+linux-riscv
6 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+linux-riscv @ 2024-10-24 17:50 UTC (permalink / raw)
To: Charlie Jenkins
Cc: linux-riscv, paul.walmsley, palmer, aou, cleger, evan, corbet,
palmer, linux-kernel, linux-doc, jesse, stable, conor.dooley
Hello:
This series was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:
On Thu, 17 Oct 2024 12:00:17 -0700 you wrote:
> Adds support for detecting and reporting the speed of unaligned vector
> accesses on RISC-V CPUs. Adds vec_misaligned_speed key to the hwprobe
> adds Zicclsm to cpufeature and fixes the check for scalar unaligned
> emulated all CPUs. The vec_misaligned_speed key keeps the same format
> as the scalar unaligned access speed key.
>
> This set does not emulate unaligned vector accesses on CPUs that do not
> support them. Only reports if userspace can run them and speed of
> unaligned vector accesses if supported.
>
> [...]
Here is the summary with links:
- [v10,1/6] RISC-V: Check scalar unaligned access on all CPUs
https://git.kernel.org/riscv/c/8d20a739f17a
- [v10,2/6] RISC-V: Scalar unaligned access emulated on hotplug CPUs
https://git.kernel.org/riscv/c/9c528b5f7927
- [v10,3/6] RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
https://git.kernel.org/riscv/c/c05a62c92516
- [v10,4/6] RISC-V: Detect unaligned vector accesses supported
https://git.kernel.org/riscv/c/d1703dc7bc8e
- [v10,5/6] RISC-V: Report vector unaligned access speed hwprobe
https://git.kernel.org/riscv/c/e7c9d66e313b
- [v10,6/6] RISC-V: hwprobe: Document unaligned vector perf key
https://git.kernel.org/riscv/c/40e09ebd791f
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 9+ messages in thread