* [PATCH v10 0/2] riscv: Add runtime constant support
@ 2025-03-19 18:35 Charlie Jenkins
2025-03-19 18:35 ` [PATCH v10 1/2] riscv: Move nop definition to insn-def.h Charlie Jenkins
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Charlie Jenkins @ 2025-03-19 18:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones
Cc: linux-riscv, linux-kernel, linux-trace-kernel, Charlie Jenkins
Ard brought this to my attention in this patch [1].
I benchmarked this patch on the Nezha D1 (which does not contain Zba or
Zbkb so it uses the default algorithm) by navigating through a large
directory structure. I created a 1000-deep directory structure and then
cd and ls through it. With this patch there was a 0.57% performance
improvement.
[1] https://lore.kernel.org/lkml/CAMj1kXE4DJnwFejNWQu784GvyJO=aGNrzuLjSxiowX_e7nW8QA@mail.gmail.com/
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
Changes in v10:
- Use _AC() instead of just adding U to the end of a constant
- Link to v9: https://lore.kernel.org/r/20250318-runtime_const_riscv-v9-0-ddd3534d3e8e@rivosinc.com
Changes in v9:
- Fix bug where stale register data may be used when an lui is replaced
with a nop. In the following addiw, add to register x0 instead of the
stale register to resolve.
- Add locks for text_mutex before using patch_insn_write()
- Link to v8: https://lore.kernel.org/r/20250305-runtime_const_riscv-v8-1-fa66f3468dac@rivosinc.com
Changes in v8:
- Rebase to linux v6.14-rc5
- Link to v7: https://lore.kernel.org/r/20250218-runtime_const_riscv-v7-1-e431763157ff@rivosinc.com
Changes in v7:
- Added benchmarking info
- Added CONFIG_RISCV_ISA_ZBA and CONFIG_RISCV_ISA_ZBKB to check that the
compiler supports the extensions.
- Link to v6: https://lore.kernel.org/r/20250212-runtime_const_riscv-v6-1-3ef0146b310b@rivosinc.com
Changes in v6:
- .option arch only became officially supported by clang in version 17.
Add a config to check that and guard the alternatives uses .option
arch.
- Link to v5: https://lore.kernel.org/r/20250203-runtime_const_riscv-v5-1-bc61736a3229@rivosinc.com
Changes in v5:
- Split instructions into 16-bit parcels to avoid alignment (Emil)
- Link to v4: https://lore.kernel.org/r/20250130-runtime_const_riscv-v4-1-2d36c41b7b9c@rivosinc.com
Changes in v4:
- Add newlines after riscv32 assembler directives
- Align instructions along 32-bit boundary (Emil)
- Link to v3: https://lore.kernel.org/r/20250128-runtime_const_riscv-v3-1-11922989e2d3@rivosinc.com
Changes in v3:
- Leverage "pack" instruction for runtime_const_ptr() to reduce hot path
by 3 instructions if Zbkb is supported. Suggested by Pasha Bouzarjomehri (pasha@rivosinc.com)
- Link to v2: https://lore.kernel.org/r/20250127-runtime_const_riscv-v2-1-95ae7cf97a39@rivosinc.com
Changes in v2:
- Treat instructions as __le32 and do proper conversions (Ben)
- Link to v1: https://lore.kernel.org/r/20250127-runtime_const_riscv-v1-1-795b023ea20b@rivosinc.com
---
Charlie Jenkins (2):
riscv: Move nop definition to insn-def.h
riscv: Add runtime constant support
arch/riscv/Kconfig | 22 +++
arch/riscv/include/asm/asm.h | 1 +
arch/riscv/include/asm/ftrace.h | 1 -
arch/riscv/include/asm/insn-def.h | 3 +
arch/riscv/include/asm/runtime-const.h | 265 +++++++++++++++++++++++++++++++++
arch/riscv/kernel/ftrace.c | 6 +-
arch/riscv/kernel/jump_label.c | 4 +-
arch/riscv/kernel/vmlinux.lds.S | 3 +
8 files changed, 299 insertions(+), 6 deletions(-)
---
base-commit: 2014c95afecee3e76ca4a56956a936e23283f05b
change-id: 20250123-runtime_const_riscv-6cd854ee2817
--
- Charlie
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v10 1/2] riscv: Move nop definition to insn-def.h
2025-03-19 18:35 [PATCH v10 0/2] riscv: Add runtime constant support Charlie Jenkins
@ 2025-03-19 18:35 ` Charlie Jenkins
2025-03-20 9:02 ` Andrew Jones
2025-03-19 18:35 ` [PATCH v10 2/2] riscv: Add runtime constant support Charlie Jenkins
2025-03-27 3:24 ` [PATCH v10 0/2] " patchwork-bot+linux-riscv
2 siblings, 1 reply; 12+ messages in thread
From: Charlie Jenkins @ 2025-03-19 18:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones
Cc: linux-riscv, linux-kernel, linux-trace-kernel, Charlie Jenkins
We have duplicated the definition of the nop instruction in ftrace.h and
in jump_label.c. Move this definition into the generic file insn-def.h
so that they can share the definition with each other and with future
files.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
arch/riscv/include/asm/ftrace.h | 1 -
arch/riscv/include/asm/insn-def.h | 3 +++
arch/riscv/kernel/ftrace.c | 6 +++---
arch/riscv/kernel/jump_label.c | 4 ++--
4 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index c4721ce44ca474654b37b3d51bc0a63d46bc1eff..b7f361a50f6445d02a0d88eef5547ee27c1fb52e 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -79,7 +79,6 @@ struct dyn_arch_ftrace {
#define AUIPC_RA (0x00000097)
#define JALR_T0 (0x000282e7)
#define AUIPC_T0 (0x00000297)
-#define NOP4 (0x00000013)
#define to_jalr_t0(offset) \
(((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_T0)
diff --git a/arch/riscv/include/asm/insn-def.h b/arch/riscv/include/asm/insn-def.h
index 9a913010cdd93cdfdd93f467e7880e20cce0dd2b..71060a2f838e24200e3eb4ad8dfb32ef6bd2f57a 100644
--- a/arch/riscv/include/asm/insn-def.h
+++ b/arch/riscv/include/asm/insn-def.h
@@ -199,5 +199,8 @@
#define RISCV_PAUSE ".4byte 0x100000f"
#define ZAWRS_WRS_NTO ".4byte 0x00d00073"
#define ZAWRS_WRS_STO ".4byte 0x01d00073"
+#define RISCV_NOP4 ".4byte 0x00000013"
+
+#define RISCV_INSN_NOP4 _AC(0x00000013, U)
#endif /* __ASM_INSN_DEF_H */
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 3524db5e4fa014a4594465f849d898a030bfb7b8..674dcdfae7a149c339f1e791adb450535f22991b 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -36,7 +36,7 @@ static int ftrace_check_current_call(unsigned long hook_pos,
unsigned int *expected)
{
unsigned int replaced[2];
- unsigned int nops[2] = {NOP4, NOP4};
+ unsigned int nops[2] = {RISCV_INSN_NOP4, RISCV_INSN_NOP4};
/* we expect nops at the hook position */
if (!expected)
@@ -68,7 +68,7 @@ static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target,
bool enable, bool ra)
{
unsigned int call[2];
- unsigned int nops[2] = {NOP4, NOP4};
+ unsigned int nops[2] = {RISCV_INSN_NOP4, RISCV_INSN_NOP4};
if (ra)
make_call_ra(hook_pos, target, call);
@@ -97,7 +97,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
unsigned long addr)
{
- unsigned int nops[2] = {NOP4, NOP4};
+ unsigned int nops[2] = {RISCV_INSN_NOP4, RISCV_INSN_NOP4};
if (patch_insn_write((void *)rec->ip, nops, MCOUNT_INSN_SIZE))
return -EPERM;
diff --git a/arch/riscv/kernel/jump_label.c b/arch/riscv/kernel/jump_label.c
index 654ed159c830b3d5e34ac58bf367107066eb73a1..b4c1a6a3fbd28533552036194f27ed206bea305d 100644
--- a/arch/riscv/kernel/jump_label.c
+++ b/arch/riscv/kernel/jump_label.c
@@ -11,8 +11,8 @@
#include <asm/bug.h>
#include <asm/cacheflush.h>
#include <asm/text-patching.h>
+#include <asm/insn-def.h>
-#define RISCV_INSN_NOP 0x00000013U
#define RISCV_INSN_JAL 0x0000006fU
bool arch_jump_label_transform_queue(struct jump_entry *entry,
@@ -33,7 +33,7 @@ bool arch_jump_label_transform_queue(struct jump_entry *entry,
(((u32)offset & GENMASK(10, 1)) << (21 - 1)) |
(((u32)offset & GENMASK(20, 20)) << (31 - 20));
} else {
- insn = RISCV_INSN_NOP;
+ insn = RISCV_INSN_NOP4;
}
if (early_boot_irqs_disabled) {
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-19 18:35 [PATCH v10 0/2] riscv: Add runtime constant support Charlie Jenkins
2025-03-19 18:35 ` [PATCH v10 1/2] riscv: Move nop definition to insn-def.h Charlie Jenkins
@ 2025-03-19 18:35 ` Charlie Jenkins
2025-03-28 15:42 ` Klara Modin
2025-04-01 19:28 ` Nathan Chancellor
2025-03-27 3:24 ` [PATCH v10 0/2] " patchwork-bot+linux-riscv
2 siblings, 2 replies; 12+ messages in thread
From: Charlie Jenkins @ 2025-03-19 18:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones
Cc: linux-riscv, linux-kernel, linux-trace-kernel, Charlie Jenkins
Implement the runtime constant infrastructure for riscv. Use this
infrastructure to generate constants to be used by the d_hash()
function.
This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
constant' support") and commit e3c92e81711d ("runtime constants: add
x86 architecture support").
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
arch/riscv/Kconfig | 22 +++
arch/riscv/include/asm/asm.h | 1 +
arch/riscv/include/asm/runtime-const.h | 265 +++++++++++++++++++++++++++++++++
arch/riscv/kernel/vmlinux.lds.S | 3 +
4 files changed, 291 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 7612c52e9b1e35607f1dd4603a596416d3357a71..c123f7c0579c1aca839e3c04bdb662d6856ae765 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -783,6 +783,28 @@ config RISCV_ISA_ZBC
If you don't know what to do here, say Y.
+config TOOLCHAIN_HAS_ZBKB
+ bool
+ default y
+ depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbkb)
+ depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbkb)
+ depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
+ depends on AS_HAS_OPTION_ARCH
+
+config RISCV_ISA_ZBKB
+ bool "Zbkb extension support for bit manipulation instructions"
+ depends on TOOLCHAIN_HAS_ZBKB
+ depends on RISCV_ALTERNATIVE
+ default y
+ help
+ Adds support to dynamically detect the presence of the ZBKB
+ extension (bit manipulation for cryptography) and enable its usage.
+
+ The Zbkb extension provides instructions to accelerate a number
+ of common cryptography operations (pack, zip, etc).
+
+ If you don't know what to do here, say Y.
+
config RISCV_ISA_ZICBOM
bool "Zicbom extension support for non-coherent DMA operation"
depends on MMU
diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
index 776354895b81e7dc332e58265548aaf7365a6037..a8a2af6dfe9d2406625ca8fc94014fe5180e4fec 100644
--- a/arch/riscv/include/asm/asm.h
+++ b/arch/riscv/include/asm/asm.h
@@ -27,6 +27,7 @@
#define REG_ASM __REG_SEL(.dword, .word)
#define SZREG __REG_SEL(8, 4)
#define LGREG __REG_SEL(3, 2)
+#define SRLI __REG_SEL(srliw, srli)
#if __SIZEOF_POINTER__ == 8
#ifdef __ASSEMBLY__
diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
new file mode 100644
index 0000000000000000000000000000000000000000..a23a9bd47903b2765608c75cd83f01ae578dffaa
--- /dev/null
+++ b/arch/riscv/include/asm/runtime-const.h
@@ -0,0 +1,265 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_RUNTIME_CONST_H
+#define _ASM_RISCV_RUNTIME_CONST_H
+
+#include <asm/asm.h>
+#include <asm/alternative.h>
+#include <asm/cacheflush.h>
+#include <asm/insn-def.h>
+#include <linux/memory.h>
+#include <asm/text-patching.h>
+
+#include <linux/uaccess.h>
+
+#ifdef CONFIG_32BIT
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret; \
+ asm_inline(".option push\n\t" \
+ ".option norvc\n\t" \
+ "1:\t" \
+ "lui %[__ret],0x89abd\n\t" \
+ "addi %[__ret],%[__ret],-0x211\n\t" \
+ ".option pop\n\t" \
+ ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
+ ".long 1b - .\n\t" \
+ ".popsection" \
+ : [__ret] "=r" (__ret)); \
+ __ret; \
+})
+#else
+/*
+ * Loading 64-bit constants into a register from immediates is a non-trivial
+ * task on riscv64. To get it somewhat performant, load 32 bits into two
+ * different registers and then combine the results.
+ *
+ * If the processor supports the Zbkb extension, we can combine the final
+ * "slli,slli,srli,add" into the single "pack" instruction. If the processor
+ * doesn't support Zbkb but does support the Zbb extension, we can
+ * combine the final "slli,srli,add" into one instruction "add.uw".
+ */
+#define RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ".option push\n\t" \
+ ".option norvc\n\t" \
+ "1:\t" \
+ "lui %[__ret],0x89abd\n\t" \
+ "lui %[__tmp],0x1234\n\t" \
+ "addiw %[__ret],%[__ret],-0x211\n\t" \
+ "addiw %[__tmp],%[__tmp],0x567\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_BASE \
+ "slli %[__tmp],%[__tmp],32\n\t" \
+ "slli %[__ret],%[__ret],32\n\t" \
+ "srli %[__ret],%[__ret],32\n\t" \
+ "add %[__ret],%[__ret],%[__tmp]\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_ZBA \
+ ".option push\n\t" \
+ ".option arch,+zba\n\t" \
+ "slli %[__tmp],%[__tmp],32\n\t" \
+ "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ ".option pop\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_ZBKB \
+ ".option push\n\t" \
+ ".option arch,+zbkb\n\t" \
+ "pack %[__ret],%[__ret],%[__tmp]\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ "nop\n\t" \
+ ".option pop\n\t" \
+
+#define RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ ".option pop\n\t" \
+ ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
+ ".long 1b - .\n\t" \
+ ".popsection" \
+
+#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_RISCV_ISA_ZBKB)
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ALTERNATIVE_2( \
+ RISCV_RUNTIME_CONST_64_BASE, \
+ RISCV_RUNTIME_CONST_64_ZBA, \
+ 0, RISCV_ISA_EXT_ZBA, 1, \
+ RISCV_RUNTIME_CONST_64_ZBKB, \
+ 0, RISCV_ISA_EXT_ZBKB, 1 \
+ ) \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#elif defined(CONFIG_RISCV_ISA_ZBA)
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ALTERNATIVE( \
+ RISCV_RUNTIME_CONST_64_BASE, \
+ RISCV_RUNTIME_CONST_64_ZBA, \
+ 0, RISCV_ISA_EXT_ZBA, 1 \
+ ) \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#elif defined(CONFIG_RISCV_ISA_ZBKB)
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ ALTERNATIVE( \
+ RISCV_RUNTIME_CONST_64_BASE, \
+ RISCV_RUNTIME_CONST_64_ZBKB, \
+ 0, RISCV_ISA_EXT_ZBKB, 1 \
+ ) \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#else
+#define runtime_const_ptr(sym) \
+({ \
+ typeof(sym) __ret, __tmp; \
+ asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
+ RISCV_RUNTIME_CONST_64_BASE \
+ RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
+ : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
+ __ret; \
+})
+#endif
+#endif
+
+#define runtime_const_shift_right_32(val, sym) \
+({ \
+ u32 __ret; \
+ asm_inline(".option push\n\t" \
+ ".option norvc\n\t" \
+ "1:\t" \
+ SRLI " %[__ret],%[__val],12\n\t" \
+ ".option pop\n\t" \
+ ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
+ ".long 1b - .\n\t" \
+ ".popsection" \
+ : [__ret] "=r" (__ret) \
+ : [__val] "r" (val)); \
+ __ret; \
+})
+
+#define runtime_const_init(type, sym) do { \
+ extern s32 __start_runtime_##type##_##sym[]; \
+ extern s32 __stop_runtime_##type##_##sym[]; \
+ \
+ runtime_const_fixup(__runtime_fixup_##type, \
+ (unsigned long)(sym), \
+ __start_runtime_##type##_##sym, \
+ __stop_runtime_##type##_##sym); \
+} while (0)
+
+static inline void __runtime_fixup_caches(void *where, unsigned int insns)
+{
+ /* On riscv there are currently only cache-wide flushes so va is ignored. */
+ __always_unused uintptr_t va = (uintptr_t)where;
+
+ flush_icache_range(va, va + 4 * insns);
+}
+
+/*
+ * The 32-bit immediate is stored in a lui+addi pairing.
+ * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction.
+ * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction.
+ */
+static inline void __runtime_fixup_32(__le16 *lui_parcel, __le16 *addi_parcel, unsigned int val)
+{
+ unsigned int lower_immediate, upper_immediate;
+ u32 lui_insn, addi_insn, addi_insn_mask;
+ __le32 lui_res, addi_res;
+
+ /* Mask out upper 12 bit of addi */
+ addi_insn_mask = 0x000fffff;
+
+ lui_insn = (u32)le16_to_cpu(lui_parcel[0]) | (u32)le16_to_cpu(lui_parcel[1]) << 16;
+ addi_insn = (u32)le16_to_cpu(addi_parcel[0]) | (u32)le16_to_cpu(addi_parcel[1]) << 16;
+
+ lower_immediate = sign_extend32(val, 11);
+ upper_immediate = (val - lower_immediate);
+
+ if (upper_immediate & 0xfffff000) {
+ /* replace upper 20 bits of lui with upper immediate */
+ lui_insn &= 0x00000fff;
+ lui_insn |= upper_immediate & 0xfffff000;
+ } else {
+ /* replace lui with nop if immediate is small enough to fit in addi */
+ lui_insn = RISCV_INSN_NOP4;
+ /*
+ * lui is being skipped, so do a load instead of an add. A load
+ * is performed by adding with the x0 register. Setting rs to
+ * zero with the following mask will accomplish this goal.
+ */
+ addi_insn_mask &= 0x07fff;
+ }
+
+ if (lower_immediate & 0x00000fff) {
+ /* replace upper 12 bits of addi with lower 12 bits of val */
+ addi_insn &= addi_insn_mask;
+ addi_insn |= (lower_immediate & 0x00000fff) << 20;
+ } else {
+ /* replace addi with nop if lower_immediate is empty */
+ addi_insn = RISCV_INSN_NOP4;
+ }
+
+ addi_res = cpu_to_le32(addi_insn);
+ lui_res = cpu_to_le32(lui_insn);
+ mutex_lock(&text_mutex);
+ patch_insn_write(addi_parcel, &addi_res, sizeof(addi_res));
+ patch_insn_write(lui_parcel, &lui_res, sizeof(lui_res));
+ mutex_unlock(&text_mutex);
+}
+
+static inline void __runtime_fixup_ptr(void *where, unsigned long val)
+{
+#ifdef CONFIG_32BIT
+ __runtime_fixup_32(where, where + 4, val);
+ __runtime_fixup_caches(where, 2);
+#else
+ __runtime_fixup_32(where, where + 8, val);
+ __runtime_fixup_32(where + 4, where + 12, val >> 32);
+ __runtime_fixup_caches(where, 4);
+#endif
+}
+
+/*
+ * Replace the least significant 5 bits of the srli/srliw immediate that is
+ * located at bits 20-24
+ */
+static inline void __runtime_fixup_shift(void *where, unsigned long val)
+{
+ __le16 *parcel = where;
+ __le32 res;
+ u32 insn;
+
+ insn = (u32)le16_to_cpu(parcel[0]) | (u32)le16_to_cpu(parcel[1]) << 16;
+
+ insn &= 0xfe0fffff;
+ insn |= (val & 0b11111) << 20;
+
+ res = cpu_to_le32(insn);
+ mutex_lock(&text_mutex);
+ patch_text_nosync(where, &res, sizeof(insn));
+ mutex_unlock(&text_mutex);
+}
+
+static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
+ unsigned long val, s32 *start, s32 *end)
+{
+ while (start < end) {
+ fn(*start + (void *)start, val);
+ start++;
+ }
+}
+
+#endif /* _ASM_RISCV_RUNTIME_CONST_H */
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -97,6 +97,9 @@ SECTIONS
{
EXIT_DATA
}
+
+ RUNTIME_CONST_VARIABLES
+
PERCPU_SECTION(L1_CACHE_BYTES)
.rel.dyn : {
--
2.43.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v10 1/2] riscv: Move nop definition to insn-def.h
2025-03-19 18:35 ` [PATCH v10 1/2] riscv: Move nop definition to insn-def.h Charlie Jenkins
@ 2025-03-20 9:02 ` Andrew Jones
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Jones @ 2025-03-20 9:02 UTC (permalink / raw)
To: Charlie Jenkins
Cc: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, linux-riscv,
linux-kernel, linux-trace-kernel
On Wed, Mar 19, 2025 at 11:35:19AM -0700, Charlie Jenkins wrote:
> We have duplicated the definition of the nop instruction in ftrace.h and
> in jump_label.c. Move this definition into the generic file insn-def.h
> so that they can share the definition with each other and with future
> files.
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> ---
> arch/riscv/include/asm/ftrace.h | 1 -
> arch/riscv/include/asm/insn-def.h | 3 +++
> arch/riscv/kernel/ftrace.c | 6 +++---
> arch/riscv/kernel/jump_label.c | 4 ++--
> 4 files changed, 8 insertions(+), 6 deletions(-)
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 0/2] riscv: Add runtime constant support
2025-03-19 18:35 [PATCH v10 0/2] riscv: Add runtime constant support Charlie Jenkins
2025-03-19 18:35 ` [PATCH v10 1/2] riscv: Move nop definition to insn-def.h Charlie Jenkins
2025-03-19 18:35 ` [PATCH v10 2/2] riscv: Add runtime constant support Charlie Jenkins
@ 2025-03-27 3:24 ` patchwork-bot+linux-riscv
2 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+linux-riscv @ 2025-03-27 3:24 UTC (permalink / raw)
To: Charlie Jenkins
Cc: linux-riscv, paul.walmsley, palmer, ardb, ben.dooks, pasha,
emil.renner.berthing, alexghiti, rostedt, mhiramat, mark.rutland,
aou, peterz, jpoimboe, jbaron, ajones, linux-kernel,
linux-trace-kernel
Hello:
This series was applied to riscv/linux.git (for-next)
by Alexandre Ghiti <alexghiti@rivosinc.com>:
On Wed, 19 Mar 2025 11:35:18 -0700 you wrote:
> Ard brought this to my attention in this patch [1].
>
> I benchmarked this patch on the Nezha D1 (which does not contain Zba or
> Zbkb so it uses the default algorithm) by navigating through a large
> directory structure. I created a 1000-deep directory structure and then
> cd and ls through it. With this patch there was a 0.57% performance
> improvement.
>
> [...]
Here is the summary with links:
- [v10,1/2] riscv: Move nop definition to insn-def.h
https://git.kernel.org/riscv/c/afa8a93932aa
- [v10,2/2] riscv: Add runtime constant support
https://git.kernel.org/riscv/c/a44fb5722199
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-19 18:35 ` [PATCH v10 2/2] riscv: Add runtime constant support Charlie Jenkins
@ 2025-03-28 15:42 ` Klara Modin
2025-03-28 17:35 ` Alexandre Ghiti
2025-03-28 19:51 ` Charlie Jenkins
2025-04-01 19:28 ` Nathan Chancellor
1 sibling, 2 replies; 12+ messages in thread
From: Klara Modin @ 2025-03-28 15:42 UTC (permalink / raw)
To: Charlie Jenkins, Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel,
Ben Dooks, Pasha Bouzarjomehri, Emil Renner Berthing,
Alexandre Ghiti, Steven Rostedt, Masami Hiramatsu, Mark Rutland,
Albert Ou, Peter Zijlstra, Josh Poimboeuf, Jason Baron,
Andrew Jones
Cc: linux-riscv, linux-kernel, linux-trace-kernel
[-- Attachment #1: Type: text/plain, Size: 12598 bytes --]
Hi,
On 3/19/25 19:35, Charlie Jenkins wrote:
> Implement the runtime constant infrastructure for riscv. Use this
> infrastructure to generate constants to be used by the d_hash()
> function.
>
> This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> constant' support") and commit e3c92e81711d ("runtime constants: add
> x86 architecture support").
This patch causes the following build failure for me:
fs/dcache.c: Assembler messages:
fs/dcache.c:157: Error: attempt to move .org backwards
fs/dcache.c:157: Error: attempt to move .org backwards
fs/dcache.c:157: Error: attempt to move .org backwards
fs/dcache.c:157: Error: attempt to move .org backwards
fs/dcache.c:157: Error: attempt to move .org backwards
make[3]: *** [scripts/Makefile.build:203: fs/dcache.o] Error 1
The value of CONFIG_RISCV_ISA_ZBKB doesn't seem to have an impact.
Reverting the patch on top of next-20250328 resolved the issue for me. I
attached the generated fs/dcache.s.
Please let me know if there's anything else you need.
Regards,
Klara Modin
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> ---
> arch/riscv/Kconfig | 22 +++
> arch/riscv/include/asm/asm.h | 1 +
> arch/riscv/include/asm/runtime-const.h | 265 +++++++++++++++++++++++++++++++++
> arch/riscv/kernel/vmlinux.lds.S | 3 +
> 4 files changed, 291 insertions(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 7612c52e9b1e35607f1dd4603a596416d3357a71..c123f7c0579c1aca839e3c04bdb662d6856ae765 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -783,6 +783,28 @@ config RISCV_ISA_ZBC
>
> If you don't know what to do here, say Y.
>
> +config TOOLCHAIN_HAS_ZBKB
> + bool
> + default y
> + depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbkb)
> + depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbkb)
> + depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
> + depends on AS_HAS_OPTION_ARCH
> +
> +config RISCV_ISA_ZBKB
> + bool "Zbkb extension support for bit manipulation instructions"
> + depends on TOOLCHAIN_HAS_ZBKB
> + depends on RISCV_ALTERNATIVE
> + default y
> + help
> + Adds support to dynamically detect the presence of the ZBKB
> + extension (bit manipulation for cryptography) and enable its usage.
> +
> + The Zbkb extension provides instructions to accelerate a number
> + of common cryptography operations (pack, zip, etc).
> +
> + If you don't know what to do here, say Y.
> +
> config RISCV_ISA_ZICBOM
> bool "Zicbom extension support for non-coherent DMA operation"
> depends on MMU
> diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> index 776354895b81e7dc332e58265548aaf7365a6037..a8a2af6dfe9d2406625ca8fc94014fe5180e4fec 100644
> --- a/arch/riscv/include/asm/asm.h
> +++ b/arch/riscv/include/asm/asm.h
> @@ -27,6 +27,7 @@
> #define REG_ASM __REG_SEL(.dword, .word)
> #define SZREG __REG_SEL(8, 4)
> #define LGREG __REG_SEL(3, 2)
> +#define SRLI __REG_SEL(srliw, srli)
>
> #if __SIZEOF_POINTER__ == 8
> #ifdef __ASSEMBLY__
> diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
> new file mode 100644
> index 0000000000000000000000000000000000000000..a23a9bd47903b2765608c75cd83f01ae578dffaa
> --- /dev/null
> +++ b/arch/riscv/include/asm/runtime-const.h
> @@ -0,0 +1,265 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_RISCV_RUNTIME_CONST_H
> +#define _ASM_RISCV_RUNTIME_CONST_H
> +
> +#include <asm/asm.h>
> +#include <asm/alternative.h>
> +#include <asm/cacheflush.h>
> +#include <asm/insn-def.h>
> +#include <linux/memory.h>
> +#include <asm/text-patching.h>
> +
> +#include <linux/uaccess.h>
> +
> +#ifdef CONFIG_32BIT
> +#define runtime_const_ptr(sym) \
> +({ \
> + typeof(sym) __ret; \
> + asm_inline(".option push\n\t" \
> + ".option norvc\n\t" \
> + "1:\t" \
> + "lui %[__ret],0x89abd\n\t" \
> + "addi %[__ret],%[__ret],-0x211\n\t" \
> + ".option pop\n\t" \
> + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
> + ".long 1b - .\n\t" \
> + ".popsection" \
> + : [__ret] "=r" (__ret)); \
> + __ret; \
> +})
> +#else
> +/*
> + * Loading 64-bit constants into a register from immediates is a non-trivial
> + * task on riscv64. To get it somewhat performant, load 32 bits into two
> + * different registers and then combine the results.
> + *
> + * If the processor supports the Zbkb extension, we can combine the final
> + * "slli,slli,srli,add" into the single "pack" instruction. If the processor
> + * doesn't support Zbkb but does support the Zbb extension, we can
> + * combine the final "slli,srli,add" into one instruction "add.uw".
> + */
> +#define RISCV_RUNTIME_CONST_64_PREAMBLE \
> + ".option push\n\t" \
> + ".option norvc\n\t" \
> + "1:\t" \
> + "lui %[__ret],0x89abd\n\t" \
> + "lui %[__tmp],0x1234\n\t" \
> + "addiw %[__ret],%[__ret],-0x211\n\t" \
> + "addiw %[__tmp],%[__tmp],0x567\n\t" \
> +
> +#define RISCV_RUNTIME_CONST_64_BASE \
> + "slli %[__tmp],%[__tmp],32\n\t" \
> + "slli %[__ret],%[__ret],32\n\t" \
> + "srli %[__ret],%[__ret],32\n\t" \
> + "add %[__ret],%[__ret],%[__tmp]\n\t" \
> +
> +#define RISCV_RUNTIME_CONST_64_ZBA \
> + ".option push\n\t" \
> + ".option arch,+zba\n\t" \
> + "slli %[__tmp],%[__tmp],32\n\t" \
> + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
> + "nop\n\t" \
> + "nop\n\t" \
> + ".option pop\n\t" \
> +
> +#define RISCV_RUNTIME_CONST_64_ZBKB \
> + ".option push\n\t" \
> + ".option arch,+zbkb\n\t" \
> + "pack %[__ret],%[__ret],%[__tmp]\n\t" \
> + "nop\n\t" \
> + "nop\n\t" \
> + "nop\n\t" \
> + ".option pop\n\t" \
> +
> +#define RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> + ".option pop\n\t" \
> + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
> + ".long 1b - .\n\t" \
> + ".popsection" \
> +
> +#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_RISCV_ISA_ZBKB)
> +#define runtime_const_ptr(sym) \
> +({ \
> + typeof(sym) __ret, __tmp; \
> + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> + ALTERNATIVE_2( \
> + RISCV_RUNTIME_CONST_64_BASE, \
> + RISCV_RUNTIME_CONST_64_ZBA, \
> + 0, RISCV_ISA_EXT_ZBA, 1, \
> + RISCV_RUNTIME_CONST_64_ZBKB, \
> + 0, RISCV_ISA_EXT_ZBKB, 1 \
> + ) \
> + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> + __ret; \
> +})
> +#elif defined(CONFIG_RISCV_ISA_ZBA)
> +#define runtime_const_ptr(sym) \
> +({ \
> + typeof(sym) __ret, __tmp; \
> + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> + ALTERNATIVE( \
> + RISCV_RUNTIME_CONST_64_BASE, \
> + RISCV_RUNTIME_CONST_64_ZBA, \
> + 0, RISCV_ISA_EXT_ZBA, 1 \
> + ) \
> + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> + __ret; \
> +})
> +#elif defined(CONFIG_RISCV_ISA_ZBKB)
> +#define runtime_const_ptr(sym) \
> +({ \
> + typeof(sym) __ret, __tmp; \
> + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> + ALTERNATIVE( \
> + RISCV_RUNTIME_CONST_64_BASE, \
> + RISCV_RUNTIME_CONST_64_ZBKB, \
> + 0, RISCV_ISA_EXT_ZBKB, 1 \
> + ) \
> + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> + __ret; \
> +})
> +#else
> +#define runtime_const_ptr(sym) \
> +({ \
> + typeof(sym) __ret, __tmp; \
> + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> + RISCV_RUNTIME_CONST_64_BASE \
> + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> + __ret; \
> +})
> +#endif
> +#endif
> +
> +#define runtime_const_shift_right_32(val, sym) \
> +({ \
> + u32 __ret; \
> + asm_inline(".option push\n\t" \
> + ".option norvc\n\t" \
> + "1:\t" \
> + SRLI " %[__ret],%[__val],12\n\t" \
> + ".option pop\n\t" \
> + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
> + ".long 1b - .\n\t" \
> + ".popsection" \
> + : [__ret] "=r" (__ret) \
> + : [__val] "r" (val)); \
> + __ret; \
> +})
> +
> +#define runtime_const_init(type, sym) do { \
> + extern s32 __start_runtime_##type##_##sym[]; \
> + extern s32 __stop_runtime_##type##_##sym[]; \
> + \
> + runtime_const_fixup(__runtime_fixup_##type, \
> + (unsigned long)(sym), \
> + __start_runtime_##type##_##sym, \
> + __stop_runtime_##type##_##sym); \
> +} while (0)
> +
> +static inline void __runtime_fixup_caches(void *where, unsigned int insns)
> +{
> + /* On riscv there are currently only cache-wide flushes so va is ignored. */
> + __always_unused uintptr_t va = (uintptr_t)where;
> +
> + flush_icache_range(va, va + 4 * insns);
> +}
> +
> +/*
> + * The 32-bit immediate is stored in a lui+addi pairing.
> + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction.
> + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction.
> + */
> +static inline void __runtime_fixup_32(__le16 *lui_parcel, __le16 *addi_parcel, unsigned int val)
> +{
> + unsigned int lower_immediate, upper_immediate;
> + u32 lui_insn, addi_insn, addi_insn_mask;
> + __le32 lui_res, addi_res;
> +
> + /* Mask out upper 12 bit of addi */
> + addi_insn_mask = 0x000fffff;
> +
> + lui_insn = (u32)le16_to_cpu(lui_parcel[0]) | (u32)le16_to_cpu(lui_parcel[1]) << 16;
> + addi_insn = (u32)le16_to_cpu(addi_parcel[0]) | (u32)le16_to_cpu(addi_parcel[1]) << 16;
> +
> + lower_immediate = sign_extend32(val, 11);
> + upper_immediate = (val - lower_immediate);
> +
> + if (upper_immediate & 0xfffff000) {
> + /* replace upper 20 bits of lui with upper immediate */
> + lui_insn &= 0x00000fff;
> + lui_insn |= upper_immediate & 0xfffff000;
> + } else {
> + /* replace lui with nop if immediate is small enough to fit in addi */
> + lui_insn = RISCV_INSN_NOP4;
> + /*
> + * lui is being skipped, so do a load instead of an add. A load
> + * is performed by adding with the x0 register. Setting rs to
> + * zero with the following mask will accomplish this goal.
> + */
> + addi_insn_mask &= 0x07fff;
> + }
> +
> + if (lower_immediate & 0x00000fff) {
> + /* replace upper 12 bits of addi with lower 12 bits of val */
> + addi_insn &= addi_insn_mask;
> + addi_insn |= (lower_immediate & 0x00000fff) << 20;
> + } else {
> + /* replace addi with nop if lower_immediate is empty */
> + addi_insn = RISCV_INSN_NOP4;
> + }
> +
> + addi_res = cpu_to_le32(addi_insn);
> + lui_res = cpu_to_le32(lui_insn);
> + mutex_lock(&text_mutex);
> + patch_insn_write(addi_parcel, &addi_res, sizeof(addi_res));
> + patch_insn_write(lui_parcel, &lui_res, sizeof(lui_res));
> + mutex_unlock(&text_mutex);
> +}
> +
> +static inline void __runtime_fixup_ptr(void *where, unsigned long val)
> +{
> +#ifdef CONFIG_32BIT
> + __runtime_fixup_32(where, where + 4, val);
> + __runtime_fixup_caches(where, 2);
> +#else
> + __runtime_fixup_32(where, where + 8, val);
> + __runtime_fixup_32(where + 4, where + 12, val >> 32);
> + __runtime_fixup_caches(where, 4);
> +#endif
> +}
> +
> +/*
> + * Replace the least significant 5 bits of the srli/srliw immediate that is
> + * located at bits 20-24
> + */
> +static inline void __runtime_fixup_shift(void *where, unsigned long val)
> +{
> + __le16 *parcel = where;
> + __le32 res;
> + u32 insn;
> +
> + insn = (u32)le16_to_cpu(parcel[0]) | (u32)le16_to_cpu(parcel[1]) << 16;
> +
> + insn &= 0xfe0fffff;
> + insn |= (val & 0b11111) << 20;
> +
> + res = cpu_to_le32(insn);
> + mutex_lock(&text_mutex);
> + patch_text_nosync(where, &res, sizeof(insn));
> + mutex_unlock(&text_mutex);
> +}
> +
> +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
> + unsigned long val, s32 *start, s32 *end)
> +{
> + while (start < end) {
> + fn(*start + (void *)start, val);
> + start++;
> + }
> +}
> +
> +#endif /* _ASM_RISCV_RUNTIME_CONST_H */
> diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644
> --- a/arch/riscv/kernel/vmlinux.lds.S
> +++ b/arch/riscv/kernel/vmlinux.lds.S
> @@ -97,6 +97,9 @@ SECTIONS
> {
> EXIT_DATA
> }
> +
> + RUNTIME_CONST_VARIABLES
> +
> PERCPU_SECTION(L1_CACHE_BYTES)
>
> .rel.dyn : {
>
[-- Attachment #2: riscv-org-move-bisect --]
[-- Type: text/plain, Size: 2780 bytes --]
# bad: [e21edb1638e82460f126a6e49bcdd958d452929c] Add linux-next specific files for 20250328
git bisect start 'next/master'
# status: waiting for good commit(s), bad commit known
# good: [5c2a430e85994f4873ea5ec42091baa1153bc731] Merge tag 'ext4-for_linus-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
git bisect good 5c2a430e85994f4873ea5ec42091baa1153bc731
# bad: [82dd76474d886e4e272cd3a6ce9e4a5cf193961d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
git bisect bad 82dd76474d886e4e272cd3a6ce9e4a5cf193961d
# good: [04c3b8bb1d6173c070927ad07baae14aa3cda0b5] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git
git bisect good 04c3b8bb1d6173c070927ad07baae14aa3cda0b5
# bad: [cf63bbbadeaaae64b5a42bf72c5f842ec412f006] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git
git bisect bad cf63bbbadeaaae64b5a42bf72c5f842ec412f006
# bad: [5744739d8ce78876d4b62a20fd6bc65f01ba142e] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git
git bisect bad 5744739d8ce78876d4b62a20fd6bc65f01ba142e
# good: [6d1617d154d6759e6aa269bdb387ea8d7bd4bf52] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git
git bisect good 6d1617d154d6759e6aa269bdb387ea8d7bd4bf52
# good: [861efb8a48ee8b73ae4e8817509cd4e82fd52bc4] powerpc/kexec: fix physical address calculation in clear_utlb_entry()
git bisect good 861efb8a48ee8b73ae4e8817509cd4e82fd52bc4
# bad: [74f4bf9d15ad1d6862b828d486ed10ea0e874a23] Merge patch series "riscv: Add runtime constant support"
git bisect bad 74f4bf9d15ad1d6862b828d486ed10ea0e874a23
# good: [82e81b89501a9f19a3a0b0d7d9641d86c1956284] riscv: migrate to the generic rule for built-in DTB
git bisect good 82e81b89501a9f19a3a0b0d7d9641d86c1956284
# good: [5b376a68da0a395d54684987efde647d8fa9027c] Merge patch series "riscv: add support for Zaamo and Zalrsc extensions"
git bisect good 5b376a68da0a395d54684987efde647d8fa9027c
# good: [2744ec472de31141ad354907ff98843dd6040917] riscv: Fix set up of vector cpu hotplug callback
git bisect good 2744ec472de31141ad354907ff98843dd6040917
# good: [d9b65824d8f8b69a9d80a5b4d1a8a52c3244f9c0] Merge patch series "riscv: Unaligned access speed probing fixes and skipping"
git bisect good d9b65824d8f8b69a9d80a5b4d1a8a52c3244f9c0
# bad: [a44fb5722199de8338d991db5ad3d509192179bb] riscv: Add runtime constant support
git bisect bad a44fb5722199de8338d991db5ad3d509192179bb
# good: [afa8a93932aa63b107d81bd438454760d8c7c8a3] riscv: Move nop definition to insn-def.h
git bisect good afa8a93932aa63b107d81bd438454760d8c7c8a3
# first bad commit: [a44fb5722199de8338d991db5ad3d509192179bb] riscv: Add runtime constant support
[-- Attachment #3: config-zbkb.gz --]
[-- Type: application/gzip, Size: 28566 bytes --]
[-- Attachment #4: dcache.s.gz --]
[-- Type: application/gzip, Size: 73998 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-28 15:42 ` Klara Modin
@ 2025-03-28 17:35 ` Alexandre Ghiti
2025-03-28 20:22 ` Klara Modin
2025-03-28 19:51 ` Charlie Jenkins
1 sibling, 1 reply; 12+ messages in thread
From: Alexandre Ghiti @ 2025-03-28 17:35 UTC (permalink / raw)
To: Klara Modin
Cc: Charlie Jenkins, Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel,
Ben Dooks, Pasha Bouzarjomehri, Emil Renner Berthing,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones,
linux-riscv, linux-kernel, linux-trace-kernel
Hi Klara,
On Fri, Mar 28, 2025 at 4:42 PM Klara Modin <klarasmodin@gmail.com> wrote:
>
> Hi,
>
> On 3/19/25 19:35, Charlie Jenkins wrote:
> > Implement the runtime constant infrastructure for riscv. Use this
> > infrastructure to generate constants to be used by the d_hash()
> > function.
> >
> > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> > constant' support") and commit e3c92e81711d ("runtime constants: add
> > x86 architecture support").
>
> This patch causes the following build failure for me:
>
> fs/dcache.c: Assembler messages:
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> make[3]: *** [scripts/Makefile.build:203: fs/dcache.o] Error 1
>
> The value of CONFIG_RISCV_ISA_ZBKB doesn't seem to have an impact.
> Reverting the patch on top of next-20250328 resolved the issue for me. I
> attached the generated fs/dcache.s.
Thanks for your report!
Kernel test robot reported the following issue, do you have the same errors?
fs/dcache.c: Assembler messages:
>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
--
>> fs/dcache.c:145: Error: unrecognized opcode `add.uw s1,s1,a5'
>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
--
>> fs/dcache.c:145: Error: unrecognized opcode `add.uw a4,a4,a5'
>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
--
>> fs/dcache.c:145: Error: unrecognized opcode `add.uw s4,s4,a5'
>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
--
>> fs/dcache.c:145: Error: unrecognized opcode `add.uw s1,s1,a5'
>> fs/dcache.c:152: Error: attempt to move .org backwards
>> fs/dcache.c:152: Error: attempt to move .org backwards
>> fs/dcache.c:152: Error: attempt to move .org backwards
>> fs/dcache.c:152: Error: attempt to move .org backwards
If so, I sent a fix, don't hesitate to add your Tested-by:
https://lore.kernel.org/linux-riscv/c0f425ec-6c76-45b2-b1bc-8d9be028a878@rivosinc.com/T/#me1469bfb2e6f69e1422a136014b753a6acaa3bc6
Thanks,
Alex
>
> Please let me know if there's anything else you need.
>
> Regards,
> Klara Modin
>
> >
> > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > ---
> > arch/riscv/Kconfig | 22 +++
> > arch/riscv/include/asm/asm.h | 1 +
> > arch/riscv/include/asm/runtime-const.h | 265 +++++++++++++++++++++++++++++++++
> > arch/riscv/kernel/vmlinux.lds.S | 3 +
> > 4 files changed, 291 insertions(+)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 7612c52e9b1e35607f1dd4603a596416d3357a71..c123f7c0579c1aca839e3c04bdb662d6856ae765 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -783,6 +783,28 @@ config RISCV_ISA_ZBC
> >
> > If you don't know what to do here, say Y.
> >
> > +config TOOLCHAIN_HAS_ZBKB
> > + bool
> > + default y
> > + depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbkb)
> > + depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbkb)
> > + depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
> > + depends on AS_HAS_OPTION_ARCH
> > +
> > +config RISCV_ISA_ZBKB
> > + bool "Zbkb extension support for bit manipulation instructions"
> > + depends on TOOLCHAIN_HAS_ZBKB
> > + depends on RISCV_ALTERNATIVE
> > + default y
> > + help
> > + Adds support to dynamically detect the presence of the ZBKB
> > + extension (bit manipulation for cryptography) and enable its usage.
> > +
> > + The Zbkb extension provides instructions to accelerate a number
> > + of common cryptography operations (pack, zip, etc).
> > +
> > + If you don't know what to do here, say Y.
> > +
> > config RISCV_ISA_ZICBOM
> > bool "Zicbom extension support for non-coherent DMA operation"
> > depends on MMU
> > diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> > index 776354895b81e7dc332e58265548aaf7365a6037..a8a2af6dfe9d2406625ca8fc94014fe5180e4fec 100644
> > --- a/arch/riscv/include/asm/asm.h
> > +++ b/arch/riscv/include/asm/asm.h
> > @@ -27,6 +27,7 @@
> > #define REG_ASM __REG_SEL(.dword, .word)
> > #define SZREG __REG_SEL(8, 4)
> > #define LGREG __REG_SEL(3, 2)
> > +#define SRLI __REG_SEL(srliw, srli)
> >
> > #if __SIZEOF_POINTER__ == 8
> > #ifdef __ASSEMBLY__
> > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..a23a9bd47903b2765608c75cd83f01ae578dffaa
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/runtime-const.h
> > @@ -0,0 +1,265 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_RISCV_RUNTIME_CONST_H
> > +#define _ASM_RISCV_RUNTIME_CONST_H
> > +
> > +#include <asm/asm.h>
> > +#include <asm/alternative.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/insn-def.h>
> > +#include <linux/memory.h>
> > +#include <asm/text-patching.h>
> > +
> > +#include <linux/uaccess.h>
> > +
> > +#ifdef CONFIG_32BIT
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret; \
> > + asm_inline(".option push\n\t" \
> > + ".option norvc\n\t" \
> > + "1:\t" \
> > + "lui %[__ret],0x89abd\n\t" \
> > + "addi %[__ret],%[__ret],-0x211\n\t" \
> > + ".option pop\n\t" \
> > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
> > + ".long 1b - .\n\t" \
> > + ".popsection" \
> > + : [__ret] "=r" (__ret)); \
> > + __ret; \
> > +})
> > +#else
> > +/*
> > + * Loading 64-bit constants into a register from immediates is a non-trivial
> > + * task on riscv64. To get it somewhat performant, load 32 bits into two
> > + * different registers and then combine the results.
> > + *
> > + * If the processor supports the Zbkb extension, we can combine the final
> > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor
> > + * doesn't support Zbkb but does support the Zbb extension, we can
> > + * combine the final "slli,srli,add" into one instruction "add.uw".
> > + */
> > +#define RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ".option push\n\t" \
> > + ".option norvc\n\t" \
> > + "1:\t" \
> > + "lui %[__ret],0x89abd\n\t" \
> > + "lui %[__tmp],0x1234\n\t" \
> > + "addiw %[__ret],%[__ret],-0x211\n\t" \
> > + "addiw %[__tmp],%[__tmp],0x567\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_BASE \
> > + "slli %[__tmp],%[__tmp],32\n\t" \
> > + "slli %[__ret],%[__ret],32\n\t" \
> > + "srli %[__ret],%[__ret],32\n\t" \
> > + "add %[__ret],%[__ret],%[__tmp]\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_ZBA \
> > + ".option push\n\t" \
> > + ".option arch,+zba\n\t" \
> > + "slli %[__tmp],%[__tmp],32\n\t" \
> > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + ".option pop\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_ZBKB \
> > + ".option push\n\t" \
> > + ".option arch,+zbkb\n\t" \
> > + "pack %[__ret],%[__ret],%[__tmp]\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + ".option pop\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + ".option pop\n\t" \
> > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
> > + ".long 1b - .\n\t" \
> > + ".popsection" \
> > +
> > +#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_RISCV_ISA_ZBKB)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE_2( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBA, \
> > + 0, RISCV_ISA_EXT_ZBA, 1, \
> > + RISCV_RUNTIME_CONST_64_ZBKB, \
> > + 0, RISCV_ISA_EXT_ZBKB, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#elif defined(CONFIG_RISCV_ISA_ZBA)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBA, \
> > + 0, RISCV_ISA_EXT_ZBA, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#elif defined(CONFIG_RISCV_ISA_ZBKB)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBKB, \
> > + 0, RISCV_ISA_EXT_ZBKB, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#else
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + RISCV_RUNTIME_CONST_64_BASE \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#endif
> > +#endif
> > +
> > +#define runtime_const_shift_right_32(val, sym) \
> > +({ \
> > + u32 __ret; \
> > + asm_inline(".option push\n\t" \
> > + ".option norvc\n\t" \
> > + "1:\t" \
> > + SRLI " %[__ret],%[__val],12\n\t" \
> > + ".option pop\n\t" \
> > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
> > + ".long 1b - .\n\t" \
> > + ".popsection" \
> > + : [__ret] "=r" (__ret) \
> > + : [__val] "r" (val)); \
> > + __ret; \
> > +})
> > +
> > +#define runtime_const_init(type, sym) do { \
> > + extern s32 __start_runtime_##type##_##sym[]; \
> > + extern s32 __stop_runtime_##type##_##sym[]; \
> > + \
> > + runtime_const_fixup(__runtime_fixup_##type, \
> > + (unsigned long)(sym), \
> > + __start_runtime_##type##_##sym, \
> > + __stop_runtime_##type##_##sym); \
> > +} while (0)
> > +
> > +static inline void __runtime_fixup_caches(void *where, unsigned int insns)
> > +{
> > + /* On riscv there are currently only cache-wide flushes so va is ignored. */
> > + __always_unused uintptr_t va = (uintptr_t)where;
> > +
> > + flush_icache_range(va, va + 4 * insns);
> > +}
> > +
> > +/*
> > + * The 32-bit immediate is stored in a lui+addi pairing.
> > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction.
> > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction.
> > + */
> > +static inline void __runtime_fixup_32(__le16 *lui_parcel, __le16 *addi_parcel, unsigned int val)
> > +{
> > + unsigned int lower_immediate, upper_immediate;
> > + u32 lui_insn, addi_insn, addi_insn_mask;
> > + __le32 lui_res, addi_res;
> > +
> > + /* Mask out upper 12 bit of addi */
> > + addi_insn_mask = 0x000fffff;
> > +
> > + lui_insn = (u32)le16_to_cpu(lui_parcel[0]) | (u32)le16_to_cpu(lui_parcel[1]) << 16;
> > + addi_insn = (u32)le16_to_cpu(addi_parcel[0]) | (u32)le16_to_cpu(addi_parcel[1]) << 16;
> > +
> > + lower_immediate = sign_extend32(val, 11);
> > + upper_immediate = (val - lower_immediate);
> > +
> > + if (upper_immediate & 0xfffff000) {
> > + /* replace upper 20 bits of lui with upper immediate */
> > + lui_insn &= 0x00000fff;
> > + lui_insn |= upper_immediate & 0xfffff000;
> > + } else {
> > + /* replace lui with nop if immediate is small enough to fit in addi */
> > + lui_insn = RISCV_INSN_NOP4;
> > + /*
> > + * lui is being skipped, so do a load instead of an add. A load
> > + * is performed by adding with the x0 register. Setting rs to
> > + * zero with the following mask will accomplish this goal.
> > + */
> > + addi_insn_mask &= 0x07fff;
> > + }
> > +
> > + if (lower_immediate & 0x00000fff) {
> > + /* replace upper 12 bits of addi with lower 12 bits of val */
> > + addi_insn &= addi_insn_mask;
> > + addi_insn |= (lower_immediate & 0x00000fff) << 20;
> > + } else {
> > + /* replace addi with nop if lower_immediate is empty */
> > + addi_insn = RISCV_INSN_NOP4;
> > + }
> > +
> > + addi_res = cpu_to_le32(addi_insn);
> > + lui_res = cpu_to_le32(lui_insn);
> > + mutex_lock(&text_mutex);
> > + patch_insn_write(addi_parcel, &addi_res, sizeof(addi_res));
> > + patch_insn_write(lui_parcel, &lui_res, sizeof(lui_res));
> > + mutex_unlock(&text_mutex);
> > +}
> > +
> > +static inline void __runtime_fixup_ptr(void *where, unsigned long val)
> > +{
> > +#ifdef CONFIG_32BIT
> > + __runtime_fixup_32(where, where + 4, val);
> > + __runtime_fixup_caches(where, 2);
> > +#else
> > + __runtime_fixup_32(where, where + 8, val);
> > + __runtime_fixup_32(where + 4, where + 12, val >> 32);
> > + __runtime_fixup_caches(where, 4);
> > +#endif
> > +}
> > +
> > +/*
> > + * Replace the least significant 5 bits of the srli/srliw immediate that is
> > + * located at bits 20-24
> > + */
> > +static inline void __runtime_fixup_shift(void *where, unsigned long val)
> > +{
> > + __le16 *parcel = where;
> > + __le32 res;
> > + u32 insn;
> > +
> > + insn = (u32)le16_to_cpu(parcel[0]) | (u32)le16_to_cpu(parcel[1]) << 16;
> > +
> > + insn &= 0xfe0fffff;
> > + insn |= (val & 0b11111) << 20;
> > +
> > + res = cpu_to_le32(insn);
> > + mutex_lock(&text_mutex);
> > + patch_text_nosync(where, &res, sizeof(insn));
> > + mutex_unlock(&text_mutex);
> > +}
> > +
> > +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
> > + unsigned long val, s32 *start, s32 *end)
> > +{
> > + while (start < end) {
> > + fn(*start + (void *)start, val);
> > + start++;
> > + }
> > +}
> > +
> > +#endif /* _ASM_RISCV_RUNTIME_CONST_H */
> > diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> > index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644
> > --- a/arch/riscv/kernel/vmlinux.lds.S
> > +++ b/arch/riscv/kernel/vmlinux.lds.S
> > @@ -97,6 +97,9 @@ SECTIONS
> > {
> > EXIT_DATA
> > }
> > +
> > + RUNTIME_CONST_VARIABLES
> > +
> > PERCPU_SECTION(L1_CACHE_BYTES)
> >
> > .rel.dyn : {
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-28 15:42 ` Klara Modin
2025-03-28 17:35 ` Alexandre Ghiti
@ 2025-03-28 19:51 ` Charlie Jenkins
2025-03-28 20:22 ` Klara Modin
1 sibling, 1 reply; 12+ messages in thread
From: Charlie Jenkins @ 2025-03-28 19:51 UTC (permalink / raw)
To: Klara Modin
Cc: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones,
linux-riscv, linux-kernel, linux-trace-kernel
On Fri, Mar 28, 2025 at 04:42:42PM +0100, Klara Modin wrote:
> Hi,
>
> On 3/19/25 19:35, Charlie Jenkins wrote:
> > Implement the runtime constant infrastructure for riscv. Use this
> > infrastructure to generate constants to be used by the d_hash()
> > function.
> >
> > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> > constant' support") and commit e3c92e81711d ("runtime constants: add
> > x86 architecture support").
>
> This patch causes the following build failure for me:
>
> fs/dcache.c: Assembler messages:
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> fs/dcache.c:157: Error: attempt to move .org backwards
> make[3]: *** [scripts/Makefile.build:203: fs/dcache.o] Error 1
Thank you for the report, this seems like a binutils issue potentially.
I will look into it. Here is a minimally reproducible example:
886 :
.option push
.option norvc
nop
nop
.option pop
887 :
888 :
.option push
.option norvc
.option arch,+zba
nop
nop
.option pop
889 :
.org . - (887b - 886b) + (889b - 888b)
.org . - (889b - 888b) + (887b - 886b)
Removing the ".option arch,+zba" fixes the issue but that shouldn't
matter...
- Charlie
>
> The value of CONFIG_RISCV_ISA_ZBKB doesn't seem to have an impact. Reverting
> the patch on top of next-20250328 resolved the issue for me. I attached the
> generated fs/dcache.s.
>
> Please let me know if there's anything else you need.
>
> Regards,
> Klara Modin
>
> >
> > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > ---
> > arch/riscv/Kconfig | 22 +++
> > arch/riscv/include/asm/asm.h | 1 +
> > arch/riscv/include/asm/runtime-const.h | 265 +++++++++++++++++++++++++++++++++
> > arch/riscv/kernel/vmlinux.lds.S | 3 +
> > 4 files changed, 291 insertions(+)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 7612c52e9b1e35607f1dd4603a596416d3357a71..c123f7c0579c1aca839e3c04bdb662d6856ae765 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -783,6 +783,28 @@ config RISCV_ISA_ZBC
> > If you don't know what to do here, say Y.
> > +config TOOLCHAIN_HAS_ZBKB
> > + bool
> > + default y
> > + depends on !64BIT || $(cc-option,-mabi=lp64 -march=rv64ima_zbkb)
> > + depends on !32BIT || $(cc-option,-mabi=ilp32 -march=rv32ima_zbkb)
> > + depends on LLD_VERSION >= 150000 || LD_VERSION >= 23900
> > + depends on AS_HAS_OPTION_ARCH
> > +
> > +config RISCV_ISA_ZBKB
> > + bool "Zbkb extension support for bit manipulation instructions"
> > + depends on TOOLCHAIN_HAS_ZBKB
> > + depends on RISCV_ALTERNATIVE
> > + default y
> > + help
> > + Adds support to dynamically detect the presence of the ZBKB
> > + extension (bit manipulation for cryptography) and enable its usage.
> > +
> > + The Zbkb extension provides instructions to accelerate a number
> > + of common cryptography operations (pack, zip, etc).
> > +
> > + If you don't know what to do here, say Y.
> > +
> > config RISCV_ISA_ZICBOM
> > bool "Zicbom extension support for non-coherent DMA operation"
> > depends on MMU
> > diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
> > index 776354895b81e7dc332e58265548aaf7365a6037..a8a2af6dfe9d2406625ca8fc94014fe5180e4fec 100644
> > --- a/arch/riscv/include/asm/asm.h
> > +++ b/arch/riscv/include/asm/asm.h
> > @@ -27,6 +27,7 @@
> > #define REG_ASM __REG_SEL(.dword, .word)
> > #define SZREG __REG_SEL(8, 4)
> > #define LGREG __REG_SEL(3, 2)
> > +#define SRLI __REG_SEL(srliw, srli)
> > #if __SIZEOF_POINTER__ == 8
> > #ifdef __ASSEMBLY__
> > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..a23a9bd47903b2765608c75cd83f01ae578dffaa
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/runtime-const.h
> > @@ -0,0 +1,265 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_RISCV_RUNTIME_CONST_H
> > +#define _ASM_RISCV_RUNTIME_CONST_H
> > +
> > +#include <asm/asm.h>
> > +#include <asm/alternative.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/insn-def.h>
> > +#include <linux/memory.h>
> > +#include <asm/text-patching.h>
> > +
> > +#include <linux/uaccess.h>
> > +
> > +#ifdef CONFIG_32BIT
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret; \
> > + asm_inline(".option push\n\t" \
> > + ".option norvc\n\t" \
> > + "1:\t" \
> > + "lui %[__ret],0x89abd\n\t" \
> > + "addi %[__ret],%[__ret],-0x211\n\t" \
> > + ".option pop\n\t" \
> > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
> > + ".long 1b - .\n\t" \
> > + ".popsection" \
> > + : [__ret] "=r" (__ret)); \
> > + __ret; \
> > +})
> > +#else
> > +/*
> > + * Loading 64-bit constants into a register from immediates is a non-trivial
> > + * task on riscv64. To get it somewhat performant, load 32 bits into two
> > + * different registers and then combine the results.
> > + *
> > + * If the processor supports the Zbkb extension, we can combine the final
> > + * "slli,slli,srli,add" into the single "pack" instruction. If the processor
> > + * doesn't support Zbkb but does support the Zbb extension, we can
> > + * combine the final "slli,srli,add" into one instruction "add.uw".
> > + */
> > +#define RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ".option push\n\t" \
> > + ".option norvc\n\t" \
> > + "1:\t" \
> > + "lui %[__ret],0x89abd\n\t" \
> > + "lui %[__tmp],0x1234\n\t" \
> > + "addiw %[__ret],%[__ret],-0x211\n\t" \
> > + "addiw %[__tmp],%[__tmp],0x567\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_BASE \
> > + "slli %[__tmp],%[__tmp],32\n\t" \
> > + "slli %[__ret],%[__ret],32\n\t" \
> > + "srli %[__ret],%[__ret],32\n\t" \
> > + "add %[__ret],%[__ret],%[__tmp]\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_ZBA \
> > + ".option push\n\t" \
> > + ".option arch,+zba\n\t" \
> > + "slli %[__tmp],%[__tmp],32\n\t" \
> > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + ".option pop\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_ZBKB \
> > + ".option push\n\t" \
> > + ".option arch,+zbkb\n\t" \
> > + "pack %[__ret],%[__ret],%[__tmp]\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + ".option pop\n\t" \
> > +
> > +#define RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + ".option pop\n\t" \
> > + ".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
> > + ".long 1b - .\n\t" \
> > + ".popsection" \
> > +
> > +#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_RISCV_ISA_ZBKB)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE_2( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBA, \
> > + 0, RISCV_ISA_EXT_ZBA, 1, \
> > + RISCV_RUNTIME_CONST_64_ZBKB, \
> > + 0, RISCV_ISA_EXT_ZBKB, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#elif defined(CONFIG_RISCV_ISA_ZBA)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBA, \
> > + 0, RISCV_ISA_EXT_ZBA, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#elif defined(CONFIG_RISCV_ISA_ZBKB)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBKB, \
> > + 0, RISCV_ISA_EXT_ZBKB, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#else
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + RISCV_RUNTIME_CONST_64_BASE \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
> > +#endif
> > +#endif
> > +
> > +#define runtime_const_shift_right_32(val, sym) \
> > +({ \
> > + u32 __ret; \
> > + asm_inline(".option push\n\t" \
> > + ".option norvc\n\t" \
> > + "1:\t" \
> > + SRLI " %[__ret],%[__val],12\n\t" \
> > + ".option pop\n\t" \
> > + ".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
> > + ".long 1b - .\n\t" \
> > + ".popsection" \
> > + : [__ret] "=r" (__ret) \
> > + : [__val] "r" (val)); \
> > + __ret; \
> > +})
> > +
> > +#define runtime_const_init(type, sym) do { \
> > + extern s32 __start_runtime_##type##_##sym[]; \
> > + extern s32 __stop_runtime_##type##_##sym[]; \
> > + \
> > + runtime_const_fixup(__runtime_fixup_##type, \
> > + (unsigned long)(sym), \
> > + __start_runtime_##type##_##sym, \
> > + __stop_runtime_##type##_##sym); \
> > +} while (0)
> > +
> > +static inline void __runtime_fixup_caches(void *where, unsigned int insns)
> > +{
> > + /* On riscv there are currently only cache-wide flushes so va is ignored. */
> > + __always_unused uintptr_t va = (uintptr_t)where;
> > +
> > + flush_icache_range(va, va + 4 * insns);
> > +}
> > +
> > +/*
> > + * The 32-bit immediate is stored in a lui+addi pairing.
> > + * lui holds the upper 20 bits of the immediate in the first 20 bits of the instruction.
> > + * addi holds the lower 12 bits of the immediate in the first 12 bits of the instruction.
> > + */
> > +static inline void __runtime_fixup_32(__le16 *lui_parcel, __le16 *addi_parcel, unsigned int val)
> > +{
> > + unsigned int lower_immediate, upper_immediate;
> > + u32 lui_insn, addi_insn, addi_insn_mask;
> > + __le32 lui_res, addi_res;
> > +
> > + /* Mask out upper 12 bit of addi */
> > + addi_insn_mask = 0x000fffff;
> > +
> > + lui_insn = (u32)le16_to_cpu(lui_parcel[0]) | (u32)le16_to_cpu(lui_parcel[1]) << 16;
> > + addi_insn = (u32)le16_to_cpu(addi_parcel[0]) | (u32)le16_to_cpu(addi_parcel[1]) << 16;
> > +
> > + lower_immediate = sign_extend32(val, 11);
> > + upper_immediate = (val - lower_immediate);
> > +
> > + if (upper_immediate & 0xfffff000) {
> > + /* replace upper 20 bits of lui with upper immediate */
> > + lui_insn &= 0x00000fff;
> > + lui_insn |= upper_immediate & 0xfffff000;
> > + } else {
> > + /* replace lui with nop if immediate is small enough to fit in addi */
> > + lui_insn = RISCV_INSN_NOP4;
> > + /*
> > + * lui is being skipped, so do a load instead of an add. A load
> > + * is performed by adding with the x0 register. Setting rs to
> > + * zero with the following mask will accomplish this goal.
> > + */
> > + addi_insn_mask &= 0x07fff;
> > + }
> > +
> > + if (lower_immediate & 0x00000fff) {
> > + /* replace upper 12 bits of addi with lower 12 bits of val */
> > + addi_insn &= addi_insn_mask;
> > + addi_insn |= (lower_immediate & 0x00000fff) << 20;
> > + } else {
> > + /* replace addi with nop if lower_immediate is empty */
> > + addi_insn = RISCV_INSN_NOP4;
> > + }
> > +
> > + addi_res = cpu_to_le32(addi_insn);
> > + lui_res = cpu_to_le32(lui_insn);
> > + mutex_lock(&text_mutex);
> > + patch_insn_write(addi_parcel, &addi_res, sizeof(addi_res));
> > + patch_insn_write(lui_parcel, &lui_res, sizeof(lui_res));
> > + mutex_unlock(&text_mutex);
> > +}
> > +
> > +static inline void __runtime_fixup_ptr(void *where, unsigned long val)
> > +{
> > +#ifdef CONFIG_32BIT
> > + __runtime_fixup_32(where, where + 4, val);
> > + __runtime_fixup_caches(where, 2);
> > +#else
> > + __runtime_fixup_32(where, where + 8, val);
> > + __runtime_fixup_32(where + 4, where + 12, val >> 32);
> > + __runtime_fixup_caches(where, 4);
> > +#endif
> > +}
> > +
> > +/*
> > + * Replace the least significant 5 bits of the srli/srliw immediate that is
> > + * located at bits 20-24
> > + */
> > +static inline void __runtime_fixup_shift(void *where, unsigned long val)
> > +{
> > + __le16 *parcel = where;
> > + __le32 res;
> > + u32 insn;
> > +
> > + insn = (u32)le16_to_cpu(parcel[0]) | (u32)le16_to_cpu(parcel[1]) << 16;
> > +
> > + insn &= 0xfe0fffff;
> > + insn |= (val & 0b11111) << 20;
> > +
> > + res = cpu_to_le32(insn);
> > + mutex_lock(&text_mutex);
> > + patch_text_nosync(where, &res, sizeof(insn));
> > + mutex_unlock(&text_mutex);
> > +}
> > +
> > +static inline void runtime_const_fixup(void (*fn)(void *, unsigned long),
> > + unsigned long val, s32 *start, s32 *end)
> > +{
> > + while (start < end) {
> > + fn(*start + (void *)start, val);
> > + start++;
> > + }
> > +}
> > +
> > +#endif /* _ASM_RISCV_RUNTIME_CONST_H */
> > diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
> > index 002ca58dd998cb78b662837b5ebac988fb6c77bb..61bd5ba6680a786bf1db7dc37bf1acda0639b5c7 100644
> > --- a/arch/riscv/kernel/vmlinux.lds.S
> > +++ b/arch/riscv/kernel/vmlinux.lds.S
> > @@ -97,6 +97,9 @@ SECTIONS
> > {
> > EXIT_DATA
> > }
> > +
> > + RUNTIME_CONST_VARIABLES
> > +
> > PERCPU_SECTION(L1_CACHE_BYTES)
> > .rel.dyn : {
> >
> # bad: [e21edb1638e82460f126a6e49bcdd958d452929c] Add linux-next specific files for 20250328
> git bisect start 'next/master'
> # status: waiting for good commit(s), bad commit known
> # good: [5c2a430e85994f4873ea5ec42091baa1153bc731] Merge tag 'ext4-for_linus-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
> git bisect good 5c2a430e85994f4873ea5ec42091baa1153bc731
> # bad: [82dd76474d886e4e272cd3a6ce9e4a5cf193961d] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
> git bisect bad 82dd76474d886e4e272cd3a6ce9e4a5cf193961d
> # good: [04c3b8bb1d6173c070927ad07baae14aa3cda0b5] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git
> git bisect good 04c3b8bb1d6173c070927ad07baae14aa3cda0b5
> # bad: [cf63bbbadeaaae64b5a42bf72c5f842ec412f006] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git
> git bisect bad cf63bbbadeaaae64b5a42bf72c5f842ec412f006
> # bad: [5744739d8ce78876d4b62a20fd6bc65f01ba142e] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git
> git bisect bad 5744739d8ce78876d4b62a20fd6bc65f01ba142e
> # good: [6d1617d154d6759e6aa269bdb387ea8d7bd4bf52] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git
> git bisect good 6d1617d154d6759e6aa269bdb387ea8d7bd4bf52
> # good: [861efb8a48ee8b73ae4e8817509cd4e82fd52bc4] powerpc/kexec: fix physical address calculation in clear_utlb_entry()
> git bisect good 861efb8a48ee8b73ae4e8817509cd4e82fd52bc4
> # bad: [74f4bf9d15ad1d6862b828d486ed10ea0e874a23] Merge patch series "riscv: Add runtime constant support"
> git bisect bad 74f4bf9d15ad1d6862b828d486ed10ea0e874a23
> # good: [82e81b89501a9f19a3a0b0d7d9641d86c1956284] riscv: migrate to the generic rule for built-in DTB
> git bisect good 82e81b89501a9f19a3a0b0d7d9641d86c1956284
> # good: [5b376a68da0a395d54684987efde647d8fa9027c] Merge patch series "riscv: add support for Zaamo and Zalrsc extensions"
> git bisect good 5b376a68da0a395d54684987efde647d8fa9027c
> # good: [2744ec472de31141ad354907ff98843dd6040917] riscv: Fix set up of vector cpu hotplug callback
> git bisect good 2744ec472de31141ad354907ff98843dd6040917
> # good: [d9b65824d8f8b69a9d80a5b4d1a8a52c3244f9c0] Merge patch series "riscv: Unaligned access speed probing fixes and skipping"
> git bisect good d9b65824d8f8b69a9d80a5b4d1a8a52c3244f9c0
> # bad: [a44fb5722199de8338d991db5ad3d509192179bb] riscv: Add runtime constant support
> git bisect bad a44fb5722199de8338d991db5ad3d509192179bb
> # good: [afa8a93932aa63b107d81bd438454760d8c7c8a3] riscv: Move nop definition to insn-def.h
> git bisect good afa8a93932aa63b107d81bd438454760d8c7c8a3
> # first bad commit: [a44fb5722199de8338d991db5ad3d509192179bb] riscv: Add runtime constant support
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-28 17:35 ` Alexandre Ghiti
@ 2025-03-28 20:22 ` Klara Modin
0 siblings, 0 replies; 12+ messages in thread
From: Klara Modin @ 2025-03-28 20:22 UTC (permalink / raw)
To: Alexandre Ghiti
Cc: Charlie Jenkins, Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel,
Ben Dooks, Pasha Bouzarjomehri, Emil Renner Berthing,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones,
linux-riscv, linux-kernel, linux-trace-kernel
Hi Alex,
On 3/28/25 18:35, Alexandre Ghiti wrote:
> Hi Klara,
>
> On Fri, Mar 28, 2025 at 4:42 PM Klara Modin <klarasmodin@gmail.com> wrote:
>>
>> Hi,
>>
>> On 3/19/25 19:35, Charlie Jenkins wrote:
>>> Implement the runtime constant infrastructure for riscv. Use this
>>> infrastructure to generate constants to be used by the d_hash()
>>> function.
>>>
>>> This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
>>> constant' support") and commit e3c92e81711d ("runtime constants: add
>>> x86 architecture support").
>>
>> This patch causes the following build failure for me:
>>
>> fs/dcache.c: Assembler messages:
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> make[3]: *** [scripts/Makefile.build:203: fs/dcache.o] Error 1
>>
>> The value of CONFIG_RISCV_ISA_ZBKB doesn't seem to have an impact.
>> Reverting the patch on top of next-20250328 resolved the issue for me. I
>> attached the generated fs/dcache.s.
>
> Thanks for your report!
>
> Kernel test robot reported the following issue, do you have the same errors?
>
> fs/dcache.c: Assembler messages:
>>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
> --
>>> fs/dcache.c:145: Error: unrecognized opcode `add.uw s1,s1,a5'
>>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
> --
>>> fs/dcache.c:145: Error: unrecognized opcode `add.uw a4,a4,a5'
>>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
> --
>>> fs/dcache.c:145: Error: unrecognized opcode `add.uw s4,s4,a5'
>>> fs/dcache.c:143: Warning: Unrecognized .option directive: arch,+zba
> --
>>> fs/dcache.c:145: Error: unrecognized opcode `add.uw s1,s1,a5'
>>> fs/dcache.c:152: Error: attempt to move .org backwards
>>> fs/dcache.c:152: Error: attempt to move .org backwards
>>> fs/dcache.c:152: Error: attempt to move .org backwards
>>> fs/dcache.c:152: Error: attempt to move .org backwards
>
> If so, I sent a fix, don't hesitate to add your Tested-by:
> https://lore.kernel.org/linux-riscv/c0f425ec-6c76-45b2-b1bc-8d9be028a878@rivosinc.com/T/#me1469bfb2e6f69e1422a136014b753a6acaa3bc6
I only saw the attempt to move .org backwards error. I'm using binutils
2.44 and a GCC 15 snapshot from 2025-03-23 so I don't think the
toolchain support for zba should be the issue. The fix didn't make any
difference for me.
However, it seems this could be something in GCC 15 as when I retried
with GCC 14.2 and 12.4 I could no longer see the issue.
Regards,
Klara Modin
>
> Thanks,
>
> Alex
>
>
>>
>> Please let me know if there's anything else you need.
>>
>> Regards,
>> Klara Modin
>>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-28 19:51 ` Charlie Jenkins
@ 2025-03-28 20:22 ` Klara Modin
0 siblings, 0 replies; 12+ messages in thread
From: Klara Modin @ 2025-03-28 20:22 UTC (permalink / raw)
To: Charlie Jenkins
Cc: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones,
linux-riscv, linux-kernel, linux-trace-kernel
On 3/28/25 20:51, Charlie Jenkins wrote:
> On Fri, Mar 28, 2025 at 04:42:42PM +0100, Klara Modin wrote:
>> Hi,
>>
>> On 3/19/25 19:35, Charlie Jenkins wrote:
>>> Implement the runtime constant infrastructure for riscv. Use this
>>> infrastructure to generate constants to be used by the d_hash()
>>> function.
>>>
>>> This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
>>> constant' support") and commit e3c92e81711d ("runtime constants: add
>>> x86 architecture support").
>>
>> This patch causes the following build failure for me:
>>
>> fs/dcache.c: Assembler messages:
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> fs/dcache.c:157: Error: attempt to move .org backwards
>> make[3]: *** [scripts/Makefile.build:203: fs/dcache.o] Error 1
>
> Thank you for the report, this seems like a binutils issue potentially.
> I will look into it. Here is a minimally reproducible example:
>
> 886 :
> .option push
> .option norvc
> nop
> nop
> .option pop
> 887 :
> 888 :
> .option push
> .option norvc
> .option arch,+zba
> nop
> nop
> .option pop
> 889 :
> .org . - (887b - 886b) + (889b - 888b)
> .org . - (889b - 888b) + (887b - 886b)
>
> Removing the ".option arch,+zba" fixes the issue but that shouldn't
> matter...
I tried again with GCC 14.2 and 12.4 (with the same binutils version)
after Alex's answer and couldn't see the issue with these. I got the
same result with your example. If I invoke `as` directly it doesn't
happen either.
The issue might be with GCC 15 then?
Regards,
Klara Modin
>
> - Charlie
>
>>
>> The value of CONFIG_RISCV_ISA_ZBKB doesn't seem to have an impact. Reverting
>> the patch on top of next-20250328 resolved the issue for me. I attached the
>> generated fs/dcache.s.
>>
>> Please let me know if there's anything else you need.
>>
>> Regards,
>> Klara Modin
>>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-03-19 18:35 ` [PATCH v10 2/2] riscv: Add runtime constant support Charlie Jenkins
2025-03-28 15:42 ` Klara Modin
@ 2025-04-01 19:28 ` Nathan Chancellor
2025-04-01 20:43 ` Charlie Jenkins
1 sibling, 1 reply; 12+ messages in thread
From: Nathan Chancellor @ 2025-04-01 19:28 UTC (permalink / raw)
To: Charlie Jenkins
Cc: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones,
linux-riscv, linux-kernel, linux-trace-kernel, llvm
Hi Charlie,
On Wed, Mar 19, 2025 at 11:35:20AM -0700, Charlie Jenkins wrote:
> Implement the runtime constant infrastructure for riscv. Use this
> infrastructure to generate constants to be used by the d_hash()
> function.
>
> This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> constant' support") and commit e3c92e81711d ("runtime constants: add
> x86 architecture support").
>
> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
...
> diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
...
> +#define RISCV_RUNTIME_CONST_64_ZBA \
> + ".option push\n\t" \
> + ".option arch,+zba\n\t" \
> + "slli %[__tmp],%[__tmp],32\n\t" \
> + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
> + "nop\n\t" \
> + "nop\n\t" \
> + ".option pop\n\t" \
...
> +#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_RISCV_ISA_ZBKB)
...
> +#elif defined(CONFIG_RISCV_ISA_ZBA)
> +#define runtime_const_ptr(sym) \
> +({ \
> + typeof(sym) __ret, __tmp; \
> + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> + ALTERNATIVE( \
> + RISCV_RUNTIME_CONST_64_BASE, \
> + RISCV_RUNTIME_CONST_64_ZBA, \
> + 0, RISCV_ISA_EXT_ZBA, 1 \
> + ) \
> + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> + __ret; \
> +})
This breaks the build for clang versions 16 and earlier because they do
not support '.option arch' and it is used in CONFIG_RISCV_ISA_ZBA, which
has no dependencies and it is default on.
$ make -skj"$(nproc)" ARCH=riscv LLVM=1 mrproper defconfig fs/dcache.o
fs/dcache.c:117:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'relax' or 'norelax' [-Winline-asm]
return runtime_const_ptr(dentry_hashtable) +
^
arch/riscv/include/asm/runtime-const.h:103:4: note: expanded from macro 'runtime_const_ptr'
RISCV_RUNTIME_CONST_64_ZBA, \
^
arch/riscv/include/asm/runtime-const.h:57:17: note: expanded from macro 'RISCV_RUNTIME_CONST_64_ZBA'
".option push\n\t" \
^
<inline asm>:32:10: note: instantiated into assembly here
.option arch,+zba
^
fs/dcache.c:117:9: error: instruction requires the following: 'Zba' (Address Generation Instructions)
return runtime_const_ptr(dentry_hashtable) +
^
arch/riscv/include/asm/runtime-const.h:103:4: note: expanded from macro 'runtime_const_ptr'
RISCV_RUNTIME_CONST_64_ZBA, \
^
arch/riscv/include/asm/runtime-const.h:59:30: note: expanded from macro 'RISCV_RUNTIME_CONST_64_ZBA'
"slli %[__tmp],%[__tmp],32\n\t" \
^
<inline asm>:34:2: note: instantiated into assembly here
add.uw a2,a2,a3
^
...
$ rg 'OPTION_ARCH|ZBA' .config
364:CONFIG_RISCV_ISA_ZBA=y
Should it grow a dependency on AS_HAS_OPTION_ARCH or should there be a
different fix?
Cheers,
Nathan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v10 2/2] riscv: Add runtime constant support
2025-04-01 19:28 ` Nathan Chancellor
@ 2025-04-01 20:43 ` Charlie Jenkins
0 siblings, 0 replies; 12+ messages in thread
From: Charlie Jenkins @ 2025-04-01 20:43 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Paul Walmsley, Palmer Dabbelt, Ard Biesheuvel, Ben Dooks,
Pasha Bouzarjomehri, Emil Renner Berthing, Alexandre Ghiti,
Steven Rostedt, Masami Hiramatsu, Mark Rutland, Albert Ou,
Peter Zijlstra, Josh Poimboeuf, Jason Baron, Andrew Jones,
linux-riscv, linux-kernel, linux-trace-kernel, llvm
On Tue, Apr 01, 2025 at 12:28:33PM -0700, Nathan Chancellor wrote:
> Hi Charlie,
>
> On Wed, Mar 19, 2025 at 11:35:20AM -0700, Charlie Jenkins wrote:
> > Implement the runtime constant infrastructure for riscv. Use this
> > infrastructure to generate constants to be used by the d_hash()
> > function.
> >
> > This is the riscv variant of commit 94a2bc0f611c ("arm64: add 'runtime
> > constant' support") and commit e3c92e81711d ("runtime constants: add
> > x86 architecture support").
> >
> > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
> > Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> ...
> > diff --git a/arch/riscv/include/asm/runtime-const.h b/arch/riscv/include/asm/runtime-const.h
> ...
> > +#define RISCV_RUNTIME_CONST_64_ZBA \
> > + ".option push\n\t" \
> > + ".option arch,+zba\n\t" \
> > + "slli %[__tmp],%[__tmp],32\n\t" \
> > + "add.uw %[__ret],%[__ret],%[__tmp]\n\t" \
> > + "nop\n\t" \
> > + "nop\n\t" \
> > + ".option pop\n\t" \
> ...
> > +#if defined(CONFIG_RISCV_ISA_ZBA) && defined(CONFIG_RISCV_ISA_ZBKB)
> ...
> > +#elif defined(CONFIG_RISCV_ISA_ZBA)
> > +#define runtime_const_ptr(sym) \
> > +({ \
> > + typeof(sym) __ret, __tmp; \
> > + asm_inline(RISCV_RUNTIME_CONST_64_PREAMBLE \
> > + ALTERNATIVE( \
> > + RISCV_RUNTIME_CONST_64_BASE, \
> > + RISCV_RUNTIME_CONST_64_ZBA, \
> > + 0, RISCV_ISA_EXT_ZBA, 1 \
> > + ) \
> > + RISCV_RUNTIME_CONST_64_POSTAMBLE(sym) \
> > + : [__ret] "=r" (__ret), [__tmp] "=r" (__tmp)); \
> > + __ret; \
> > +})
>
> This breaks the build for clang versions 16 and earlier because they do
> not support '.option arch' and it is used in CONFIG_RISCV_ISA_ZBA, which
> has no dependencies and it is default on.
>
> $ make -skj"$(nproc)" ARCH=riscv LLVM=1 mrproper defconfig fs/dcache.o
> fs/dcache.c:117:9: warning: unknown option, expected 'push', 'pop', 'rvc', 'norvc', 'relax' or 'norelax' [-Winline-asm]
> return runtime_const_ptr(dentry_hashtable) +
> ^
> arch/riscv/include/asm/runtime-const.h:103:4: note: expanded from macro 'runtime_const_ptr'
> RISCV_RUNTIME_CONST_64_ZBA, \
> ^
> arch/riscv/include/asm/runtime-const.h:57:17: note: expanded from macro 'RISCV_RUNTIME_CONST_64_ZBA'
> ".option push\n\t" \
> ^
> <inline asm>:32:10: note: instantiated into assembly here
> .option arch,+zba
> ^
> fs/dcache.c:117:9: error: instruction requires the following: 'Zba' (Address Generation Instructions)
> return runtime_const_ptr(dentry_hashtable) +
> ^
> arch/riscv/include/asm/runtime-const.h:103:4: note: expanded from macro 'runtime_const_ptr'
> RISCV_RUNTIME_CONST_64_ZBA, \
> ^
> arch/riscv/include/asm/runtime-const.h:59:30: note: expanded from macro 'RISCV_RUNTIME_CONST_64_ZBA'
> "slli %[__tmp],%[__tmp],32\n\t" \
> ^
> <inline asm>:34:2: note: instantiated into assembly here
> add.uw a2,a2,a3
> ^
> ...
>
> $ rg 'OPTION_ARCH|ZBA' .config
> 364:CONFIG_RISCV_ISA_ZBA=y
>
> Should it grow a dependency on AS_HAS_OPTION_ARCH or should there be a
> different fix?
This should have been fixed by Alex's patch [1]. Zba is in an awkward
state because BPF generates Zba code without the need for toolchain
support.
[1] https://lore.kernel.org/all/20250328115422.253670-1-alexghiti@rivosinc.com/
>
> Cheers,
> Nathan
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-04-01 20:43 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-19 18:35 [PATCH v10 0/2] riscv: Add runtime constant support Charlie Jenkins
2025-03-19 18:35 ` [PATCH v10 1/2] riscv: Move nop definition to insn-def.h Charlie Jenkins
2025-03-20 9:02 ` Andrew Jones
2025-03-19 18:35 ` [PATCH v10 2/2] riscv: Add runtime constant support Charlie Jenkins
2025-03-28 15:42 ` Klara Modin
2025-03-28 17:35 ` Alexandre Ghiti
2025-03-28 20:22 ` Klara Modin
2025-03-28 19:51 ` Charlie Jenkins
2025-03-28 20:22 ` Klara Modin
2025-04-01 19:28 ` Nathan Chancellor
2025-04-01 20:43 ` Charlie Jenkins
2025-03-27 3:24 ` [PATCH v10 0/2] " patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).