* [PATCH v3 00/13] riscv: improve boot time isa extensions handling
@ 2023-01-11 17:10 Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives Jisheng Zhang
` (12 more replies)
0 siblings, 13 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv
Generally, riscv ISA extensions are fixed for any specific hardware
platform, so a hart's features won't change after booting, this
chacteristic makes it straightforward to use a static branch to check
a specific ISA extension is supported or not to optimize performance.
However, some ISA extensions such as SVPBMT and ZICBOM are handled
via. the alternative sequences.
Basically, for ease of maintenance, we prefer to use static branches
in C code, but recently, Samuel found that the static branch usage in
cpu_relax() breaks building with CONFIG_CC_OPTIMIZE_FOR_SIZE[1]. As
Samuel pointed out, "Having a static branch in cpu_relax() is
problematic because that function is widely inlined, including in some
quite complex functions like in the VDSO. A quick measurement shows
this static branch is responsible by itself for around 40% of the jump
table."
Samuel's findings pointed out one of a few downsides of static branches
usage in C code to handle ISA extensions detected at boot time:
static branch's metadata in the __jump_table section, which is not
discarded after ISA extensions are finalized, wastes some space.
I want to try to solve the issue for all possible dynamic handling of
ISA extensions at boot time. Inspired by Mark[2], this patch introduces
riscv_has_extension_*() helpers, which work like static branches but
are patched using alternatives, thus the metadata can be freed after
patching.
Hi Heiko,
I combined your code and my code into patch1, since one of the key
patch in the merged "Allow calls in alternatives" series rolled
back to your v1. So I added your Co-developed-by and Signed-off-by
thanks
Since v2
- rebase on riscv-next
- collect Reviewed-by tag
- fix jal imm construction
- combine Heiko's code and my code for jal patching, thus add
Co-developed-by tag
- address comments from Conor
Since v1
- rebase on v6.1-rc7 + Heiko's alternative improvements[3]
- collect Reviewed-by tag
- add one patch to update jal offsets in patched alternatives
- add one patch to switch to relative alternative entries
- add patches to patch vdso
[1]https://lore.kernel.org/linux-riscv/20220922060958.44203-1-samuel@sholland.org/
[2]https://lore.kernel.org/linux-arm-kernel/20220912162210.3626215-8-mark.rutland@arm.com/
[3]https://lore.kernel.org/linux-riscv/20221130225614.1594256-1-heiko@sntech.de/
Andrew Jones (1):
riscv: KVM: Switch has_svinval() to riscv_has_extension_unlikely()
Jisheng Zhang (12):
riscv: fix jal offsets in patched alternatives
riscv: move riscv_noncoherent_supported() out of ZICBOM probe
riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier
riscv: hwcap: make ISA extension ids can be used in asm
riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA
extensions
riscv: introduce riscv_has_extension_[un]likely()
riscv: fpu: switch has_fpu() to riscv_has_extension_likely()
riscv: module: move find_section to module.h
riscv: switch to relative alternative entries
riscv: alternative: patch alternatives in the vDSO
riscv: cpu_relax: switch to riscv_has_extension_likely()
riscv: remove riscv_isa_ext_keys[] array and related usage
arch/riscv/errata/sifive/errata.c | 4 +-
arch/riscv/errata/thead/errata.c | 11 ++-
arch/riscv/include/asm/alternative-macros.h | 20 ++---
arch/riscv/include/asm/alternative.h | 12 +--
arch/riscv/include/asm/errata_list.h | 9 +-
arch/riscv/include/asm/hwcap.h | 97 +++++++++++----------
arch/riscv/include/asm/insn.h | 27 ++++++
arch/riscv/include/asm/module.h | 16 ++++
arch/riscv/include/asm/switch_to.h | 3 +-
arch/riscv/include/asm/vdso.h | 4 +
arch/riscv/include/asm/vdso/processor.h | 2 +-
arch/riscv/kernel/alternative.c | 52 +++++++++++
arch/riscv/kernel/cpufeature.c | 78 +++--------------
arch/riscv/kernel/module.c | 15 ----
arch/riscv/kernel/setup.c | 3 +
arch/riscv/kernel/vdso.c | 5 --
arch/riscv/kernel/vdso/vdso.lds.S | 7 ++
arch/riscv/kvm/tlb.c | 3 +-
18 files changed, 206 insertions(+), 162 deletions(-)
--
2.38.1
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 17:56 ` Andrew Jones
2023-01-11 23:31 ` Heiko Stübner
2023-01-11 17:10 ` [PATCH v3 02/13] riscv: move riscv_noncoherent_supported() out of ZICBOM probe Jisheng Zhang
` (11 subsequent siblings)
12 siblings, 2 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Heiko Stuebner
Alternatives live in a different section, so offsets used by jal
instruction will point to wrong locations after the patch got applied.
Similar to arm64, adjust the location to consider that offset.
Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
arch/riscv/include/asm/insn.h | 27 +++++++++++++++++++++++++++
arch/riscv/kernel/alternative.c | 27 +++++++++++++++++++++++++++
2 files changed, 54 insertions(+)
diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
index 98453535324a..1d2df245d0bd 100644
--- a/arch/riscv/include/asm/insn.h
+++ b/arch/riscv/include/asm/insn.h
@@ -291,6 +291,33 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
(RVC_X(x_, RVC_B_IMM_7_6_OPOFF, RVC_B_IMM_7_6_MASK) << RVC_B_IMM_7_6_OFF) | \
(RVC_IMM_SIGN(x_) << RVC_B_IMM_SIGN_OFF); })
+/*
+ * Get the immediate from a J-type instruction.
+ *
+ * @insn: instruction to process
+ * Return: immediate
+ */
+static inline s32 riscv_insn_extract_jtype_imm(u32 insn)
+{
+ return RV_EXTRACT_JTYPE_IMM(insn);
+}
+
+/*
+ * Update a J-type instruction with an immediate value.
+ *
+ * @insn: pointer to the jtype instruction
+ * @imm: the immediate to insert into the instruction
+ */
+static inline void riscv_insn_insert_jtype_imm(u32 *insn, s32 imm)
+{
+ /* drop the old IMMs, all jal IMM bits sit at 31:12 */
+ *insn &= ~GENMASK(31, 12);
+ *insn |= (RV_X(imm, RV_J_IMM_10_1_OFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OPOFF) |
+ (RV_X(imm, RV_J_IMM_11_OFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OPOFF) |
+ (RV_X(imm, RV_J_IMM_19_12_OFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OPOFF) |
+ (RV_X(imm, RV_J_IMM_SIGN_OFF, 1) << RV_J_IMM_SIGN_OPOFF);
+}
+
/*
* Put together one immediate from a U-type and I-type instruction pair.
*
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index 6212ea0eed72..3d4f1f32c7f6 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -79,6 +79,21 @@ static void riscv_alternative_fix_auipc_jalr(void *ptr, u32 auipc_insn,
patch_text_nosync(ptr, call, sizeof(u32) * 2);
}
+static void riscv_alternative_fix_jal(void *ptr, u32 jal_insn, int patch_offset)
+{
+ s32 imm;
+
+ /* get and adjust new target address */
+ imm = riscv_insn_extract_jtype_imm(jal_insn);
+ imm -= patch_offset;
+
+ /* update instruction */
+ riscv_insn_insert_jtype_imm(&jal_insn, imm);
+
+ /* patch the call place again */
+ patch_text_nosync(ptr, &jal_insn, sizeof(u32));
+}
+
void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
int patch_offset)
{
@@ -106,6 +121,18 @@ void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
riscv_alternative_fix_auipc_jalr(alt_ptr + i * sizeof(u32),
insn, insn2, patch_offset);
}
+
+ if (riscv_insn_is_jal(insn)) {
+ s32 imm = riscv_insn_extract_jtype_imm(insn);
+
+ /* Don't modify jumps inside the alternative block */
+ if ((alt_ptr + i * sizeof(u32) + imm) >= alt_ptr &&
+ (alt_ptr + i * sizeof(u32) + imm) < (alt_ptr + len))
+ continue;
+
+ riscv_alternative_fix_jal(alt_ptr + i * sizeof(u32),
+ insn, patch_offset);
+ }
}
}
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 02/13] riscv: move riscv_noncoherent_supported() out of ZICBOM probe
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 03/13] riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier Jisheng Zhang
` (10 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones,
Conor Dooley
It's a bit weird to call riscv_noncoherent_supported() each time when
insmoding a module. Move the calling out of feature patch func.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
arch/riscv/kernel/cpufeature.c | 1 -
arch/riscv/kernel/setup.c | 3 +++
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 205bbd6b1fce..421b3d9578cc 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -297,7 +297,6 @@ static bool __init_or_module cpufeature_probe_zicbom(unsigned int stage)
if (!riscv_isa_extension_available(NULL, ZICBOM))
return false;
- riscv_noncoherent_supported();
return true;
}
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 86acd690d529..376d2827e736 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -300,6 +300,9 @@ void __init setup_arch(char **cmdline_p)
riscv_init_cbom_blocksize();
riscv_fill_hwcap();
apply_boot_alternatives();
+ if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
+ riscv_isa_extension_available(NULL, ZICBOM))
+ riscv_noncoherent_supported();
}
static int __init topology_init(void)
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 03/13] riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 02/13] riscv: move riscv_noncoherent_supported() out of ZICBOM probe Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-12 21:11 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm Jisheng Zhang
` (9 subsequent siblings)
12 siblings, 1 reply; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
Currently riscv_cpufeature_patch_func() does nothing at the
RISCV_ALTERNATIVES_EARLY_BOOT stage. Add a check to detect whether we
are in this stage and exit early. This will allow us to use
riscv_cpufeature_patch_func() for scanning of all ISA extensions.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
---
arch/riscv/kernel/cpufeature.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 421b3d9578cc..37e8c5e69754 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -328,6 +328,9 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
struct alt_entry *alt;
u32 tmp;
+ if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
+ return;
+
for (alt = begin; alt < end; alt++) {
if (alt->vendor_id != 0)
continue;
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (2 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 03/13] riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-12 21:28 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions Jisheng Zhang
` (8 subsequent siblings)
12 siblings, 1 reply; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
We will make use of ISA extension in asm files, so make the multi-letter
RISC-V ISA extension IDs macros rather than enums and move them and
those base ISA extension IDs to suitable place.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
---
arch/riscv/include/asm/hwcap.h | 45 ++++++++++++++++------------------
1 file changed, 21 insertions(+), 24 deletions(-)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 86328e3acb02..09a7767723f6 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -12,20 +12,6 @@
#include <linux/bits.h>
#include <uapi/asm/hwcap.h>
-#ifndef __ASSEMBLY__
-#include <linux/jump_label.h>
-/*
- * This yields a mask that user programs can use to figure out what
- * instruction set this cpu supports.
- */
-#define ELF_HWCAP (elf_hwcap)
-
-enum {
- CAP_HWCAP = 1,
-};
-
-extern unsigned long elf_hwcap;
-
#define RISCV_ISA_EXT_a ('a' - 'a')
#define RISCV_ISA_EXT_c ('c' - 'a')
#define RISCV_ISA_EXT_d ('d' - 'a')
@@ -46,22 +32,33 @@ extern unsigned long elf_hwcap;
#define RISCV_ISA_EXT_BASE 26
/*
- * This enum represent the logical ID for each multi-letter RISC-V ISA extension.
+ * These macros represent the logical ID for each multi-letter RISC-V ISA extension.
* The logical ID should start from RISCV_ISA_EXT_BASE and must not exceed
* RISCV_ISA_EXT_MAX. 0-25 range is reserved for single letter
* extensions while all the multi-letter extensions should define the next
* available logical extension id.
*/
-enum riscv_isa_ext_id {
- RISCV_ISA_EXT_SSCOFPMF = RISCV_ISA_EXT_BASE,
- RISCV_ISA_EXT_SVPBMT,
- RISCV_ISA_EXT_ZICBOM,
- RISCV_ISA_EXT_ZIHINTPAUSE,
- RISCV_ISA_EXT_SSTC,
- RISCV_ISA_EXT_SVINVAL,
- RISCV_ISA_EXT_ID_MAX
+#define RISCV_ISA_EXT_SSCOFPMF 26
+#define RISCV_ISA_EXT_SVPBMT 27
+#define RISCV_ISA_EXT_ZICBOM 28
+#define RISCV_ISA_EXT_ZIHINTPAUSE 29
+#define RISCV_ISA_EXT_SSTC 30
+#define RISCV_ISA_EXT_SVINVAL 31
+
+#ifndef __ASSEMBLY__
+#include <linux/jump_label.h>
+/*
+ * This yields a mask that user programs can use to figure out what
+ * instruction set this cpu supports.
+ */
+#define ELF_HWCAP (elf_hwcap)
+
+enum {
+ CAP_HWCAP = 1,
};
-static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX);
+
+extern unsigned long elf_hwcap;
+
/*
* This enum represents the logical ID for each RISC-V ISA extension static
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (3 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 23:29 ` Heiko Stübner
2023-01-11 17:10 ` [PATCH v3 06/13] riscv: introduce riscv_has_extension_[un]likely() Jisheng Zhang
` (7 subsequent siblings)
12 siblings, 1 reply; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
riscv_cpufeature_patch_func() currently only scans a limited set of
cpufeatures, explicitly defined with macros. Extend it to probe for all
ISA extensions.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
---
arch/riscv/include/asm/errata_list.h | 9 ++--
arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
2 files changed, 11 insertions(+), 61 deletions(-)
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 4180312d2a70..274c6f889602 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -7,6 +7,7 @@
#include <asm/alternative.h>
#include <asm/csr.h>
+#include <asm/hwcap.h>
#include <asm/vendorid_list.h>
#ifdef CONFIG_ERRATA_SIFIVE
@@ -22,10 +23,6 @@
#define ERRATA_THEAD_NUMBER 3
#endif
-#define CPUFEATURE_SVPBMT 0
-#define CPUFEATURE_ZICBOM 1
-#define CPUFEATURE_NUMBER 2
-
#ifdef __ASSEMBLY__
#define ALT_INSN_FAULT(x) \
@@ -55,7 +52,7 @@ asm(ALTERNATIVE("sfence.vma %0", "sfence.vma", SIFIVE_VENDOR_ID, \
#define ALT_SVPBMT(_val, prot) \
asm(ALTERNATIVE_2("li %0, 0\t\nnop", \
"li %0, %1\t\nslli %0,%0,%3", 0, \
- CPUFEATURE_SVPBMT, CONFIG_RISCV_ISA_SVPBMT, \
+ RISCV_ISA_EXT_SVPBMT, CONFIG_RISCV_ISA_SVPBMT, \
"li %0, %2\t\nslli %0,%0,%4", THEAD_VENDOR_ID, \
ERRATA_THEAD_PBMT, CONFIG_ERRATA_THEAD_PBMT) \
: "=r"(_val) \
@@ -129,7 +126,7 @@ asm volatile(ALTERNATIVE_2( \
"add a0, a0, %0\n\t" \
"2:\n\t" \
"bltu a0, %2, 3b\n\t" \
- "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
+ "nop", 0, RISCV_ISA_EXT_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
"mv a0, %1\n\t" \
"j 2f\n\t" \
"3:\n\t" \
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 37e8c5e69754..6db8b31d9149 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -275,58 +275,11 @@ void __init riscv_fill_hwcap(void)
}
#ifdef CONFIG_RISCV_ALTERNATIVE
-static bool __init_or_module cpufeature_probe_svpbmt(unsigned int stage)
-{
- if (!IS_ENABLED(CONFIG_RISCV_ISA_SVPBMT))
- return false;
-
- if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
- return false;
-
- return riscv_isa_extension_available(NULL, SVPBMT);
-}
-
-static bool __init_or_module cpufeature_probe_zicbom(unsigned int stage)
-{
- if (!IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM))
- return false;
-
- if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
- return false;
-
- if (!riscv_isa_extension_available(NULL, ZICBOM))
- return false;
-
- return true;
-}
-
-/*
- * Probe presence of individual extensions.
- *
- * This code may also be executed before kernel relocation, so we cannot use
- * addresses generated by the address-of operator as they won't be valid in
- * this context.
- */
-static u32 __init_or_module cpufeature_probe(unsigned int stage)
-{
- u32 cpu_req_feature = 0;
-
- if (cpufeature_probe_svpbmt(stage))
- cpu_req_feature |= BIT(CPUFEATURE_SVPBMT);
-
- if (cpufeature_probe_zicbom(stage))
- cpu_req_feature |= BIT(CPUFEATURE_ZICBOM);
-
- return cpu_req_feature;
-}
-
void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
struct alt_entry *end,
unsigned int stage)
{
- u32 cpu_req_feature = cpufeature_probe(stage);
struct alt_entry *alt;
- u32 tmp;
if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
return;
@@ -334,18 +287,18 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
for (alt = begin; alt < end; alt++) {
if (alt->vendor_id != 0)
continue;
- if (alt->errata_id >= CPUFEATURE_NUMBER) {
- WARN(1, "This feature id:%d is not in kernel cpufeature list",
+ if (alt->errata_id >= RISCV_ISA_EXT_MAX) {
+ WARN(1, "This extension id:%d is not in ISA extension list",
alt->errata_id);
continue;
}
- tmp = (1U << alt->errata_id);
- if (cpu_req_feature & tmp) {
- patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
- riscv_alternative_fix_offsets(alt->old_ptr, alt->alt_len,
- alt->old_ptr - alt->alt_ptr);
- }
+ if (!__riscv_isa_extension_available(NULL, alt->errata_id))
+ continue;
+
+ patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
+ riscv_alternative_fix_offsets(alt->old_ptr, alt->alt_len,
+ alt->old_ptr - alt->alt_ptr);
}
}
#endif
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 06/13] riscv: introduce riscv_has_extension_[un]likely()
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (4 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 07/13] riscv: fpu: switch has_fpu() to riscv_has_extension_likely() Jisheng Zhang
` (6 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
Generally, riscv ISA extensions are fixed for any specific hardware
platform, so a hart's features won't change after booting. This
chacteristic makes it straightforward to use a static branch to check
if a specific ISA extension is supported or not to optimize
performance.
However, some ISA extensions such as SVPBMT and ZICBOM are handled
via. the alternative sequences.
Basically, for ease of maintenance, we prefer to use static branches
in C code, but recently, Samuel found that the static branch usage in
cpu_relax() breaks building with CONFIG_CC_OPTIMIZE_FOR_SIZE[1]. As
Samuel pointed out, "Having a static branch in cpu_relax() is
problematic because that function is widely inlined, including in some
quite complex functions like in the VDSO. A quick measurement shows
this static branch is responsible by itself for around 40% of the jump
table."
Samuel's findings pointed out one of a few downsides of static branches
usage in C code to handle ISA extensions detected at boot time:
static branch's metadata in the __jump_table section, which is not
discarded after ISA extensions are finalized, wastes some space.
I want to try to solve the issue for all possible dynamic handling of
ISA extensions at boot time. Inspired by Mark[2], this patch introduces
riscv_has_extension_*() helpers, which work like static branches but
are patched using alternatives, thus the metadata can be freed after
patching.
Link: https://lore.kernel.org/linux-riscv/20220922060958.44203-1-samuel@sholland.org/ [1]
Link: https://lore.kernel.org/linux-arm-kernel/20220912162210.3626215-8-mark.rutland@arm.com/ [2]
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
---
arch/riscv/include/asm/hwcap.h | 37 ++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 09a7767723f6..1767a9ce1a04 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -8,6 +8,7 @@
#ifndef _ASM_RISCV_HWCAP_H
#define _ASM_RISCV_HWCAP_H
+#include <asm/alternative-macros.h>
#include <asm/errno.h>
#include <linux/bits.h>
#include <uapi/asm/hwcap.h>
@@ -97,6 +98,42 @@ static __always_inline int riscv_isa_ext2key(int num)
}
}
+static __always_inline bool
+riscv_has_extension_likely(const unsigned long ext)
+{
+ compiletime_assert(ext < RISCV_ISA_EXT_MAX,
+ "ext must be < RISCV_ISA_EXT_MAX");
+
+ asm_volatile_goto(
+ ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1)
+ :
+ : [ext] "i" (ext)
+ :
+ : l_no);
+
+ return true;
+l_no:
+ return false;
+}
+
+static __always_inline bool
+riscv_has_extension_unlikely(const unsigned long ext)
+{
+ compiletime_assert(ext < RISCV_ISA_EXT_MAX,
+ "ext must be < RISCV_ISA_EXT_MAX");
+
+ asm_volatile_goto(
+ ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1)
+ :
+ : [ext] "i" (ext)
+ :
+ : l_yes);
+
+ return false;
+l_yes:
+ return true;
+}
+
unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap);
#define riscv_isa_extension_mask(ext) BIT_MASK(RISCV_ISA_EXT_##ext)
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 07/13] riscv: fpu: switch has_fpu() to riscv_has_extension_likely()
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (5 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 06/13] riscv: introduce riscv_has_extension_[un]likely() Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-12 21:58 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 08/13] riscv: module: move find_section to module.h Jisheng Zhang
` (5 subsequent siblings)
12 siblings, 1 reply; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
Switch has_fpu() from statich branch to the new helper
riscv_has_extension_likely().
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
---
arch/riscv/include/asm/switch_to.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
index 11463489fec6..60f8ca01d36e 100644
--- a/arch/riscv/include/asm/switch_to.h
+++ b/arch/riscv/include/asm/switch_to.h
@@ -59,7 +59,8 @@ static inline void __switch_to_aux(struct task_struct *prev,
static __always_inline bool has_fpu(void)
{
- return static_branch_likely(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_FPU]);
+ return riscv_has_extension_likely(RISCV_ISA_EXT_f) ||
+ riscv_has_extension_likely(RISCV_ISA_EXT_d);
}
#else
static __always_inline bool has_fpu(void) { return false; }
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 08/13] riscv: module: move find_section to module.h
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (6 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 07/13] riscv: fpu: switch has_fpu() to riscv_has_extension_likely() Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 09/13] riscv: switch to relative alternative entries Jisheng Zhang
` (4 subsequent siblings)
12 siblings, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Conor Dooley,
Andrew Jones
Move find_section() to module.h so that the implementation can be shared
by the alternatives code. This will allow us to use alternatives in
the vdso.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
arch/riscv/include/asm/module.h | 16 ++++++++++++++++
arch/riscv/kernel/module.c | 15 ---------------
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/arch/riscv/include/asm/module.h b/arch/riscv/include/asm/module.h
index 76aa96a9fc08..0f3baaa6a9a8 100644
--- a/arch/riscv/include/asm/module.h
+++ b/arch/riscv/include/asm/module.h
@@ -5,6 +5,7 @@
#define _ASM_RISCV_MODULE_H
#include <asm-generic/module.h>
+#include <linux/elf.h>
struct module;
unsigned long module_emit_got_entry(struct module *mod, unsigned long val);
@@ -111,4 +112,19 @@ static inline struct plt_entry *get_plt_entry(unsigned long val,
#endif /* CONFIG_MODULE_SECTIONS */
+static inline const Elf_Shdr *find_section(const Elf_Ehdr *hdr,
+ const Elf_Shdr *sechdrs,
+ const char *name)
+{
+ const Elf_Shdr *s, *se;
+ const char *secstrs = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
+
+ for (s = sechdrs, se = sechdrs + hdr->e_shnum; s < se; s++) {
+ if (strcmp(name, secstrs + s->sh_name) == 0)
+ return s;
+ }
+
+ return NULL;
+}
+
#endif /* _ASM_RISCV_MODULE_H */
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
index 91fe16bfaa07..76f4b9c2ec5b 100644
--- a/arch/riscv/kernel/module.c
+++ b/arch/riscv/kernel/module.c
@@ -429,21 +429,6 @@ void *module_alloc(unsigned long size)
}
#endif
-static const Elf_Shdr *find_section(const Elf_Ehdr *hdr,
- const Elf_Shdr *sechdrs,
- const char *name)
-{
- const Elf_Shdr *s, *se;
- const char *secstrs = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
-
- for (s = sechdrs, se = sechdrs + hdr->e_shnum; s < se; s++) {
- if (strcmp(name, secstrs + s->sh_name) == 0)
- return s;
- }
-
- return NULL;
-}
-
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 09/13] riscv: switch to relative alternative entries
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (7 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 08/13] riscv: module: move find_section to module.h Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 18:11 ` Andrew Jones
2023-01-12 21:49 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO Jisheng Zhang
` (3 subsequent siblings)
12 siblings, 2 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv
Instead of using absolute addresses for both the old instrucions and
the alternative instructions, use offsets relative to the alt_entry
values. So this not only cuts the size of the alternative entry, but
also meets the prerequisite for patching alternatives in the vDSO,
since absolute alternative entries are subject to dynamic relocation,
which is incompatible with the vDSO building.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
arch/riscv/errata/sifive/errata.c | 4 +++-
arch/riscv/errata/thead/errata.c | 11 ++++++++---
arch/riscv/include/asm/alternative-macros.h | 20 ++++++++++----------
arch/riscv/include/asm/alternative.h | 12 ++++++------
arch/riscv/kernel/cpufeature.c | 8 +++++---
5 files changed, 32 insertions(+), 23 deletions(-)
diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sifive/errata.c
index 1031038423e7..0e537cdfd324 100644
--- a/arch/riscv/errata/sifive/errata.c
+++ b/arch/riscv/errata/sifive/errata.c
@@ -107,7 +107,9 @@ void __init_or_module sifive_errata_patch_func(struct alt_entry *begin,
tmp = (1U << alt->errata_id);
if (cpu_req_errata & tmp) {
- patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
+ patch_text_nosync((void *)&alt->old_offset + alt->old_offset,
+ (void *)&alt->alt_offset + alt->alt_offset,
+ alt->alt_len);
cpu_apply_errata |= tmp;
}
}
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index fac5742d1c1e..d56d76a529b5 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -87,6 +87,7 @@ void __init_or_module thead_errata_patch_func(struct alt_entry *begin, struct al
struct alt_entry *alt;
u32 cpu_req_errata = thead_errata_probe(stage, archid, impid);
u32 tmp;
+ void *oldptr, *altptr;
for (alt = begin; alt < end; alt++) {
if (alt->vendor_id != THEAD_VENDOR_ID)
@@ -96,12 +97,16 @@ void __init_or_module thead_errata_patch_func(struct alt_entry *begin, struct al
tmp = (1U << alt->errata_id);
if (cpu_req_errata & tmp) {
+ oldptr = (void *)&alt->old_offset + alt->old_offset;
+ altptr = (void *)&alt->alt_offset + alt->alt_offset;
+
/* On vm-alternatives, the mmu isn't running yet */
if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
- memcpy((void *)__pa_symbol(alt->old_ptr),
- (void *)__pa_symbol(alt->alt_ptr), alt->alt_len);
+ memcpy((void *)__pa_symbol(oldptr),
+ (void *)__pa_symbol(altptr),
+ alt->alt_len);
else
- patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
+ patch_text_nosync(oldptr, altptr, alt->alt_len);
}
}
diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
index 7226e2462584..3c3ca65e521b 100644
--- a/arch/riscv/include/asm/alternative-macros.h
+++ b/arch/riscv/include/asm/alternative-macros.h
@@ -7,11 +7,11 @@
#ifdef __ASSEMBLY__
.macro ALT_ENTRY oldptr newptr vendor_id errata_id new_len
- RISCV_PTR \oldptr
- RISCV_PTR \newptr
- REG_ASM \vendor_id
- REG_ASM \new_len
- .word \errata_id
+ .long \oldptr - .
+ .long \newptr - .
+ .short \vendor_id
+ .short \new_len
+ .long \errata_id
.endm
.macro ALT_NEW_CONTENT vendor_id, errata_id, enable = 1, new_c : vararg
@@ -59,11 +59,11 @@
#include <linux/stringify.h>
#define ALT_ENTRY(oldptr, newptr, vendor_id, errata_id, newlen) \
- RISCV_PTR " " oldptr "\n" \
- RISCV_PTR " " newptr "\n" \
- REG_ASM " " vendor_id "\n" \
- REG_ASM " " newlen "\n" \
- ".word " errata_id "\n"
+ ".long ((" oldptr ") - .) \n" \
+ ".long ((" newptr ") - .) \n" \
+ ".short " vendor_id "\n" \
+ ".short " newlen "\n" \
+ ".long " errata_id "\n"
#define ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c) \
".if " __stringify(enable) " == 1\n" \
diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 1bd4027d34ca..b6050a235f50 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -31,12 +31,12 @@ void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
int patch_offset);
struct alt_entry {
- void *old_ptr; /* address of original instruciton or data */
- void *alt_ptr; /* address of replacement instruction or data */
- unsigned long vendor_id; /* cpu vendor id */
- unsigned long alt_len; /* The replacement size */
- unsigned int errata_id; /* The errata id */
-} __packed;
+ s32 old_offset; /* offset relative to original instruciton or data */
+ s32 alt_offset; /* offset relative to replacement instruction or data */
+ u16 vendor_id; /* cpu vendor id */
+ u16 alt_len; /* The replacement size */
+ u32 errata_id; /* The errata id */
+};
struct errata_checkfunc_id {
unsigned long vendor_id;
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 6db8b31d9149..c394cde2560b 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -280,6 +280,7 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
unsigned int stage)
{
struct alt_entry *alt;
+ void *oldptr, *altptr;
if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
return;
@@ -293,12 +294,13 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
continue;
}
+ oldptr = (void *)&alt->old_offset + alt->old_offset;
+ altptr = (void *)&alt->alt_offset + alt->alt_offset;
if (!__riscv_isa_extension_available(NULL, alt->errata_id))
continue;
- patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
- riscv_alternative_fix_offsets(alt->old_ptr, alt->alt_len,
- alt->old_ptr - alt->alt_ptr);
+ patch_text_nosync(oldptr, altptr, alt->alt_len);
+ riscv_alternative_fix_offsets(oldptr, alt->alt_len, oldptr - altptr);
}
}
#endif
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (8 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 09/13] riscv: switch to relative alternative entries Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-12 7:48 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 11/13] riscv: cpu_relax: switch to riscv_has_extension_likely() Jisheng Zhang
` (2 subsequent siblings)
12 siblings, 1 reply; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Guo Ren, Andrew Jones
Make it possible to use alternatives in the vDSO, so that better
implementations can be used if possible.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
---
arch/riscv/include/asm/vdso.h | 4 ++++
arch/riscv/kernel/alternative.c | 25 +++++++++++++++++++++++++
arch/riscv/kernel/vdso.c | 5 -----
arch/riscv/kernel/vdso/vdso.lds.S | 7 +++++++
4 files changed, 36 insertions(+), 5 deletions(-)
diff --git a/arch/riscv/include/asm/vdso.h b/arch/riscv/include/asm/vdso.h
index a7644f46d0e5..f891478829a5 100644
--- a/arch/riscv/include/asm/vdso.h
+++ b/arch/riscv/include/asm/vdso.h
@@ -28,8 +28,12 @@
#define COMPAT_VDSO_SYMBOL(base, name) \
(void __user *)((unsigned long)(base) + compat__vdso_##name##_offset)
+extern char compat_vdso_start[], compat_vdso_end[];
+
#endif /* CONFIG_COMPAT */
+extern char vdso_start[], vdso_end[];
+
#endif /* !__ASSEMBLY__ */
#endif /* CONFIG_MMU */
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index 3d4f1f32c7f6..a883a309139f 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -11,7 +11,9 @@
#include <linux/cpu.h>
#include <linux/uaccess.h>
#include <asm/alternative.h>
+#include <asm/module.h>
#include <asm/sections.h>
+#include <asm/vdso.h>
#include <asm/vendorid_list.h>
#include <asm/sbi.h>
#include <asm/csr.h>
@@ -160,6 +162,27 @@ static void __init_or_module _apply_alternatives(struct alt_entry *begin,
stage);
}
+static void __init apply_vdso_alternatives(void)
+{
+ const struct elf64_hdr *hdr;
+ const struct elf64_shdr *shdr;
+ const struct elf64_shdr *alt;
+ struct alt_entry *begin, *end;
+
+ hdr = (struct elf64_hdr *)vdso_start;
+ shdr = (void *)hdr + hdr->e_shoff;
+ alt = find_section(hdr, shdr, ".alternative");
+ if (!alt)
+ return;
+
+ begin = (void *)hdr + alt->sh_offset,
+ end = (void *)hdr + alt->sh_offset + alt->sh_size,
+
+ _apply_alternatives((struct alt_entry *)begin,
+ (struct alt_entry *)end,
+ RISCV_ALTERNATIVES_BOOT);
+}
+
void __init apply_boot_alternatives(void)
{
/* If called on non-boot cpu things could go wrong */
@@ -168,6 +191,8 @@ void __init apply_boot_alternatives(void)
_apply_alternatives((struct alt_entry *)__alt_start,
(struct alt_entry *)__alt_end,
RISCV_ALTERNATIVES_BOOT);
+
+ apply_vdso_alternatives();
}
/*
diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
index e410275918ac..4e631c098f4d 100644
--- a/arch/riscv/kernel/vdso.c
+++ b/arch/riscv/kernel/vdso.c
@@ -22,11 +22,6 @@ struct vdso_data {
};
#endif
-extern char vdso_start[], vdso_end[];
-#ifdef CONFIG_COMPAT
-extern char compat_vdso_start[], compat_vdso_end[];
-#endif
-
enum vvar_pages {
VVAR_DATA_PAGE_OFFSET,
VVAR_TIMENS_PAGE_OFFSET,
diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S
index 150b1a572e61..4a0606633290 100644
--- a/arch/riscv/kernel/vdso/vdso.lds.S
+++ b/arch/riscv/kernel/vdso/vdso.lds.S
@@ -40,6 +40,13 @@ SECTIONS
. = 0x800;
.text : { *(.text .text.*) } :text
+ . = ALIGN(4);
+ .alternative : {
+ __alt_start = .;
+ *(.alternative)
+ __alt_end = .;
+ }
+
.data : {
*(.got.plt) *(.got)
*(.data .data.* .gnu.linkonce.d.*)
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 11/13] riscv: cpu_relax: switch to riscv_has_extension_likely()
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (9 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-12 21:59 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 12/13] riscv: KVM: Switch has_svinval() to riscv_has_extension_unlikely() Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 13/13] riscv: remove riscv_isa_ext_keys[] array and related usage Jisheng Zhang
12 siblings, 1 reply; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones, Guo Ren
Switch cpu_relax() from static branch to the new helper
riscv_has_extension_likely()
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/include/asm/vdso/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/vdso/processor.h b/arch/riscv/include/asm/vdso/processor.h
index fa70cfe507aa..edf0e25e43d1 100644
--- a/arch/riscv/include/asm/vdso/processor.h
+++ b/arch/riscv/include/asm/vdso/processor.h
@@ -10,7 +10,7 @@
static inline void cpu_relax(void)
{
- if (!static_branch_likely(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_ZIHINTPAUSE])) {
+ if (!riscv_has_extension_likely(RISCV_ISA_EXT_ZIHINTPAUSE)) {
#ifdef __riscv_muldiv
int dummy;
/* In lieu of a halt instruction, induce a long-latency stall. */
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 12/13] riscv: KVM: Switch has_svinval() to riscv_has_extension_unlikely()
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (10 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 11/13] riscv: cpu_relax: switch to riscv_has_extension_likely() Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 13/13] riscv: remove riscv_isa_ext_keys[] array and related usage Jisheng Zhang
12 siblings, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones, Guo Ren
From: Andrew Jones <ajones@ventanamicro.com>
Switch has_svinval() from static branch to the new helper
riscv_has_extension_unlikely().
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/kvm/tlb.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/riscv/kvm/tlb.c b/arch/riscv/kvm/tlb.c
index 309d79b3e5cd..aa3da18ad873 100644
--- a/arch/riscv/kvm/tlb.c
+++ b/arch/riscv/kvm/tlb.c
@@ -15,8 +15,7 @@
#include <asm/hwcap.h>
#include <asm/insn-def.h>
-#define has_svinval() \
- static_branch_unlikely(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_SVINVAL])
+#define has_svinval() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVINVAL)
void kvm_riscv_local_hfence_gvma_vmid_gpa(unsigned long vmid,
gpa_t gpa, gpa_t gpsz,
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v3 13/13] riscv: remove riscv_isa_ext_keys[] array and related usage
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
` (11 preceding siblings ...)
2023-01-11 17:10 ` [PATCH v3 12/13] riscv: KVM: Switch has_svinval() to riscv_has_extension_unlikely() Jisheng Zhang
@ 2023-01-11 17:10 ` Jisheng Zhang
12 siblings, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-11 17:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones,
Conor Dooley, Guo Ren
All users have switched to riscv_has_extension_*, remove unused
definitions, vars and related setting code.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/include/asm/hwcap.h | 31 -------------------------------
arch/riscv/kernel/cpufeature.c | 9 ---------
2 files changed, 40 deletions(-)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 1767a9ce1a04..e3749bee5c24 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -60,19 +60,6 @@ enum {
extern unsigned long elf_hwcap;
-
-/*
- * This enum represents the logical ID for each RISC-V ISA extension static
- * keys. We can use static key to optimize code path if some ISA extensions
- * are available.
- */
-enum riscv_isa_ext_key {
- RISCV_ISA_EXT_KEY_FPU, /* For 'F' and 'D' */
- RISCV_ISA_EXT_KEY_ZIHINTPAUSE,
- RISCV_ISA_EXT_KEY_SVINVAL,
- RISCV_ISA_EXT_KEY_MAX,
-};
-
struct riscv_isa_ext_data {
/* Name of the extension displayed to userspace via /proc/cpuinfo */
char uprop[RISCV_ISA_EXT_NAME_LEN_MAX];
@@ -80,24 +67,6 @@ struct riscv_isa_ext_data {
unsigned int isa_ext_id;
};
-extern struct static_key_false riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_MAX];
-
-static __always_inline int riscv_isa_ext2key(int num)
-{
- switch (num) {
- case RISCV_ISA_EXT_f:
- return RISCV_ISA_EXT_KEY_FPU;
- case RISCV_ISA_EXT_d:
- return RISCV_ISA_EXT_KEY_FPU;
- case RISCV_ISA_EXT_ZIHINTPAUSE:
- return RISCV_ISA_EXT_KEY_ZIHINTPAUSE;
- case RISCV_ISA_EXT_SVINVAL:
- return RISCV_ISA_EXT_KEY_SVINVAL;
- default:
- return -EINVAL;
- }
-}
-
static __always_inline bool
riscv_has_extension_likely(const unsigned long ext)
{
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index c394cde2560b..5591d45e96b5 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -29,9 +29,6 @@ unsigned long elf_hwcap __read_mostly;
/* Host ISA bitmap */
static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly;
-DEFINE_STATIC_KEY_ARRAY_FALSE(riscv_isa_ext_keys, RISCV_ISA_EXT_KEY_MAX);
-EXPORT_SYMBOL(riscv_isa_ext_keys);
-
/**
* riscv_isa_extension_base() - Get base extension word
*
@@ -266,12 +263,6 @@ void __init riscv_fill_hwcap(void)
if (elf_hwcap & BIT_MASK(i))
print_str[j++] = (char)('a' + i);
pr_info("riscv: ELF capabilities %s\n", print_str);
-
- for_each_set_bit(i, riscv_isa, RISCV_ISA_EXT_MAX) {
- j = riscv_isa_ext2key(i);
- if (j >= 0)
- static_branch_enable(&riscv_isa_ext_keys[j]);
- }
}
#ifdef CONFIG_RISCV_ALTERNATIVE
--
2.38.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives
2023-01-11 17:10 ` [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives Jisheng Zhang
@ 2023-01-11 17:56 ` Andrew Jones
2023-01-11 23:31 ` Heiko Stübner
1 sibling, 0 replies; 35+ messages in thread
From: Andrew Jones @ 2023-01-11 17:56 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Heiko Stuebner
On Thu, Jan 12, 2023 at 01:10:15AM +0800, Jisheng Zhang wrote:
> Alternatives live in a different section, so offsets used by jal
> instruction will point to wrong locations after the patch got applied.
>
> Similar to arm64, adjust the location to consider that offset.
>
> Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
> arch/riscv/include/asm/insn.h | 27 +++++++++++++++++++++++++++
> arch/riscv/kernel/alternative.c | 27 +++++++++++++++++++++++++++
> 2 files changed, 54 insertions(+)
>
> diff --git a/arch/riscv/include/asm/insn.h b/arch/riscv/include/asm/insn.h
> index 98453535324a..1d2df245d0bd 100644
> --- a/arch/riscv/include/asm/insn.h
> +++ b/arch/riscv/include/asm/insn.h
> @@ -291,6 +291,33 @@ static __always_inline bool riscv_insn_is_branch(u32 code)
> (RVC_X(x_, RVC_B_IMM_7_6_OPOFF, RVC_B_IMM_7_6_MASK) << RVC_B_IMM_7_6_OFF) | \
> (RVC_IMM_SIGN(x_) << RVC_B_IMM_SIGN_OFF); })
>
> +/*
> + * Get the immediate from a J-type instruction.
> + *
> + * @insn: instruction to process
> + * Return: immediate
> + */
> +static inline s32 riscv_insn_extract_jtype_imm(u32 insn)
> +{
> + return RV_EXTRACT_JTYPE_IMM(insn);
> +}
> +
> +/*
> + * Update a J-type instruction with an immediate value.
> + *
> + * @insn: pointer to the jtype instruction
> + * @imm: the immediate to insert into the instruction
> + */
> +static inline void riscv_insn_insert_jtype_imm(u32 *insn, s32 imm)
> +{
> + /* drop the old IMMs, all jal IMM bits sit at 31:12 */
> + *insn &= ~GENMASK(31, 12);
> + *insn |= (RV_X(imm, RV_J_IMM_10_1_OFF, RV_J_IMM_10_1_MASK) << RV_J_IMM_10_1_OPOFF) |
> + (RV_X(imm, RV_J_IMM_11_OFF, RV_J_IMM_11_MASK) << RV_J_IMM_11_OPOFF) |
> + (RV_X(imm, RV_J_IMM_19_12_OFF, RV_J_IMM_19_12_MASK) << RV_J_IMM_19_12_OPOFF) |
> + (RV_X(imm, RV_J_IMM_SIGN_OFF, 1) << RV_J_IMM_SIGN_OPOFF);
> +}
> +
> /*
> * Put together one immediate from a U-type and I-type instruction pair.
> *
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index 6212ea0eed72..3d4f1f32c7f6 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -79,6 +79,21 @@ static void riscv_alternative_fix_auipc_jalr(void *ptr, u32 auipc_insn,
> patch_text_nosync(ptr, call, sizeof(u32) * 2);
> }
>
> +static void riscv_alternative_fix_jal(void *ptr, u32 jal_insn, int patch_offset)
> +{
> + s32 imm;
> +
> + /* get and adjust new target address */
> + imm = riscv_insn_extract_jtype_imm(jal_insn);
> + imm -= patch_offset;
> +
> + /* update instruction */
> + riscv_insn_insert_jtype_imm(&jal_insn, imm);
> +
> + /* patch the call place again */
> + patch_text_nosync(ptr, &jal_insn, sizeof(u32));
> +}
> +
> void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
> int patch_offset)
> {
> @@ -106,6 +121,18 @@ void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
> riscv_alternative_fix_auipc_jalr(alt_ptr + i * sizeof(u32),
> insn, insn2, patch_offset);
> }
> +
> + if (riscv_insn_is_jal(insn)) {
> + s32 imm = riscv_insn_extract_jtype_imm(insn);
> +
> + /* Don't modify jumps inside the alternative block */
> + if ((alt_ptr + i * sizeof(u32) + imm) >= alt_ptr &&
> + (alt_ptr + i * sizeof(u32) + imm) < (alt_ptr + len))
> + continue;
> +
> + riscv_alternative_fix_jal(alt_ptr + i * sizeof(u32),
> + insn, patch_offset);
> + }
> }
> }
>
> --
> 2.38.1
>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 09/13] riscv: switch to relative alternative entries
2023-01-11 17:10 ` [PATCH v3 09/13] riscv: switch to relative alternative entries Jisheng Zhang
@ 2023-01-11 18:11 ` Andrew Jones
2023-01-12 21:49 ` Conor Dooley
1 sibling, 0 replies; 35+ messages in thread
From: Andrew Jones @ 2023-01-11 18:11 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv
On Thu, Jan 12, 2023 at 01:10:23AM +0800, Jisheng Zhang wrote:
> Instead of using absolute addresses for both the old instrucions and
> the alternative instructions, use offsets relative to the alt_entry
> values. So this not only cuts the size of the alternative entry, but
> also meets the prerequisite for patching alternatives in the vDSO,
> since absolute alternative entries are subject to dynamic relocation,
> which is incompatible with the vDSO building.
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
> arch/riscv/errata/sifive/errata.c | 4 +++-
> arch/riscv/errata/thead/errata.c | 11 ++++++++---
> arch/riscv/include/asm/alternative-macros.h | 20 ++++++++++----------
> arch/riscv/include/asm/alternative.h | 12 ++++++------
> arch/riscv/kernel/cpufeature.c | 8 +++++---
> 5 files changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sifive/errata.c
> index 1031038423e7..0e537cdfd324 100644
> --- a/arch/riscv/errata/sifive/errata.c
> +++ b/arch/riscv/errata/sifive/errata.c
> @@ -107,7 +107,9 @@ void __init_or_module sifive_errata_patch_func(struct alt_entry *begin,
>
> tmp = (1U << alt->errata_id);
> if (cpu_req_errata & tmp) {
> - patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> + patch_text_nosync((void *)&alt->old_offset + alt->old_offset,
> + (void *)&alt->alt_offset + alt->alt_offset,
I was hoping to see Conor's macro suggestion show up in this version.
> + alt->alt_len);
> cpu_apply_errata |= tmp;
> }
> }
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index fac5742d1c1e..d56d76a529b5 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -87,6 +87,7 @@ void __init_or_module thead_errata_patch_func(struct alt_entry *begin, struct al
> struct alt_entry *alt;
> u32 cpu_req_errata = thead_errata_probe(stage, archid, impid);
> u32 tmp;
> + void *oldptr, *altptr;
>
> for (alt = begin; alt < end; alt++) {
> if (alt->vendor_id != THEAD_VENDOR_ID)
> @@ -96,12 +97,16 @@ void __init_or_module thead_errata_patch_func(struct alt_entry *begin, struct al
>
> tmp = (1U << alt->errata_id);
> if (cpu_req_errata & tmp) {
> + oldptr = (void *)&alt->old_offset + alt->old_offset;
> + altptr = (void *)&alt->alt_offset + alt->alt_offset;
> +
> /* On vm-alternatives, the mmu isn't running yet */
> if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> - memcpy((void *)__pa_symbol(alt->old_ptr),
> - (void *)__pa_symbol(alt->alt_ptr), alt->alt_len);
> + memcpy((void *)__pa_symbol(oldptr),
> + (void *)__pa_symbol(altptr),
> + alt->alt_len);
> else
> - patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> + patch_text_nosync(oldptr, altptr, alt->alt_len);
> }
> }
>
> diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
> index 7226e2462584..3c3ca65e521b 100644
> --- a/arch/riscv/include/asm/alternative-macros.h
> +++ b/arch/riscv/include/asm/alternative-macros.h
> @@ -7,11 +7,11 @@
> #ifdef __ASSEMBLY__
>
> .macro ALT_ENTRY oldptr newptr vendor_id errata_id new_len
> - RISCV_PTR \oldptr
> - RISCV_PTR \newptr
> - REG_ASM \vendor_id
> - REG_ASM \new_len
> - .word \errata_id
> + .long \oldptr - .
> + .long \newptr - .
> + .short \vendor_id
> + .short \new_len
> + .long \errata_id
nit: I like .2byte and .4byte since I always have to double check how many
bytes .long is.
> .endm
>
> .macro ALT_NEW_CONTENT vendor_id, errata_id, enable = 1, new_c : vararg
> @@ -59,11 +59,11 @@
> #include <linux/stringify.h>
>
> #define ALT_ENTRY(oldptr, newptr, vendor_id, errata_id, newlen) \
> - RISCV_PTR " " oldptr "\n" \
> - RISCV_PTR " " newptr "\n" \
> - REG_ASM " " vendor_id "\n" \
> - REG_ASM " " newlen "\n" \
> - ".word " errata_id "\n"
> + ".long ((" oldptr ") - .) \n" \
> + ".long ((" newptr ") - .) \n" \
> + ".short " vendor_id "\n" \
> + ".short " newlen "\n" \
> + ".long " errata_id "\n"
>
> #define ALT_NEW_CONTENT(vendor_id, errata_id, enable, new_c) \
> ".if " __stringify(enable) " == 1\n" \
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 1bd4027d34ca..b6050a235f50 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -31,12 +31,12 @@ void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
> int patch_offset);
>
> struct alt_entry {
> - void *old_ptr; /* address of original instruciton or data */
> - void *alt_ptr; /* address of replacement instruction or data */
> - unsigned long vendor_id; /* cpu vendor id */
> - unsigned long alt_len; /* The replacement size */
> - unsigned int errata_id; /* The errata id */
> -} __packed;
> + s32 old_offset; /* offset relative to original instruciton or data */
^
instruction
(The typo was already there, but, IMO, we can fix something like that
while touching it.)
> + s32 alt_offset; /* offset relative to replacement instruction or data */
> + u16 vendor_id; /* cpu vendor id */
> + u16 alt_len; /* The replacement size */
> + u32 errata_id; /* The errata id */
> +};
>
> struct errata_checkfunc_id {
> unsigned long vendor_id;
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 6db8b31d9149..c394cde2560b 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -280,6 +280,7 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> unsigned int stage)
> {
> struct alt_entry *alt;
> + void *oldptr, *altptr;
>
> if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> return;
> @@ -293,12 +294,13 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> continue;
> }
>
> + oldptr = (void *)&alt->old_offset + alt->old_offset;
> + altptr = (void *)&alt->alt_offset + alt->alt_offset;
> if (!__riscv_isa_extension_available(NULL, alt->errata_id))
> continue;
>
> - patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> - riscv_alternative_fix_offsets(alt->old_ptr, alt->alt_len,
> - alt->old_ptr - alt->alt_ptr);
> + patch_text_nosync(oldptr, altptr, alt->alt_len);
> + riscv_alternative_fix_offsets(oldptr, alt->alt_len, oldptr - altptr);
> }
> }
> #endif
> --
> 2.38.1
Besides preferring a macro and the nits, LGTM
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Thanks,
drew
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-11 17:10 ` [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions Jisheng Zhang
@ 2023-01-11 23:29 ` Heiko Stübner
2023-01-12 9:21 ` Andrew Jones
2023-01-15 14:19 ` Jisheng Zhang
0 siblings, 2 replies; 35+ messages in thread
From: Heiko Stübner @ 2023-01-11 23:29 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Jisheng Zhang
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
Hi Jisheng.
Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> riscv_cpufeature_patch_func() currently only scans a limited set of
> cpufeatures, explicitly defined with macros. Extend it to probe for all
> ISA extensions.
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> ---
> arch/riscv/include/asm/errata_list.h | 9 ++--
> arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> 2 files changed, 11 insertions(+), 61 deletions(-)
hmmm ... I do see a somewhat big caveat for this.
and would like to take back my Reviewed-by for now
With this change we would limit the patchable cpufeatures to actual
riscv extensions. But cpufeatures can also be soft features like
how performant the core handles unaligned accesses.
See Palmer's series [0].
Also this essentially codifies that each ALTERNATIVE can only ever
be attached to exactly one extension.
But contrary to vendor-errata, it is very likely that we will need
combinations of different extensions for some alternatives in the future.
In my optimization quest, I found that it's actually pretty neat to
convert the errata-id for cpufeatures to a bitfield [1], because then it's
possible to just combine extensions into said bitfield [2]:
ALTERNATIVE_2("nop",
"j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
"j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
[the additional field there models a "not" component]
So I really feel this would limit us quite a bit.
Heiko
[0] https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=riscv-hwprobe-v1&id=510c491cb9d87dcbdc91c63558dc704968723240
[1] https://github.com/mmind/linux-riscv/commit/f57a896122ee7e666692079320fc35829434cf96
[2] https://github.com/mmind/linux-riscv/commit/8cef615dab0c00ad68af2651ee5b93d06be17f27#diff-194cb8a86f9fb9b03683295f21c8f46b456a9f94737f01726ddbcbb9e3aace2cR12
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives
2023-01-11 17:10 ` [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives Jisheng Zhang
2023-01-11 17:56 ` Andrew Jones
@ 2023-01-11 23:31 ` Heiko Stübner
2023-01-12 20:25 ` Conor Dooley
1 sibling, 1 reply; 35+ messages in thread
From: Heiko Stübner @ 2023-01-11 23:31 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Jisheng Zhang
Cc: linux-riscv, linux-kernel, kvm, kvm-riscv
Am Mittwoch, 11. Januar 2023, 18:10:15 CET schrieb Jisheng Zhang:
> Alternatives live in a different section, so offsets used by jal
> instruction will point to wrong locations after the patch got applied.
>
> Similar to arm64, adjust the location to consider that offset.
>
> Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
looks good, thanks for fixing the issues Andrew and Conor pointed
out in the variant in my zbb series. I've now switched over to this one.
I guess as you said, we really should separate this out into a single patch
[or if Palmer is fine with that, just pick this one patch to apply until the
rest is ready]
Heiko
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO
2023-01-11 17:10 ` [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO Jisheng Zhang
@ 2023-01-12 7:48 ` Conor Dooley
2023-01-12 21:55 ` Conor Dooley
0 siblings, 1 reply; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 7:48 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Guo Ren, Andrew Jones
[-- Attachment #1: Type: text/plain, Size: 3975 bytes --]
Hey Jisheng,
On Thu, Jan 12, 2023 at 01:10:24AM +0800, Jisheng Zhang wrote:
> Make it possible to use alternatives in the vDSO, so that better
> implementations can be used if possible.
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Reviewed-by: Guo Ren <guoren@kernel.org>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
FYI, from this patch onwards the rv32 build is broken.
Should be reproduceable with the in-tree rv32_defconfig.
Unfortunately no logs for you, I've got a CI bug to fix!
Thanks,
Conor.
> ---
> arch/riscv/include/asm/vdso.h | 4 ++++
> arch/riscv/kernel/alternative.c | 25 +++++++++++++++++++++++++
> arch/riscv/kernel/vdso.c | 5 -----
> arch/riscv/kernel/vdso/vdso.lds.S | 7 +++++++
> 4 files changed, 36 insertions(+), 5 deletions(-)
>
> diff --git a/arch/riscv/include/asm/vdso.h b/arch/riscv/include/asm/vdso.h
> index a7644f46d0e5..f891478829a5 100644
> --- a/arch/riscv/include/asm/vdso.h
> +++ b/arch/riscv/include/asm/vdso.h
> @@ -28,8 +28,12 @@
> #define COMPAT_VDSO_SYMBOL(base, name) \
> (void __user *)((unsigned long)(base) + compat__vdso_##name##_offset)
>
> +extern char compat_vdso_start[], compat_vdso_end[];
> +
> #endif /* CONFIG_COMPAT */
>
> +extern char vdso_start[], vdso_end[];
> +
> #endif /* !__ASSEMBLY__ */
>
> #endif /* CONFIG_MMU */
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index 3d4f1f32c7f6..a883a309139f 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -11,7 +11,9 @@
> #include <linux/cpu.h>
> #include <linux/uaccess.h>
> #include <asm/alternative.h>
> +#include <asm/module.h>
> #include <asm/sections.h>
> +#include <asm/vdso.h>
> #include <asm/vendorid_list.h>
> #include <asm/sbi.h>
> #include <asm/csr.h>
> @@ -160,6 +162,27 @@ static void __init_or_module _apply_alternatives(struct alt_entry *begin,
> stage);
> }
>
> +static void __init apply_vdso_alternatives(void)
> +{
> + const struct elf64_hdr *hdr;
> + const struct elf64_shdr *shdr;
> + const struct elf64_shdr *alt;
> + struct alt_entry *begin, *end;
> +
> + hdr = (struct elf64_hdr *)vdso_start;
> + shdr = (void *)hdr + hdr->e_shoff;
> + alt = find_section(hdr, shdr, ".alternative");
> + if (!alt)
> + return;
> +
> + begin = (void *)hdr + alt->sh_offset,
> + end = (void *)hdr + alt->sh_offset + alt->sh_size,
> +
> + _apply_alternatives((struct alt_entry *)begin,
> + (struct alt_entry *)end,
> + RISCV_ALTERNATIVES_BOOT);
> +}
> +
> void __init apply_boot_alternatives(void)
> {
> /* If called on non-boot cpu things could go wrong */
> @@ -168,6 +191,8 @@ void __init apply_boot_alternatives(void)
> _apply_alternatives((struct alt_entry *)__alt_start,
> (struct alt_entry *)__alt_end,
> RISCV_ALTERNATIVES_BOOT);
> +
> + apply_vdso_alternatives();
> }
>
> /*
> diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
> index e410275918ac..4e631c098f4d 100644
> --- a/arch/riscv/kernel/vdso.c
> +++ b/arch/riscv/kernel/vdso.c
> @@ -22,11 +22,6 @@ struct vdso_data {
> };
> #endif
>
> -extern char vdso_start[], vdso_end[];
> -#ifdef CONFIG_COMPAT
> -extern char compat_vdso_start[], compat_vdso_end[];
> -#endif
> -
> enum vvar_pages {
> VVAR_DATA_PAGE_OFFSET,
> VVAR_TIMENS_PAGE_OFFSET,
> diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S
> index 150b1a572e61..4a0606633290 100644
> --- a/arch/riscv/kernel/vdso/vdso.lds.S
> +++ b/arch/riscv/kernel/vdso/vdso.lds.S
> @@ -40,6 +40,13 @@ SECTIONS
> . = 0x800;
> .text : { *(.text .text.*) } :text
>
> + . = ALIGN(4);
> + .alternative : {
> + __alt_start = .;
> + *(.alternative)
> + __alt_end = .;
> + }
> +
> .data : {
> *(.got.plt) *(.got)
> *(.data .data.* .gnu.linkonce.d.*)
> --
> 2.38.1
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-11 23:29 ` Heiko Stübner
@ 2023-01-12 9:21 ` Andrew Jones
2023-01-13 15:18 ` Conor Dooley
2023-01-15 13:59 ` Jisheng Zhang
2023-01-15 14:19 ` Jisheng Zhang
1 sibling, 2 replies; 35+ messages in thread
From: Andrew Jones @ 2023-01-12 9:21 UTC (permalink / raw)
To: Heiko Stübner
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Jisheng Zhang, linux-riscv, linux-kernel, kvm, kvm-riscv
On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> Hi Jisheng.
>
> Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > riscv_cpufeature_patch_func() currently only scans a limited set of
> > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > ISA extensions.
> >
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > ---
> > arch/riscv/include/asm/errata_list.h | 9 ++--
> > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > 2 files changed, 11 insertions(+), 61 deletions(-)
>
> hmmm ... I do see a somewhat big caveat for this.
> and would like to take back my Reviewed-by for now
>
>
> With this change we would limit the patchable cpufeatures to actual
> riscv extensions. But cpufeatures can also be soft features like
> how performant the core handles unaligned accesses.
I agree that this needs to be addressed and Jisheng also raised this
yesterday here [*]. It seems we need the concept of cpufeatures, which
may be extensions or non-extensions.
[*] https://lore.kernel.org/all/Y77xyNPNqnFQUqAx@xhacker/
>
> See Palmer's series [0].
>
>
> Also this essentially codifies that each ALTERNATIVE can only ever
> be attached to exactly one extension.
>
> But contrary to vendor-errata, it is very likely that we will need
> combinations of different extensions for some alternatives in the future.
One possible approach may be to combine extensions/non-extensions at boot
time into pseudo-cpufeatures. Then, alternatives can continue attaching to
a single "feature". (I'm not saying that's a better approach than the
bitmap, I'm just suggesting it as something else to consider.)
Thanks,
drew
>
> In my optimization quest, I found that it's actually pretty neat to
> convert the errata-id for cpufeatures to a bitfield [1], because then it's
> possible to just combine extensions into said bitfield [2]:
>
> ALTERNATIVE_2("nop",
> "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
>
> [the additional field there models a "not" component]
>
> So I really feel this would limit us quite a bit.
>
>
> Heiko
>
>
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=riscv-hwprobe-v1&id=510c491cb9d87dcbdc91c63558dc704968723240
> [1] https://github.com/mmind/linux-riscv/commit/f57a896122ee7e666692079320fc35829434cf96
> [2] https://github.com/mmind/linux-riscv/commit/8cef615dab0c00ad68af2651ee5b93d06be17f27#diff-194cb8a86f9fb9b03683295f21c8f46b456a9f94737f01726ddbcbb9e3aace2cR12
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives
2023-01-11 23:31 ` Heiko Stübner
@ 2023-01-12 20:25 ` Conor Dooley
0 siblings, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 20:25 UTC (permalink / raw)
To: Heiko Stübner
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Jisheng Zhang, linux-riscv, linux-kernel, kvm, kvm-riscv
[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]
On Thu, Jan 12, 2023 at 12:31:59AM +0100, Heiko Stübner wrote:
> Am Mittwoch, 11. Januar 2023, 18:10:15 CET schrieb Jisheng Zhang:
> > Alternatives live in a different section, so offsets used by jal
> > instruction will point to wrong locations after the patch got applied.
> >
> > Similar to arm64, adjust the location to consider that offset.
> >
> > Co-developed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
>
> looks good, thanks for fixing the issues Andrew and Conor pointed
> out in the variant in my zbb series. I've now switched over to this one.
>
> I guess as you said, we really should separate this out into a single patch
> [or if Palmer is fine with that, just pick this one patch to apply until the
> rest is ready]
Splitting it out may make it easier to flag for him during the pw sync
next week? Either way, I'm fine w/ it..
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 03/13] riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier
2023-01-11 17:10 ` [PATCH v3 03/13] riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier Jisheng Zhang
@ 2023-01-12 21:11 ` Conor Dooley
0 siblings, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 21:11 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Andrew Jones
[-- Attachment #1: Type: text/plain, Size: 1510 bytes --]
On Thu, Jan 12, 2023 at 01:10:17AM +0800, Jisheng Zhang wrote:
> Currently riscv_cpufeature_patch_func() does nothing at the
> RISCV_ALTERNATIVES_EARLY_BOOT stage. Add a check to detect whether we
> are in this stage and exit early. This will allow us to use
> riscv_cpufeature_patch_func() for scanning of all ISA extensions.
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Probably overkill at this stage, but:
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
BTW, would you mind CCing me on all patches in a series if I have
previously reviewed the series? Makes life easier :)
> ---
> arch/riscv/kernel/cpufeature.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 421b3d9578cc..37e8c5e69754 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -328,6 +328,9 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> struct alt_entry *alt;
> u32 tmp;
>
> + if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> + return;
> +
> for (alt = begin; alt < end; alt++) {
> if (alt->vendor_id != 0)
> continue;
> --
> 2.38.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm
2023-01-11 17:10 ` [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm Jisheng Zhang
@ 2023-01-12 21:28 ` Conor Dooley
2023-01-15 13:13 ` Jisheng Zhang
0 siblings, 1 reply; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 21:28 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Andrew Jones
[-- Attachment #1: Type: text/plain, Size: 3718 bytes --]
Hey Jisheng,
On Thu, Jan 12, 2023 at 01:10:18AM +0800, Jisheng Zhang wrote:
> We will make use of ISA extension in asm files, so make the multi-letter
>
> RISC-V ISA extension IDs macros rather than enums and move them and
> those base ISA extension IDs to suitable place.
From v2:
Which base ISA extension IDs? Changelog should match the patch contents,
and it's a little unclear here since the base ISA extension IDs are
visible here but in the context not the diff.
How about something like:
"So that ISA extensions can be used in assembly files, convert the
multi-letter RISC-V ISA extension IDs enums to macros.
In order to make them visible, move the #ifndef __ASSEMBLY__ guard
to a later point in the header"
Pedantry perhaps, but referring to moving the base IDs looks odd, since
that is not what git thinks you did - even if that is the copy paste
operation you carried out.
Content itself is
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Thanks,
Conor.
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> ---
> arch/riscv/include/asm/hwcap.h | 45 ++++++++++++++++------------------
> 1 file changed, 21 insertions(+), 24 deletions(-)
>
> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> index 86328e3acb02..09a7767723f6 100644
> --- a/arch/riscv/include/asm/hwcap.h
> +++ b/arch/riscv/include/asm/hwcap.h
> @@ -12,20 +12,6 @@
> #include <linux/bits.h>
> #include <uapi/asm/hwcap.h>
>
> -#ifndef __ASSEMBLY__
> -#include <linux/jump_label.h>
> -/*
> - * This yields a mask that user programs can use to figure out what
> - * instruction set this cpu supports.
> - */
> -#define ELF_HWCAP (elf_hwcap)
> -
> -enum {
> - CAP_HWCAP = 1,
> -};
> -
> -extern unsigned long elf_hwcap;
> -
> #define RISCV_ISA_EXT_a ('a' - 'a')
> #define RISCV_ISA_EXT_c ('c' - 'a')
> #define RISCV_ISA_EXT_d ('d' - 'a')
> @@ -46,22 +32,33 @@ extern unsigned long elf_hwcap;
> #define RISCV_ISA_EXT_BASE 26
>
> /*
> - * This enum represent the logical ID for each multi-letter RISC-V ISA extension.
> + * These macros represent the logical ID for each multi-letter RISC-V ISA extension.
> * The logical ID should start from RISCV_ISA_EXT_BASE and must not exceed
> * RISCV_ISA_EXT_MAX. 0-25 range is reserved for single letter
> * extensions while all the multi-letter extensions should define the next
> * available logical extension id.
> */
> -enum riscv_isa_ext_id {
> - RISCV_ISA_EXT_SSCOFPMF = RISCV_ISA_EXT_BASE,
> - RISCV_ISA_EXT_SVPBMT,
> - RISCV_ISA_EXT_ZICBOM,
> - RISCV_ISA_EXT_ZIHINTPAUSE,
> - RISCV_ISA_EXT_SSTC,
> - RISCV_ISA_EXT_SVINVAL,
> - RISCV_ISA_EXT_ID_MAX
> +#define RISCV_ISA_EXT_SSCOFPMF 26
> +#define RISCV_ISA_EXT_SVPBMT 27
> +#define RISCV_ISA_EXT_ZICBOM 28
> +#define RISCV_ISA_EXT_ZIHINTPAUSE 29
> +#define RISCV_ISA_EXT_SSTC 30
> +#define RISCV_ISA_EXT_SVINVAL 31
> +
> +#ifndef __ASSEMBLY__
> +#include <linux/jump_label.h>
> +/*
> + * This yields a mask that user programs can use to figure out what
> + * instruction set this cpu supports.
> + */
> +#define ELF_HWCAP (elf_hwcap)
> +
> +enum {
> + CAP_HWCAP = 1,
> };
> -static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX);
> +
> +extern unsigned long elf_hwcap;
> +
>
> /*
> * This enum represents the logical ID for each RISC-V ISA extension static
> --
> 2.38.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 09/13] riscv: switch to relative alternative entries
2023-01-11 17:10 ` [PATCH v3 09/13] riscv: switch to relative alternative entries Jisheng Zhang
2023-01-11 18:11 ` Andrew Jones
@ 2023-01-12 21:49 ` Conor Dooley
1 sibling, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 21:49 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv
[-- Attachment #1: Type: text/plain, Size: 3396 bytes --]
Hey Jisheng,
On Thu, Jan 12, 2023 at 01:10:23AM +0800, Jisheng Zhang wrote:
> Instead of using absolute addresses for both the old instrucions and
> the alternative instructions, use offsets relative to the alt_entry
> values. So this not only cuts the size of the alternative entry, but
> also meets the prerequisite for patching alternatives in the vDSO,
> since absolute alternative entries are subject to dynamic relocation,
> which is incompatible with the vDSO building.
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
> arch/riscv/errata/sifive/errata.c | 4 +++-
> arch/riscv/errata/thead/errata.c | 11 ++++++++---
> arch/riscv/include/asm/alternative-macros.h | 20 ++++++++++----------
> arch/riscv/include/asm/alternative.h | 12 ++++++------
> arch/riscv/kernel/cpufeature.c | 8 +++++---
> 5 files changed, 32 insertions(+), 23 deletions(-)
>
> diff --git a/arch/riscv/errata/sifive/errata.c b/arch/riscv/errata/sifive/errata.c
> index 1031038423e7..0e537cdfd324 100644
> --- a/arch/riscv/errata/sifive/errata.c
> +++ b/arch/riscv/errata/sifive/errata.c
> @@ -107,7 +107,9 @@ void __init_or_module sifive_errata_patch_func(struct alt_entry *begin,
>
> tmp = (1U << alt->errata_id);
> if (cpu_req_errata & tmp) {
> - patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> + patch_text_nosync((void *)&alt->old_offset + alt->old_offset,
> + (void *)&alt->alt_offset + alt->alt_offset,
> + alt->alt_len);
I left a comment on v2 that went unanswered:
https://lore.kernel.org/all/Y4+3nJ53nvmmc8+z@spud/
The TL;DR is that I would like you to create a macro for this so that
this messy operation is done in a central location, with a nice comment
explaining the offsets. If my "analysis" there was correct, feel free to
use it as a starting point for said comment.
The macro would then reduce the above to something like:
patch_text_nosync(ALT_OFFSET_ADDRESS(alt->old_offset),
ALT_OFFSET_ADDRESS(alt->alt_offset),
alt->alt_len);
Which I think is easier to understand since this "concept" will show up
in several places & is less intuitive than what we currently have.
Nothing beats having this stuff well explained in the codebase IMO.
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 1bd4027d34ca..b6050a235f50 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -31,12 +31,12 @@ void riscv_alternative_fix_offsets(void *alt_ptr, unsigned int len,
> int patch_offset);
>
> struct alt_entry {
> - void *old_ptr; /* address of original instruciton or data */
> - void *alt_ptr; /* address of replacement instruction or data */
> - unsigned long vendor_id; /* cpu vendor id */
> - unsigned long alt_len; /* The replacement size */
> - unsigned int errata_id; /* The errata id */
> -} __packed;
> + s32 old_offset; /* offset relative to original instruciton or data */
> + s32 alt_offset; /* offset relative to replacement instruction or data */
This wording is better, but you should fix the "instruciton" typo while
you are in the area.
> + u16 vendor_id; /* cpu vendor id */
> + u16 alt_len; /* The replacement size */
> + u32 errata_id; /* The errata id */
> +};
Thanks,
Conor.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO
2023-01-12 7:48 ` Conor Dooley
@ 2023-01-12 21:55 ` Conor Dooley
0 siblings, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 21:55 UTC (permalink / raw)
To: Conor Dooley, jszhang
Cc: Jisheng Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Anup Patel, Atish Patra, Heiko Stuebner, linux-riscv,
linux-kernel, kvm, kvm-riscv, Guo Ren, Andrew Jones
[-- Attachment #1: Type: text/plain, Size: 2090 bytes --]
On Thu, Jan 12, 2023 at 07:48:29AM +0000, Conor Dooley wrote:
> Hey Jisheng,
>
> On Thu, Jan 12, 2023 at 01:10:24AM +0800, Jisheng Zhang wrote:
> > Make it possible to use alternatives in the vDSO, so that better
> > implementations can be used if possible.
> >
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > Reviewed-by: Guo Ren <guoren@kernel.org>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
>
> FYI, from this patch onwards the rv32 build is broken.
> Should be reproduceable with the in-tree rv32_defconfig.
>
> Unfortunately no logs for you, I've got a CI bug to fix!
Here's the error:
../arch/riscv/kernel/alternative.c:174:21: error: incompatible pointer types passing 'const struct elf64_hdr *' to parameter of type 'const Elf32_Ehdr *' (aka 'const struct elf32_hdr *') [-Werror,-Wincompatible-pointer-types]
alt = find_section(hdr, shdr, ".alternative");
^~~
../arch/riscv/include/asm/module.h:115:60: note: passing argument to parameter 'hdr' here
static inline const Elf_Shdr *find_section(const Elf_Ehdr *hdr,
^
../arch/riscv/kernel/alternative.c:174:26: error: incompatible pointer types passing 'const struct elf64_shdr *' to parameter of type 'const Elf32_Shdr *' (aka 'const struct elf32_shdr *') [-Werror,-Wincompatible-pointer-types]
alt = find_section(hdr, shdr, ".alternative");
^~~~
../arch/riscv/include/asm/module.h:116:25: note: passing argument to parameter 'sechdrs' here
const Elf_Shdr *sechdrs,
^
../arch/riscv/kernel/alternative.c:174:6: error: incompatible pointer types assigning to 'const struct elf64_shdr *' from 'const Elf32_Shdr *' (aka 'const struct elf32_shdr *') [-Werror,-Wincompatible-pointer-types]
alt = find_section(hdr, shdr, ".alternative");
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.
Thanks,
Conor.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 07/13] riscv: fpu: switch has_fpu() to riscv_has_extension_likely()
2023-01-11 17:10 ` [PATCH v3 07/13] riscv: fpu: switch has_fpu() to riscv_has_extension_likely() Jisheng Zhang
@ 2023-01-12 21:58 ` Conor Dooley
0 siblings, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 21:58 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Andrew Jones
[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]
On Thu, Jan 12, 2023 at 01:10:21AM +0800, Jisheng Zhang wrote:
> Switch has_fpu() from statich branch to the new helper
s/statich/static
> riscv_has_extension_likely().
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Modulo whatever psuedo-cpufeature shenanigans require renaming the
function, you can tack my name onto this list also...
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Thanks,
Conor.
> ---
> arch/riscv/include/asm/switch_to.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
> index 11463489fec6..60f8ca01d36e 100644
> --- a/arch/riscv/include/asm/switch_to.h
> +++ b/arch/riscv/include/asm/switch_to.h
> @@ -59,7 +59,8 @@ static inline void __switch_to_aux(struct task_struct *prev,
>
> static __always_inline bool has_fpu(void)
> {
> - return static_branch_likely(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_FPU]);
> + return riscv_has_extension_likely(RISCV_ISA_EXT_f) ||
> + riscv_has_extension_likely(RISCV_ISA_EXT_d);
> }
> #else
> static __always_inline bool has_fpu(void) { return false; }
> --
> 2.38.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 11/13] riscv: cpu_relax: switch to riscv_has_extension_likely()
2023-01-11 17:10 ` [PATCH v3 11/13] riscv: cpu_relax: switch to riscv_has_extension_likely() Jisheng Zhang
@ 2023-01-12 21:59 ` Conor Dooley
0 siblings, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-12 21:59 UTC (permalink / raw)
To: Jisheng Zhang
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Andrew Jones, Guo Ren
[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]
On Thu, Jan 12, 2023 at 01:10:25AM +0800, Jisheng Zhang wrote:
> Switch cpu_relax() from static branch to the new helper
> riscv_has_extension_likely()
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> Reviewed-by: Guo Ren <guoren@kernel.org>
With the same caveat here as with fpu, may as well join the
posse once more...
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
> ---
> arch/riscv/include/asm/vdso/processor.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/include/asm/vdso/processor.h b/arch/riscv/include/asm/vdso/processor.h
> index fa70cfe507aa..edf0e25e43d1 100644
> --- a/arch/riscv/include/asm/vdso/processor.h
> +++ b/arch/riscv/include/asm/vdso/processor.h
> @@ -10,7 +10,7 @@
>
> static inline void cpu_relax(void)
> {
> - if (!static_branch_likely(&riscv_isa_ext_keys[RISCV_ISA_EXT_KEY_ZIHINTPAUSE])) {
> + if (!riscv_has_extension_likely(RISCV_ISA_EXT_ZIHINTPAUSE)) {
> #ifdef __riscv_muldiv
> int dummy;
> /* In lieu of a halt instruction, induce a long-latency stall. */
> --
> 2.38.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-12 9:21 ` Andrew Jones
@ 2023-01-13 15:18 ` Conor Dooley
2023-01-14 20:32 ` Conor Dooley
2023-01-15 13:59 ` Jisheng Zhang
1 sibling, 1 reply; 35+ messages in thread
From: Conor Dooley @ 2023-01-13 15:18 UTC (permalink / raw)
To: Andrew Jones
Cc: Heiko Stübner, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Anup Patel, Atish Patra, Jisheng Zhang, linux-riscv, linux-kernel,
kvm, kvm-riscv
[-- Attachment #1: Type: text/plain, Size: 4544 bytes --]
On Thu, Jan 12, 2023 at 10:21:36AM +0100, Andrew Jones wrote:
> On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> > Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > > riscv_cpufeature_patch_func() currently only scans a limited set of
> > > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > > ISA extensions.
> > >
> > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > > ---
> > > arch/riscv/include/asm/errata_list.h | 9 ++--
> > > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > > 2 files changed, 11 insertions(+), 61 deletions(-)
> >
> > hmmm ... I do see a somewhat big caveat for this.
> > and would like to take back my Reviewed-by for now
> >
> >
> > With this change we would limit the patchable cpufeatures to actual
> > riscv extensions. But cpufeatures can also be soft features like
> > how performant the core handles unaligned accesses.
>
> I agree that this needs to be addressed and Jisheng also raised this
> yesterday here [*]. It seems we need the concept of cpufeatures, which
> may be extensions or non-extensions.
>
> [*] https://lore.kernel.org/all/Y77xyNPNqnFQUqAx@xhacker/
>
> > See Palmer's series [0].
> >
> >
> > Also this essentially codifies that each ALTERNATIVE can only ever
> > be attached to exactly one extension.
> >
> > But contrary to vendor-errata, it is very likely that we will need
> > combinations of different extensions for some alternatives in the future.
>
> One possible approach may be to combine extensions/non-extensions at boot
> time into pseudo-cpufeatures. Then, alternatives can continue attaching to
> a single "feature". (I'm not saying that's a better approach than the
> bitmap, I'm just suggesting it as something else to consider.)
> > ALTERNATIVE_2("nop",
> > "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> > "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
> >
> > [the additional field there models a "not" component]
Since we're discussing theoretical implementations, and it's a little hard
to picture all that they entail in my head, I might be making a fool of
myself here with assumptions...
Heiko's suggestion sounded along the lines of: keep probing individual
"features" as we are now. Features in this case being the presence of
the extension or non-extension capability. And then in the alternative,
make all of the decisions about which to apply.
Drew's suggestion would have significantly more defined CPUFEATURE_FOOs,
but would offload the decision making about which extensions or non-
extension capabilities constitute a feature to regular old cpufeature
code. However, the order of precedence would remain in the alt macro, as
it does now.
I think I am just a wee bit biased, but adding the complexity somewhere
other than alternative macros seems a wise choice, especially as we are
likely to find that complexity increases over time?
The other thing that came to mind, and maybe this is just looking for
holes where they don't exist (or are not worth addressing), is that
order of precedence.
I can imagine that, in some cases, the order of precedence is not a
constant per psuedo-cpufeature, but determined by implementation of
the capabilities that comprise it?
If my assumption/understanding holds, moving decision making out of the
alternative seems like it would better provision for scenarios like
that? I dunno, maybe that is whatever the corollary of "premature
optimisation" is for this discussion.
That's my unsolicited € 0.02, hopefully I wasn't off-base with the
assumptions I made.
Heiko, I figure you've got some sort of WIP stuff for this anyway since
you're interested in the fast unaligned? How close are you to posting any
of that?
While I think of it, w.r.t. extension versus (pseudo)cpufeature etc
naming, it may make sense to call the functions that this series adds
in patch 6 has_cpufeature_{un,}likely(), no matter what decision gets
made here?
IMO using cpufeature seems to make more sense for a general use API that
may be used later on for the likes of unaligned access, even if
initially it is not used for anything other than extensions.
Thanks,
Conor.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-13 15:18 ` Conor Dooley
@ 2023-01-14 20:32 ` Conor Dooley
2023-01-18 21:54 ` Conor Dooley
0 siblings, 1 reply; 35+ messages in thread
From: Conor Dooley @ 2023-01-14 20:32 UTC (permalink / raw)
To: Andrew Jones, Heiko Stübner
Cc: Andrew Jones, Heiko Stübner, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Anup Patel, Atish Patra, Jisheng Zhang, linux-riscv,
linux-kernel, kvm, kvm-riscv
[-- Attachment #1: Type: text/plain, Size: 6557 bytes --]
Hello again!
On Fri, Jan 13, 2023 at 03:18:59PM +0000, Conor Dooley wrote:
> On Thu, Jan 12, 2023 at 10:21:36AM +0100, Andrew Jones wrote:
> > On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> > > Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > > > riscv_cpufeature_patch_func() currently only scans a limited set of
> > > > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > > > ISA extensions.
> > > >
> > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > > > ---
> > > > arch/riscv/include/asm/errata_list.h | 9 ++--
> > > > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > > > 2 files changed, 11 insertions(+), 61 deletions(-)
> > >
> > > hmmm ... I do see a somewhat big caveat for this.
> > > and would like to take back my Reviewed-by for now
> > >
> > >
> > > With this change we would limit the patchable cpufeatures to actual
> > > riscv extensions. But cpufeatures can also be soft features like
> > > how performant the core handles unaligned accesses.
> >
> > I agree that this needs to be addressed and Jisheng also raised this
> > yesterday here [*]. It seems we need the concept of cpufeatures, which
> > may be extensions or non-extensions.
> >
> > [*] https://lore.kernel.org/all/Y77xyNPNqnFQUqAx@xhacker/
> >
> > > See Palmer's series [0].
> > >
> > >
> > > Also this essentially codifies that each ALTERNATIVE can only ever
> > > be attached to exactly one extension.
> > >
> > > But contrary to vendor-errata, it is very likely that we will need
> > > combinations of different extensions for some alternatives in the future.
> >
> > One possible approach may be to combine extensions/non-extensions at boot
> > time into pseudo-cpufeatures. Then, alternatives can continue attaching to
> > a single "feature". (I'm not saying that's a better approach than the
> > bitmap, I'm just suggesting it as something else to consider.)
>
>
> > > ALTERNATIVE_2("nop",
> > > "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> > > "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
> > >
> > > [the additional field there models a "not" component]
>
> Since we're discussing theoretical implementations, and it's a little hard
> to picture all that they entail in my head, I might be making a fool of
> myself here with assumptions...
>
> Heiko's suggestion sounded along the lines of: keep probing individual
> "features" as we are now. Features in this case being the presence of
> the extension or non-extension capability. And then in the alternative,
> make all of the decisions about which to apply.
>
> Drew's suggestion would have significantly more defined CPUFEATURE_FOOs,
> but would offload the decision making about which extensions or non-
> extension capabilities constitute a feature to regular old cpufeature
> code. However, the order of precedence would remain in the alt macro, as
> it does now.
>
> I think I am just a wee bit biased, but adding the complexity somewhere
> other than alternative macros seems a wise choice, especially as we are
> likely to find that complexity increases over time?
>
> The other thing that came to mind, and maybe this is just looking for
> holes where they don't exist (or are not worth addressing), is that
> order of precedence.
> I can imagine that, in some cases, the order of precedence is not a
> constant per psuedo-cpufeature, but determined by implementation of
> the capabilities that comprise it?
Having spent longer than I maybe should've looking at your patches
Heiko, given it's a Saturday evening, the precedence stuff is still
sticking out to me..
For Zbb & fast unaligned, the order may be non-controversial, but in
the general case I don't see how it can be true that the order of
precedence for variants is a constant.
Creating pseudo cpufeatures as Drew suggested does seem like a way to
extract complexity from the alternatives themselves (which I think is a
good thing) but at the expense of eating up cpu_req_feature bits...
By itself, it doesn't help with precedence, but it may better allow us
to deal with some of the precedence in cpufeature.c, since the
alternative would operate based on the pseudo cpufeature rather than on
the individual capabilities that the pseudo cpufeature depends on.
I tried to come up with a suggestion for what to do about precedence,
but everything I thought up felt a bit horrific tbh.
The thing that fits the current model best is just allowing cpu vendors
to add, yet more, "errata" that pick the variant that works best for
their implementation... Although I'd be worried about ballooning some of
these ALT_FOO macros out to a massive degree with that sort of approach.
> If my assumption/understanding holds, moving decision making out of the
> alternative seems like it would better provision for scenarios like
> that? I dunno, maybe that is whatever the corollary of "premature
> optimisation" is for this discussion.
>
> That's my unsolicited € 0.02, hopefully I wasn't off-base with the
> assumptions I made.
The order in which an alternative is added to the macro does matter,
right? At least, that's how I thought it worked and hope I've not had
an incorrect interpretation there all along... I wasn't until I started
reading your patch and couldn't understand why you had a construct that
looked like
if (zbb && !fast_unaligned)
...
else if (zbb && fast_unaligned)
...
rather than just inverting the order and dropping the !fast_unaligned
that I realised I might have a gap in my understanding after all..
> Heiko, I figure you've got some sort of WIP stuff for this anyway since
> you're interested in the fast unaligned? How close are you to posting any
> of that?
>
> While I think of it, w.r.t. extension versus (pseudo)cpufeature etc
> naming, it may make sense to call the functions that this series adds
> in patch 6 has_cpufeature_{un,}likely(), no matter what decision gets
> made here?
> IMO using cpufeature seems to make more sense for a general use API that
> may be used later on for the likes of unaligned access, even if
> initially it is not used for anything other than extensions.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm
2023-01-12 21:28 ` Conor Dooley
@ 2023-01-15 13:13 ` Jisheng Zhang
0 siblings, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-15 13:13 UTC (permalink / raw)
To: Conor Dooley
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Heiko Stuebner, linux-riscv, linux-kernel, kvm, kvm-riscv,
Andrew Jones
On Thu, Jan 12, 2023 at 09:28:55PM +0000, Conor Dooley wrote:
> Hey Jisheng,
Hi Conor,
>
> On Thu, Jan 12, 2023 at 01:10:18AM +0800, Jisheng Zhang wrote:
> > We will make use of ISA extension in asm files, so make the multi-letter
> >
> > RISC-V ISA extension IDs macros rather than enums and move them and
> > those base ISA extension IDs to suitable place.
>
> From v2:
> Which base ISA extension IDs? Changelog should match the patch contents,
> and it's a little unclear here since the base ISA extension IDs are
> visible here but in the context not the diff.
"that is not what git thinks you did" is the key, see below.
>
> How about something like:
> "So that ISA extensions can be used in assembly files, convert the
> multi-letter RISC-V ISA extension IDs enums to macros.
> In order to make them visible, move the #ifndef __ASSEMBLY__ guard
> to a later point in the header"
This commit msg looks better, thanks.
>
> Pedantry perhaps, but referring to moving the base IDs looks odd, since
> that is not what git thinks you did - even if that is the copy paste
Aha, this is the key, I moved the base IDs out side of __ASSEMBLY__
macro protection and move extension IDs to macros, but the git doesn't
think I did the base IDs moving ;)
> operation you carried out.
>
> Content itself is
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
>
> Thanks,
> Conor.
>
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > ---
> > arch/riscv/include/asm/hwcap.h | 45 ++++++++++++++++------------------
> > 1 file changed, 21 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> > index 86328e3acb02..09a7767723f6 100644
> > --- a/arch/riscv/include/asm/hwcap.h
> > +++ b/arch/riscv/include/asm/hwcap.h
> > @@ -12,20 +12,6 @@
> > #include <linux/bits.h>
> > #include <uapi/asm/hwcap.h>
> >
> > -#ifndef __ASSEMBLY__
> > -#include <linux/jump_label.h>
> > -/*
> > - * This yields a mask that user programs can use to figure out what
> > - * instruction set this cpu supports.
> > - */
> > -#define ELF_HWCAP (elf_hwcap)
> > -
> > -enum {
> > - CAP_HWCAP = 1,
> > -};
> > -
> > -extern unsigned long elf_hwcap;
> > -
> > #define RISCV_ISA_EXT_a ('a' - 'a')
> > #define RISCV_ISA_EXT_c ('c' - 'a')
> > #define RISCV_ISA_EXT_d ('d' - 'a')
> > @@ -46,22 +32,33 @@ extern unsigned long elf_hwcap;
> > #define RISCV_ISA_EXT_BASE 26
> >
> > /*
> > - * This enum represent the logical ID for each multi-letter RISC-V ISA extension.
> > + * These macros represent the logical ID for each multi-letter RISC-V ISA extension.
> > * The logical ID should start from RISCV_ISA_EXT_BASE and must not exceed
> > * RISCV_ISA_EXT_MAX. 0-25 range is reserved for single letter
> > * extensions while all the multi-letter extensions should define the next
> > * available logical extension id.
> > */
> > -enum riscv_isa_ext_id {
> > - RISCV_ISA_EXT_SSCOFPMF = RISCV_ISA_EXT_BASE,
> > - RISCV_ISA_EXT_SVPBMT,
> > - RISCV_ISA_EXT_ZICBOM,
> > - RISCV_ISA_EXT_ZIHINTPAUSE,
> > - RISCV_ISA_EXT_SSTC,
> > - RISCV_ISA_EXT_SVINVAL,
> > - RISCV_ISA_EXT_ID_MAX
> > +#define RISCV_ISA_EXT_SSCOFPMF 26
> > +#define RISCV_ISA_EXT_SVPBMT 27
> > +#define RISCV_ISA_EXT_ZICBOM 28
> > +#define RISCV_ISA_EXT_ZIHINTPAUSE 29
> > +#define RISCV_ISA_EXT_SSTC 30
> > +#define RISCV_ISA_EXT_SVINVAL 31
> > +
> > +#ifndef __ASSEMBLY__
> > +#include <linux/jump_label.h>
> > +/*
> > + * This yields a mask that user programs can use to figure out what
> > + * instruction set this cpu supports.
> > + */
> > +#define ELF_HWCAP (elf_hwcap)
> > +
> > +enum {
> > + CAP_HWCAP = 1,
> > };
> > -static_assert(RISCV_ISA_EXT_ID_MAX <= RISCV_ISA_EXT_MAX);
> > +
> > +extern unsigned long elf_hwcap;
> > +
> >
> > /*
> > * This enum represents the logical ID for each RISC-V ISA extension static
> > --
> > 2.38.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-12 9:21 ` Andrew Jones
2023-01-13 15:18 ` Conor Dooley
@ 2023-01-15 13:59 ` Jisheng Zhang
1 sibling, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-15 13:59 UTC (permalink / raw)
To: Andrew Jones
Cc: Heiko Stübner, Paul Walmsley, Palmer Dabbelt, Albert Ou,
Anup Patel, Atish Patra, linux-riscv, linux-kernel, kvm,
kvm-riscv
On Thu, Jan 12, 2023 at 10:21:36AM +0100, Andrew Jones wrote:
> On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> > Hi Jisheng.
> >
> > Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > > riscv_cpufeature_patch_func() currently only scans a limited set of
> > > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > > ISA extensions.
> > >
> > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > > ---
> > > arch/riscv/include/asm/errata_list.h | 9 ++--
> > > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > > 2 files changed, 11 insertions(+), 61 deletions(-)
> >
> > hmmm ... I do see a somewhat big caveat for this.
> > and would like to take back my Reviewed-by for now
> >
> >
> > With this change we would limit the patchable cpufeatures to actual
> > riscv extensions. But cpufeatures can also be soft features like
> > how performant the core handles unaligned accesses.
>
> I agree that this needs to be addressed and Jisheng also raised this
> yesterday here [*]. It seems we need the concept of cpufeatures, which
> may be extensions or non-extensions.
>
> [*] https://lore.kernel.org/all/Y77xyNPNqnFQUqAx@xhacker/
>
> >
> > See Palmer's series [0].
> >
> >
> > Also this essentially codifies that each ALTERNATIVE can only ever
> > be attached to exactly one extension.
> >
> > But contrary to vendor-errata, it is very likely that we will need
> > combinations of different extensions for some alternatives in the future.
>
> One possible approach may be to combine extensions/non-extensions at boot
> time into pseudo-cpufeatures. Then, alternatives can continue attaching to
> a single "feature". (I'm not saying that's a better approach than the
> bitmap, I'm just suggesting it as something else to consider.)
When swtiching pgtable_l4_enabled to static key for the first time, I
suggested bitmap for cpufeatures which cover both ISA extensions
and non-extensions-but-some-cpu-related-features [1],
but it was rejected at that time, it seems we need to revisit the idea.
[1] https://lore.kernel.org/linux-riscv/20220508160749.984-1-jszhang@kernel.org/
>
> Thanks,
> drew
>
> >
> > In my optimization quest, I found that it's actually pretty neat to
> > convert the errata-id for cpufeatures to a bitfield [1], because then it's
> > possible to just combine extensions into said bitfield [2]:
> >
> > ALTERNATIVE_2("nop",
> > "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> > "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
> >
> > [the additional field there models a "not" component]
> >
> > So I really feel this would limit us quite a bit.
> >
> >
> > Heiko
> >
> >
> >
> > [0] https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=riscv-hwprobe-v1&id=510c491cb9d87dcbdc91c63558dc704968723240
> > [1] https://github.com/mmind/linux-riscv/commit/f57a896122ee7e666692079320fc35829434cf96
> > [2] https://github.com/mmind/linux-riscv/commit/8cef615dab0c00ad68af2651ee5b93d06be17f27#diff-194cb8a86f9fb9b03683295f21c8f46b456a9f94737f01726ddbcbb9e3aace2cR12
> >
> >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-11 23:29 ` Heiko Stübner
2023-01-12 9:21 ` Andrew Jones
@ 2023-01-15 14:19 ` Jisheng Zhang
1 sibling, 0 replies; 35+ messages in thread
From: Jisheng Zhang @ 2023-01-15 14:19 UTC (permalink / raw)
To: Heiko Stübner
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
linux-riscv, linux-kernel, kvm, kvm-riscv, Andrew Jones
On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> Hi Jisheng.
Hi Heiko,
>
> Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > riscv_cpufeature_patch_func() currently only scans a limited set of
> > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > ISA extensions.
> >
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > ---
> > arch/riscv/include/asm/errata_list.h | 9 ++--
> > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > 2 files changed, 11 insertions(+), 61 deletions(-)
>
> hmmm ... I do see a somewhat big caveat for this.
> and would like to take back my Reviewed-by for now
>
>
> With this change we would limit the patchable cpufeatures to actual
> riscv extensions. But cpufeatures can also be soft features like
> how performant the core handles unaligned accesses.
Besides Drew's comments and my reply a few minutes ago, here are
what I thought: I agree with you about "cpufeatures can also be soft
features" which I called cpu related features, but currently we
don't have that case in urgent, the SV48 and SV57 are extensions now
as Jessica pointed out[1], so I planed to send a v7 to apply the
alternative mechanism for SV48/SV57, and I think we still have time to
revisit the "expanding cpufeatures to cover soft features". But that
need to be addressed in another improvement series.
[1] https://lore.kernel.org/linux-riscv/391AFCB9-D314-4243-9E35-6D95B81C9400@jrtc27.com/
>
> See Palmer's series [0].
>
>
> Also this essentially codifies that each ALTERNATIVE can only ever
> be attached to exactly one extension.
>
> But contrary to vendor-errata, it is very likely that we will need
> combinations of different extensions for some alternatives in the future.
>
> In my optimization quest, I found that it's actually pretty neat to
> convert the errata-id for cpufeatures to a bitfield [1], because then it's
> possible to just combine extensions into said bitfield [2]:
>
> ALTERNATIVE_2("nop",
> "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
>
> [the additional field there models a "not" component]
>
> So I really feel this would limit us quite a bit.
>
>
> Heiko
>
>
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=riscv-hwprobe-v1&id=510c491cb9d87dcbdc91c63558dc704968723240
> [1] https://github.com/mmind/linux-riscv/commit/f57a896122ee7e666692079320fc35829434cf96
> [2] https://github.com/mmind/linux-riscv/commit/8cef615dab0c00ad68af2651ee5b93d06be17f27#diff-194cb8a86f9fb9b03683295f21c8f46b456a9f94737f01726ddbcbb9e3aace2cR12
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-14 20:32 ` Conor Dooley
@ 2023-01-18 21:54 ` Conor Dooley
2023-01-19 8:29 ` Andrew Jones
0 siblings, 1 reply; 35+ messages in thread
From: Conor Dooley @ 2023-01-18 21:54 UTC (permalink / raw)
To: Andrew Jones, Heiko Stübner, Palmer Dabbelt
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Anup Patel, Atish Patra,
Jisheng Zhang, linux-riscv, linux-kernel, kvm, kvm-riscv
[-- Attachment #1: Type: text/plain, Size: 8846 bytes --]
Hey!
I guess here is the right place to follow up on all of this stuff...
On Sat, Jan 14, 2023 at 08:32:37PM +0000, Conor Dooley wrote:
> On Fri, Jan 13, 2023 at 03:18:59PM +0000, Conor Dooley wrote:
> > On Thu, Jan 12, 2023 at 10:21:36AM +0100, Andrew Jones wrote:
> > > On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> > > > Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > > > > riscv_cpufeature_patch_func() currently only scans a limited set of
> > > > > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > > > > ISA extensions.
> > > > >
> > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > > > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > > > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > > > > ---
> > > > > arch/riscv/include/asm/errata_list.h | 9 ++--
> > > > > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > > > > 2 files changed, 11 insertions(+), 61 deletions(-)
> > > >
> > > > hmmm ... I do see a somewhat big caveat for this.
> > > > and would like to take back my Reviewed-by for now
> > > >
> > > >
> > > > With this change we would limit the patchable cpufeatures to actual
> > > > riscv extensions. But cpufeatures can also be soft features like
> > > > how performant the core handles unaligned accesses.
> > >
> > > I agree that this needs to be addressed and Jisheng also raised this
> > > yesterday here [*]. It seems we need the concept of cpufeatures, which
> > > may be extensions or non-extensions.
> > >
> > > [*] https://lore.kernel.org/all/Y77xyNPNqnFQUqAx@xhacker/
> > >
> > > > See Palmer's series [0].
> > > >
> > > >
> > > > Also this essentially codifies that each ALTERNATIVE can only ever
> > > > be attached to exactly one extension.
> > > >
> > > > But contrary to vendor-errata, it is very likely that we will need
> > > > combinations of different extensions for some alternatives in the future.
> > >
> > > One possible approach may be to combine extensions/non-extensions at boot
> > > time into pseudo-cpufeatures. Then, alternatives can continue attaching to
> > > a single "feature". (I'm not saying that's a better approach than the
> > > bitmap, I'm just suggesting it as something else to consider.)
> >
> >
> > > > ALTERNATIVE_2("nop",
> > > > "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> > > > "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
> > > >
> > > > [the additional field there models a "not" component]
> >
> > Since we're discussing theoretical implementations, and it's a little hard
> > to picture all that they entail in my head, I might be making a fool of
> > myself here with assumptions...
> >
> > Heiko's suggestion sounded along the lines of: keep probing individual
> > "features" as we are now. Features in this case being the presence of
> > the extension or non-extension capability. And then in the alternative,
> > make all of the decisions about which to apply.
> >
> > Drew's suggestion would have significantly more defined CPUFEATURE_FOOs,
> > but would offload the decision making about which extensions or non-
> > extension capabilities constitute a feature to regular old cpufeature
> > code. However, the order of precedence would remain in the alt macro, as
> > it does now.
> >
> > I think I am just a wee bit biased, but adding the complexity somewhere
> > other than alternative macros seems a wise choice, especially as we are
> > likely to find that complexity increases over time?
> >
> > The other thing that came to mind, and maybe this is just looking for
> > holes where they don't exist (or are not worth addressing), is that
> > order of precedence.
> > I can imagine that, in some cases, the order of precedence is not a
> > constant per psuedo-cpufeature, but determined by implementation of
> > the capabilities that comprise it?
>
> Having spent longer than I maybe should've looking at your patches
> Heiko, given it's a Saturday evening, the precedence stuff is still
> sticking out to me..
>
> For Zbb & fast unaligned, the order may be non-controversial, but in
> the general case I don't see how it can be true that the order of
> precedence for variants is a constant.
>
> Creating pseudo cpufeatures as Drew suggested does seem like a way to
> extract complexity from the alternatives themselves (which I think is a
> good thing) but at the expense of eating up cpu_req_feature bits...
> By itself, it doesn't help with precedence, but it may better allow us
> to deal with some of the precedence in cpufeature.c, since the
> alternative would operate based on the pseudo cpufeature rather than on
> the individual capabilities that the pseudo cpufeature depends on.
>
> I tried to come up with a suggestion for what to do about precedence,
> but everything I thought up felt a bit horrific tbh.
> The thing that fits the current model best is just allowing cpu vendors
> to add, yet more, "errata" that pick the variant that works best for
> their implementation... Although I'd be worried about ballooning some of
> these ALT_FOO macros out to a massive degree with that sort of approach.
>
> > If my assumption/understanding holds, moving decision making out of the
> > alternative seems like it would better provision for scenarios like
> > that? I dunno, maybe that is whatever the corollary of "premature
> > optimisation" is for this discussion.
> >
> > That's my unsolicited € 0.02, hopefully I wasn't off-base with the
> > assumptions I made.
>
> The order in which an alternative is added to the macro does matter,
> right? At least, that's how I thought it worked and hope I've not had
> an incorrect interpretation there all along... I wasn't until I started
> reading your patch and couldn't understand why you had a construct that
> looked like
>
> if (zbb && !fast_unaligned)
> ...
> else if (zbb && fast_unaligned)
> ...
>
> rather than just inverting the order and dropping the !fast_unaligned
> that I realised I might have a gap in my understanding after all..
>
> > Heiko, I figure you've got some sort of WIP stuff for this anyway since
> > you're interested in the fast unaligned? How close are you to posting any
> > of that?
> >
> > While I think of it, w.r.t. extension versus (pseudo)cpufeature etc
> > naming, it may make sense to call the functions that this series adds
> > in patch 6 has_cpufeature_{un,}likely(), no matter what decision gets
> > made here?
> > IMO using cpufeature seems to make more sense for a general use API that
> > may be used later on for the likes of unaligned access, even if
> > initially it is not used for anything other than extensions.
Today at [1] we talked a bit about the various bits going on here.
I'll attempt to summarise what I remember, but I meant to do this
several hours ago and am likely to make a hames of it.
For Zbb/unaligned stuff, the sentiment was along the lines of there
needing to be a performance benefit to justify the inclusion.
Some of us have HW that is (allegedly) capable of Zbb, and, if that's the
case, will give it a go.
I think it was similar for unaligned, since there is nothing yet that
supports this behaviour, we should wait until a benefit is demonstrable.
On the subject of grouping extension/non-extension capabilities into a
single cpufeature, Palmer mentioned that GCC does something similar,
for the likes of the Ventana vendor extensions, that are unlikely to be
present in isolation.
Those are (or were?) probed as a group of extensions rather than
individually.
I think it was said it'd make sense for us to unify extensions that will
only ever appear together single psuedo cpufeature too.
For the bitfield approach versus creating pseudo cpufeatures discussion
& how to deal with that in alternatives etc, I'm a bit less sure what the
outcome was.
IIRC, nothing concrete was said about either approach, but maybe it was
implied that we should do as GCC does, only grouping things that won't
ever been seen apart.
Figuring that out seems to have been punted down the road, as the
inclusion of our only current example of this (Zbb + unaligned) is
dependant on hardware showing up that actually benefits from it.
The plan then seemed to be press ahead with this series & test the
benefits of the Zbb str* functions in Zbb capable hardware before making
a decision there.
Hopefully I wasn't too far off with that summary...
Thanks,
Conor.
1 - https://lore.kernel.org/linux-riscv/mhng-775d4068-6c1e-48a4-a1dc-b4a76ff26bb3@palmer-ri-x1c9a/
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-18 21:54 ` Conor Dooley
@ 2023-01-19 8:29 ` Andrew Jones
2023-01-19 22:13 ` Conor Dooley
0 siblings, 1 reply; 35+ messages in thread
From: Andrew Jones @ 2023-01-19 8:29 UTC (permalink / raw)
To: Conor Dooley
Cc: Heiko Stübner, Palmer Dabbelt, Paul Walmsley, Albert Ou,
Anup Patel, Atish Patra, Jisheng Zhang, linux-riscv, linux-kernel,
kvm, kvm-riscv
On Wed, Jan 18, 2023 at 09:54:46PM +0000, Conor Dooley wrote:
> Hey!
>
> I guess here is the right place to follow up on all of this stuff...
>
> On Sat, Jan 14, 2023 at 08:32:37PM +0000, Conor Dooley wrote:
> > On Fri, Jan 13, 2023 at 03:18:59PM +0000, Conor Dooley wrote:
> > > On Thu, Jan 12, 2023 at 10:21:36AM +0100, Andrew Jones wrote:
> > > > On Thu, Jan 12, 2023 at 12:29:57AM +0100, Heiko Stübner wrote:
> > > > > Am Mittwoch, 11. Januar 2023, 18:10:19 CET schrieb Jisheng Zhang:
> > > > > > riscv_cpufeature_patch_func() currently only scans a limited set of
> > > > > > cpufeatures, explicitly defined with macros. Extend it to probe for all
> > > > > > ISA extensions.
> > > > > >
> > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > > > > > Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
> > > > > > Reviewed-by: Heiko Stuebner <heiko@sntech.de>
> > > > > > ---
> > > > > > arch/riscv/include/asm/errata_list.h | 9 ++--
> > > > > > arch/riscv/kernel/cpufeature.c | 63 ++++------------------------
> > > > > > 2 files changed, 11 insertions(+), 61 deletions(-)
> > > > >
> > > > > hmmm ... I do see a somewhat big caveat for this.
> > > > > and would like to take back my Reviewed-by for now
> > > > >
> > > > >
> > > > > With this change we would limit the patchable cpufeatures to actual
> > > > > riscv extensions. But cpufeatures can also be soft features like
> > > > > how performant the core handles unaligned accesses.
> > > >
> > > > I agree that this needs to be addressed and Jisheng also raised this
> > > > yesterday here [*]. It seems we need the concept of cpufeatures, which
> > > > may be extensions or non-extensions.
> > > >
> > > > [*] https://lore.kernel.org/all/Y77xyNPNqnFQUqAx@xhacker/
> > > >
> > > > > See Palmer's series [0].
> > > > >
> > > > >
> > > > > Also this essentially codifies that each ALTERNATIVE can only ever
> > > > > be attached to exactly one extension.
> > > > >
> > > > > But contrary to vendor-errata, it is very likely that we will need
> > > > > combinations of different extensions for some alternatives in the future.
> > > >
> > > > One possible approach may be to combine extensions/non-extensions at boot
> > > > time into pseudo-cpufeatures. Then, alternatives can continue attaching to
> > > > a single "feature". (I'm not saying that's a better approach than the
> > > > bitmap, I'm just suggesting it as something else to consider.)
> > >
> > >
> > > > > ALTERNATIVE_2("nop",
> > > > > "j strcmp_zbb_unaligned", 0, CPUFEATURE_ZBB | CPUFEATURE_FAST_UNALIGNED, 0, CONFIG_RISCV_ISA_ZBB,
> > > > > "j variant_zbb", 0, CPUFEATURE_ZBB, CPUFEATURE_FAST_UNALIGNED, CONFIG_RISCV_ISA_ZBB)
> > > > >
> > > > > [the additional field there models a "not" component]
> > >
> > > Since we're discussing theoretical implementations, and it's a little hard
> > > to picture all that they entail in my head, I might be making a fool of
> > > myself here with assumptions...
> > >
> > > Heiko's suggestion sounded along the lines of: keep probing individual
> > > "features" as we are now. Features in this case being the presence of
> > > the extension or non-extension capability. And then in the alternative,
> > > make all of the decisions about which to apply.
> > >
> > > Drew's suggestion would have significantly more defined CPUFEATURE_FOOs,
> > > but would offload the decision making about which extensions or non-
> > > extension capabilities constitute a feature to regular old cpufeature
> > > code. However, the order of precedence would remain in the alt macro, as
> > > it does now.
> > >
> > > I think I am just a wee bit biased, but adding the complexity somewhere
> > > other than alternative macros seems a wise choice, especially as we are
> > > likely to find that complexity increases over time?
> > >
> > > The other thing that came to mind, and maybe this is just looking for
> > > holes where they don't exist (or are not worth addressing), is that
> > > order of precedence.
> > > I can imagine that, in some cases, the order of precedence is not a
> > > constant per psuedo-cpufeature, but determined by implementation of
> > > the capabilities that comprise it?
> >
> > Having spent longer than I maybe should've looking at your patches
> > Heiko, given it's a Saturday evening, the precedence stuff is still
> > sticking out to me..
> >
> > For Zbb & fast unaligned, the order may be non-controversial, but in
> > the general case I don't see how it can be true that the order of
> > precedence for variants is a constant.
> >
> > Creating pseudo cpufeatures as Drew suggested does seem like a way to
> > extract complexity from the alternatives themselves (which I think is a
> > good thing) but at the expense of eating up cpu_req_feature bits...
> > By itself, it doesn't help with precedence, but it may better allow us
> > to deal with some of the precedence in cpufeature.c, since the
> > alternative would operate based on the pseudo cpufeature rather than on
> > the individual capabilities that the pseudo cpufeature depends on.
> >
> > I tried to come up with a suggestion for what to do about precedence,
> > but everything I thought up felt a bit horrific tbh.
> > The thing that fits the current model best is just allowing cpu vendors
> > to add, yet more, "errata" that pick the variant that works best for
> > their implementation... Although I'd be worried about ballooning some of
> > these ALT_FOO macros out to a massive degree with that sort of approach.
> >
> > > If my assumption/understanding holds, moving decision making out of the
> > > alternative seems like it would better provision for scenarios like
> > > that? I dunno, maybe that is whatever the corollary of "premature
> > > optimisation" is for this discussion.
> > >
> > > That's my unsolicited € 0.02, hopefully I wasn't off-base with the
> > > assumptions I made.
> >
> > The order in which an alternative is added to the macro does matter,
> > right? At least, that's how I thought it worked and hope I've not had
> > an incorrect interpretation there all along... I wasn't until I started
> > reading your patch and couldn't understand why you had a construct that
> > looked like
> >
> > if (zbb && !fast_unaligned)
> > ...
> > else if (zbb && fast_unaligned)
> > ...
> >
> > rather than just inverting the order and dropping the !fast_unaligned
> > that I realised I might have a gap in my understanding after all..
> >
> > > Heiko, I figure you've got some sort of WIP stuff for this anyway since
> > > you're interested in the fast unaligned? How close are you to posting any
> > > of that?
> > >
> > > While I think of it, w.r.t. extension versus (pseudo)cpufeature etc
> > > naming, it may make sense to call the functions that this series adds
> > > in patch 6 has_cpufeature_{un,}likely(), no matter what decision gets
> > > made here?
> > > IMO using cpufeature seems to make more sense for a general use API that
> > > may be used later on for the likes of unaligned access, even if
> > > initially it is not used for anything other than extensions.
>
> Today at [1] we talked a bit about the various bits going on here.
> I'll attempt to summarise what I remember, but I meant to do this
> several hours ago and am likely to make a hames of it.
>
> For Zbb/unaligned stuff, the sentiment was along the lines of there
> needing to be a performance benefit to justify the inclusion.
> Some of us have HW that is (allegedly) capable of Zbb, and, if that's the
> case, will give it a go.
> I think it was similar for unaligned, since there is nothing yet that
> supports this behaviour, we should wait until a benefit is demonstrable.
>
> On the subject of grouping extension/non-extension capabilities into a
> single cpufeature, Palmer mentioned that GCC does something similar,
> for the likes of the Ventana vendor extensions, that are unlikely to be
> present in isolation.
> Those are (or were?) probed as a group of extensions rather than
> individually.
> I think it was said it'd make sense for us to unify extensions that will
> only ever appear together single psuedo cpufeature too.
>
> For the bitfield approach versus creating pseudo cpufeatures discussion
> & how to deal with that in alternatives etc, I'm a bit less sure what the
> outcome was.
> IIRC, nothing concrete was said about either approach, but maybe it was
> implied that we should do as GCC does, only grouping things that won't
> ever been seen apart.
> Figuring that out seems to have been punted down the road, as the
> inclusion of our only current example of this (Zbb + unaligned) is
> dependant on hardware showing up that actually benefits from it.
>
> The plan then seemed to be press ahead with this series & test the
> benefits of the Zbb str* functions in Zbb capable hardware before making
> a decision there.
>
> Hopefully I wasn't too far off with that summary...
This matches my recollection. Thanks for the summary, Conor.
drew
>
> Thanks,
> Conor.
>
> 1 - https://lore.kernel.org/linux-riscv/mhng-775d4068-6c1e-48a4-a1dc-b4a76ff26bb3@palmer-ri-x1c9a/
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions
2023-01-19 8:29 ` Andrew Jones
@ 2023-01-19 22:13 ` Conor Dooley
0 siblings, 0 replies; 35+ messages in thread
From: Conor Dooley @ 2023-01-19 22:13 UTC (permalink / raw)
To: Andrew Jones, Heiko Stübner, Palmer Dabbelt, kernel, bjorn
Cc: Heiko Stübner, Palmer Dabbelt, Paul Walmsley, Albert Ou,
Anup Patel, Atish Patra, Jisheng Zhang, linux-riscv, linux-kernel,
kvm, kvm-riscv, David Abdurachmanov
[-- Attachment #1: Type: text/plain, Size: 3039 bytes --]
Me again!
On Thu, Jan 19, 2023 at 09:29:03AM +0100, Andrew Jones wrote:
> On Wed, Jan 18, 2023 at 09:54:46PM +0000, Conor Dooley wrote:
> > Hey!
> >
> > I guess here is the right place to follow up on all of this stuff...
> >
> > On Sat, Jan 14, 2023 at 08:32:37PM +0000, Conor Dooley wrote:
> > Today at [1] we talked a bit about the various bits going on here.
> > I'll attempt to summarise what I remember, but I meant to do this
> > several hours ago and am likely to make a hames of it.
> >
> > For Zbb/unaligned stuff, the sentiment was along the lines of there
> > needing to be a performance benefit to justify the inclusion.
> > Some of us have HW that is (allegedly) capable of Zbb, and, if that's the
I did some very very basic testing today. Ethernet is still a no-go on
my visionfive 2 board, but the sd card works at least, so I can run w/
Zbb code people want & we can see how it goes!
At the very least, it is capable of executing the instructions that were
used in Appendix A. I didn't try to do anything else, because I am lazy
and if there were some pre-existing test programs I didn't want to go
and write out a bunch of asm myself!
impid appears to be 0x4210427, if that means anything to anyone!
> > case, will give it a go.
> > I think it was similar for unaligned, since there is nothing yet that
> > supports this behaviour, we should wait until a benefit is demonstrable.
> >
> > On the subject of grouping extension/non-extension capabilities into a
> > single cpufeature, Palmer mentioned that GCC does something similar,
> > for the likes of the Ventana vendor extensions, that are unlikely to be
> > present in isolation.
Jess pointed out on IRC that GCC doesn't support XVentanaCondOps
so maybe there was a mixup there. I don't think that really matters
though, as the point stands regardless of whether it was in GCC or not.
> > Those are (or were?) probed as a group of extensions rather than
> > individually.
> > I think it was said it'd make sense for us to unify extensions that will
> > only ever appear together single psuedo cpufeature too.
> >
> > For the bitfield approach versus creating pseudo cpufeatures discussion
> > & how to deal with that in alternatives etc, I'm a bit less sure what the
> > outcome was.
> > IIRC, nothing concrete was said about either approach, but maybe it was
> > implied that we should do as GCC does, only grouping things that won't
> > ever been seen apart.
> > Figuring that out seems to have been punted down the road, as the
> > inclusion of our only current example of this (Zbb + unaligned) is
> > dependant on hardware showing up that actually benefits from it.
> >
> > The plan then seemed to be press ahead with this series & test the
> > benefits of the Zbb str* functions in Zbb capable hardware before making
> > a decision there.
> >
> > Hopefully I wasn't too far off with that summary...
>
> This matches my recollection. Thanks for the summary, Conor.
Cool, thanks.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2023-01-19 22:30 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-11 17:10 [PATCH v3 00/13] riscv: improve boot time isa extensions handling Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 01/13] riscv: fix jal offsets in patched alternatives Jisheng Zhang
2023-01-11 17:56 ` Andrew Jones
2023-01-11 23:31 ` Heiko Stübner
2023-01-12 20:25 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 02/13] riscv: move riscv_noncoherent_supported() out of ZICBOM probe Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 03/13] riscv: cpufeature: detect RISCV_ALTERNATIVES_EARLY_BOOT earlier Jisheng Zhang
2023-01-12 21:11 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 04/13] riscv: hwcap: make ISA extension ids can be used in asm Jisheng Zhang
2023-01-12 21:28 ` Conor Dooley
2023-01-15 13:13 ` Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 05/13] riscv: cpufeature: extend riscv_cpufeature_patch_func to all ISA extensions Jisheng Zhang
2023-01-11 23:29 ` Heiko Stübner
2023-01-12 9:21 ` Andrew Jones
2023-01-13 15:18 ` Conor Dooley
2023-01-14 20:32 ` Conor Dooley
2023-01-18 21:54 ` Conor Dooley
2023-01-19 8:29 ` Andrew Jones
2023-01-19 22:13 ` Conor Dooley
2023-01-15 13:59 ` Jisheng Zhang
2023-01-15 14:19 ` Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 06/13] riscv: introduce riscv_has_extension_[un]likely() Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 07/13] riscv: fpu: switch has_fpu() to riscv_has_extension_likely() Jisheng Zhang
2023-01-12 21:58 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 08/13] riscv: module: move find_section to module.h Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 09/13] riscv: switch to relative alternative entries Jisheng Zhang
2023-01-11 18:11 ` Andrew Jones
2023-01-12 21:49 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 10/13] riscv: alternative: patch alternatives in the vDSO Jisheng Zhang
2023-01-12 7:48 ` Conor Dooley
2023-01-12 21:55 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 11/13] riscv: cpu_relax: switch to riscv_has_extension_likely() Jisheng Zhang
2023-01-12 21:59 ` Conor Dooley
2023-01-11 17:10 ` [PATCH v3 12/13] riscv: KVM: Switch has_svinval() to riscv_has_extension_unlikely() Jisheng Zhang
2023-01-11 17:10 ` [PATCH v3 13/13] riscv: remove riscv_isa_ext_keys[] array and related usage Jisheng Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).