devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page
@ 2023-02-21 19:09 Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

When the Zicboz extension is available we can more rapidly zero naturally
aligned Zicboz block sized chunks of memory. As pages are always page
aligned and are larger than any Zicboz block size will be, then
clear_page() appears to be a good candidate for the extension. While cycle
count and energy consumption should also be considered, we can be pretty
certain that implementing clear_page() with the Zicboz extension is a win
by comparing the new dynamic instruction count with its current count[1].
Doing so we see that the new count is just over a quarter of the old count
(see patch6's commit message for more details).

For those of you who reviewed v1[2], you may be looking for the memset()
patches. As pointed out in v1, and a couple follow-up emails, it's not
clear that patching memset() is a win yet. When I get a chance to test
on real hardware with a comprehensive benchmark collection then I can
post the memset() patches separately (assuming the benchmarks show it's
worthwhile).

Based on riscv-linux/for-next plus the dependencies listed below.

Dependencies:
https://lore.kernel.org/linux-riscv/20230221185603.570882-1-ajones@ventanamicro.com/

The patches are also available here
https://github.com/jones-drew/linux/commits/riscv/zicboz-v5

To test over QEMU this branch may be used to enable Zicboz
https://gitlab.com/jones-drew/qemu/-/commits/riscv/zicboz

To test running a KVM guest with Zicboz this kvmtool branch may be used
https://github.com/jones-drew/kvmtool/commits/riscv/zicboz

Thanks,
drew

[1] I ported the functions under test to userspace and linked them with
    a test program. Then, I ran them under gdb with a script[3] which
    counted instructions by single stepping.
[2] https://lore.kernel.org/all/20221027130247.31634-1-ajones@ventanamicro.com/
[3] https://gist.github.com/jones-drew/487791c956ceca8c18adc2847eec9c60

v5:
  - Rebased on latest for-next which allowed dropping v4's dependencies
  - Added a new dependency on an alternatives/cpufeatures cleanup series
  - Replaced "RISC-V: cpufeature: Put vendor_id to work" with "riscv:
    cpufeatures: Put the upper 16 bits of patch ID to work" since touching
    vendor_id is probably a bad idea [Conor]
  - Picked up an r-b from Conor

v4:
  - Rebased on latest for-next which allowed dropping one dependency
  - Added "RISC-V: alternatives: Support patching multiple insns in assembly"
    since I needed to use more than one instruction in an ALTERNATIVE call
    from assembly. I can post this patch separately as a fix if desired.
  - Improved the dt-binding patch commit message [Conor]
  - Picked up some tags from Conor and Rob (I kept Conor's a-b on the
    clear_page patch, even though there are several changes to it, because
    I interpreted the a-b as "OK by me to implement a Zicboz clear_page")
  - Added "RISC-V: cpufeature: Put vendor_id to work", which was necessary
    for how I wanted to use ALTERNATIVE in clear_page.
  - Fallback to memset when the Zicboz block size is too big for the
    Zicboz implementation of clear_page [Palmer]
  - Use lw without la in clear_page impl [Palmer]
  - Implement a Duff device for clear_page using the new 'vendor_id'
    alternatives support [drew]

v3:
  - CC'ed DT list
  - Improved commit message of DT bindings patch to point out relationship
    with cbom-block-size
  - Picked up an a-b from Conor

v2:
  - s/blksz/block_size/, improved commit message for "RISC-V: Add Zicboz
    detection and block size parsing", isa ext sorting [Conor]
  - Added dt binding patch [Heiko]
  - Picked up r-b's from Conor, Heiko, and Anup
  - Moved config symbol and CBO_zero() introduction to "RISC-V: Use Zicboz
    in clear_page when available" and improved its commit message and
    implementation (unrolled four times) [drew]
  - Dropped memset() patches [drew]
  - Rebased on ae4d39f75308 ("Merge patch "RISC-V: fix incorrect type of
    ARCH_CANAAN_K210_DTB_SOURCE"") plus the dependencies

Andrew Jones (8):
  RISC-V: alternatives: Support patching multiple insns in assembly
  RISC-V: Factor out body of riscv_init_cbom_blocksize loop
  dt-bindings: riscv: Document cboz-block-size
  RISC-V: Add Zicboz detection and block size parsing
  riscv: cpufeatures: Put the upper 16 bits of patch ID to work
  RISC-V: Use Zicboz in clear_page when available
  RISC-V: KVM: Provide UAPI for Zicboz block size
  RISC-V: KVM: Expose Zicboz to the guest

 .../devicetree/bindings/riscv/cpus.yaml       |  5 ++
 arch/riscv/Kconfig                            | 13 ++++
 arch/riscv/include/asm/alternative-macros.h   |  6 +-
 arch/riscv/include/asm/alternative.h          |  3 +
 arch/riscv/include/asm/cacheflush.h           |  3 +-
 arch/riscv/include/asm/hwcap.h                |  1 +
 arch/riscv/include/asm/insn-def.h             |  4 +
 arch/riscv/include/asm/page.h                 |  6 +-
 arch/riscv/include/uapi/asm/kvm.h             |  2 +
 arch/riscv/kernel/cpu.c                       |  1 +
 arch/riscv/kernel/cpufeature.c                | 58 ++++++++++++++-
 arch/riscv/kernel/setup.c                     |  2 +-
 arch/riscv/kvm/vcpu.c                         | 11 +++
 arch/riscv/lib/Makefile                       |  1 +
 arch/riscv/lib/clear_page.S                   | 73 +++++++++++++++++++
 arch/riscv/mm/cacheflush.c                    | 64 +++++++++-------
 16 files changed, 217 insertions(+), 36 deletions(-)
 create mode 100644 arch/riscv/lib/clear_page.S

-- 
2.39.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v5 1/8] RISC-V: alternatives: Support patching multiple insns in assembly
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

As pointed out in commit d374a16539b1 ("RISC-V: fix compile error
from deduplicated __ALTERNATIVE_CFG_2"), we need quotes around
parameters passed to macros within macros to avoid spaces being
interpreted as separators. ALT_NEW_CONTENT was trying to handle
this by defining new_c has a vararg, but this isn't sufficient
for calling ALTERNATIVE() from assembly with multiple instructions
in the new/old sequences. Remove the vararg "hack" and use quotes.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/include/asm/alternative-macros.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
index 993a44a8fdac..b8c55fb3ab2c 100644
--- a/arch/riscv/include/asm/alternative-macros.h
+++ b/arch/riscv/include/asm/alternative-macros.h
@@ -14,7 +14,7 @@
 	.4byte \patch_id
 .endm
 
-.macro ALT_NEW_CONTENT vendor_id, patch_id, enable = 1, new_c : vararg
+.macro ALT_NEW_CONTENT vendor_id, patch_id, enable = 1, new_c
 	.if \enable
 	.pushsection .alternative, "a"
 	ALT_ENTRY 886b, 888f, \vendor_id, \patch_id, 889f - 888f
@@ -41,13 +41,13 @@
 	\old_c
 	.option pop
 887 :
-	ALT_NEW_CONTENT \vendor_id, \patch_id, \enable, \new_c
+	ALT_NEW_CONTENT \vendor_id, \patch_id, \enable, "\new_c"
 .endm
 
 .macro ALTERNATIVE_CFG_2 old_c, new_c_1, vendor_id_1, patch_id_1, enable_1,	\
 				new_c_2, vendor_id_2, patch_id_2, enable_2
 	ALTERNATIVE_CFG "\old_c", "\new_c_1", \vendor_id_1, \patch_id_1, \enable_1
-	ALT_NEW_CONTENT \vendor_id_2, \patch_id_2, \enable_2, \new_c_2
+	ALT_NEW_CONTENT \vendor_id_2, \patch_id_2, \enable_2, "\new_c_2"
 .endm
 
 #define __ALTERNATIVE_CFG(...)		ALTERNATIVE_CFG __VA_ARGS__
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 3/8] dt-bindings: riscv: Document cboz-block-size Andrew Jones
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

Refactor riscv_init_cbom_blocksize() to prepare for it to be used
for both cbom block size and cboz block size.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/mm/cacheflush.c | 45 +++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 3cc07ed45aeb..eaf23fc14966 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -98,34 +98,39 @@ void flush_icache_pte(pte_t pte)
 unsigned int riscv_cbom_block_size;
 EXPORT_SYMBOL_GPL(riscv_cbom_block_size);
 
+static void cbo_get_block_size(struct device_node *node,
+			       const char *name, u32 *block_size,
+			       unsigned long *first_hartid)
+{
+	unsigned long hartid;
+	u32 val;
+
+	if (riscv_of_processor_hartid(node, &hartid))
+		return;
+
+	if (of_property_read_u32(node, name, &val))
+		return;
+
+	if (!*block_size) {
+		*block_size = val;
+		*first_hartid = hartid;
+	} else if (*block_size != val) {
+		pr_warn("%s mismatched between harts %lu and %lu\n",
+			name, *first_hartid, hartid);
+	}
+}
+
 void riscv_init_cbom_blocksize(void)
 {
 	struct device_node *node;
 	unsigned long cbom_hartid;
-	u32 val, probed_block_size;
-	int ret;
+	u32 probed_block_size;
 
 	probed_block_size = 0;
 	for_each_of_cpu_node(node) {
-		unsigned long hartid;
-
-		ret = riscv_of_processor_hartid(node, &hartid);
-		if (ret)
-			continue;
-
 		/* set block-size for cbom extension if available */
-		ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
-		if (ret)
-			continue;
-
-		if (!probed_block_size) {
-			probed_block_size = val;
-			cbom_hartid = hartid;
-		} else {
-			if (probed_block_size != val)
-				pr_warn("cbom-block-size mismatched between harts %lu and %lu\n",
-					cbom_hartid, hartid);
-		}
+		cbo_get_block_size(node, "riscv,cbom-block-size",
+				   &probed_block_size, &cbom_hartid);
 	}
 
 	if (probed_block_size)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 3/8] dt-bindings: riscv: Document cboz-block-size
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 4/8] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

The Zicboz operation (cbo.zero) operates on a block-size defined
for the cpu-core. While we already have the riscv,cbom-block-size
property, it only provides the block size for Zicbom operations.
Even though it's likely Zicboz and Zicbom will use the same size,
that's not required by the specification. Create another property
specifically for Zicboz.

Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Rob Herring <robh@kernel.org>
---
 Documentation/devicetree/bindings/riscv/cpus.yaml | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
index 001931d526ec..f24cf9601c6e 100644
--- a/Documentation/devicetree/bindings/riscv/cpus.yaml
+++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
@@ -72,6 +72,11 @@ properties:
     description:
       The blocksize in bytes for the Zicbom cache operations.
 
+  riscv,cboz-block-size:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description:
+      The blocksize in bytes for the Zicboz cache operations.
+
   riscv,isa:
     description:
       Identifies the specific RISC-V instruction set architecture
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 4/8] RISC-V: Add Zicboz detection and block size parsing
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (2 preceding siblings ...)
  2023-02-21 19:09 ` [PATCH v5 3/8] dt-bindings: riscv: Document cboz-block-size Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

Parse "riscv,cboz-block-size" from the DT by piggybacking on Zicbom's
riscv_init_cbom_blocksize(). Additionally check the DT for the presence
of the "zicboz" extension and, when it's present, validate the parsed
cboz block size as we do Zicbom's cbom block size with
riscv_isa_extension_check().

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/include/asm/cacheflush.h |  3 ++-
 arch/riscv/include/asm/hwcap.h      |  1 +
 arch/riscv/kernel/cpu.c             |  1 +
 arch/riscv/kernel/cpufeature.c      | 10 ++++++++++
 arch/riscv/kernel/setup.c           |  2 +-
 arch/riscv/mm/cacheflush.c          | 23 +++++++++++++++--------
 6 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index 03e3b95ae6da..8091b8bf4883 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -50,7 +50,8 @@ void flush_icache_mm(struct mm_struct *mm, bool local);
 #endif /* CONFIG_SMP */
 
 extern unsigned int riscv_cbom_block_size;
-void riscv_init_cbom_blocksize(void);
+extern unsigned int riscv_cboz_block_size;
+void riscv_init_cbo_blocksizes(void);
 
 #ifdef CONFIG_RISCV_DMA_NONCOHERENT
 void riscv_noncoherent_supported(void);
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index ee9c80fe0062..bd025e918a08 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -47,6 +47,7 @@
 #define RISCV_ISA_EXT_ZBB              30
 #define RISCV_ISA_EXT_ZICBOM           31
 #define RISCV_ISA_EXT_ZIHINTPAUSE      32
+#define RISCV_ISA_EXT_ZICBOZ           33
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 420228e219f7..7a3065544dcb 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -187,6 +187,7 @@ arch_initcall(riscv_cpuinfo_init);
 static struct riscv_isa_ext_data isa_ext_arr[] = {
 	__RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
 	__RISCV_ISA_EXT_DATA(zicbom, RISCV_ISA_EXT_ZICBOM),
+	__RISCV_ISA_EXT_DATA(zicboz, RISCV_ISA_EXT_ZICBOZ),
 	__RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE),
 	__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
 	__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 151e1a715db9..6102b6bb5db3 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -73,6 +73,15 @@ static bool riscv_isa_extension_check(int id)
 			return false;
 		}
 		return true;
+	case RISCV_ISA_EXT_ZICBOZ:
+		if (!riscv_cboz_block_size) {
+			pr_err("Zicboz detected in ISA string, but no cboz-block-size found\n");
+			return false;
+		} else if (!is_power_of_2(riscv_cboz_block_size)) {
+			pr_err("cboz-block-size present, but is not a power-of-2\n");
+			return false;
+		}
+		return true;
 	}
 
 	return true;
@@ -221,6 +230,7 @@ void __init riscv_fill_hwcap(void)
 				SET_ISA_EXT_MAP("svpbmt", RISCV_ISA_EXT_SVPBMT);
 				SET_ISA_EXT_MAP("zbb", RISCV_ISA_EXT_ZBB);
 				SET_ISA_EXT_MAP("zicbom", RISCV_ISA_EXT_ZICBOM);
+				SET_ISA_EXT_MAP("zicboz", RISCV_ISA_EXT_ZICBOZ);
 				SET_ISA_EXT_MAP("zihintpause", RISCV_ISA_EXT_ZIHINTPAUSE);
 			}
 #undef SET_ISA_EXT_MAP
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 376d2827e736..5d3184cbf518 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -297,7 +297,7 @@ void __init setup_arch(char **cmdline_p)
 	setup_smp();
 #endif
 
-	riscv_init_cbom_blocksize();
+	riscv_init_cbo_blocksizes();
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 	if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index eaf23fc14966..ba4832bb949b 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -98,6 +98,9 @@ void flush_icache_pte(pte_t pte)
 unsigned int riscv_cbom_block_size;
 EXPORT_SYMBOL_GPL(riscv_cbom_block_size);
 
+unsigned int riscv_cboz_block_size;
+EXPORT_SYMBOL_GPL(riscv_cboz_block_size);
+
 static void cbo_get_block_size(struct device_node *node,
 			       const char *name, u32 *block_size,
 			       unsigned long *first_hartid)
@@ -120,19 +123,23 @@ static void cbo_get_block_size(struct device_node *node,
 	}
 }
 
-void riscv_init_cbom_blocksize(void)
+void riscv_init_cbo_blocksizes(void)
 {
+	unsigned long cbom_hartid, cboz_hartid;
+	u32 cbom_block_size = 0, cboz_block_size = 0;
 	struct device_node *node;
-	unsigned long cbom_hartid;
-	u32 probed_block_size;
 
-	probed_block_size = 0;
 	for_each_of_cpu_node(node) {
-		/* set block-size for cbom extension if available */
+		/* set block-size for cbom and/or cboz extension if available */
 		cbo_get_block_size(node, "riscv,cbom-block-size",
-				   &probed_block_size, &cbom_hartid);
+				   &cbom_block_size, &cbom_hartid);
+		cbo_get_block_size(node, "riscv,cboz-block-size",
+				   &cboz_block_size, &cboz_hartid);
 	}
 
-	if (probed_block_size)
-		riscv_cbom_block_size = probed_block_size;
+	if (cbom_block_size)
+		riscv_cbom_block_size = cbom_block_size;
+
+	if (cboz_block_size)
+		riscv_cboz_block_size = cboz_block_size;
 }
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (3 preceding siblings ...)
  2023-02-21 19:09 ` [PATCH v5 4/8] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-22 17:27   ` Conor Dooley
  2023-02-21 19:09 ` [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

cpufeature IDs are consecutive integers starting at 26, so a 32-bit
patch ID allows an aircraft carrier load of feature IDs. Repurposing
the upper 16 bits still leaves a boat load of feature IDs and gains
16 bits which may be used to control patching on a per patch-site
basis.

This will be initially used in Zicboz's application to clear_page(),
as Zicboz's block size must also be considered. In that case, the
upper 16-bit value's role will be to convey the maximum block size
which the Zicboz clear_page() implementation supports.

cpufeature patch sites which need to check for the existence or
absence of other cpufeatures may also be able to make use of this.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
---
 arch/riscv/include/asm/alternative.h |  3 +++
 arch/riscv/kernel/cpufeature.c       | 37 +++++++++++++++++++++++++---
 2 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 8f39d4e8598d..f2cb543b0bd2 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -17,6 +17,9 @@
 #include <linux/stddef.h>
 #include <asm/hwcap.h>
 
+#define PATCH_ID_CPUFEATURE_ID(p)		((u16)((u32)(p) & 0xffff))
+#define PATCH_ID_CPUFEATURE_VALUE(p)		((u16)(((u32)(p) >> 16) & 0xffff))
+
 #define RISCV_ALTERNATIVES_BOOT		0 /* alternatives applied during regular boot */
 #define RISCV_ALTERNATIVES_MODULE	1 /* alternatives applied during module-init */
 #define RISCV_ALTERNATIVES_EARLY_BOOT	2 /* alternatives applied before mmu start */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 6102b6bb5db3..0594989ead63 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -273,12 +273,35 @@ void __init riscv_fill_hwcap(void)
 }
 
 #ifdef CONFIG_RISCV_ALTERNATIVE
+/*
+ * Alternative patch sites consider 48 bits when determining when to patch
+ * the old instruction sequence with the new. These bits are broken into a
+ * 16-bit vendor ID and a 32-bit patch ID. A non-zero vendor ID means the
+ * patch site is for an erratum, identified by the 32-bit patch ID. When
+ * the vendor ID is zero, the patch site is for a cpufeature. cpufeatures
+ * further break down patch ID into two 16-bit numbers. The lower 16 bits
+ * are the cpufeature ID and the upper 16 bits are used for a value specific
+ * to the cpufeature and patch site. If the upper 16 bits are zero, then it
+ * implies no specific value is specified. cpufeatures that want to control
+ * patching on a per-site basis will provide non-zero values and implement
+ * checks here. The checks return true when patching should be done, and
+ * false otherwise.
+ */
+static bool riscv_cpufeature_patch_check(u16 id, u16 value)
+{
+	if (!value)
+		return true;
+
+	return false;
+}
+
 void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
 						  struct alt_entry *end,
 						  unsigned int stage)
 {
 	struct alt_entry *alt;
 	void *oldptr, *altptr;
+	u16 id, value;
 
 	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
 		return;
@@ -286,13 +309,19 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
 	for (alt = begin; alt < end; alt++) {
 		if (alt->vendor_id != 0)
 			continue;
-		if (alt->patch_id >= RISCV_ISA_EXT_MAX) {
-			WARN(1, "This extension id:%d is not in ISA extension list",
-				alt->patch_id);
+
+		id = PATCH_ID_CPUFEATURE_ID(alt->patch_id);
+
+		if (id >= RISCV_ISA_EXT_MAX) {
+			WARN(1, "This extension id:%d is not in ISA extension list", id);
 			continue;
 		}
 
-		if (!__riscv_isa_extension_available(NULL, alt->patch_id))
+		if (!__riscv_isa_extension_available(NULL, id))
+			continue;
+
+		value = PATCH_ID_CPUFEATURE_VALUE(alt->patch_id);
+		if (!riscv_cpufeature_patch_check(id, value))
 			continue;
 
 		oldptr = ALT_OLD_PTR(alt);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (4 preceding siblings ...)
  2023-02-21 19:09 ` [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-24 14:00   ` Ben Dooks
  2023-02-21 19:09 ` [PATCH v5 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
  7 siblings, 1 reply; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra '

Using memset() to zero a 4K page takes 563 total instructions, where
20 are branches. clear_page(), with Zicboz and a 64 byte block size,
takes 169 total instructions, where 4 are branches and 33 are nops.
Even though the block size is a variable, thanks to alternatives, we
can still implement a Duff device without having to do any preliminary
calculations. This is achieved by using the alternatives' cpufeature
value (the upper 16 bits of patch_id). The value used is the maximum
zicboz block size order accepted at the patch site. This enables us
to stop patching / unrolling when 4K bytes have been zeroed (we would
loop and continue after 4K if the page size would be larger)

For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to
only loop a few times and larger block sizes to not loop at all. Since
cbo.zero doesn't take an offset, we also need an 'add' after each
instruction, making the loop body 112 to 160 bytes. Hopefully this
is small enough to not cause icache misses.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
---
 arch/riscv/Kconfig                | 13 ++++++
 arch/riscv/include/asm/insn-def.h |  4 ++
 arch/riscv/include/asm/page.h     |  6 ++-
 arch/riscv/kernel/cpufeature.c    | 11 +++++
 arch/riscv/lib/Makefile           |  1 +
 arch/riscv/lib/clear_page.S       | 73 +++++++++++++++++++++++++++++++
 6 files changed, 107 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/lib/clear_page.S

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 0c494c36e911..c9006bcf912d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -456,6 +456,19 @@ config RISCV_ISA_ZICBOM
 
 	   If you don't know what to do here, say Y.
 
+config RISCV_ISA_ZICBOZ
+	bool "Zicboz extension support for faster zeroing of memory"
+	depends on !XIP_KERNEL && MMU
+	select RISCV_ALTERNATIVE
+	default y
+	help
+	   Enable the use of the ZICBOZ extension (cbo.zero instruction)
+	   when available.
+
+	   The Zicboz extension is used for faster zeroing of memory.
+
+	   If you don't know what to do here, say Y.
+
 config TOOLCHAIN_HAS_ZIHINTPAUSE
 	bool
 	default y
diff --git a/arch/riscv/include/asm/insn-def.h b/arch/riscv/include/asm/insn-def.h
index e01ab51f50d2..6960beb75f32 100644
--- a/arch/riscv/include/asm/insn-def.h
+++ b/arch/riscv/include/asm/insn-def.h
@@ -192,4 +192,8 @@
 	INSN_I(OPCODE_MISC_MEM, FUNC3(2), __RD(0),		\
 	       RS1(base), SIMM12(2))
 
+#define CBO_zero(base)						\
+	INSN_I(OPCODE_MISC_MEM, FUNC3(2), __RD(0),		\
+	       RS1(base), SIMM12(4))
+
 #endif /* __ASM_INSN_DEF_H */
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 9f432c1b5289..ccd168fe29d2 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -49,10 +49,14 @@
 
 #ifndef __ASSEMBLY__
 
+#ifdef CONFIG_RISCV_ISA_ZICBOZ
+void clear_page(void *page);
+#else
 #define clear_page(pgaddr)			memset((pgaddr), 0, PAGE_SIZE)
+#endif
 #define copy_page(to, from)			memcpy((to), (from), PAGE_SIZE)
 
-#define clear_user_page(pgaddr, vaddr, page)	memset((pgaddr), 0, PAGE_SIZE)
+#define clear_user_page(pgaddr, vaddr, page)	clear_page(pgaddr)
 #define copy_user_page(vto, vfrom, vaddr, topg) \
 			memcpy((vto), (vfrom), PAGE_SIZE)
 
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 0594989ead63..4a496552b812 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -292,6 +292,17 @@ static bool riscv_cpufeature_patch_check(u16 id, u16 value)
 	if (!value)
 		return true;
 
+	switch (id) {
+	case RISCV_ISA_EXT_ZICBOZ:
+		/*
+		 * Zicboz alternative applications provide the maximum
+		 * supported block size order, or zero when it doesn't
+		 * matter. If the current block size exceeds the maximum,
+		 * then the alternative cannot be applied.
+		 */
+		return riscv_cboz_block_size <= (1U << value);
+	}
+
 	return false;
 }
 
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
index 6c74b0bedd60..26cb2502ecf8 100644
--- a/arch/riscv/lib/Makefile
+++ b/arch/riscv/lib/Makefile
@@ -8,5 +8,6 @@ lib-y			+= strlen.o
 lib-y			+= strncmp.o
 lib-$(CONFIG_MMU)	+= uaccess.o
 lib-$(CONFIG_64BIT)	+= tishift.o
+lib-$(CONFIG_RISCV_ISA_ZICBOZ)	+= clear_page.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
diff --git a/arch/riscv/lib/clear_page.S b/arch/riscv/lib/clear_page.S
new file mode 100644
index 000000000000..7c7fa45b5ab5
--- /dev/null
+++ b/arch/riscv/lib/clear_page.S
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2023 Ventana Micro Systems Inc.
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm/alternative-macros.h>
+#include <asm/hwcap.h>
+#include <asm/insn-def.h>
+#include <asm/page.h>
+
+#define CBOZ_ALT(order, old, new)				\
+	ALTERNATIVE(old, new, 0,				\
+		    ((order) << 16) | RISCV_ISA_EXT_ZICBOZ,	\
+		    CONFIG_RISCV_ISA_ZICBOZ)
+
+/* void clear_page(void *page) */
+ENTRY(__clear_page)
+WEAK(clear_page)
+	li	a2, PAGE_SIZE
+
+	/*
+	 * If Zicboz isn't present, or somehow has a block
+	 * size larger than 4K, then fallback to memset.
+	 */
+	CBOZ_ALT(12, "j .Lno_zicboz", "nop")
+
+	lw	a1, riscv_cboz_block_size
+	add	a2, a0, a2
+.Lzero_loop:
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(11, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(10, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(9, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBOZ_ALT(8, "bltu a0, a2, .Lzero_loop; ret", "nop; nop")
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	CBO_zero(a0)
+	add	a0, a0, a1
+	bltu	a0, a2, .Lzero_loop
+	ret
+.Lno_zicboz:
+	li	a1, 0
+	tail	__memset
+END(__clear_page)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (5 preceding siblings ...)
  2023-02-21 19:09 ` [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  2023-02-21 19:09 ` [PATCH v5 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
  7 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra ', Anup Patel

We're about to allow guests to use the Zicboz extension. KVM
userspace needs to know the cache block size in order to
properly advertise it to the guest. Provide a virtual config
register for userspace to get it with the GET_ONE_REG API, but
setting it cannot be supported, so disallow SET_ONE_REG.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/include/uapi/asm/kvm.h | 1 +
 arch/riscv/kvm/vcpu.c             | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index 92af6f3f057c..c1a1bb0fa91c 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -52,6 +52,7 @@ struct kvm_riscv_config {
 	unsigned long mvendorid;
 	unsigned long marchid;
 	unsigned long mimpid;
+	unsigned long zicboz_block_size;
 };
 
 /* CORE registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index 7c08567097f0..e5126cefbc87 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -276,6 +276,11 @@ static int kvm_riscv_vcpu_get_reg_config(struct kvm_vcpu *vcpu,
 			return -EINVAL;
 		reg_val = riscv_cbom_block_size;
 		break;
+	case KVM_REG_RISCV_CONFIG_REG(zicboz_block_size):
+		if (!riscv_isa_extension_available(vcpu->arch.isa, ZICBOZ))
+			return -EINVAL;
+		reg_val = riscv_cboz_block_size;
+		break;
 	case KVM_REG_RISCV_CONFIG_REG(mvendorid):
 		reg_val = vcpu->arch.mvendorid;
 		break;
@@ -347,6 +352,8 @@ static int kvm_riscv_vcpu_set_reg_config(struct kvm_vcpu *vcpu,
 		break;
 	case KVM_REG_RISCV_CONFIG_REG(zicbom_block_size):
 		return -EOPNOTSUPP;
+	case KVM_REG_RISCV_CONFIG_REG(zicboz_block_size):
+		return -EOPNOTSUPP;
 	case KVM_REG_RISCV_CONFIG_REG(mvendorid):
 		if (!vcpu->arch.ran_atleast_once)
 			vcpu->arch.mvendorid = reg_val;
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 8/8] RISC-V: KVM: Expose Zicboz to the guest
  2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
                   ` (6 preceding siblings ...)
  2023-02-21 19:09 ` [PATCH v5 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
@ 2023-02-21 19:09 ` Andrew Jones
  7 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-21 19:09 UTC (permalink / raw)
  To: linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Ben Dooks ',
	'Atish Patra ', Anup Patel

Guests may use the cbo.zero instruction when the CPU has the Zicboz
extension and the hypervisor sets henvcfg.CBZE.

Add Zicboz support for KVM guests which may be enabled and
disabled from KVM userspace using the ISA extension ONE_REG API.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
---
 arch/riscv/include/uapi/asm/kvm.h | 1 +
 arch/riscv/kvm/vcpu.c             | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index c1a1bb0fa91c..e44c1e90eaa7 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -106,6 +106,7 @@ enum KVM_RISCV_ISA_EXT_ID {
 	KVM_RISCV_ISA_EXT_SVINVAL,
 	KVM_RISCV_ISA_EXT_ZIHINTPAUSE,
 	KVM_RISCV_ISA_EXT_ZICBOM,
+	KVM_RISCV_ISA_EXT_ZICBOZ,
 	KVM_RISCV_ISA_EXT_MAX,
 };
 
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index e5126cefbc87..198ee86cad38 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -63,6 +63,7 @@ static const unsigned long kvm_isa_ext_arr[] = {
 	KVM_ISA_EXT_ARR(SVPBMT),
 	KVM_ISA_EXT_ARR(ZIHINTPAUSE),
 	KVM_ISA_EXT_ARR(ZICBOM),
+	KVM_ISA_EXT_ARR(ZICBOZ),
 };
 
 static unsigned long kvm_riscv_vcpu_base2isa_ext(unsigned long base_ext)
@@ -865,6 +866,9 @@ static void kvm_riscv_vcpu_update_config(const unsigned long *isa)
 	if (riscv_isa_extension_available(isa, ZICBOM))
 		henvcfg |= (ENVCFG_CBIE | ENVCFG_CBCFE);
 
+	if (riscv_isa_extension_available(isa, ZICBOZ))
+		henvcfg |= ENVCFG_CBZE;
+
 	csr_write(CSR_HENVCFG, henvcfg);
 #ifdef CONFIG_32BIT
 	csr_write(CSR_HENVCFGH, henvcfg >> 32);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work
  2023-02-21 19:09 ` [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
@ 2023-02-22 17:27   ` Conor Dooley
  2023-02-23 12:53     ` Andrew Jones
  0 siblings, 1 reply; 14+ messages in thread
From: Conor Dooley @ 2023-02-22 17:27 UTC (permalink / raw)
  To: Andrew Jones
  Cc: linux-riscv, devicetree, kvm-riscv, 'Rob Herring ',
	'Jisheng Zhang ', 'Anup Patel ',
	'Conor Dooley ', 'Krzysztof Kozlowski ',
	'Heiko Stuebner ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Albert Ou ',
	'Ben Dooks ', 'Atish Patra '

[-- Attachment #1: Type: text/plain, Size: 4964 bytes --]

On Tue, Feb 21, 2023 at 08:09:13PM +0100, Andrew Jones wrote:
> cpufeature IDs are consecutive integers starting at 26, so a 32-bit
> patch ID allows an aircraft carrier load of feature IDs. Repurposing
> the upper 16 bits still leaves a boat load of feature IDs and gains
> 16 bits which may be used to control patching on a per patch-site
> basis.
> 
> This will be initially used in Zicboz's application to clear_page(),
> as Zicboz's block size must also be considered. In that case, the
> upper 16-bit value's role will be to convey the maximum block size
> which the Zicboz clear_page() implementation supports.
> 
> cpufeature patch sites which need to check for the existence or
> absence of other cpufeatures may also be able to make use of this.
> 
> Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> ---
>  arch/riscv/include/asm/alternative.h |  3 +++
>  arch/riscv/kernel/cpufeature.c       | 37 +++++++++++++++++++++++++---
>  2 files changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 8f39d4e8598d..f2cb543b0bd2 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -17,6 +17,9 @@
>  #include <linux/stddef.h>
>  #include <asm/hwcap.h>
>  
> +#define PATCH_ID_CPUFEATURE_ID(p)		((u16)((u32)(p) & 0xffff))
> +#define PATCH_ID_CPUFEATURE_VALUE(p)		((u16)(((u32)(p) >> 16) & 0xffff))

I was just fiddling around a bit with macros, I think these do the same
thing:
#define PATCH_ID_CPUFEATURE_ID(p)		((p) & GENMASK(15, 0))
#define PATCH_ID_CPUFEATURE_VALUE(p)		FIELD_GET(GENMASK(31, 16), (p))
Although without the same care about types - is there a specific reason
you were casting like that?

Either way, I think I prefer this approach to the vendor_id stuffing!
If we do end up needing to fit an aircraft carrier, we can come back and
revisit another parameter in the alternatives I suppose...

I don't really know if the macros do anything to help with
understandability, so with or without trying to use macros:
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
Conor.

>  #define RISCV_ALTERNATIVES_BOOT		0 /* alternatives applied during regular boot */
>  #define RISCV_ALTERNATIVES_MODULE	1 /* alternatives applied during module-init */
>  #define RISCV_ALTERNATIVES_EARLY_BOOT	2 /* alternatives applied before mmu start */
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 6102b6bb5db3..0594989ead63 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -273,12 +273,35 @@ void __init riscv_fill_hwcap(void)
>  }
>  
>  #ifdef CONFIG_RISCV_ALTERNATIVE
> +/*
> + * Alternative patch sites consider 48 bits when determining when to patch
> + * the old instruction sequence with the new. These bits are broken into a
> + * 16-bit vendor ID and a 32-bit patch ID. A non-zero vendor ID means the
> + * patch site is for an erratum, identified by the 32-bit patch ID. When
> + * the vendor ID is zero, the patch site is for a cpufeature. cpufeatures
> + * further break down patch ID into two 16-bit numbers. The lower 16 bits
> + * are the cpufeature ID and the upper 16 bits are used for a value specific
> + * to the cpufeature and patch site. If the upper 16 bits are zero, then it
> + * implies no specific value is specified. cpufeatures that want to control
> + * patching on a per-site basis will provide non-zero values and implement
> + * checks here. The checks return true when patching should be done, and
> + * false otherwise.
> + */
> +static bool riscv_cpufeature_patch_check(u16 id, u16 value)
> +{
> +	if (!value)
> +		return true;
> +
> +	return false;
> +}
> +
>  void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
>  						  struct alt_entry *end,
>  						  unsigned int stage)
>  {
>  	struct alt_entry *alt;
>  	void *oldptr, *altptr;
> +	u16 id, value;
>  
>  	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
>  		return;
> @@ -286,13 +309,19 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
>  	for (alt = begin; alt < end; alt++) {
>  		if (alt->vendor_id != 0)
>  			continue;
> -		if (alt->patch_id >= RISCV_ISA_EXT_MAX) {
> -			WARN(1, "This extension id:%d is not in ISA extension list",
> -				alt->patch_id);
> +
> +		id = PATCH_ID_CPUFEATURE_ID(alt->patch_id);
> +
> +		if (id >= RISCV_ISA_EXT_MAX) {
> +			WARN(1, "This extension id:%d is not in ISA extension list", id);
>  			continue;
>  		}
>  
> -		if (!__riscv_isa_extension_available(NULL, alt->patch_id))
> +		if (!__riscv_isa_extension_available(NULL, id))
> +			continue;
> +
> +		value = PATCH_ID_CPUFEATURE_VALUE(alt->patch_id);
> +		if (!riscv_cpufeature_patch_check(id, value))
>  			continue;
>  
>  		oldptr = ALT_OLD_PTR(alt);
> -- 
> 2.39.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work
  2023-02-22 17:27   ` Conor Dooley
@ 2023-02-23 12:53     ` Andrew Jones
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-23 12:53 UTC (permalink / raw)
  To: Conor Dooley
  Cc: linux-riscv, devicetree, kvm-riscv, 'Rob Herring ',
	'Jisheng Zhang ', 'Anup Patel ',
	'Conor Dooley ', 'Krzysztof Kozlowski ',
	'Heiko Stuebner ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Albert Ou ',
	'Ben Dooks ', 'Atish Patra '

On Wed, Feb 22, 2023 at 05:27:47PM +0000, Conor Dooley wrote:
> On Tue, Feb 21, 2023 at 08:09:13PM +0100, Andrew Jones wrote:
> > cpufeature IDs are consecutive integers starting at 26, so a 32-bit
> > patch ID allows an aircraft carrier load of feature IDs. Repurposing
> > the upper 16 bits still leaves a boat load of feature IDs and gains
> > 16 bits which may be used to control patching on a per patch-site
> > basis.
> > 
> > This will be initially used in Zicboz's application to clear_page(),
> > as Zicboz's block size must also be considered. In that case, the
> > upper 16-bit value's role will be to convey the maximum block size
> > which the Zicboz clear_page() implementation supports.
> > 
> > cpufeature patch sites which need to check for the existence or
> > absence of other cpufeatures may also be able to make use of this.
> > 
> > Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> > ---
> >  arch/riscv/include/asm/alternative.h |  3 +++
> >  arch/riscv/kernel/cpufeature.c       | 37 +++++++++++++++++++++++++---
> >  2 files changed, 36 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 8f39d4e8598d..f2cb543b0bd2 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -17,6 +17,9 @@
> >  #include <linux/stddef.h>
> >  #include <asm/hwcap.h>
> >  
> > +#define PATCH_ID_CPUFEATURE_ID(p)		((u16)((u32)(p) & 0xffff))
> > +#define PATCH_ID_CPUFEATURE_VALUE(p)		((u16)(((u32)(p) >> 16) & 0xffff))
> 
> I was just fiddling around a bit with macros, I think these do the same
> thing:
> #define PATCH_ID_CPUFEATURE_ID(p)		((p) & GENMASK(15, 0))
> #define PATCH_ID_CPUFEATURE_VALUE(p)		FIELD_GET(GENMASK(31, 16), (p))
> Although without the same care about types - is there a specific reason
> you were casting like that?

Just to be pedantic, but I just remembered we already have
upper/lower_16_bits() in linux/kernel.h. I'll use those.

> 
> Either way, I think I prefer this approach to the vendor_id stuffing!
> If we do end up needing to fit an aircraft carrier, we can come back and
> revisit another parameter in the alternatives I suppose...
> 
> I don't really know if the macros do anything to help with
> understandability, so with or without trying to use macros:
> Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

Thanks,
drew

> 
> Thanks,
> Conor.
> 
> >  #define RISCV_ALTERNATIVES_BOOT		0 /* alternatives applied during regular boot */
> >  #define RISCV_ALTERNATIVES_MODULE	1 /* alternatives applied during module-init */
> >  #define RISCV_ALTERNATIVES_EARLY_BOOT	2 /* alternatives applied before mmu start */
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 6102b6bb5db3..0594989ead63 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -273,12 +273,35 @@ void __init riscv_fill_hwcap(void)
> >  }
> >  
> >  #ifdef CONFIG_RISCV_ALTERNATIVE
> > +/*
> > + * Alternative patch sites consider 48 bits when determining when to patch
> > + * the old instruction sequence with the new. These bits are broken into a
> > + * 16-bit vendor ID and a 32-bit patch ID. A non-zero vendor ID means the
> > + * patch site is for an erratum, identified by the 32-bit patch ID. When
> > + * the vendor ID is zero, the patch site is for a cpufeature. cpufeatures
> > + * further break down patch ID into two 16-bit numbers. The lower 16 bits
> > + * are the cpufeature ID and the upper 16 bits are used for a value specific
> > + * to the cpufeature and patch site. If the upper 16 bits are zero, then it
> > + * implies no specific value is specified. cpufeatures that want to control
> > + * patching on a per-site basis will provide non-zero values and implement
> > + * checks here. The checks return true when patching should be done, and
> > + * false otherwise.
> > + */
> > +static bool riscv_cpufeature_patch_check(u16 id, u16 value)
> > +{
> > +	if (!value)
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> >  void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >  						  struct alt_entry *end,
> >  						  unsigned int stage)
> >  {
> >  	struct alt_entry *alt;
> >  	void *oldptr, *altptr;
> > +	u16 id, value;
> >  
> >  	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> >  		return;
> > @@ -286,13 +309,19 @@ void __init_or_module riscv_cpufeature_patch_func(struct alt_entry *begin,
> >  	for (alt = begin; alt < end; alt++) {
> >  		if (alt->vendor_id != 0)
> >  			continue;
> > -		if (alt->patch_id >= RISCV_ISA_EXT_MAX) {
> > -			WARN(1, "This extension id:%d is not in ISA extension list",
> > -				alt->patch_id);
> > +
> > +		id = PATCH_ID_CPUFEATURE_ID(alt->patch_id);
> > +
> > +		if (id >= RISCV_ISA_EXT_MAX) {
> > +			WARN(1, "This extension id:%d is not in ISA extension list", id);
> >  			continue;
> >  		}
> >  
> > -		if (!__riscv_isa_extension_available(NULL, alt->patch_id))
> > +		if (!__riscv_isa_extension_available(NULL, id))
> > +			continue;
> > +
> > +		value = PATCH_ID_CPUFEATURE_VALUE(alt->patch_id);
> > +		if (!riscv_cpufeature_patch_check(id, value))
> >  			continue;
> >  
> >  		oldptr = ALT_OLD_PTR(alt);
> > -- 
> > 2.39.1
> > 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available
  2023-02-21 19:09 ` [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
@ 2023-02-24 14:00   ` Ben Dooks
  2023-02-24 14:25     ` Andrew Jones
  0 siblings, 1 reply; 14+ messages in thread
From: Ben Dooks @ 2023-02-24 14:00 UTC (permalink / raw)
  To: Andrew Jones, linux-riscv, devicetree, kvm-riscv
  Cc: 'Rob Herring ', 'Jisheng Zhang ',
	'Anup Patel ', 'Conor Dooley ',
	'Krzysztof Kozlowski ', 'Heiko Stuebner ',
	'Paul Walmsley ', 'Palmer Dabbelt ',
	'Albert Ou ', 'Atish Patra '

On 21/02/2023 19:09, Andrew Jones wrote:
> Using memset() to zero a 4K page takes 563 total instructions, where
> 20 are branches. clear_page(), with Zicboz and a 64 byte block size,
> takes 169 total instructions, where 4 are branches and 33 are nops.
> Even though the block size is a variable, thanks to alternatives, we
> can still implement a Duff device without having to do any preliminary
> calculations. This is achieved by using the alternatives' cpufeature
> value (the upper 16 bits of patch_id). The value used is the maximum
> zicboz block size order accepted at the patch site. This enables us
> to stop patching / unrolling when 4K bytes have been zeroed (we would
> loop and continue after 4K if the page size would be larger)
> 
> For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to
> only loop a few times and larger block sizes to not loop at all. Since
> cbo.zero doesn't take an offset, we also need an 'add' after each
> instruction, making the loop body 112 to 160 bytes. Hopefully this
> is small enough to not cause icache misses.
> 
> Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> Acked-by: Conor Dooley <conor.dooley@microchip.com>
> ---
>   arch/riscv/Kconfig                | 13 ++++++
>   arch/riscv/include/asm/insn-def.h |  4 ++
>   arch/riscv/include/asm/page.h     |  6 ++-
>   arch/riscv/kernel/cpufeature.c    | 11 +++++
>   arch/riscv/lib/Makefile           |  1 +
>   arch/riscv/lib/clear_page.S       | 73 +++++++++++++++++++++++++++++++
>   6 files changed, 107 insertions(+), 1 deletion(-)
>   create mode 100644 arch/riscv/lib/clear_page.S

[snip]

> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index 0594989ead63..4a496552b812 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -292,6 +292,17 @@ static bool riscv_cpufeature_patch_check(u16 id, u16 value)
>   	if (!value)
>   		return true;
>   
> +	switch (id) {
> +	case RISCV_ISA_EXT_ZICBOZ:
> +		/*
> +		 * Zicboz alternative applications provide the maximum
> +		 * supported block size order, or zero when it doesn't
> +		 * matter. If the current block size exceeds the maximum,
> +		 * then the alternative cannot be applied.
> +		 */
> +		return riscv_cboz_block_size <= (1U << value);
> +	}
> +
>   	return false;
>   }
>   
> diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> index 6c74b0bedd60..26cb2502ecf8 100644
> --- a/arch/riscv/lib/Makefile
> +++ b/arch/riscv/lib/Makefile
> @@ -8,5 +8,6 @@ lib-y			+= strlen.o
>   lib-y			+= strncmp.o
>   lib-$(CONFIG_MMU)	+= uaccess.o
>   lib-$(CONFIG_64BIT)	+= tishift.o
> +lib-$(CONFIG_RISCV_ISA_ZICBOZ)	+= clear_page.o
>   
>   obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> diff --git a/arch/riscv/lib/clear_page.S b/arch/riscv/lib/clear_page.S
> new file mode 100644
> index 000000000000..7c7fa45b5ab5
> --- /dev/null
> +++ b/arch/riscv/lib/clear_page.S
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2023 Ventana Micro Systems Inc.
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm.h>
> +#include <asm/alternative-macros.h>
> +#include <asm/hwcap.h>
> +#include <asm/insn-def.h>
> +#include <asm/page.h>
> +
> +#define CBOZ_ALT(order, old, new)				\
> +	ALTERNATIVE(old, new, 0,				\
> +		    ((order) << 16) | RISCV_ISA_EXT_ZICBOZ,	\
> +		    CONFIG_RISCV_ISA_ZICBOZ)
> +
> +/* void clear_page(void *page) */
> +ENTRY(__clear_page)
> +WEAK(clear_page)

out of interest, why the __clear_page() entry and the
WEAK(clear_page)?

Just followed up with a patch to fix the modpost.

So far this seems to be working with qemu and a backport to 5.19.x

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available
  2023-02-24 14:00   ` Ben Dooks
@ 2023-02-24 14:25     ` Andrew Jones
  2023-02-24 14:36       ` Andrew Jones
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 14:25 UTC (permalink / raw)
  To: Ben Dooks
  Cc: linux-riscv, devicetree, kvm-riscv, 'Rob Herring ',
	'Jisheng Zhang ', 'Anup Patel ',
	'Conor Dooley ', 'Krzysztof Kozlowski ',
	'Heiko Stuebner ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Albert Ou ',
	'Atish Patra '

On Fri, Feb 24, 2023 at 02:00:44PM +0000, Ben Dooks wrote:
> On 21/02/2023 19:09, Andrew Jones wrote:
> > Using memset() to zero a 4K page takes 563 total instructions, where
> > 20 are branches. clear_page(), with Zicboz and a 64 byte block size,
> > takes 169 total instructions, where 4 are branches and 33 are nops.
> > Even though the block size is a variable, thanks to alternatives, we
> > can still implement a Duff device without having to do any preliminary
> > calculations. This is achieved by using the alternatives' cpufeature
> > value (the upper 16 bits of patch_id). The value used is the maximum
> > zicboz block size order accepted at the patch site. This enables us
> > to stop patching / unrolling when 4K bytes have been zeroed (we would
> > loop and continue after 4K if the page size would be larger)
> > 
> > For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to
> > only loop a few times and larger block sizes to not loop at all. Since
> > cbo.zero doesn't take an offset, we also need an 'add' after each
> > instruction, making the loop body 112 to 160 bytes. Hopefully this
> > is small enough to not cause icache misses.
> > 
> > Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
> > Acked-by: Conor Dooley <conor.dooley@microchip.com>
> > ---
> >   arch/riscv/Kconfig                | 13 ++++++
> >   arch/riscv/include/asm/insn-def.h |  4 ++
> >   arch/riscv/include/asm/page.h     |  6 ++-
> >   arch/riscv/kernel/cpufeature.c    | 11 +++++
> >   arch/riscv/lib/Makefile           |  1 +
> >   arch/riscv/lib/clear_page.S       | 73 +++++++++++++++++++++++++++++++
> >   6 files changed, 107 insertions(+), 1 deletion(-)
> >   create mode 100644 arch/riscv/lib/clear_page.S
> 
> [snip]
> 
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index 0594989ead63..4a496552b812 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -292,6 +292,17 @@ static bool riscv_cpufeature_patch_check(u16 id, u16 value)
> >   	if (!value)
> >   		return true;
> > +	switch (id) {
> > +	case RISCV_ISA_EXT_ZICBOZ:
> > +		/*
> > +		 * Zicboz alternative applications provide the maximum
> > +		 * supported block size order, or zero when it doesn't
> > +		 * matter. If the current block size exceeds the maximum,
> > +		 * then the alternative cannot be applied.
> > +		 */
> > +		return riscv_cboz_block_size <= (1U << value);
> > +	}
> > +
> >   	return false;
> >   }
> > diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> > index 6c74b0bedd60..26cb2502ecf8 100644
> > --- a/arch/riscv/lib/Makefile
> > +++ b/arch/riscv/lib/Makefile
> > @@ -8,5 +8,6 @@ lib-y			+= strlen.o
> >   lib-y			+= strncmp.o
> >   lib-$(CONFIG_MMU)	+= uaccess.o
> >   lib-$(CONFIG_64BIT)	+= tishift.o
> > +lib-$(CONFIG_RISCV_ISA_ZICBOZ)	+= clear_page.o
> >   obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> > diff --git a/arch/riscv/lib/clear_page.S b/arch/riscv/lib/clear_page.S
> > new file mode 100644
> > index 000000000000..7c7fa45b5ab5
> > --- /dev/null
> > +++ b/arch/riscv/lib/clear_page.S
> > @@ -0,0 +1,73 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (c) 2023 Ventana Micro Systems Inc.
> > + */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm/alternative-macros.h>
> > +#include <asm/hwcap.h>
> > +#include <asm/insn-def.h>
> > +#include <asm/page.h>
> > +
> > +#define CBOZ_ALT(order, old, new)				\
> > +	ALTERNATIVE(old, new, 0,				\
> > +		    ((order) << 16) | RISCV_ISA_EXT_ZICBOZ,	\
> > +		    CONFIG_RISCV_ISA_ZICBOZ)
> > +
> > +/* void clear_page(void *page) */
> > +ENTRY(__clear_page)
> > +WEAK(clear_page)
> 
> out of interest, why the __clear_page() entry and the
> WEAK(clear_page)?

I was inspired by memset, but, in hindsight, it doesn't make sense for
clear_page to be weak.

> 
> Just followed up with a patch to fix the modpost.

Thanks!

> 
> So far this seems to be working with qemu and a backport to 5.19.x

Great news!

Thanks,
drew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available
  2023-02-24 14:25     ` Andrew Jones
@ 2023-02-24 14:36       ` Andrew Jones
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Jones @ 2023-02-24 14:36 UTC (permalink / raw)
  To: Ben Dooks
  Cc: linux-riscv, devicetree, kvm-riscv, 'Rob Herring ',
	'Jisheng Zhang ', 'Anup Patel ',
	'Conor Dooley ', 'Krzysztof Kozlowski ',
	'Heiko Stuebner ', 'Paul Walmsley ',
	'Palmer Dabbelt ', 'Albert Ou ',
	'Atish Patra '

On Fri, Feb 24, 2023 at 03:25:30PM +0100, Andrew Jones wrote:
> On Fri, Feb 24, 2023 at 02:00:44PM +0000, Ben Dooks wrote:
> > On 21/02/2023 19:09, Andrew Jones wrote:
...
> > > +/* void clear_page(void *page) */
> > > +ENTRY(__clear_page)
> > > +WEAK(clear_page)
> > 
> > out of interest, why the __clear_page() entry and the
> > WEAK(clear_page)?
> 
> I was inspired by memset, but, in hindsight, it doesn't make sense for
> clear_page to be weak.

Of course I failed to completely follow the memset pattern, which also
has an export (in riscv_ksyms.c). I prefer the export in clear_page.S,
though, as you've done.

Thanks,
drew

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-02-24 14:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-21 19:09 [PATCH v5 0/8] RISC-V: Apply Zicboz to clear_page Andrew Jones
2023-02-21 19:09 ` [PATCH v5 1/8] RISC-V: alternatives: Support patching multiple insns in assembly Andrew Jones
2023-02-21 19:09 ` [PATCH v5 2/8] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
2023-02-21 19:09 ` [PATCH v5 3/8] dt-bindings: riscv: Document cboz-block-size Andrew Jones
2023-02-21 19:09 ` [PATCH v5 4/8] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
2023-02-21 19:09 ` [PATCH v5 5/8] riscv: cpufeatures: Put the upper 16 bits of patch ID to work Andrew Jones
2023-02-22 17:27   ` Conor Dooley
2023-02-23 12:53     ` Andrew Jones
2023-02-21 19:09 ` [PATCH v5 6/8] RISC-V: Use Zicboz in clear_page when available Andrew Jones
2023-02-24 14:00   ` Ben Dooks
2023-02-24 14:25     ` Andrew Jones
2023-02-24 14:36       ` Andrew Jones
2023-02-21 19:09 ` [PATCH v5 7/8] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
2023-02-21 19:09 ` [PATCH v5 8/8] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).