linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support
@ 2025-07-11 10:09 Chunyan Zhang
  2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

The 1st patch is a cleanup;
Patch 2/4 is an optimization that takes Palmer's suggestion;
The last two patches add raid6test support and make the raid6 RVV code buildable on user space.

V2:
* Addressed comments from v1:
- Replaced one load with a move to speed up in _gen/xor_syndrome();
- Added a compiler error
- Dropped the NSIZE macro, instead of using the vector length;
- Modified has_vector() definition for user space;

Chunyan Zhang (5):
  raid6: riscv: Clean up unused header file inclusion
  raid6: riscv: replace one load with a move to speed up the caculation
  raid6: riscv: Add a compiler error
  raid6: riscv: Allow code to be compiled in userspace
  raid6: test: Add support for RISC-V

 lib/raid6/recov_rvv.c   |   9 +-
 lib/raid6/rvv.c         | 362 ++++++++++++++++++++--------------------
 lib/raid6/rvv.h         |  17 ++
 lib/raid6/test/Makefile |   8 +
 4 files changed, 211 insertions(+), 185 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion
  2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
@ 2025-07-11 10:09 ` Chunyan Zhang
  2025-07-16 13:38   ` Alexandre Ghiti
  2025-07-21  7:52   ` Nutty Liu
  2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

These two C files don't reference things defined in simd.h or types.h
so remove these redundant #inclusions.

Fixes: 6093faaf9593 ("raid6: Add RISC-V SIMD syndrome and recovery calculations")
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
 lib/raid6/recov_rvv.c | 2 --
 lib/raid6/rvv.c       | 3 ---
 2 files changed, 5 deletions(-)

diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
index f29303795ccf..500da521a806 100644
--- a/lib/raid6/recov_rvv.c
+++ b/lib/raid6/recov_rvv.c
@@ -4,9 +4,7 @@
  * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
  */
 
-#include <asm/simd.h>
 #include <asm/vector.h>
-#include <crypto/internal/simd.h>
 #include <linux/raid/pq.h>
 
 static int rvv_has_vector(void)
diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
index 7d82efa5b14f..b193ea176d5d 100644
--- a/lib/raid6/rvv.c
+++ b/lib/raid6/rvv.c
@@ -9,11 +9,8 @@
  *	Copyright 2002-2004 H. Peter Anvin
  */
 
-#include <asm/simd.h>
 #include <asm/vector.h>
-#include <crypto/internal/simd.h>
 #include <linux/raid/pq.h>
-#include <linux/types.h>
 #include "rvv.h"
 
 #define NSIZE	(riscv_v_vsize / 32) /* NSIZE = vlenb */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation
  2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
  2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
@ 2025-07-11 10:09 ` Chunyan Zhang
  2025-07-16 13:40   ` Alexandre Ghiti
  2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

Since wp$$==wq$$, it doesn't need to load the same data twice, use move
instruction to replace one of the loads to let the program run faster.

Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
 lib/raid6/rvv.c | 60 ++++++++++++++++++++++++-------------------------
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
index b193ea176d5d..89da5fc247aa 100644
--- a/lib/raid6/rvv.c
+++ b/lib/raid6/rvv.c
@@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
@@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
@@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      "vle8.v	v4, (%[wp1])\n"
-			      "vle8.v	v5, (%[wp1])\n"
+			      "vmv.v.v	v5, v4\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
@@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      "vle8.v	v4, (%[wp1])\n"
-			      "vle8.v	v5, (%[wp1])\n"
+			      "vmv.v.v	v5, v4\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
@@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      "vle8.v	v4, (%[wp1])\n"
-			      "vle8.v	v5, (%[wp1])\n"
+			      "vmv.v.v	v5, v4\n"
 			      "vle8.v	v8, (%[wp2])\n"
-			      "vle8.v	v9, (%[wp2])\n"
+			      "vmv.v.v	v9, v8\n"
 			      "vle8.v	v12, (%[wp3])\n"
-			      "vle8.v	v13, (%[wp3])\n"
+			      "vmv.v.v	v13, v12\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
@@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      "vle8.v	v4, (%[wp1])\n"
-			      "vle8.v	v5, (%[wp1])\n"
+			      "vmv.v.v	v5, v4\n"
 			      "vle8.v	v8, (%[wp2])\n"
-			      "vle8.v	v9, (%[wp2])\n"
+			      "vmv.v.v	v9, v8\n"
 			      "vle8.v	v12, (%[wp3])\n"
-			      "vle8.v	v13, (%[wp3])\n"
+			      "vmv.v.v	v13, v12\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
@@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      "vle8.v	v4, (%[wp1])\n"
-			      "vle8.v	v5, (%[wp1])\n"
+			      "vmv.v.v	v5, v4\n"
 			      "vle8.v	v8, (%[wp2])\n"
-			      "vle8.v	v9, (%[wp2])\n"
+			      "vmv.v.v	v9, v8\n"
 			      "vle8.v	v12, (%[wp3])\n"
-			      "vle8.v	v13, (%[wp3])\n"
+			      "vmv.v.v	v13, v12\n"
 			      "vle8.v	v16, (%[wp4])\n"
-			      "vle8.v	v17, (%[wp4])\n"
+			      "vmv.v.v	v17, v16\n"
 			      "vle8.v	v20, (%[wp5])\n"
-			      "vle8.v	v21, (%[wp5])\n"
+			      "vmv.v.v	v21, v20\n"
 			      "vle8.v	v24, (%[wp6])\n"
-			      "vle8.v	v25, (%[wp6])\n"
+			      "vmv.v.v	v25, v24\n"
 			      "vle8.v	v28, (%[wp7])\n"
-			      "vle8.v	v29, (%[wp7])\n"
+			      "vmv.v.v	v29, v28\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
@@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
 			      "vle8.v	v0, (%[wp0])\n"
-			      "vle8.v	v1, (%[wp0])\n"
+			      "vmv.v.v	v1, v0\n"
 			      "vle8.v	v4, (%[wp1])\n"
-			      "vle8.v	v5, (%[wp1])\n"
+			      "vmv.v.v	v5, v4\n"
 			      "vle8.v	v8, (%[wp2])\n"
-			      "vle8.v	v9, (%[wp2])\n"
+			      "vmv.v.v	v9, v8\n"
 			      "vle8.v	v12, (%[wp3])\n"
-			      "vle8.v	v13, (%[wp3])\n"
+			      "vmv.v.v	v13, v12\n"
 			      "vle8.v	v16, (%[wp4])\n"
-			      "vle8.v	v17, (%[wp4])\n"
+			      "vmv.v.v	v17, v16\n"
 			      "vle8.v	v20, (%[wp5])\n"
-			      "vle8.v	v21, (%[wp5])\n"
+			      "vmv.v.v	v21, v20\n"
 			      "vle8.v	v24, (%[wp6])\n"
-			      "vle8.v	v25, (%[wp6])\n"
+			      "vmv.v.v	v25, v24\n"
 			      "vle8.v	v28, (%[wp7])\n"
-			      "vle8.v	v29, (%[wp7])\n"
+			      "vmv.v.v	v29, v28\n"
 			      ".option	pop\n"
 			      : :
 			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 3/5] raid6: riscv: Add a compiler error
  2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
  2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
  2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang
@ 2025-07-11 10:09 ` Chunyan Zhang
  2025-07-16 13:43   ` Alexandre Ghiti
  2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang
  2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang
  4 siblings, 1 reply; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

The code like "u8 **dptr = (u8 **)ptrs" just won't work when built with
a compiler that can use vector instructions. So add an error for that.

Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
 lib/raid6/rvv.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
index 89da5fc247aa..015f3ee4da25 100644
--- a/lib/raid6/rvv.c
+++ b/lib/raid6/rvv.c
@@ -20,6 +20,10 @@ static int rvv_has_vector(void)
 	return has_vector();
 }
 
+#ifdef __riscv_vector
+#error "This code must be built without compiler support for vector"
+#endif
+
 static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs)
 {
 	u8 **dptr = (u8 **)ptrs;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace
  2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
                   ` (2 preceding siblings ...)
  2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang
@ 2025-07-11 10:09 ` Chunyan Zhang
  2025-07-17  7:04   ` Alexandre Ghiti
  2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang
  4 siblings, 1 reply; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel
header inclusions also userspace wrapper definitions to allow code to be
compiled in userspace.

This patch also drops the NSIZE macro, instead of using the vector length,
which can work for both kernel and user space.

Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
 lib/raid6/recov_rvv.c |   7 +-
 lib/raid6/rvv.c       | 297 +++++++++++++++++++++---------------------
 lib/raid6/rvv.h       |  17 +++
 3 files changed, 170 insertions(+), 151 deletions(-)

diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
index 500da521a806..8f2be833c015 100644
--- a/lib/raid6/recov_rvv.c
+++ b/lib/raid6/recov_rvv.c
@@ -4,13 +4,8 @@
  * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
  */
 
-#include <asm/vector.h>
 #include <linux/raid/pq.h>
-
-static int rvv_has_vector(void)
-{
-	return has_vector();
-}
+#include "rvv.h"
 
 static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp,
 				    u8 *dq, const u8 *pbmul,
diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
index 015f3ee4da25..75c9dafedb28 100644
--- a/lib/raid6/rvv.c
+++ b/lib/raid6/rvv.c
@@ -9,17 +9,8 @@
  *	Copyright 2002-2004 H. Peter Anvin
  */
 
-#include <asm/vector.h>
-#include <linux/raid/pq.h>
 #include "rvv.h"
 
-#define NSIZE	(riscv_v_vsize / 32) /* NSIZE = vlenb */
-
-static int rvv_has_vector(void)
-{
-	return has_vector();
-}
-
 #ifdef __riscv_vector
 #error "This code must be built without compiler support for vector"
 #endif
@@ -28,7 +19,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = disks - 3;		/* Highest data disk */
@@ -42,8 +33,10 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	 /* v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 */
-	for (d = 0; d < bytes; d += NSIZE * 1) {
+	for (d = 0; d < bytes; d += nsize * 1) {
 		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -51,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vmv.v.v	v1, v0\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize])
 		);
 
 		for (z = z0 - 1 ; z >= 0 ; z--) {
@@ -75,7 +68,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
 				      "vxor.vv	v0, v0, v2\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -90,8 +83,8 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vse8.v	v1, (%[wq0])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0])
 		);
 	}
 }
@@ -101,7 +94,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = stop;		/* P/Q right side optimization */
@@ -115,8 +108,10 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/* v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 */
-	for (d = 0 ; d < bytes ; d += NSIZE * 1) {
+	for (d = 0 ; d < bytes ; d += nsize * 1) {
 		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -124,7 +119,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
 			      "vmv.v.v	v1, v0\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize])
 		);
 
 		/* P/Q data pages */
@@ -149,7 +144,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
 				      "vxor.vv	v0, v0, v2\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -189,8 +184,8 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
 			      "vse8.v	v3, (%[wq0])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0])
 		);
 	}
 }
@@ -199,7 +194,7 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = disks - 3;		/* Highest data disk */
@@ -213,11 +208,13 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/*
 	 * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10
 	 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11
 	 */
-	for (d = 0; d < bytes; d += NSIZE * 2) {
+	for (d = 0; d < bytes; d += nsize * 2) {
 		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -227,8 +224,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vmv.v.v	v5, v4\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
+			      [wp1]"r"(&dptr[z0][d + 1 * nsize])
 		);
 
 		for (z = z0 - 1; z >= 0; z--) {
@@ -260,8 +257,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
 				      "vxor.vv	v4, v4, v6\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
-				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
+				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -278,10 +275,10 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vse8.v	v5, (%[wq1])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0]),
-			      [wp1]"r"(&p[d + NSIZE * 1]),
-			      [wq1]"r"(&q[d + NSIZE * 1])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0]),
+			      [wp1]"r"(&p[d + nsize * 1]),
+			      [wq1]"r"(&q[d + nsize * 1])
 		);
 	}
 }
@@ -291,7 +288,7 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = stop;		/* P/Q right side optimization */
@@ -305,11 +302,13 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/*
 	 * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10
 	 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11
 	 */
-	for (d = 0; d < bytes; d += NSIZE * 2) {
+	for (d = 0; d < bytes; d += nsize * 2) {
 		 /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -319,8 +318,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
 			      "vmv.v.v	v5, v4\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
+			      [wp1]"r"(&dptr[z0][d + 1 * nsize])
 		);
 
 		/* P/Q data pages */
@@ -353,8 +352,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
 				      "vxor.vv	v4, v4, v6\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
-				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
+				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -407,10 +406,10 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
 			      "vse8.v	v7, (%[wq1])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0]),
-			      [wp1]"r"(&p[d + NSIZE * 1]),
-			      [wq1]"r"(&q[d + NSIZE * 1])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0]),
+			      [wp1]"r"(&p[d + nsize * 1]),
+			      [wq1]"r"(&q[d + nsize * 1])
 		);
 	}
 }
@@ -419,7 +418,7 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = disks - 3;	/* Highest data disk */
@@ -433,13 +432,15 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/*
 	 * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10
 	 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11
 	 * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12
 	 * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13
 	 */
-	for (d = 0; d < bytes; d += NSIZE * 4) {
+	for (d = 0; d < bytes; d += nsize * 4) {
 		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -453,10 +454,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vmv.v.v	v13, v12\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
-			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
-			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
+			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
+			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
+			      [wp3]"r"(&dptr[z0][d + 3 * nsize])
 		);
 
 		for (z = z0 - 1; z >= 0; z--) {
@@ -504,10 +505,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
 				      "vxor.vv	v12, v12, v14\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
-				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
-				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
-				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
+				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
+				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
+				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -528,14 +529,14 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vse8.v	v13, (%[wq3])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0]),
-			      [wp1]"r"(&p[d + NSIZE * 1]),
-			      [wq1]"r"(&q[d + NSIZE * 1]),
-			      [wp2]"r"(&p[d + NSIZE * 2]),
-			      [wq2]"r"(&q[d + NSIZE * 2]),
-			      [wp3]"r"(&p[d + NSIZE * 3]),
-			      [wq3]"r"(&q[d + NSIZE * 3])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0]),
+			      [wp1]"r"(&p[d + nsize * 1]),
+			      [wq1]"r"(&q[d + nsize * 1]),
+			      [wp2]"r"(&p[d + nsize * 2]),
+			      [wq2]"r"(&q[d + nsize * 2]),
+			      [wp3]"r"(&p[d + nsize * 3]),
+			      [wq3]"r"(&q[d + nsize * 3])
 		);
 	}
 }
@@ -545,7 +546,7 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = stop;		/* P/Q right side optimization */
@@ -559,13 +560,15 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/*
 	 * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10
 	 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11
 	 * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12
 	 * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13
 	 */
-	for (d = 0; d < bytes; d += NSIZE * 4) {
+	for (d = 0; d < bytes; d += nsize * 4) {
 		 /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -579,10 +582,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
 			      "vmv.v.v	v13, v12\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
-			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
-			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
+			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
+			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
+			      [wp3]"r"(&dptr[z0][d + 3 * nsize])
 		);
 
 		/* P/Q data pages */
@@ -631,10 +634,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
 				      "vxor.vv	v12, v12, v14\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
-				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
-				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
-				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
+				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
+				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
+				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -713,14 +716,14 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
 			      "vse8.v	v15, (%[wq3])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0]),
-			      [wp1]"r"(&p[d + NSIZE * 1]),
-			      [wq1]"r"(&q[d + NSIZE * 1]),
-			      [wp2]"r"(&p[d + NSIZE * 2]),
-			      [wq2]"r"(&q[d + NSIZE * 2]),
-			      [wp3]"r"(&p[d + NSIZE * 3]),
-			      [wq3]"r"(&q[d + NSIZE * 3])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0]),
+			      [wp1]"r"(&p[d + nsize * 1]),
+			      [wq1]"r"(&q[d + nsize * 1]),
+			      [wp2]"r"(&p[d + nsize * 2]),
+			      [wq2]"r"(&q[d + nsize * 2]),
+			      [wp3]"r"(&p[d + nsize * 3]),
+			      [wq3]"r"(&q[d + nsize * 3])
 		);
 	}
 }
@@ -729,7 +732,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = disks - 3;	/* Highest data disk */
@@ -743,6 +746,8 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/*
 	 * v0:wp0,   v1:wq0,  v2:wd0/w20,  v3:w10
 	 * v4:wp1,   v5:wq1,  v6:wd1/w21,  v7:w11
@@ -753,7 +758,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 	 * v24:wp6, v25:wq6, v26:wd6/w26, v27:w16
 	 * v28:wp7, v29:wq7, v30:wd7/w27, v31:w17
 	 */
-	for (d = 0; d < bytes; d += NSIZE * 8) {
+	for (d = 0; d < bytes; d += nsize * 8) {
 		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -775,14 +780,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vmv.v.v	v29, v28\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
-			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
-			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE]),
-			      [wp4]"r"(&dptr[z0][d + 4 * NSIZE]),
-			      [wp5]"r"(&dptr[z0][d + 5 * NSIZE]),
-			      [wp6]"r"(&dptr[z0][d + 6 * NSIZE]),
-			      [wp7]"r"(&dptr[z0][d + 7 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
+			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
+			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
+			      [wp3]"r"(&dptr[z0][d + 3 * nsize]),
+			      [wp4]"r"(&dptr[z0][d + 4 * nsize]),
+			      [wp5]"r"(&dptr[z0][d + 5 * nsize]),
+			      [wp6]"r"(&dptr[z0][d + 6 * nsize]),
+			      [wp7]"r"(&dptr[z0][d + 7 * nsize])
 		);
 
 		for (z = z0 - 1; z >= 0; z--) {
@@ -862,14 +867,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 				      "vxor.vv	v28, v28, v30\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
-				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
-				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
-				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
-				      [wd4]"r"(&dptr[z][d + 4 * NSIZE]),
-				      [wd5]"r"(&dptr[z][d + 5 * NSIZE]),
-				      [wd6]"r"(&dptr[z][d + 6 * NSIZE]),
-				      [wd7]"r"(&dptr[z][d + 7 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
+				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
+				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
+				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
+				      [wd4]"r"(&dptr[z][d + 4 * nsize]),
+				      [wd5]"r"(&dptr[z][d + 5 * nsize]),
+				      [wd6]"r"(&dptr[z][d + 6 * nsize]),
+				      [wd7]"r"(&dptr[z][d + 7 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -898,22 +903,22 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
 			      "vse8.v	v29, (%[wq7])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0]),
-			      [wp1]"r"(&p[d + NSIZE * 1]),
-			      [wq1]"r"(&q[d + NSIZE * 1]),
-			      [wp2]"r"(&p[d + NSIZE * 2]),
-			      [wq2]"r"(&q[d + NSIZE * 2]),
-			      [wp3]"r"(&p[d + NSIZE * 3]),
-			      [wq3]"r"(&q[d + NSIZE * 3]),
-			      [wp4]"r"(&p[d + NSIZE * 4]),
-			      [wq4]"r"(&q[d + NSIZE * 4]),
-			      [wp5]"r"(&p[d + NSIZE * 5]),
-			      [wq5]"r"(&q[d + NSIZE * 5]),
-			      [wp6]"r"(&p[d + NSIZE * 6]),
-			      [wq6]"r"(&q[d + NSIZE * 6]),
-			      [wp7]"r"(&p[d + NSIZE * 7]),
-			      [wq7]"r"(&q[d + NSIZE * 7])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0]),
+			      [wp1]"r"(&p[d + nsize * 1]),
+			      [wq1]"r"(&q[d + nsize * 1]),
+			      [wp2]"r"(&p[d + nsize * 2]),
+			      [wq2]"r"(&q[d + nsize * 2]),
+			      [wp3]"r"(&p[d + nsize * 3]),
+			      [wq3]"r"(&q[d + nsize * 3]),
+			      [wp4]"r"(&p[d + nsize * 4]),
+			      [wq4]"r"(&q[d + nsize * 4]),
+			      [wp5]"r"(&p[d + nsize * 5]),
+			      [wq5]"r"(&q[d + nsize * 5]),
+			      [wp6]"r"(&p[d + nsize * 6]),
+			      [wq6]"r"(&q[d + nsize * 6]),
+			      [wp7]"r"(&p[d + nsize * 7]),
+			      [wq7]"r"(&q[d + nsize * 7])
 		);
 	}
 }
@@ -923,7 +928,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 {
 	u8 **dptr = (u8 **)ptrs;
 	u8 *p, *q;
-	unsigned long vl, d;
+	unsigned long vl, d, nsize;
 	int z, z0;
 
 	z0 = stop;		/* P/Q right side optimization */
@@ -937,6 +942,8 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 		      : "=&r" (vl)
 	);
 
+	nsize = vl;
+
 	/*
 	 * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10
 	 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11
@@ -947,7 +954,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 	 * v24:wp6, v25:wq6, v26:wd6/w26, v27:w16
 	 * v28:wp7, v29:wq7, v30:wd7/w27, v31:w17
 	 */
-	for (d = 0; d < bytes; d += NSIZE * 8) {
+	for (d = 0; d < bytes; d += nsize * 8) {
 		 /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
 		asm volatile (".option	push\n"
 			      ".option	arch,+v\n"
@@ -969,14 +976,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 			      "vmv.v.v	v29, v28\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
-			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
-			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
-			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE]),
-			      [wp4]"r"(&dptr[z0][d + 4 * NSIZE]),
-			      [wp5]"r"(&dptr[z0][d + 5 * NSIZE]),
-			      [wp6]"r"(&dptr[z0][d + 6 * NSIZE]),
-			      [wp7]"r"(&dptr[z0][d + 7 * NSIZE])
+			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
+			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
+			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
+			      [wp3]"r"(&dptr[z0][d + 3 * nsize]),
+			      [wp4]"r"(&dptr[z0][d + 4 * nsize]),
+			      [wp5]"r"(&dptr[z0][d + 5 * nsize]),
+			      [wp6]"r"(&dptr[z0][d + 6 * nsize]),
+			      [wp7]"r"(&dptr[z0][d + 7 * nsize])
 		);
 
 		/* P/Q data pages */
@@ -1057,14 +1064,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 				      "vxor.vv	v28, v28, v30\n"
 				      ".option	pop\n"
 				      : :
-				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
-				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
-				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
-				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
-				      [wd4]"r"(&dptr[z][d + 4 * NSIZE]),
-				      [wd5]"r"(&dptr[z][d + 5 * NSIZE]),
-				      [wd6]"r"(&dptr[z][d + 6 * NSIZE]),
-				      [wd7]"r"(&dptr[z][d + 7 * NSIZE]),
+				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
+				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
+				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
+				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
+				      [wd4]"r"(&dptr[z][d + 4 * nsize]),
+				      [wd5]"r"(&dptr[z][d + 5 * nsize]),
+				      [wd6]"r"(&dptr[z][d + 6 * nsize]),
+				      [wd7]"r"(&dptr[z][d + 7 * nsize]),
 				      [x1d]"r"(0x1d)
 			);
 		}
@@ -1195,22 +1202,22 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
 			      "vse8.v	v31, (%[wq7])\n"
 			      ".option	pop\n"
 			      : :
-			      [wp0]"r"(&p[d + NSIZE * 0]),
-			      [wq0]"r"(&q[d + NSIZE * 0]),
-			      [wp1]"r"(&p[d + NSIZE * 1]),
-			      [wq1]"r"(&q[d + NSIZE * 1]),
-			      [wp2]"r"(&p[d + NSIZE * 2]),
-			      [wq2]"r"(&q[d + NSIZE * 2]),
-			      [wp3]"r"(&p[d + NSIZE * 3]),
-			      [wq3]"r"(&q[d + NSIZE * 3]),
-			      [wp4]"r"(&p[d + NSIZE * 4]),
-			      [wq4]"r"(&q[d + NSIZE * 4]),
-			      [wp5]"r"(&p[d + NSIZE * 5]),
-			      [wq5]"r"(&q[d + NSIZE * 5]),
-			      [wp6]"r"(&p[d + NSIZE * 6]),
-			      [wq6]"r"(&q[d + NSIZE * 6]),
-			      [wp7]"r"(&p[d + NSIZE * 7]),
-			      [wq7]"r"(&q[d + NSIZE * 7])
+			      [wp0]"r"(&p[d + nsize * 0]),
+			      [wq0]"r"(&q[d + nsize * 0]),
+			      [wp1]"r"(&p[d + nsize * 1]),
+			      [wq1]"r"(&q[d + nsize * 1]),
+			      [wp2]"r"(&p[d + nsize * 2]),
+			      [wq2]"r"(&q[d + nsize * 2]),
+			      [wp3]"r"(&p[d + nsize * 3]),
+			      [wq3]"r"(&q[d + nsize * 3]),
+			      [wp4]"r"(&p[d + nsize * 4]),
+			      [wq4]"r"(&q[d + nsize * 4]),
+			      [wp5]"r"(&p[d + nsize * 5]),
+			      [wq5]"r"(&q[d + nsize * 5]),
+			      [wp6]"r"(&p[d + nsize * 6]),
+			      [wq6]"r"(&q[d + nsize * 6]),
+			      [wp7]"r"(&p[d + nsize * 7]),
+			      [wq7]"r"(&q[d + nsize * 7])
 		);
 	}
 }
diff --git a/lib/raid6/rvv.h b/lib/raid6/rvv.h
index 94044a1b707b..6d0708a2c8a4 100644
--- a/lib/raid6/rvv.h
+++ b/lib/raid6/rvv.h
@@ -7,6 +7,23 @@
  * Definitions for RISC-V RAID-6 code
  */
 
+#ifdef __KERNEL__
+#include <asm/vector.h>
+#else
+#define kernel_vector_begin()
+#define kernel_vector_end()
+#include <sys/auxv.h>
+#include <asm/hwcap.h>
+#define has_vector() (getauxval(AT_HWCAP) & COMPAT_HWCAP_ISA_V)
+#endif
+
+#include <linux/raid/pq.h>
+
+static int rvv_has_vector(void)
+{
+	return has_vector();
+}
+
 #define RAID6_RVV_WRAPPER(_n)						\
 	static void raid6_rvv ## _n ## _gen_syndrome(int disks,		\
 					size_t bytes, void **ptrs)	\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 5/5] raid6: test: Add support for RISC-V
  2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
                   ` (3 preceding siblings ...)
  2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang
@ 2025-07-11 10:09 ` Chunyan Zhang
  4 siblings, 0 replies; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw)
  To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

From: Chunyan Zhang <zhang.lyra@gmail.com>

Add RISC-V code to be compiled to allow the userspace raid6test program
to be built and run on RISC-V.

Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com>
---
 lib/raid6/test/Makefile | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/raid6/test/Makefile b/lib/raid6/test/Makefile
index 8f2dd2210ba8..09bbe2b14cce 100644
--- a/lib/raid6/test/Makefile
+++ b/lib/raid6/test/Makefile
@@ -35,6 +35,11 @@ ifeq ($(ARCH),aarch64)
         HAS_NEON = yes
 endif
 
+ifeq ($(findstring riscv,$(ARCH)),riscv)
+        CFLAGS += -I../../../arch/riscv/include -DCONFIG_RISCV=1
+        HAS_RVV = yes
+endif
+
 ifeq ($(findstring ppc,$(ARCH)),ppc)
         CFLAGS += -I../../../arch/powerpc/include
         HAS_ALTIVEC := $(shell printf '$(pound)include <altivec.h>\nvector int a;\n' |\
@@ -63,6 +68,9 @@ else ifeq ($(HAS_ALTIVEC),yes)
                 vpermxor1.o vpermxor2.o vpermxor4.o vpermxor8.o
 else ifeq ($(ARCH),loongarch64)
         OBJS += loongarch_simd.o recov_loongarch_simd.o
+else ifeq ($(HAS_RVV),yes)
+        OBJS   += rvv.o recov_rvv.o
+        CFLAGS += -DCONFIG_RISCV_ISA_V=1
 endif
 
 .c.o:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion
  2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
@ 2025-07-16 13:38   ` Alexandre Ghiti
  2025-07-21  7:52   ` Nutty Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Alexandre Ghiti @ 2025-07-16 13:38 UTC (permalink / raw)
  To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

Hi Chunyan,

On 7/11/25 12:09, Chunyan Zhang wrote:
> These two C files don't reference things defined in simd.h or types.h
> so remove these redundant #inclusions.
>
> Fixes: 6093faaf9593 ("raid6: Add RISC-V SIMD syndrome and recovery calculations")
> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> ---
>   lib/raid6/recov_rvv.c | 2 --
>   lib/raid6/rvv.c       | 3 ---
>   2 files changed, 5 deletions(-)
>
> diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
> index f29303795ccf..500da521a806 100644
> --- a/lib/raid6/recov_rvv.c
> +++ b/lib/raid6/recov_rvv.c
> @@ -4,9 +4,7 @@
>    * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
>    */
>   
> -#include <asm/simd.h>
>   #include <asm/vector.h>
> -#include <crypto/internal/simd.h>
>   #include <linux/raid/pq.h>
>   
>   static int rvv_has_vector(void)
> diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> index 7d82efa5b14f..b193ea176d5d 100644
> --- a/lib/raid6/rvv.c
> +++ b/lib/raid6/rvv.c
> @@ -9,11 +9,8 @@
>    *	Copyright 2002-2004 H. Peter Anvin
>    */
>   
> -#include <asm/simd.h>
>   #include <asm/vector.h>
> -#include <crypto/internal/simd.h>
>   #include <linux/raid/pq.h>
> -#include <linux/types.h>
>   #include "rvv.h"
>   
>   #define NSIZE	(riscv_v_vsize / 32) /* NSIZE = vlenb */


Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation
  2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang
@ 2025-07-16 13:40   ` Alexandre Ghiti
  2025-07-17  2:16     ` Chunyan Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Alexandre Ghiti @ 2025-07-16 13:40 UTC (permalink / raw)
  To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

On 7/11/25 12:09, Chunyan Zhang wrote:
> Since wp$$==wq$$, it doesn't need to load the same data twice, use move
> instruction to replace one of the loads to let the program run faster.
>
> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> ---
>   lib/raid6/rvv.c | 60 ++++++++++++++++++++++++-------------------------
>   1 file changed, 30 insertions(+), 30 deletions(-)
>
> diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> index b193ea176d5d..89da5fc247aa 100644
> --- a/lib/raid6/rvv.c
> +++ b/lib/raid6/rvv.c
> @@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> @@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> @@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      "vle8.v	v4, (%[wp1])\n"
> -			      "vle8.v	v5, (%[wp1])\n"
> +			      "vmv.v.v	v5, v4\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> @@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      "vle8.v	v4, (%[wp1])\n"
> -			      "vle8.v	v5, (%[wp1])\n"
> +			      "vmv.v.v	v5, v4\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> @@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      "vle8.v	v4, (%[wp1])\n"
> -			      "vle8.v	v5, (%[wp1])\n"
> +			      "vmv.v.v	v5, v4\n"
>   			      "vle8.v	v8, (%[wp2])\n"
> -			      "vle8.v	v9, (%[wp2])\n"
> +			      "vmv.v.v	v9, v8\n"
>   			      "vle8.v	v12, (%[wp3])\n"
> -			      "vle8.v	v13, (%[wp3])\n"
> +			      "vmv.v.v	v13, v12\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> @@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      "vle8.v	v4, (%[wp1])\n"
> -			      "vle8.v	v5, (%[wp1])\n"
> +			      "vmv.v.v	v5, v4\n"
>   			      "vle8.v	v8, (%[wp2])\n"
> -			      "vle8.v	v9, (%[wp2])\n"
> +			      "vmv.v.v	v9, v8\n"
>   			      "vle8.v	v12, (%[wp3])\n"
> -			      "vle8.v	v13, (%[wp3])\n"
> +			      "vmv.v.v	v13, v12\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> @@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      "vle8.v	v4, (%[wp1])\n"
> -			      "vle8.v	v5, (%[wp1])\n"
> +			      "vmv.v.v	v5, v4\n"
>   			      "vle8.v	v8, (%[wp2])\n"
> -			      "vle8.v	v9, (%[wp2])\n"
> +			      "vmv.v.v	v9, v8\n"
>   			      "vle8.v	v12, (%[wp3])\n"
> -			      "vle8.v	v13, (%[wp3])\n"
> +			      "vmv.v.v	v13, v12\n"
>   			      "vle8.v	v16, (%[wp4])\n"
> -			      "vle8.v	v17, (%[wp4])\n"
> +			      "vmv.v.v	v17, v16\n"
>   			      "vle8.v	v20, (%[wp5])\n"
> -			      "vle8.v	v21, (%[wp5])\n"
> +			      "vmv.v.v	v21, v20\n"
>   			      "vle8.v	v24, (%[wp6])\n"
> -			      "vle8.v	v25, (%[wp6])\n"
> +			      "vmv.v.v	v25, v24\n"
>   			      "vle8.v	v28, (%[wp7])\n"
> -			      "vle8.v	v29, (%[wp7])\n"
> +			      "vmv.v.v	v29, v28\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> @@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
>   			      "vle8.v	v0, (%[wp0])\n"
> -			      "vle8.v	v1, (%[wp0])\n"
> +			      "vmv.v.v	v1, v0\n"
>   			      "vle8.v	v4, (%[wp1])\n"
> -			      "vle8.v	v5, (%[wp1])\n"
> +			      "vmv.v.v	v5, v4\n"
>   			      "vle8.v	v8, (%[wp2])\n"
> -			      "vle8.v	v9, (%[wp2])\n"
> +			      "vmv.v.v	v9, v8\n"
>   			      "vle8.v	v12, (%[wp3])\n"
> -			      "vle8.v	v13, (%[wp3])\n"
> +			      "vmv.v.v	v13, v12\n"
>   			      "vle8.v	v16, (%[wp4])\n"
> -			      "vle8.v	v17, (%[wp4])\n"
> +			      "vmv.v.v	v17, v16\n"
>   			      "vle8.v	v20, (%[wp5])\n"
> -			      "vle8.v	v21, (%[wp5])\n"
> +			      "vmv.v.v	v21, v20\n"
>   			      "vle8.v	v24, (%[wp6])\n"
> -			      "vle8.v	v25, (%[wp6])\n"
> +			      "vmv.v.v	v25, v24\n"
>   			      "vle8.v	v28, (%[wp7])\n"
> -			      "vle8.v	v29, (%[wp7])\n"
> +			      "vmv.v.v	v29, v28\n"
>   			      ".option	pop\n"
>   			      : :
>   			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),


Out of curiosity, did you notice a gain?

Anyway:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 3/5] raid6: riscv: Add a compiler error
  2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang
@ 2025-07-16 13:43   ` Alexandre Ghiti
  2025-07-17  3:16     ` Chunyan Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Alexandre Ghiti @ 2025-07-16 13:43 UTC (permalink / raw)
  To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

First, the patch title should be something like:

"raid6: riscv: Prevent compiler with vector support to build already 
vectorized code"

Or something similar.

On 7/11/25 12:09, Chunyan Zhang wrote:
> The code like "u8 **dptr = (u8 **)ptrs" just won't work when built with


Why wouldn't this code ^ work?

I guess preventing the compiler to vectorize the code is to avoid the 
inline assembly code to break what the compiler could have vectorized no?


> a compiler that can use vector instructions. So add an error for that.
>
> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> ---
>   lib/raid6/rvv.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> index 89da5fc247aa..015f3ee4da25 100644
> --- a/lib/raid6/rvv.c
> +++ b/lib/raid6/rvv.c
> @@ -20,6 +20,10 @@ static int rvv_has_vector(void)
>   	return has_vector();
>   }
>   
> +#ifdef __riscv_vector
> +#error "This code must be built without compiler support for vector"
> +#endif
> +
>   static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs)
>   {
>   	u8 **dptr = (u8 **)ptrs;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation
  2025-07-16 13:40   ` Alexandre Ghiti
@ 2025-07-17  2:16     ` Chunyan Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-17  2:16 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai, linux-riscv, linux-raid,
	linux-kernel

On Wed, 16 Jul 2025 at 21:40, Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> On 7/11/25 12:09, Chunyan Zhang wrote:
> > Since wp$$==wq$$, it doesn't need to load the same data twice, use move
> > instruction to replace one of the loads to let the program run faster.
> >
> > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> > ---
> >   lib/raid6/rvv.c | 60 ++++++++++++++++++++++++-------------------------
> >   1 file changed, 30 insertions(+), 30 deletions(-)
> >
> > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> > index b193ea176d5d..89da5fc247aa 100644
> > --- a/lib/raid6/rvv.c
> > +++ b/lib/raid6/rvv.c
> > @@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> > @@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> > @@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             "vle8.v   v4, (%[wp1])\n"
> > -                           "vle8.v   v5, (%[wp1])\n"
> > +                           "vmv.v.v  v5, v4\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             "vle8.v   v4, (%[wp1])\n"
> > -                           "vle8.v   v5, (%[wp1])\n"
> > +                           "vmv.v.v  v5, v4\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             "vle8.v   v4, (%[wp1])\n"
> > -                           "vle8.v   v5, (%[wp1])\n"
> > +                           "vmv.v.v  v5, v4\n"
> >                             "vle8.v   v8, (%[wp2])\n"
> > -                           "vle8.v   v9, (%[wp2])\n"
> > +                           "vmv.v.v  v9, v8\n"
> >                             "vle8.v   v12, (%[wp3])\n"
> > -                           "vle8.v   v13, (%[wp3])\n"
> > +                           "vmv.v.v  v13, v12\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             "vle8.v   v4, (%[wp1])\n"
> > -                           "vle8.v   v5, (%[wp1])\n"
> > +                           "vmv.v.v  v5, v4\n"
> >                             "vle8.v   v8, (%[wp2])\n"
> > -                           "vle8.v   v9, (%[wp2])\n"
> > +                           "vmv.v.v  v9, v8\n"
> >                             "vle8.v   v12, (%[wp3])\n"
> > -                           "vle8.v   v13, (%[wp3])\n"
> > +                           "vmv.v.v  v13, v12\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             "vle8.v   v4, (%[wp1])\n"
> > -                           "vle8.v   v5, (%[wp1])\n"
> > +                           "vmv.v.v  v5, v4\n"
> >                             "vle8.v   v8, (%[wp2])\n"
> > -                           "vle8.v   v9, (%[wp2])\n"
> > +                           "vmv.v.v  v9, v8\n"
> >                             "vle8.v   v12, (%[wp3])\n"
> > -                           "vle8.v   v13, (%[wp3])\n"
> > +                           "vmv.v.v  v13, v12\n"
> >                             "vle8.v   v16, (%[wp4])\n"
> > -                           "vle8.v   v17, (%[wp4])\n"
> > +                           "vmv.v.v  v17, v16\n"
> >                             "vle8.v   v20, (%[wp5])\n"
> > -                           "vle8.v   v21, (%[wp5])\n"
> > +                           "vmv.v.v  v21, v20\n"
> >                             "vle8.v   v24, (%[wp6])\n"
> > -                           "vle8.v   v25, (%[wp6])\n"
> > +                           "vmv.v.v  v25, v24\n"
> >                             "vle8.v   v28, (%[wp7])\n"
> > -                           "vle8.v   v29, (%[wp7])\n"
> > +                           "vmv.v.v  v29, v28\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> >                             "vle8.v   v0, (%[wp0])\n"
> > -                           "vle8.v   v1, (%[wp0])\n"
> > +                           "vmv.v.v  v1, v0\n"
> >                             "vle8.v   v4, (%[wp1])\n"
> > -                           "vle8.v   v5, (%[wp1])\n"
> > +                           "vmv.v.v  v5, v4\n"
> >                             "vle8.v   v8, (%[wp2])\n"
> > -                           "vle8.v   v9, (%[wp2])\n"
> > +                           "vmv.v.v  v9, v8\n"
> >                             "vle8.v   v12, (%[wp3])\n"
> > -                           "vle8.v   v13, (%[wp3])\n"
> > +                           "vmv.v.v  v13, v12\n"
> >                             "vle8.v   v16, (%[wp4])\n"
> > -                           "vle8.v   v17, (%[wp4])\n"
> > +                           "vmv.v.v  v17, v16\n"
> >                             "vle8.v   v20, (%[wp5])\n"
> > -                           "vle8.v   v21, (%[wp5])\n"
> > +                           "vmv.v.v  v21, v20\n"
> >                             "vle8.v   v24, (%[wp6])\n"
> > -                           "vle8.v   v25, (%[wp6])\n"
> > +                           "vmv.v.v  v25, v24\n"
> >                             "vle8.v   v28, (%[wp7])\n"
> > -                           "vle8.v   v29, (%[wp7])\n"
> > +                           "vmv.v.v  v29, v28\n"
> >                             ".option  pop\n"
> >                             : :
> >                             [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
>
>
> Out of curiosity, did you notice a gain?

Yes, I can see ~3% gain on my BPI-F3.

>
> Anyway:
>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>
> Thanks,
>
> Alex
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 3/5] raid6: riscv: Add a compiler error
  2025-07-16 13:43   ` Alexandre Ghiti
@ 2025-07-17  3:16     ` Chunyan Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-17  3:16 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai, linux-riscv, linux-raid,
	linux-kernel

Hi Alex,

On Wed, 16 Jul 2025 at 21:43, Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> First, the patch title should be something like:

Yeah, I've also recognized the phrase is not right when rereading
after the patch was sent.

>
> "raid6: riscv: Prevent compiler with vector support to build already
> vectorized code"
>
> Or something similar.
>
> On 7/11/25 12:09, Chunyan Zhang wrote:
> > The code like "u8 **dptr = (u8 **)ptrs" just won't work when built with
>
>
> Why wouldn't this code ^ work?

I actually didn't quite get this compiler issue ^_^||

>
> I guess preventing the compiler to vectorize the code is to avoid the
> inline assembly code to break what the compiler could have vectorized no?
>

This states the issue clearly, I will cook a new patchset.

Thanks for the review,
Chunyan

>
> > a compiler that can use vector instructions. So add an error for that.
> >
> > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> > ---
> >   lib/raid6/rvv.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> > index 89da5fc247aa..015f3ee4da25 100644
> > --- a/lib/raid6/rvv.c
> > +++ b/lib/raid6/rvv.c
> > @@ -20,6 +20,10 @@ static int rvv_has_vector(void)
> >       return has_vector();
> >   }
> >
> > +#ifdef __riscv_vector
> > +#error "This code must be built without compiler support for vector"
> > +#endif
> > +
> >   static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs)
> >   {
> >       u8 **dptr = (u8 **)ptrs;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace
  2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang
@ 2025-07-17  7:04   ` Alexandre Ghiti
  2025-07-17  7:39     ` Chunyan Zhang
  0 siblings, 1 reply; 14+ messages in thread
From: Alexandre Ghiti @ 2025-07-17  7:04 UTC (permalink / raw)
  To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

On 7/11/25 12:09, Chunyan Zhang wrote:
> To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel
> header inclusions also userspace wrapper definitions to allow code to be
> compiled in userspace.
>
> This patch also drops the NSIZE macro, instead of using the vector length,
> which can work for both kernel and user space.
>
> Signed-off-by: Chunyan Zhang<zhangchunyan@iscas.ac.cn>
> ---
>   lib/raid6/recov_rvv.c |   7 +-
>   lib/raid6/rvv.c       | 297 +++++++++++++++++++++---------------------
>   lib/raid6/rvv.h       |  17 +++
>   3 files changed, 170 insertions(+), 151 deletions(-)
>
> diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
> index 500da521a806..8f2be833c015 100644
> --- a/lib/raid6/recov_rvv.c
> +++ b/lib/raid6/recov_rvv.c
> @@ -4,13 +4,8 @@
>    * Author: Chunyan Zhang<zhangchunyan@iscas.ac.cn>
>    */
>   
> -#include <asm/vector.h>
>   #include <linux/raid/pq.h>
> -
> -static int rvv_has_vector(void)
> -{
> -	return has_vector();
> -}
> +#include "rvv.h"
>   
>   static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp,
>   				    u8 *dq, const u8 *pbmul,
> diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> index 015f3ee4da25..75c9dafedb28 100644
> --- a/lib/raid6/rvv.c
> +++ b/lib/raid6/rvv.c
> @@ -9,17 +9,8 @@
>    *	Copyright 2002-2004 H. Peter Anvin
>    */
>   
> -#include <asm/vector.h>
> -#include <linux/raid/pq.h>
>   #include "rvv.h"
>   
> -#define NSIZE	(riscv_v_vsize / 32) /* NSIZE = vlenb */
> -
> -static int rvv_has_vector(void)
> -{
> -	return has_vector();
> -}
> -
>   #ifdef __riscv_vector
>   #error "This code must be built without compiler support for vector"
>   #endif
> @@ -28,7 +19,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = disks - 3;		/* Highest data disk */
> @@ -42,8 +33,10 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	 /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */
> -	for (d = 0; d < bytes; d += NSIZE * 1) {
> +	for (d = 0; d < bytes; d += nsize * 1) {
>   		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */


You missed a few NSIZE in comments


>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -51,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vmv.v.v	v1, v0\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize])
>   		);
>   
>   		for (z = z0 - 1 ; z >= 0 ; z--) {
> @@ -75,7 +68,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
>   				      "vxor.vv	v0, v0, v2\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -90,8 +83,8 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vse8.v	v1, (%[wq0])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0])
>   		);
>   	}
>   }
> @@ -101,7 +94,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = stop;		/* P/Q right side optimization */
> @@ -115,8 +108,10 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */
> -	for (d = 0 ; d < bytes ; d += NSIZE * 1) {
> +	for (d = 0 ; d < bytes ; d += nsize * 1) {
>   		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -124,7 +119,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
>   			      "vmv.v.v	v1, v0\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize])
>   		);
>   
>   		/* P/Q data pages */
> @@ -149,7 +144,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
>   				      "vxor.vv	v0, v0, v2\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -189,8 +184,8 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
>   			      "vse8.v	v3, (%[wq0])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0])
>   		);
>   	}
>   }
> @@ -199,7 +194,7 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = disks - 3;		/* Highest data disk */
> @@ -213,11 +208,13 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*
>   	 *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
>   	 *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
>   	 */
> -	for (d = 0; d < bytes; d += NSIZE * 2) {
> +	for (d = 0; d < bytes; d += nsize * 2) {
>   		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -227,8 +224,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vmv.v.v	v5, v4\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> -			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> +			      [wp1]"r"(&dptr[z0][d + 1 * nsize])
>   		);
>   
>   		for (z = z0 - 1; z >= 0; z--) {
> @@ -260,8 +257,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
>   				      "vxor.vv	v4, v4, v6\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> -				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
> +				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -278,10 +275,10 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vse8.v	v5, (%[wq1])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0]),
> -			      [wp1]"r"(&p[d + NSIZE * 1]),
> -			      [wq1]"r"(&q[d + NSIZE * 1])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0]),
> +			      [wp1]"r"(&p[d + nsize * 1]),
> +			      [wq1]"r"(&q[d + nsize * 1])
>   		);
>   	}
>   }
> @@ -291,7 +288,7 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = stop;		/* P/Q right side optimization */
> @@ -305,11 +302,13 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*
>   	 *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
>   	 *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
>   	 */
> -	for (d = 0; d < bytes; d += NSIZE * 2) {
> +	for (d = 0; d < bytes; d += nsize * 2) {
>   		 /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -319,8 +318,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
>   			      "vmv.v.v	v5, v4\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> -			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> +			      [wp1]"r"(&dptr[z0][d + 1 * nsize])
>   		);
>   
>   		/* P/Q data pages */
> @@ -353,8 +352,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
>   				      "vxor.vv	v4, v4, v6\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> -				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
> +				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -407,10 +406,10 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
>   			      "vse8.v	v7, (%[wq1])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0]),
> -			      [wp1]"r"(&p[d + NSIZE * 1]),
> -			      [wq1]"r"(&q[d + NSIZE * 1])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0]),
> +			      [wp1]"r"(&p[d + nsize * 1]),
> +			      [wq1]"r"(&q[d + nsize * 1])
>   		);
>   	}
>   }
> @@ -419,7 +418,7 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = disks - 3;	/* Highest data disk */
> @@ -433,13 +432,15 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*
>   	 *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
>   	 *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
>   	 *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12
>   	 *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13
>   	 */
> -	for (d = 0; d < bytes; d += NSIZE * 4) {
> +	for (d = 0; d < bytes; d += nsize * 4) {
>   		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -453,10 +454,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vmv.v.v	v13, v12\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> -			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> -			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> -			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> +			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> +			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> +			      [wp3]"r"(&dptr[z0][d + 3 * nsize])
>   		);
>   
>   		for (z = z0 - 1; z >= 0; z--) {
> @@ -504,10 +505,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
>   				      "vxor.vv	v12, v12, v14\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> -				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> -				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> -				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
> +				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
> +				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
> +				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -528,14 +529,14 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vse8.v	v13, (%[wq3])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0]),
> -			      [wp1]"r"(&p[d + NSIZE * 1]),
> -			      [wq1]"r"(&q[d + NSIZE * 1]),
> -			      [wp2]"r"(&p[d + NSIZE * 2]),
> -			      [wq2]"r"(&q[d + NSIZE * 2]),
> -			      [wp3]"r"(&p[d + NSIZE * 3]),
> -			      [wq3]"r"(&q[d + NSIZE * 3])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0]),
> +			      [wp1]"r"(&p[d + nsize * 1]),
> +			      [wq1]"r"(&q[d + nsize * 1]),
> +			      [wp2]"r"(&p[d + nsize * 2]),
> +			      [wq2]"r"(&q[d + nsize * 2]),
> +			      [wp3]"r"(&p[d + nsize * 3]),
> +			      [wq3]"r"(&q[d + nsize * 3])
>   		);
>   	}
>   }
> @@ -545,7 +546,7 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = stop;		/* P/Q right side optimization */
> @@ -559,13 +560,15 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*
>   	 *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
>   	 *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
>   	 *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12
>   	 *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13
>   	 */
> -	for (d = 0; d < bytes; d += NSIZE * 4) {
> +	for (d = 0; d < bytes; d += nsize * 4) {
>   		 /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -579,10 +582,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
>   			      "vmv.v.v	v13, v12\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> -			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> -			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> -			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> +			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> +			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> +			      [wp3]"r"(&dptr[z0][d + 3 * nsize])
>   		);
>   
>   		/* P/Q data pages */
> @@ -631,10 +634,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
>   				      "vxor.vv	v12, v12, v14\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> -				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> -				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> -				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
> +				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
> +				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
> +				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -713,14 +716,14 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
>   			      "vse8.v	v15, (%[wq3])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0]),
> -			      [wp1]"r"(&p[d + NSIZE * 1]),
> -			      [wq1]"r"(&q[d + NSIZE * 1]),
> -			      [wp2]"r"(&p[d + NSIZE * 2]),
> -			      [wq2]"r"(&q[d + NSIZE * 2]),
> -			      [wp3]"r"(&p[d + NSIZE * 3]),
> -			      [wq3]"r"(&q[d + NSIZE * 3])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0]),
> +			      [wp1]"r"(&p[d + nsize * 1]),
> +			      [wq1]"r"(&q[d + nsize * 1]),
> +			      [wp2]"r"(&p[d + nsize * 2]),
> +			      [wq2]"r"(&q[d + nsize * 2]),
> +			      [wp3]"r"(&p[d + nsize * 3]),
> +			      [wq3]"r"(&q[d + nsize * 3])
>   		);
>   	}
>   }
> @@ -729,7 +732,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = disks - 3;	/* Highest data disk */
> @@ -743,6 +746,8 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*
>   	 *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
>   	 *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> @@ -753,7 +758,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   	 *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16
>   	 *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17
>   	 */
> -	for (d = 0; d < bytes; d += NSIZE * 8) {
> +	for (d = 0; d < bytes; d += nsize * 8) {
>   		/* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -775,14 +780,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vmv.v.v	v29, v28\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> -			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> -			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> -			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE]),
> -			      [wp4]"r"(&dptr[z0][d + 4 * NSIZE]),
> -			      [wp5]"r"(&dptr[z0][d + 5 * NSIZE]),
> -			      [wp6]"r"(&dptr[z0][d + 6 * NSIZE]),
> -			      [wp7]"r"(&dptr[z0][d + 7 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> +			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> +			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> +			      [wp3]"r"(&dptr[z0][d + 3 * nsize]),
> +			      [wp4]"r"(&dptr[z0][d + 4 * nsize]),
> +			      [wp5]"r"(&dptr[z0][d + 5 * nsize]),
> +			      [wp6]"r"(&dptr[z0][d + 6 * nsize]),
> +			      [wp7]"r"(&dptr[z0][d + 7 * nsize])
>   		);
>   
>   		for (z = z0 - 1; z >= 0; z--) {
> @@ -862,14 +867,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   				      "vxor.vv	v28, v28, v30\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> -				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> -				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> -				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> -				      [wd4]"r"(&dptr[z][d + 4 * NSIZE]),
> -				      [wd5]"r"(&dptr[z][d + 5 * NSIZE]),
> -				      [wd6]"r"(&dptr[z][d + 6 * NSIZE]),
> -				      [wd7]"r"(&dptr[z][d + 7 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
> +				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
> +				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
> +				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
> +				      [wd4]"r"(&dptr[z][d + 4 * nsize]),
> +				      [wd5]"r"(&dptr[z][d + 5 * nsize]),
> +				      [wd6]"r"(&dptr[z][d + 6 * nsize]),
> +				      [wd7]"r"(&dptr[z][d + 7 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -898,22 +903,22 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
>   			      "vse8.v	v29, (%[wq7])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0]),
> -			      [wp1]"r"(&p[d + NSIZE * 1]),
> -			      [wq1]"r"(&q[d + NSIZE * 1]),
> -			      [wp2]"r"(&p[d + NSIZE * 2]),
> -			      [wq2]"r"(&q[d + NSIZE * 2]),
> -			      [wp3]"r"(&p[d + NSIZE * 3]),
> -			      [wq3]"r"(&q[d + NSIZE * 3]),
> -			      [wp4]"r"(&p[d + NSIZE * 4]),
> -			      [wq4]"r"(&q[d + NSIZE * 4]),
> -			      [wp5]"r"(&p[d + NSIZE * 5]),
> -			      [wq5]"r"(&q[d + NSIZE * 5]),
> -			      [wp6]"r"(&p[d + NSIZE * 6]),
> -			      [wq6]"r"(&q[d + NSIZE * 6]),
> -			      [wp7]"r"(&p[d + NSIZE * 7]),
> -			      [wq7]"r"(&q[d + NSIZE * 7])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0]),
> +			      [wp1]"r"(&p[d + nsize * 1]),
> +			      [wq1]"r"(&q[d + nsize * 1]),
> +			      [wp2]"r"(&p[d + nsize * 2]),
> +			      [wq2]"r"(&q[d + nsize * 2]),
> +			      [wp3]"r"(&p[d + nsize * 3]),
> +			      [wq3]"r"(&q[d + nsize * 3]),
> +			      [wp4]"r"(&p[d + nsize * 4]),
> +			      [wq4]"r"(&q[d + nsize * 4]),
> +			      [wp5]"r"(&p[d + nsize * 5]),
> +			      [wq5]"r"(&q[d + nsize * 5]),
> +			      [wp6]"r"(&p[d + nsize * 6]),
> +			      [wq6]"r"(&q[d + nsize * 6]),
> +			      [wp7]"r"(&p[d + nsize * 7]),
> +			      [wq7]"r"(&q[d + nsize * 7])
>   		);
>   	}
>   }
> @@ -923,7 +928,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   {
>   	u8 **dptr = (u8 **)ptrs;
>   	u8 *p, *q;
> -	unsigned long vl, d;
> +	unsigned long vl, d, nsize;
>   	int z, z0;
>   
>   	z0 = stop;		/* P/Q right side optimization */
> @@ -937,6 +942,8 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   		      : "=&r" (vl)
>   	);
>   
> +	nsize = vl;
> +
>   	/*
>   	 *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
>   	 *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> @@ -947,7 +954,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   	 *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16
>   	 *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17
>   	 */
> -	for (d = 0; d < bytes; d += NSIZE * 8) {
> +	for (d = 0; d < bytes; d += nsize * 8) {
>   		 /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>   		asm volatile (".option	push\n"
>   			      ".option	arch,+v\n"
> @@ -969,14 +976,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   			      "vmv.v.v	v29, v28\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> -			      [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> -			      [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> -			      [wp3]"r"(&dptr[z0][d + 3 * NSIZE]),
> -			      [wp4]"r"(&dptr[z0][d + 4 * NSIZE]),
> -			      [wp5]"r"(&dptr[z0][d + 5 * NSIZE]),
> -			      [wp6]"r"(&dptr[z0][d + 6 * NSIZE]),
> -			      [wp7]"r"(&dptr[z0][d + 7 * NSIZE])
> +			      [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> +			      [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> +			      [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> +			      [wp3]"r"(&dptr[z0][d + 3 * nsize]),
> +			      [wp4]"r"(&dptr[z0][d + 4 * nsize]),
> +			      [wp5]"r"(&dptr[z0][d + 5 * nsize]),
> +			      [wp6]"r"(&dptr[z0][d + 6 * nsize]),
> +			      [wp7]"r"(&dptr[z0][d + 7 * nsize])
>   		);
>   
>   		/* P/Q data pages */
> @@ -1057,14 +1064,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   				      "vxor.vv	v28, v28, v30\n"
>   				      ".option	pop\n"
>   				      : :
> -				      [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> -				      [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> -				      [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> -				      [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> -				      [wd4]"r"(&dptr[z][d + 4 * NSIZE]),
> -				      [wd5]"r"(&dptr[z][d + 5 * NSIZE]),
> -				      [wd6]"r"(&dptr[z][d + 6 * NSIZE]),
> -				      [wd7]"r"(&dptr[z][d + 7 * NSIZE]),
> +				      [wd0]"r"(&dptr[z][d + 0 * nsize]),
> +				      [wd1]"r"(&dptr[z][d + 1 * nsize]),
> +				      [wd2]"r"(&dptr[z][d + 2 * nsize]),
> +				      [wd3]"r"(&dptr[z][d + 3 * nsize]),
> +				      [wd4]"r"(&dptr[z][d + 4 * nsize]),
> +				      [wd5]"r"(&dptr[z][d + 5 * nsize]),
> +				      [wd6]"r"(&dptr[z][d + 6 * nsize]),
> +				      [wd7]"r"(&dptr[z][d + 7 * nsize]),
>   				      [x1d]"r"(0x1d)
>   			);
>   		}
> @@ -1195,22 +1202,22 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
>   			      "vse8.v	v31, (%[wq7])\n"
>   			      ".option	pop\n"
>   			      : :
> -			      [wp0]"r"(&p[d + NSIZE * 0]),
> -			      [wq0]"r"(&q[d + NSIZE * 0]),
> -			      [wp1]"r"(&p[d + NSIZE * 1]),
> -			      [wq1]"r"(&q[d + NSIZE * 1]),
> -			      [wp2]"r"(&p[d + NSIZE * 2]),
> -			      [wq2]"r"(&q[d + NSIZE * 2]),
> -			      [wp3]"r"(&p[d + NSIZE * 3]),
> -			      [wq3]"r"(&q[d + NSIZE * 3]),
> -			      [wp4]"r"(&p[d + NSIZE * 4]),
> -			      [wq4]"r"(&q[d + NSIZE * 4]),
> -			      [wp5]"r"(&p[d + NSIZE * 5]),
> -			      [wq5]"r"(&q[d + NSIZE * 5]),
> -			      [wp6]"r"(&p[d + NSIZE * 6]),
> -			      [wq6]"r"(&q[d + NSIZE * 6]),
> -			      [wp7]"r"(&p[d + NSIZE * 7]),
> -			      [wq7]"r"(&q[d + NSIZE * 7])
> +			      [wp0]"r"(&p[d + nsize * 0]),
> +			      [wq0]"r"(&q[d + nsize * 0]),
> +			      [wp1]"r"(&p[d + nsize * 1]),
> +			      [wq1]"r"(&q[d + nsize * 1]),
> +			      [wp2]"r"(&p[d + nsize * 2]),
> +			      [wq2]"r"(&q[d + nsize * 2]),
> +			      [wp3]"r"(&p[d + nsize * 3]),
> +			      [wq3]"r"(&q[d + nsize * 3]),
> +			      [wp4]"r"(&p[d + nsize * 4]),
> +			      [wq4]"r"(&q[d + nsize * 4]),
> +			      [wp5]"r"(&p[d + nsize * 5]),
> +			      [wq5]"r"(&q[d + nsize * 5]),
> +			      [wp6]"r"(&p[d + nsize * 6]),
> +			      [wq6]"r"(&q[d + nsize * 6]),
> +			      [wp7]"r"(&p[d + nsize * 7]),
> +			      [wq7]"r"(&q[d + nsize * 7])
>   		);
>   	}
>   }
> diff --git a/lib/raid6/rvv.h b/lib/raid6/rvv.h
> index 94044a1b707b..6d0708a2c8a4 100644
> --- a/lib/raid6/rvv.h
> +++ b/lib/raid6/rvv.h
> @@ -7,6 +7,23 @@
>    * Definitions for RISC-V RAID-6 code
>    */
>   
> +#ifdef __KERNEL__
> +#include <asm/vector.h>
> +#else
> +#define kernel_vector_begin()
> +#define kernel_vector_end()
> +#include <sys/auxv.h>
> +#include <asm/hwcap.h>
> +#define has_vector() (getauxval(AT_HWCAP) & COMPAT_HWCAP_ISA_V)
> +#endif
> +
> +#include <linux/raid/pq.h>
> +
> +static int rvv_has_vector(void)
> +{
> +	return has_vector();
> +}
> +
>   #define RAID6_RVV_WRAPPER(_n)						\
>   	static void raid6_rvv ## _n ## _gen_syndrome(int disks,		\
>   					size_t bytes, void **ptrs)	\


Otherwise, looks good:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace
  2025-07-17  7:04   ` Alexandre Ghiti
@ 2025-07-17  7:39     ` Chunyan Zhang
  0 siblings, 0 replies; 14+ messages in thread
From: Chunyan Zhang @ 2025-07-17  7:39 UTC (permalink / raw)
  To: Alexandre Ghiti
  Cc: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Charlie Jenkins, Song Liu, Yu Kuai, linux-riscv, linux-raid,
	linux-kernel

Hi Alex,

On Thu, 17 Jul 2025 at 15:04, Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> On 7/11/25 12:09, Chunyan Zhang wrote:
> > To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel
> > header inclusions also userspace wrapper definitions to allow code to be
> > compiled in userspace.
> >
> > This patch also drops the NSIZE macro, instead of using the vector length,
> > which can work for both kernel and user space.
> >
> > Signed-off-by: Chunyan Zhang<zhangchunyan@iscas.ac.cn>
> > ---
> >   lib/raid6/recov_rvv.c |   7 +-
> >   lib/raid6/rvv.c       | 297 +++++++++++++++++++++---------------------
> >   lib/raid6/rvv.h       |  17 +++
> >   3 files changed, 170 insertions(+), 151 deletions(-)
> >
> > diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
> > index 500da521a806..8f2be833c015 100644
> > --- a/lib/raid6/recov_rvv.c
> > +++ b/lib/raid6/recov_rvv.c
> > @@ -4,13 +4,8 @@
> >    * Author: Chunyan Zhang<zhangchunyan@iscas.ac.cn>
> >    */
> >
> > -#include <asm/vector.h>
> >   #include <linux/raid/pq.h>
> > -
> > -static int rvv_has_vector(void)
> > -{
> > -     return has_vector();
> > -}
> > +#include "rvv.h"
> >
> >   static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp,
> >                                   u8 *dq, const u8 *pbmul,
> > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> > index 015f3ee4da25..75c9dafedb28 100644
> > --- a/lib/raid6/rvv.c
> > +++ b/lib/raid6/rvv.c
> > @@ -9,17 +9,8 @@
> >    *  Copyright 2002-2004 H. Peter Anvin
> >    */
> >
> > -#include <asm/vector.h>
> > -#include <linux/raid/pq.h>
> >   #include "rvv.h"
> >
> > -#define NSIZE        (riscv_v_vsize / 32) /* NSIZE = vlenb */
> > -
> > -static int rvv_has_vector(void)
> > -{
> > -     return has_vector();
> > -}
> > -
> >   #ifdef __riscv_vector
> >   #error "This code must be built without compiler support for vector"
> >   #endif
> > @@ -28,7 +19,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = disks - 3;         /* Highest data disk */
> > @@ -42,8 +33,10 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >        /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */
> > -     for (d = 0; d < bytes; d += NSIZE * 1) {
> > +     for (d = 0; d < bytes; d += nsize * 1) {
> >               /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
>
>
> You missed a few NSIZE in comments

These comments come from int.uc and neon.uc.
I left NSIZE in the comments on purpose, my thought was that would
make this code more readable through matching to the int.uc or neon.uc
:)

>
>
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -51,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vmv.v.v  v1, v0\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize])
> >               );
> >
> >               for (z = z0 - 1 ; z >= 0 ; z--) {
> > @@ -75,7 +68,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                                     "vxor.vv  v0, v0, v2\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -90,8 +83,8 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vse8.v   v1, (%[wq0])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0])
> >               );
> >       }
> >   }
> > @@ -101,7 +94,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = stop;              /* P/Q right side optimization */
> > @@ -115,8 +108,10 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */
> > -     for (d = 0 ; d < bytes ; d += NSIZE * 1) {
> > +     for (d = 0 ; d < bytes ; d += nsize * 1) {
> >               /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -124,7 +119,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> >                             "vmv.v.v  v1, v0\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize])
> >               );
> >
> >               /* P/Q data pages */
> > @@ -149,7 +144,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> >                                     "vxor.vv  v0, v0, v2\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -189,8 +184,8 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> >                             "vse8.v   v3, (%[wq0])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0])
> >               );
> >       }
> >   }
> > @@ -199,7 +194,7 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = disks - 3;         /* Highest data disk */
> > @@ -213,11 +208,13 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*
> >        *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
> >        *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> >        */
> > -     for (d = 0; d < bytes; d += NSIZE * 2) {
> > +     for (d = 0; d < bytes; d += nsize * 2) {
> >               /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -227,8 +224,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vmv.v.v  v5, v4\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > -                           [wp1]"r"(&dptr[z0][d + 1 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> > +                           [wp1]"r"(&dptr[z0][d + 1 * nsize])
> >               );
> >
> >               for (z = z0 - 1; z >= 0; z--) {
> > @@ -260,8 +257,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                                     "vxor.vv  v4, v4, v6\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > -                                   [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> > +                                   [wd1]"r"(&dptr[z][d + 1 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -278,10 +275,10 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vse8.v   v5, (%[wq1])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0]),
> > -                           [wp1]"r"(&p[d + NSIZE * 1]),
> > -                           [wq1]"r"(&q[d + NSIZE * 1])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0]),
> > +                           [wp1]"r"(&p[d + nsize * 1]),
> > +                           [wq1]"r"(&q[d + nsize * 1])
> >               );
> >       }
> >   }
> > @@ -291,7 +288,7 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = stop;              /* P/Q right side optimization */
> > @@ -305,11 +302,13 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*
> >        *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
> >        *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> >        */
> > -     for (d = 0; d < bytes; d += NSIZE * 2) {
> > +     for (d = 0; d < bytes; d += nsize * 2) {
> >                /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -319,8 +318,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> >                             "vmv.v.v  v5, v4\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > -                           [wp1]"r"(&dptr[z0][d + 1 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> > +                           [wp1]"r"(&dptr[z0][d + 1 * nsize])
> >               );
> >
> >               /* P/Q data pages */
> > @@ -353,8 +352,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> >                                     "vxor.vv  v4, v4, v6\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > -                                   [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> > +                                   [wd1]"r"(&dptr[z][d + 1 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -407,10 +406,10 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> >                             "vse8.v   v7, (%[wq1])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0]),
> > -                           [wp1]"r"(&p[d + NSIZE * 1]),
> > -                           [wq1]"r"(&q[d + NSIZE * 1])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0]),
> > +                           [wp1]"r"(&p[d + nsize * 1]),
> > +                           [wq1]"r"(&q[d + nsize * 1])
> >               );
> >       }
> >   }
> > @@ -419,7 +418,7 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = disks - 3; /* Highest data disk */
> > @@ -433,13 +432,15 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*
> >        *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
> >        *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> >        *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12
> >        *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13
> >        */
> > -     for (d = 0; d < bytes; d += NSIZE * 4) {
> > +     for (d = 0; d < bytes; d += nsize * 4) {
> >               /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -453,10 +454,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vmv.v.v  v13, v12\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > -                           [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> > -                           [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> > -                           [wp3]"r"(&dptr[z0][d + 3 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> > +                           [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> > +                           [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> > +                           [wp3]"r"(&dptr[z0][d + 3 * nsize])
> >               );
> >
> >               for (z = z0 - 1; z >= 0; z--) {
> > @@ -504,10 +505,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                                     "vxor.vv  v12, v12, v14\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > -                                   [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> > -                                   [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> > -                                   [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> > +                                   [wd1]"r"(&dptr[z][d + 1 * nsize]),
> > +                                   [wd2]"r"(&dptr[z][d + 2 * nsize]),
> > +                                   [wd3]"r"(&dptr[z][d + 3 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -528,14 +529,14 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vse8.v   v13, (%[wq3])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0]),
> > -                           [wp1]"r"(&p[d + NSIZE * 1]),
> > -                           [wq1]"r"(&q[d + NSIZE * 1]),
> > -                           [wp2]"r"(&p[d + NSIZE * 2]),
> > -                           [wq2]"r"(&q[d + NSIZE * 2]),
> > -                           [wp3]"r"(&p[d + NSIZE * 3]),
> > -                           [wq3]"r"(&q[d + NSIZE * 3])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0]),
> > +                           [wp1]"r"(&p[d + nsize * 1]),
> > +                           [wq1]"r"(&q[d + nsize * 1]),
> > +                           [wp2]"r"(&p[d + nsize * 2]),
> > +                           [wq2]"r"(&q[d + nsize * 2]),
> > +                           [wp3]"r"(&p[d + nsize * 3]),
> > +                           [wq3]"r"(&q[d + nsize * 3])
> >               );
> >       }
> >   }
> > @@ -545,7 +546,7 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = stop;              /* P/Q right side optimization */
> > @@ -559,13 +560,15 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*
> >        *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
> >        *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> >        *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12
> >        *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13
> >        */
> > -     for (d = 0; d < bytes; d += NSIZE * 4) {
> > +     for (d = 0; d < bytes; d += nsize * 4) {
> >                /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -579,10 +582,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> >                             "vmv.v.v  v13, v12\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > -                           [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> > -                           [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> > -                           [wp3]"r"(&dptr[z0][d + 3 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> > +                           [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> > +                           [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> > +                           [wp3]"r"(&dptr[z0][d + 3 * nsize])
> >               );
> >
> >               /* P/Q data pages */
> > @@ -631,10 +634,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> >                                     "vxor.vv  v12, v12, v14\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > -                                   [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> > -                                   [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> > -                                   [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> > +                                   [wd1]"r"(&dptr[z][d + 1 * nsize]),
> > +                                   [wd2]"r"(&dptr[z][d + 2 * nsize]),
> > +                                   [wd3]"r"(&dptr[z][d + 3 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -713,14 +716,14 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> >                             "vse8.v   v15, (%[wq3])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0]),
> > -                           [wp1]"r"(&p[d + NSIZE * 1]),
> > -                           [wq1]"r"(&q[d + NSIZE * 1]),
> > -                           [wp2]"r"(&p[d + NSIZE * 2]),
> > -                           [wq2]"r"(&q[d + NSIZE * 2]),
> > -                           [wp3]"r"(&p[d + NSIZE * 3]),
> > -                           [wq3]"r"(&q[d + NSIZE * 3])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0]),
> > +                           [wp1]"r"(&p[d + nsize * 1]),
> > +                           [wq1]"r"(&q[d + nsize * 1]),
> > +                           [wp2]"r"(&p[d + nsize * 2]),
> > +                           [wq2]"r"(&q[d + nsize * 2]),
> > +                           [wp3]"r"(&p[d + nsize * 3]),
> > +                           [wq3]"r"(&q[d + nsize * 3])
> >               );
> >       }
> >   }
> > @@ -729,7 +732,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = disks - 3; /* Highest data disk */
> > @@ -743,6 +746,8 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*
> >        *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
> >        *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> > @@ -753,7 +758,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >        *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16
> >        *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17
> >        */
> > -     for (d = 0; d < bytes; d += NSIZE * 8) {
> > +     for (d = 0; d < bytes; d += nsize * 8) {
> >               /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -775,14 +780,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vmv.v.v  v29, v28\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > -                           [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> > -                           [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> > -                           [wp3]"r"(&dptr[z0][d + 3 * NSIZE]),
> > -                           [wp4]"r"(&dptr[z0][d + 4 * NSIZE]),
> > -                           [wp5]"r"(&dptr[z0][d + 5 * NSIZE]),
> > -                           [wp6]"r"(&dptr[z0][d + 6 * NSIZE]),
> > -                           [wp7]"r"(&dptr[z0][d + 7 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> > +                           [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> > +                           [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> > +                           [wp3]"r"(&dptr[z0][d + 3 * nsize]),
> > +                           [wp4]"r"(&dptr[z0][d + 4 * nsize]),
> > +                           [wp5]"r"(&dptr[z0][d + 5 * nsize]),
> > +                           [wp6]"r"(&dptr[z0][d + 6 * nsize]),
> > +                           [wp7]"r"(&dptr[z0][d + 7 * nsize])
> >               );
> >
> >               for (z = z0 - 1; z >= 0; z--) {
> > @@ -862,14 +867,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                                     "vxor.vv  v28, v28, v30\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > -                                   [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> > -                                   [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> > -                                   [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> > -                                   [wd4]"r"(&dptr[z][d + 4 * NSIZE]),
> > -                                   [wd5]"r"(&dptr[z][d + 5 * NSIZE]),
> > -                                   [wd6]"r"(&dptr[z][d + 6 * NSIZE]),
> > -                                   [wd7]"r"(&dptr[z][d + 7 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> > +                                   [wd1]"r"(&dptr[z][d + 1 * nsize]),
> > +                                   [wd2]"r"(&dptr[z][d + 2 * nsize]),
> > +                                   [wd3]"r"(&dptr[z][d + 3 * nsize]),
> > +                                   [wd4]"r"(&dptr[z][d + 4 * nsize]),
> > +                                   [wd5]"r"(&dptr[z][d + 5 * nsize]),
> > +                                   [wd6]"r"(&dptr[z][d + 6 * nsize]),
> > +                                   [wd7]"r"(&dptr[z][d + 7 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -898,22 +903,22 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> >                             "vse8.v   v29, (%[wq7])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0]),
> > -                           [wp1]"r"(&p[d + NSIZE * 1]),
> > -                           [wq1]"r"(&q[d + NSIZE * 1]),
> > -                           [wp2]"r"(&p[d + NSIZE * 2]),
> > -                           [wq2]"r"(&q[d + NSIZE * 2]),
> > -                           [wp3]"r"(&p[d + NSIZE * 3]),
> > -                           [wq3]"r"(&q[d + NSIZE * 3]),
> > -                           [wp4]"r"(&p[d + NSIZE * 4]),
> > -                           [wq4]"r"(&q[d + NSIZE * 4]),
> > -                           [wp5]"r"(&p[d + NSIZE * 5]),
> > -                           [wq5]"r"(&q[d + NSIZE * 5]),
> > -                           [wp6]"r"(&p[d + NSIZE * 6]),
> > -                           [wq6]"r"(&q[d + NSIZE * 6]),
> > -                           [wp7]"r"(&p[d + NSIZE * 7]),
> > -                           [wq7]"r"(&q[d + NSIZE * 7])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0]),
> > +                           [wp1]"r"(&p[d + nsize * 1]),
> > +                           [wq1]"r"(&q[d + nsize * 1]),
> > +                           [wp2]"r"(&p[d + nsize * 2]),
> > +                           [wq2]"r"(&q[d + nsize * 2]),
> > +                           [wp3]"r"(&p[d + nsize * 3]),
> > +                           [wq3]"r"(&q[d + nsize * 3]),
> > +                           [wp4]"r"(&p[d + nsize * 4]),
> > +                           [wq4]"r"(&q[d + nsize * 4]),
> > +                           [wp5]"r"(&p[d + nsize * 5]),
> > +                           [wq5]"r"(&q[d + nsize * 5]),
> > +                           [wp6]"r"(&p[d + nsize * 6]),
> > +                           [wq6]"r"(&q[d + nsize * 6]),
> > +                           [wp7]"r"(&p[d + nsize * 7]),
> > +                           [wq7]"r"(&q[d + nsize * 7])
> >               );
> >       }
> >   }
> > @@ -923,7 +928,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >   {
> >       u8 **dptr = (u8 **)ptrs;
> >       u8 *p, *q;
> > -     unsigned long vl, d;
> > +     unsigned long vl, d, nsize;
> >       int z, z0;
> >
> >       z0 = stop;              /* P/Q right side optimization */
> > @@ -937,6 +942,8 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >                     : "=&r" (vl)
> >       );
> >
> > +     nsize = vl;
> > +
> >       /*
> >        *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10
> >        *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11
> > @@ -947,7 +954,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >        *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16
> >        *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17
> >        */
> > -     for (d = 0; d < bytes; d += NSIZE * 8) {
> > +     for (d = 0; d < bytes; d += nsize * 8) {
> >                /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */
> >               asm volatile (".option  push\n"
> >                             ".option  arch,+v\n"
> > @@ -969,14 +976,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >                             "vmv.v.v  v29, v28\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > -                           [wp1]"r"(&dptr[z0][d + 1 * NSIZE]),
> > -                           [wp2]"r"(&dptr[z0][d + 2 * NSIZE]),
> > -                           [wp3]"r"(&dptr[z0][d + 3 * NSIZE]),
> > -                           [wp4]"r"(&dptr[z0][d + 4 * NSIZE]),
> > -                           [wp5]"r"(&dptr[z0][d + 5 * NSIZE]),
> > -                           [wp6]"r"(&dptr[z0][d + 6 * NSIZE]),
> > -                           [wp7]"r"(&dptr[z0][d + 7 * NSIZE])
> > +                           [wp0]"r"(&dptr[z0][d + 0 * nsize]),
> > +                           [wp1]"r"(&dptr[z0][d + 1 * nsize]),
> > +                           [wp2]"r"(&dptr[z0][d + 2 * nsize]),
> > +                           [wp3]"r"(&dptr[z0][d + 3 * nsize]),
> > +                           [wp4]"r"(&dptr[z0][d + 4 * nsize]),
> > +                           [wp5]"r"(&dptr[z0][d + 5 * nsize]),
> > +                           [wp6]"r"(&dptr[z0][d + 6 * nsize]),
> > +                           [wp7]"r"(&dptr[z0][d + 7 * nsize])
> >               );
> >
> >               /* P/Q data pages */
> > @@ -1057,14 +1064,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >                                     "vxor.vv  v28, v28, v30\n"
> >                                     ".option  pop\n"
> >                                     : :
> > -                                   [wd0]"r"(&dptr[z][d + 0 * NSIZE]),
> > -                                   [wd1]"r"(&dptr[z][d + 1 * NSIZE]),
> > -                                   [wd2]"r"(&dptr[z][d + 2 * NSIZE]),
> > -                                   [wd3]"r"(&dptr[z][d + 3 * NSIZE]),
> > -                                   [wd4]"r"(&dptr[z][d + 4 * NSIZE]),
> > -                                   [wd5]"r"(&dptr[z][d + 5 * NSIZE]),
> > -                                   [wd6]"r"(&dptr[z][d + 6 * NSIZE]),
> > -                                   [wd7]"r"(&dptr[z][d + 7 * NSIZE]),
> > +                                   [wd0]"r"(&dptr[z][d + 0 * nsize]),
> > +                                   [wd1]"r"(&dptr[z][d + 1 * nsize]),
> > +                                   [wd2]"r"(&dptr[z][d + 2 * nsize]),
> > +                                   [wd3]"r"(&dptr[z][d + 3 * nsize]),
> > +                                   [wd4]"r"(&dptr[z][d + 4 * nsize]),
> > +                                   [wd5]"r"(&dptr[z][d + 5 * nsize]),
> > +                                   [wd6]"r"(&dptr[z][d + 6 * nsize]),
> > +                                   [wd7]"r"(&dptr[z][d + 7 * nsize]),
> >                                     [x1d]"r"(0x1d)
> >                       );
> >               }
> > @@ -1195,22 +1202,22 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> >                             "vse8.v   v31, (%[wq7])\n"
> >                             ".option  pop\n"
> >                             : :
> > -                           [wp0]"r"(&p[d + NSIZE * 0]),
> > -                           [wq0]"r"(&q[d + NSIZE * 0]),
> > -                           [wp1]"r"(&p[d + NSIZE * 1]),
> > -                           [wq1]"r"(&q[d + NSIZE * 1]),
> > -                           [wp2]"r"(&p[d + NSIZE * 2]),
> > -                           [wq2]"r"(&q[d + NSIZE * 2]),
> > -                           [wp3]"r"(&p[d + NSIZE * 3]),
> > -                           [wq3]"r"(&q[d + NSIZE * 3]),
> > -                           [wp4]"r"(&p[d + NSIZE * 4]),
> > -                           [wq4]"r"(&q[d + NSIZE * 4]),
> > -                           [wp5]"r"(&p[d + NSIZE * 5]),
> > -                           [wq5]"r"(&q[d + NSIZE * 5]),
> > -                           [wp6]"r"(&p[d + NSIZE * 6]),
> > -                           [wq6]"r"(&q[d + NSIZE * 6]),
> > -                           [wp7]"r"(&p[d + NSIZE * 7]),
> > -                           [wq7]"r"(&q[d + NSIZE * 7])
> > +                           [wp0]"r"(&p[d + nsize * 0]),
> > +                           [wq0]"r"(&q[d + nsize * 0]),
> > +                           [wp1]"r"(&p[d + nsize * 1]),
> > +                           [wq1]"r"(&q[d + nsize * 1]),
> > +                           [wp2]"r"(&p[d + nsize * 2]),
> > +                           [wq2]"r"(&q[d + nsize * 2]),
> > +                           [wp3]"r"(&p[d + nsize * 3]),
> > +                           [wq3]"r"(&q[d + nsize * 3]),
> > +                           [wp4]"r"(&p[d + nsize * 4]),
> > +                           [wq4]"r"(&q[d + nsize * 4]),
> > +                           [wp5]"r"(&p[d + nsize * 5]),
> > +                           [wq5]"r"(&q[d + nsize * 5]),
> > +                           [wp6]"r"(&p[d + nsize * 6]),
> > +                           [wq6]"r"(&q[d + nsize * 6]),
> > +                           [wp7]"r"(&p[d + nsize * 7]),
> > +                           [wq7]"r"(&q[d + nsize * 7])
> >               );
> >       }
> >   }
> > diff --git a/lib/raid6/rvv.h b/lib/raid6/rvv.h
> > index 94044a1b707b..6d0708a2c8a4 100644
> > --- a/lib/raid6/rvv.h
> > +++ b/lib/raid6/rvv.h
> > @@ -7,6 +7,23 @@
> >    * Definitions for RISC-V RAID-6 code
> >    */
> >
> > +#ifdef __KERNEL__
> > +#include <asm/vector.h>
> > +#else
> > +#define kernel_vector_begin()
> > +#define kernel_vector_end()
> > +#include <sys/auxv.h>
> > +#include <asm/hwcap.h>
> > +#define has_vector() (getauxval(AT_HWCAP) & COMPAT_HWCAP_ISA_V)
> > +#endif
> > +
> > +#include <linux/raid/pq.h>
> > +
> > +static int rvv_has_vector(void)
> > +{
> > +     return has_vector();
> > +}
> > +
> >   #define RAID6_RVV_WRAPPER(_n)                                               \
> >       static void raid6_rvv ## _n ## _gen_syndrome(int disks,         \
> >                                       size_t bytes, void **ptrs)      \
>
>
> Otherwise, looks good:
>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,
Chunyan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion
  2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
  2025-07-16 13:38   ` Alexandre Ghiti
@ 2025-07-21  7:52   ` Nutty Liu
  1 sibling, 0 replies; 14+ messages in thread
From: Nutty Liu @ 2025-07-21  7:52 UTC (permalink / raw)
  To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai
  Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang

On 7/11/2025 6:09 PM, Chunyan Zhang wrote:
> These two C files don't reference things defined in simd.h or types.h
> so remove these redundant #inclusions.
>
> Fixes: 6093faaf9593 ("raid6: Add RISC-V SIMD syndrome and recovery calculations")
> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> ---
>   lib/raid6/recov_rvv.c | 2 --
>   lib/raid6/rvv.c       | 3 ---
>   2 files changed, 5 deletions(-)

Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>

Thanks,
Nutty
> diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c
> index f29303795ccf..500da521a806 100644
> --- a/lib/raid6/recov_rvv.c
> +++ b/lib/raid6/recov_rvv.c
> @@ -4,9 +4,7 @@
>    * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
>    */
>   
> -#include <asm/simd.h>
>   #include <asm/vector.h>
> -#include <crypto/internal/simd.h>
>   #include <linux/raid/pq.h>
>   
>   static int rvv_has_vector(void)
> diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> index 7d82efa5b14f..b193ea176d5d 100644
> --- a/lib/raid6/rvv.c
> +++ b/lib/raid6/rvv.c
> @@ -9,11 +9,8 @@
>    *	Copyright 2002-2004 H. Peter Anvin
>    */
>   
> -#include <asm/simd.h>
>   #include <asm/vector.h>
> -#include <crypto/internal/simd.h>
>   #include <linux/raid/pq.h>
> -#include <linux/types.h>
>   #include "rvv.h"
>   
>   #define NSIZE	(riscv_v_vsize / 32) /* NSIZE = vlenb */

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-07-21  7:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
2025-07-16 13:38   ` Alexandre Ghiti
2025-07-21  7:52   ` Nutty Liu
2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang
2025-07-16 13:40   ` Alexandre Ghiti
2025-07-17  2:16     ` Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang
2025-07-16 13:43   ` Alexandre Ghiti
2025-07-17  3:16     ` Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang
2025-07-17  7:04   ` Alexandre Ghiti
2025-07-17  7:39     ` Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).