* [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support @ 2025-07-11 10:09 Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang ` (4 more replies) 0 siblings, 5 replies; 14+ messages in thread From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang The 1st patch is a cleanup; Patch 2/4 is an optimization that takes Palmer's suggestion; The last two patches add raid6test support and make the raid6 RVV code buildable on user space. V2: * Addressed comments from v1: - Replaced one load with a move to speed up in _gen/xor_syndrome(); - Added a compiler error - Dropped the NSIZE macro, instead of using the vector length; - Modified has_vector() definition for user space; Chunyan Zhang (5): raid6: riscv: Clean up unused header file inclusion raid6: riscv: replace one load with a move to speed up the caculation raid6: riscv: Add a compiler error raid6: riscv: Allow code to be compiled in userspace raid6: test: Add support for RISC-V lib/raid6/recov_rvv.c | 9 +- lib/raid6/rvv.c | 362 ++++++++++++++++++++-------------------- lib/raid6/rvv.h | 17 ++ lib/raid6/test/Makefile | 8 + 4 files changed, 211 insertions(+), 185 deletions(-) -- 2.34.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion 2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang @ 2025-07-11 10:09 ` Chunyan Zhang 2025-07-16 13:38 ` Alexandre Ghiti 2025-07-21 7:52 ` Nutty Liu 2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang ` (3 subsequent siblings) 4 siblings, 2 replies; 14+ messages in thread From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang These two C files don't reference things defined in simd.h or types.h so remove these redundant #inclusions. Fixes: 6093faaf9593 ("raid6: Add RISC-V SIMD syndrome and recovery calculations") Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> --- lib/raid6/recov_rvv.c | 2 -- lib/raid6/rvv.c | 3 --- 2 files changed, 5 deletions(-) diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c index f29303795ccf..500da521a806 100644 --- a/lib/raid6/recov_rvv.c +++ b/lib/raid6/recov_rvv.c @@ -4,9 +4,7 @@ * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn> */ -#include <asm/simd.h> #include <asm/vector.h> -#include <crypto/internal/simd.h> #include <linux/raid/pq.h> static int rvv_has_vector(void) diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c index 7d82efa5b14f..b193ea176d5d 100644 --- a/lib/raid6/rvv.c +++ b/lib/raid6/rvv.c @@ -9,11 +9,8 @@ * Copyright 2002-2004 H. Peter Anvin */ -#include <asm/simd.h> #include <asm/vector.h> -#include <crypto/internal/simd.h> #include <linux/raid/pq.h> -#include <linux/types.h> #include "rvv.h" #define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion 2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang @ 2025-07-16 13:38 ` Alexandre Ghiti 2025-07-21 7:52 ` Nutty Liu 1 sibling, 0 replies; 14+ messages in thread From: Alexandre Ghiti @ 2025-07-16 13:38 UTC (permalink / raw) To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang Hi Chunyan, On 7/11/25 12:09, Chunyan Zhang wrote: > These two C files don't reference things defined in simd.h or types.h > so remove these redundant #inclusions. > > Fixes: 6093faaf9593 ("raid6: Add RISC-V SIMD syndrome and recovery calculations") > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > --- > lib/raid6/recov_rvv.c | 2 -- > lib/raid6/rvv.c | 3 --- > 2 files changed, 5 deletions(-) > > diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c > index f29303795ccf..500da521a806 100644 > --- a/lib/raid6/recov_rvv.c > +++ b/lib/raid6/recov_rvv.c > @@ -4,9 +4,7 @@ > * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > */ > > -#include <asm/simd.h> > #include <asm/vector.h> > -#include <crypto/internal/simd.h> > #include <linux/raid/pq.h> > > static int rvv_has_vector(void) > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > index 7d82efa5b14f..b193ea176d5d 100644 > --- a/lib/raid6/rvv.c > +++ b/lib/raid6/rvv.c > @@ -9,11 +9,8 @@ > * Copyright 2002-2004 H. Peter Anvin > */ > > -#include <asm/simd.h> > #include <asm/vector.h> > -#include <crypto/internal/simd.h> > #include <linux/raid/pq.h> > -#include <linux/types.h> > #include "rvv.h" > > #define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks, Alex ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion 2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang 2025-07-16 13:38 ` Alexandre Ghiti @ 2025-07-21 7:52 ` Nutty Liu 1 sibling, 0 replies; 14+ messages in thread From: Nutty Liu @ 2025-07-21 7:52 UTC (permalink / raw) To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang On 7/11/2025 6:09 PM, Chunyan Zhang wrote: > These two C files don't reference things defined in simd.h or types.h > so remove these redundant #inclusions. > > Fixes: 6093faaf9593 ("raid6: Add RISC-V SIMD syndrome and recovery calculations") > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > --- > lib/raid6/recov_rvv.c | 2 -- > lib/raid6/rvv.c | 3 --- > 2 files changed, 5 deletions(-) Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com> Thanks, Nutty > diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c > index f29303795ccf..500da521a806 100644 > --- a/lib/raid6/recov_rvv.c > +++ b/lib/raid6/recov_rvv.c > @@ -4,9 +4,7 @@ > * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > */ > > -#include <asm/simd.h> > #include <asm/vector.h> > -#include <crypto/internal/simd.h> > #include <linux/raid/pq.h> > > static int rvv_has_vector(void) > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > index 7d82efa5b14f..b193ea176d5d 100644 > --- a/lib/raid6/rvv.c > +++ b/lib/raid6/rvv.c > @@ -9,11 +9,8 @@ > * Copyright 2002-2004 H. Peter Anvin > */ > > -#include <asm/simd.h> > #include <asm/vector.h> > -#include <crypto/internal/simd.h> > #include <linux/raid/pq.h> > -#include <linux/types.h> > #include "rvv.h" > > #define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation 2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang @ 2025-07-11 10:09 ` Chunyan Zhang 2025-07-16 13:40 ` Alexandre Ghiti 2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang ` (2 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang Since wp$$==wq$$, it doesn't need to load the same data twice, use move instruction to replace one of the loads to let the program run faster. Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> --- lib/raid6/rvv.c | 60 ++++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c index b193ea176d5d..89da5fc247aa 100644 --- a/lib/raid6/rvv.c +++ b/lib/raid6/rvv.c @@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) @@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) @@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" "vle8.v v4, (%[wp1])\n" - "vle8.v v5, (%[wp1])\n" + "vmv.v.v v5, v4\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), @@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" "vle8.v v4, (%[wp1])\n" - "vle8.v v5, (%[wp1])\n" + "vmv.v.v v5, v4\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), @@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" "vle8.v v4, (%[wp1])\n" - "vle8.v v5, (%[wp1])\n" + "vmv.v.v v5, v4\n" "vle8.v v8, (%[wp2])\n" - "vle8.v v9, (%[wp2])\n" + "vmv.v.v v9, v8\n" "vle8.v v12, (%[wp3])\n" - "vle8.v v13, (%[wp3])\n" + "vmv.v.v v13, v12\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), @@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" "vle8.v v4, (%[wp1])\n" - "vle8.v v5, (%[wp1])\n" + "vmv.v.v v5, v4\n" "vle8.v v8, (%[wp2])\n" - "vle8.v v9, (%[wp2])\n" + "vmv.v.v v9, v8\n" "vle8.v v12, (%[wp3])\n" - "vle8.v v13, (%[wp3])\n" + "vmv.v.v v13, v12\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), @@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" "vle8.v v4, (%[wp1])\n" - "vle8.v v5, (%[wp1])\n" + "vmv.v.v v5, v4\n" "vle8.v v8, (%[wp2])\n" - "vle8.v v9, (%[wp2])\n" + "vmv.v.v v9, v8\n" "vle8.v v12, (%[wp3])\n" - "vle8.v v13, (%[wp3])\n" + "vmv.v.v v13, v12\n" "vle8.v v16, (%[wp4])\n" - "vle8.v v17, (%[wp4])\n" + "vmv.v.v v17, v16\n" "vle8.v v20, (%[wp5])\n" - "vle8.v v21, (%[wp5])\n" + "vmv.v.v v21, v20\n" "vle8.v v24, (%[wp6])\n" - "vle8.v v25, (%[wp6])\n" + "vmv.v.v v25, v24\n" "vle8.v v28, (%[wp7])\n" - "vle8.v v29, (%[wp7])\n" + "vmv.v.v v29, v28\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), @@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, asm volatile (".option push\n" ".option arch,+v\n" "vle8.v v0, (%[wp0])\n" - "vle8.v v1, (%[wp0])\n" + "vmv.v.v v1, v0\n" "vle8.v v4, (%[wp1])\n" - "vle8.v v5, (%[wp1])\n" + "vmv.v.v v5, v4\n" "vle8.v v8, (%[wp2])\n" - "vle8.v v9, (%[wp2])\n" + "vmv.v.v v9, v8\n" "vle8.v v12, (%[wp3])\n" - "vle8.v v13, (%[wp3])\n" + "vmv.v.v v13, v12\n" "vle8.v v16, (%[wp4])\n" - "vle8.v v17, (%[wp4])\n" + "vmv.v.v v17, v16\n" "vle8.v v20, (%[wp5])\n" - "vle8.v v21, (%[wp5])\n" + "vmv.v.v v21, v20\n" "vle8.v v24, (%[wp6])\n" - "vle8.v v25, (%[wp6])\n" + "vmv.v.v v25, v24\n" "vle8.v v28, (%[wp7])\n" - "vle8.v v29, (%[wp7])\n" + "vmv.v.v v29, v28\n" ".option pop\n" : : [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation 2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang @ 2025-07-16 13:40 ` Alexandre Ghiti 2025-07-17 2:16 ` Chunyan Zhang 0 siblings, 1 reply; 14+ messages in thread From: Alexandre Ghiti @ 2025-07-16 13:40 UTC (permalink / raw) To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang On 7/11/25 12:09, Chunyan Zhang wrote: > Since wp$$==wq$$, it doesn't need to load the same data twice, use move > instruction to replace one of the loads to let the program run faster. > > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > --- > lib/raid6/rvv.c | 60 ++++++++++++++++++++++++------------------------- > 1 file changed, 30 insertions(+), 30 deletions(-) > > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > index b193ea176d5d..89da5fc247aa 100644 > --- a/lib/raid6/rvv.c > +++ b/lib/raid6/rvv.c > @@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > @@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > @@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > "vle8.v v4, (%[wp1])\n" > - "vle8.v v5, (%[wp1])\n" > + "vmv.v.v v5, v4\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > @@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > "vle8.v v4, (%[wp1])\n" > - "vle8.v v5, (%[wp1])\n" > + "vmv.v.v v5, v4\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > @@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > "vle8.v v4, (%[wp1])\n" > - "vle8.v v5, (%[wp1])\n" > + "vmv.v.v v5, v4\n" > "vle8.v v8, (%[wp2])\n" > - "vle8.v v9, (%[wp2])\n" > + "vmv.v.v v9, v8\n" > "vle8.v v12, (%[wp3])\n" > - "vle8.v v13, (%[wp3])\n" > + "vmv.v.v v13, v12\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > @@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > "vle8.v v4, (%[wp1])\n" > - "vle8.v v5, (%[wp1])\n" > + "vmv.v.v v5, v4\n" > "vle8.v v8, (%[wp2])\n" > - "vle8.v v9, (%[wp2])\n" > + "vmv.v.v v9, v8\n" > "vle8.v v12, (%[wp3])\n" > - "vle8.v v13, (%[wp3])\n" > + "vmv.v.v v13, v12\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > @@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > "vle8.v v4, (%[wp1])\n" > - "vle8.v v5, (%[wp1])\n" > + "vmv.v.v v5, v4\n" > "vle8.v v8, (%[wp2])\n" > - "vle8.v v9, (%[wp2])\n" > + "vmv.v.v v9, v8\n" > "vle8.v v12, (%[wp3])\n" > - "vle8.v v13, (%[wp3])\n" > + "vmv.v.v v13, v12\n" > "vle8.v v16, (%[wp4])\n" > - "vle8.v v17, (%[wp4])\n" > + "vmv.v.v v17, v16\n" > "vle8.v v20, (%[wp5])\n" > - "vle8.v v21, (%[wp5])\n" > + "vmv.v.v v21, v20\n" > "vle8.v v24, (%[wp6])\n" > - "vle8.v v25, (%[wp6])\n" > + "vmv.v.v v25, v24\n" > "vle8.v v28, (%[wp7])\n" > - "vle8.v v29, (%[wp7])\n" > + "vmv.v.v v29, v28\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > @@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > asm volatile (".option push\n" > ".option arch,+v\n" > "vle8.v v0, (%[wp0])\n" > - "vle8.v v1, (%[wp0])\n" > + "vmv.v.v v1, v0\n" > "vle8.v v4, (%[wp1])\n" > - "vle8.v v5, (%[wp1])\n" > + "vmv.v.v v5, v4\n" > "vle8.v v8, (%[wp2])\n" > - "vle8.v v9, (%[wp2])\n" > + "vmv.v.v v9, v8\n" > "vle8.v v12, (%[wp3])\n" > - "vle8.v v13, (%[wp3])\n" > + "vmv.v.v v13, v12\n" > "vle8.v v16, (%[wp4])\n" > - "vle8.v v17, (%[wp4])\n" > + "vmv.v.v v17, v16\n" > "vle8.v v20, (%[wp5])\n" > - "vle8.v v21, (%[wp5])\n" > + "vmv.v.v v21, v20\n" > "vle8.v v24, (%[wp6])\n" > - "vle8.v v25, (%[wp6])\n" > + "vmv.v.v v25, v24\n" > "vle8.v v28, (%[wp7])\n" > - "vle8.v v29, (%[wp7])\n" > + "vmv.v.v v29, v28\n" > ".option pop\n" > : : > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), Out of curiosity, did you notice a gain? Anyway: Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks, Alex ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation 2025-07-16 13:40 ` Alexandre Ghiti @ 2025-07-17 2:16 ` Chunyan Zhang 0 siblings, 0 replies; 14+ messages in thread From: Chunyan Zhang @ 2025-07-17 2:16 UTC (permalink / raw) To: Alexandre Ghiti Cc: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai, linux-riscv, linux-raid, linux-kernel On Wed, 16 Jul 2025 at 21:40, Alexandre Ghiti <alex@ghiti.fr> wrote: > > On 7/11/25 12:09, Chunyan Zhang wrote: > > Since wp$$==wq$$, it doesn't need to load the same data twice, use move > > instruction to replace one of the loads to let the program run faster. > > > > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > > --- > > lib/raid6/rvv.c | 60 ++++++++++++++++++++++++------------------------- > > 1 file changed, 30 insertions(+), 30 deletions(-) > > > > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > > index b193ea176d5d..89da5fc247aa 100644 > > --- a/lib/raid6/rvv.c > > +++ b/lib/raid6/rvv.c > > @@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > > @@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > > @@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > "vle8.v v4, (%[wp1])\n" > > - "vle8.v v5, (%[wp1])\n" > > + "vmv.v.v v5, v4\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > @@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > "vle8.v v4, (%[wp1])\n" > > - "vle8.v v5, (%[wp1])\n" > > + "vmv.v.v v5, v4\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > @@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > "vle8.v v4, (%[wp1])\n" > > - "vle8.v v5, (%[wp1])\n" > > + "vmv.v.v v5, v4\n" > > "vle8.v v8, (%[wp2])\n" > > - "vle8.v v9, (%[wp2])\n" > > + "vmv.v.v v9, v8\n" > > "vle8.v v12, (%[wp3])\n" > > - "vle8.v v13, (%[wp3])\n" > > + "vmv.v.v v13, v12\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > @@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > "vle8.v v4, (%[wp1])\n" > > - "vle8.v v5, (%[wp1])\n" > > + "vmv.v.v v5, v4\n" > > "vle8.v v8, (%[wp2])\n" > > - "vle8.v v9, (%[wp2])\n" > > + "vmv.v.v v9, v8\n" > > "vle8.v v12, (%[wp3])\n" > > - "vle8.v v13, (%[wp3])\n" > > + "vmv.v.v v13, v12\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > @@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > "vle8.v v4, (%[wp1])\n" > > - "vle8.v v5, (%[wp1])\n" > > + "vmv.v.v v5, v4\n" > > "vle8.v v8, (%[wp2])\n" > > - "vle8.v v9, (%[wp2])\n" > > + "vmv.v.v v9, v8\n" > > "vle8.v v12, (%[wp3])\n" > > - "vle8.v v13, (%[wp3])\n" > > + "vmv.v.v v13, v12\n" > > "vle8.v v16, (%[wp4])\n" > > - "vle8.v v17, (%[wp4])\n" > > + "vmv.v.v v17, v16\n" > > "vle8.v v20, (%[wp5])\n" > > - "vle8.v v21, (%[wp5])\n" > > + "vmv.v.v v21, v20\n" > > "vle8.v v24, (%[wp6])\n" > > - "vle8.v v25, (%[wp6])\n" > > + "vmv.v.v v25, v24\n" > > "vle8.v v28, (%[wp7])\n" > > - "vle8.v v29, (%[wp7])\n" > > + "vmv.v.v v29, v28\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > @@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > asm volatile (".option push\n" > > ".option arch,+v\n" > > "vle8.v v0, (%[wp0])\n" > > - "vle8.v v1, (%[wp0])\n" > > + "vmv.v.v v1, v0\n" > > "vle8.v v4, (%[wp1])\n" > > - "vle8.v v5, (%[wp1])\n" > > + "vmv.v.v v5, v4\n" > > "vle8.v v8, (%[wp2])\n" > > - "vle8.v v9, (%[wp2])\n" > > + "vmv.v.v v9, v8\n" > > "vle8.v v12, (%[wp3])\n" > > - "vle8.v v13, (%[wp3])\n" > > + "vmv.v.v v13, v12\n" > > "vle8.v v16, (%[wp4])\n" > > - "vle8.v v17, (%[wp4])\n" > > + "vmv.v.v v17, v16\n" > > "vle8.v v20, (%[wp5])\n" > > - "vle8.v v21, (%[wp5])\n" > > + "vmv.v.v v21, v20\n" > > "vle8.v v24, (%[wp6])\n" > > - "vle8.v v25, (%[wp6])\n" > > + "vmv.v.v v25, v24\n" > > "vle8.v v28, (%[wp7])\n" > > - "vle8.v v29, (%[wp7])\n" > > + "vmv.v.v v29, v28\n" > > ".option pop\n" > > : : > > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > > Out of curiosity, did you notice a gain? Yes, I can see ~3% gain on my BPI-F3. > > Anyway: > > Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> > > Thanks, > > Alex > ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2 3/5] raid6: riscv: Add a compiler error 2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang @ 2025-07-11 10:09 ` Chunyan Zhang 2025-07-16 13:43 ` Alexandre Ghiti 2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang 4 siblings, 1 reply; 14+ messages in thread From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang The code like "u8 **dptr = (u8 **)ptrs" just won't work when built with a compiler that can use vector instructions. So add an error for that. Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> --- lib/raid6/rvv.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c index 89da5fc247aa..015f3ee4da25 100644 --- a/lib/raid6/rvv.c +++ b/lib/raid6/rvv.c @@ -20,6 +20,10 @@ static int rvv_has_vector(void) return has_vector(); } +#ifdef __riscv_vector +#error "This code must be built without compiler support for vector" +#endif + static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) { u8 **dptr = (u8 **)ptrs; -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH V2 3/5] raid6: riscv: Add a compiler error 2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang @ 2025-07-16 13:43 ` Alexandre Ghiti 2025-07-17 3:16 ` Chunyan Zhang 0 siblings, 1 reply; 14+ messages in thread From: Alexandre Ghiti @ 2025-07-16 13:43 UTC (permalink / raw) To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang First, the patch title should be something like: "raid6: riscv: Prevent compiler with vector support to build already vectorized code" Or something similar. On 7/11/25 12:09, Chunyan Zhang wrote: > The code like "u8 **dptr = (u8 **)ptrs" just won't work when built with Why wouldn't this code ^ work? I guess preventing the compiler to vectorize the code is to avoid the inline assembly code to break what the compiler could have vectorized no? > a compiler that can use vector instructions. So add an error for that. > > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > --- > lib/raid6/rvv.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > index 89da5fc247aa..015f3ee4da25 100644 > --- a/lib/raid6/rvv.c > +++ b/lib/raid6/rvv.c > @@ -20,6 +20,10 @@ static int rvv_has_vector(void) > return has_vector(); > } > > +#ifdef __riscv_vector > +#error "This code must be built without compiler support for vector" > +#endif > + > static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) > { > u8 **dptr = (u8 **)ptrs; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V2 3/5] raid6: riscv: Add a compiler error 2025-07-16 13:43 ` Alexandre Ghiti @ 2025-07-17 3:16 ` Chunyan Zhang 0 siblings, 0 replies; 14+ messages in thread From: Chunyan Zhang @ 2025-07-17 3:16 UTC (permalink / raw) To: Alexandre Ghiti Cc: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai, linux-riscv, linux-raid, linux-kernel Hi Alex, On Wed, 16 Jul 2025 at 21:43, Alexandre Ghiti <alex@ghiti.fr> wrote: > > First, the patch title should be something like: Yeah, I've also recognized the phrase is not right when rereading after the patch was sent. > > "raid6: riscv: Prevent compiler with vector support to build already > vectorized code" > > Or something similar. > > On 7/11/25 12:09, Chunyan Zhang wrote: > > The code like "u8 **dptr = (u8 **)ptrs" just won't work when built with > > > Why wouldn't this code ^ work? I actually didn't quite get this compiler issue ^_^|| > > I guess preventing the compiler to vectorize the code is to avoid the > inline assembly code to break what the compiler could have vectorized no? > This states the issue clearly, I will cook a new patchset. Thanks for the review, Chunyan > > > a compiler that can use vector instructions. So add an error for that. > > > > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> > > --- > > lib/raid6/rvv.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > > index 89da5fc247aa..015f3ee4da25 100644 > > --- a/lib/raid6/rvv.c > > +++ b/lib/raid6/rvv.c > > @@ -20,6 +20,10 @@ static int rvv_has_vector(void) > > return has_vector(); > > } > > > > +#ifdef __riscv_vector > > +#error "This code must be built without compiler support for vector" > > +#endif > > + > > static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) > > { > > u8 **dptr = (u8 **)ptrs; ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace 2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang ` (2 preceding siblings ...) 2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang @ 2025-07-11 10:09 ` Chunyan Zhang 2025-07-17 7:04 ` Alexandre Ghiti 2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang 4 siblings, 1 reply; 14+ messages in thread From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel header inclusions also userspace wrapper definitions to allow code to be compiled in userspace. This patch also drops the NSIZE macro, instead of using the vector length, which can work for both kernel and user space. Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> --- lib/raid6/recov_rvv.c | 7 +- lib/raid6/rvv.c | 297 +++++++++++++++++++++--------------------- lib/raid6/rvv.h | 17 +++ 3 files changed, 170 insertions(+), 151 deletions(-) diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c index 500da521a806..8f2be833c015 100644 --- a/lib/raid6/recov_rvv.c +++ b/lib/raid6/recov_rvv.c @@ -4,13 +4,8 @@ * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn> */ -#include <asm/vector.h> #include <linux/raid/pq.h> - -static int rvv_has_vector(void) -{ - return has_vector(); -} +#include "rvv.h" static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp, u8 *dq, const u8 *pbmul, diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c index 015f3ee4da25..75c9dafedb28 100644 --- a/lib/raid6/rvv.c +++ b/lib/raid6/rvv.c @@ -9,17 +9,8 @@ * Copyright 2002-2004 H. Peter Anvin */ -#include <asm/vector.h> -#include <linux/raid/pq.h> #include "rvv.h" -#define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ - -static int rvv_has_vector(void) -{ - return has_vector(); -} - #ifdef __riscv_vector #error "This code must be built without compiler support for vector" #endif @@ -28,7 +19,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = disks - 3; /* Highest data disk */ @@ -42,8 +33,10 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** : "=&r" (vl) ); + nsize = vl; + /* v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 */ - for (d = 0; d < bytes; d += NSIZE * 1) { + for (d = 0; d < bytes; d += nsize * 1) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -51,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** "vmv.v.v v1, v0\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]) ); for (z = z0 - 1 ; z >= 0 ; z--) { @@ -75,7 +68,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** "vxor.vv v0, v0, v2\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), [x1d]"r"(0x1d) ); } @@ -90,8 +83,8 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** "vse8.v v1, (%[wq0])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]) ); } } @@ -101,7 +94,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = stop; /* P/Q right side optimization */ @@ -115,8 +108,10 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, : "=&r" (vl) ); + nsize = vl; + /* v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 */ - for (d = 0 ; d < bytes ; d += NSIZE * 1) { + for (d = 0 ; d < bytes ; d += nsize * 1) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -124,7 +119,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, "vmv.v.v v1, v0\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]) ); /* P/Q data pages */ @@ -149,7 +144,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, "vxor.vv v0, v0, v2\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), [x1d]"r"(0x1d) ); } @@ -189,8 +184,8 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, "vse8.v v3, (%[wq0])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]) ); } } @@ -199,7 +194,7 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = disks - 3; /* Highest data disk */ @@ -213,11 +208,13 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** : "=&r" (vl) ); + nsize = vl; + /* * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 */ - for (d = 0; d < bytes; d += NSIZE * 2) { + for (d = 0; d < bytes; d += nsize * 2) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -227,8 +224,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** "vmv.v.v v5, v4\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]), + [wp1]"r"(&dptr[z0][d + 1 * nsize]) ); for (z = z0 - 1; z >= 0; z--) { @@ -260,8 +257,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** "vxor.vv v4, v4, v6\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), + [wd1]"r"(&dptr[z][d + 1 * nsize]), [x1d]"r"(0x1d) ); } @@ -278,10 +275,10 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** "vse8.v v5, (%[wq1])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]), - [wp1]"r"(&p[d + NSIZE * 1]), - [wq1]"r"(&q[d + NSIZE * 1]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]), + [wp1]"r"(&p[d + nsize * 1]), + [wq1]"r"(&q[d + nsize * 1]) ); } } @@ -291,7 +288,7 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = stop; /* P/Q right side optimization */ @@ -305,11 +302,13 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, : "=&r" (vl) ); + nsize = vl; + /* * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 */ - for (d = 0; d < bytes; d += NSIZE * 2) { + for (d = 0; d < bytes; d += nsize * 2) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -319,8 +318,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, "vmv.v.v v5, v4\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]), + [wp1]"r"(&dptr[z0][d + 1 * nsize]) ); /* P/Q data pages */ @@ -353,8 +352,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, "vxor.vv v4, v4, v6\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), + [wd1]"r"(&dptr[z][d + 1 * nsize]), [x1d]"r"(0x1d) ); } @@ -407,10 +406,10 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, "vse8.v v7, (%[wq1])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]), - [wp1]"r"(&p[d + NSIZE * 1]), - [wq1]"r"(&q[d + NSIZE * 1]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]), + [wp1]"r"(&p[d + nsize * 1]), + [wq1]"r"(&q[d + nsize * 1]) ); } } @@ -419,7 +418,7 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = disks - 3; /* Highest data disk */ @@ -433,13 +432,15 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** : "=&r" (vl) ); + nsize = vl; + /* * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12 * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13 */ - for (d = 0; d < bytes; d += NSIZE * 4) { + for (d = 0; d < bytes; d += nsize * 4) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -453,10 +454,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** "vmv.v.v v13, v12\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]), + [wp1]"r"(&dptr[z0][d + 1 * nsize]), + [wp2]"r"(&dptr[z0][d + 2 * nsize]), + [wp3]"r"(&dptr[z0][d + 3 * nsize]) ); for (z = z0 - 1; z >= 0; z--) { @@ -504,10 +505,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** "vxor.vv v12, v12, v14\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), + [wd1]"r"(&dptr[z][d + 1 * nsize]), + [wd2]"r"(&dptr[z][d + 2 * nsize]), + [wd3]"r"(&dptr[z][d + 3 * nsize]), [x1d]"r"(0x1d) ); } @@ -528,14 +529,14 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** "vse8.v v13, (%[wq3])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]), - [wp1]"r"(&p[d + NSIZE * 1]), - [wq1]"r"(&q[d + NSIZE * 1]), - [wp2]"r"(&p[d + NSIZE * 2]), - [wq2]"r"(&q[d + NSIZE * 2]), - [wp3]"r"(&p[d + NSIZE * 3]), - [wq3]"r"(&q[d + NSIZE * 3]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]), + [wp1]"r"(&p[d + nsize * 1]), + [wq1]"r"(&q[d + nsize * 1]), + [wp2]"r"(&p[d + nsize * 2]), + [wq2]"r"(&q[d + nsize * 2]), + [wp3]"r"(&p[d + nsize * 3]), + [wq3]"r"(&q[d + nsize * 3]) ); } } @@ -545,7 +546,7 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = stop; /* P/Q right side optimization */ @@ -559,13 +560,15 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, : "=&r" (vl) ); + nsize = vl; + /* * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12 * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13 */ - for (d = 0; d < bytes; d += NSIZE * 4) { + for (d = 0; d < bytes; d += nsize * 4) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -579,10 +582,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, "vmv.v.v v13, v12\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]), + [wp1]"r"(&dptr[z0][d + 1 * nsize]), + [wp2]"r"(&dptr[z0][d + 2 * nsize]), + [wp3]"r"(&dptr[z0][d + 3 * nsize]) ); /* P/Q data pages */ @@ -631,10 +634,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, "vxor.vv v12, v12, v14\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), + [wd1]"r"(&dptr[z][d + 1 * nsize]), + [wd2]"r"(&dptr[z][d + 2 * nsize]), + [wd3]"r"(&dptr[z][d + 3 * nsize]), [x1d]"r"(0x1d) ); } @@ -713,14 +716,14 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, "vse8.v v15, (%[wq3])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]), - [wp1]"r"(&p[d + NSIZE * 1]), - [wq1]"r"(&q[d + NSIZE * 1]), - [wp2]"r"(&p[d + NSIZE * 2]), - [wq2]"r"(&q[d + NSIZE * 2]), - [wp3]"r"(&p[d + NSIZE * 3]), - [wq3]"r"(&q[d + NSIZE * 3]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]), + [wp1]"r"(&p[d + nsize * 1]), + [wq1]"r"(&q[d + nsize * 1]), + [wp2]"r"(&p[d + nsize * 2]), + [wq2]"r"(&q[d + nsize * 2]), + [wp3]"r"(&p[d + nsize * 3]), + [wq3]"r"(&q[d + nsize * 3]) ); } } @@ -729,7 +732,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = disks - 3; /* Highest data disk */ @@ -743,6 +746,8 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** : "=&r" (vl) ); + nsize = vl; + /* * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 @@ -753,7 +758,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** * v24:wp6, v25:wq6, v26:wd6/w26, v27:w16 * v28:wp7, v29:wq7, v30:wd7/w27, v31:w17 */ - for (d = 0; d < bytes; d += NSIZE * 8) { + for (d = 0; d < bytes; d += nsize * 8) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -775,14 +780,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** "vmv.v.v v29, v28\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), - [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), - [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), - [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), - [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]), + [wp1]"r"(&dptr[z0][d + 1 * nsize]), + [wp2]"r"(&dptr[z0][d + 2 * nsize]), + [wp3]"r"(&dptr[z0][d + 3 * nsize]), + [wp4]"r"(&dptr[z0][d + 4 * nsize]), + [wp5]"r"(&dptr[z0][d + 5 * nsize]), + [wp6]"r"(&dptr[z0][d + 6 * nsize]), + [wp7]"r"(&dptr[z0][d + 7 * nsize]) ); for (z = z0 - 1; z >= 0; z--) { @@ -862,14 +867,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** "vxor.vv v28, v28, v30\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), - [wd4]"r"(&dptr[z][d + 4 * NSIZE]), - [wd5]"r"(&dptr[z][d + 5 * NSIZE]), - [wd6]"r"(&dptr[z][d + 6 * NSIZE]), - [wd7]"r"(&dptr[z][d + 7 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), + [wd1]"r"(&dptr[z][d + 1 * nsize]), + [wd2]"r"(&dptr[z][d + 2 * nsize]), + [wd3]"r"(&dptr[z][d + 3 * nsize]), + [wd4]"r"(&dptr[z][d + 4 * nsize]), + [wd5]"r"(&dptr[z][d + 5 * nsize]), + [wd6]"r"(&dptr[z][d + 6 * nsize]), + [wd7]"r"(&dptr[z][d + 7 * nsize]), [x1d]"r"(0x1d) ); } @@ -898,22 +903,22 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** "vse8.v v29, (%[wq7])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]), - [wp1]"r"(&p[d + NSIZE * 1]), - [wq1]"r"(&q[d + NSIZE * 1]), - [wp2]"r"(&p[d + NSIZE * 2]), - [wq2]"r"(&q[d + NSIZE * 2]), - [wp3]"r"(&p[d + NSIZE * 3]), - [wq3]"r"(&q[d + NSIZE * 3]), - [wp4]"r"(&p[d + NSIZE * 4]), - [wq4]"r"(&q[d + NSIZE * 4]), - [wp5]"r"(&p[d + NSIZE * 5]), - [wq5]"r"(&q[d + NSIZE * 5]), - [wp6]"r"(&p[d + NSIZE * 6]), - [wq6]"r"(&q[d + NSIZE * 6]), - [wp7]"r"(&p[d + NSIZE * 7]), - [wq7]"r"(&q[d + NSIZE * 7]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]), + [wp1]"r"(&p[d + nsize * 1]), + [wq1]"r"(&q[d + nsize * 1]), + [wp2]"r"(&p[d + nsize * 2]), + [wq2]"r"(&q[d + nsize * 2]), + [wp3]"r"(&p[d + nsize * 3]), + [wq3]"r"(&q[d + nsize * 3]), + [wp4]"r"(&p[d + nsize * 4]), + [wq4]"r"(&q[d + nsize * 4]), + [wp5]"r"(&p[d + nsize * 5]), + [wq5]"r"(&q[d + nsize * 5]), + [wp6]"r"(&p[d + nsize * 6]), + [wq6]"r"(&q[d + nsize * 6]), + [wp7]"r"(&p[d + nsize * 7]), + [wq7]"r"(&q[d + nsize * 7]) ); } } @@ -923,7 +928,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, { u8 **dptr = (u8 **)ptrs; u8 *p, *q; - unsigned long vl, d; + unsigned long vl, d, nsize; int z, z0; z0 = stop; /* P/Q right side optimization */ @@ -937,6 +942,8 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, : "=&r" (vl) ); + nsize = vl; + /* * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 @@ -947,7 +954,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, * v24:wp6, v25:wq6, v26:wd6/w26, v27:w16 * v28:wp7, v29:wq7, v30:wd7/w27, v31:w17 */ - for (d = 0; d < bytes; d += NSIZE * 8) { + for (d = 0; d < bytes; d += nsize * 8) { /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ asm volatile (".option push\n" ".option arch,+v\n" @@ -969,14 +976,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, "vmv.v.v v29, v28\n" ".option pop\n" : : - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), - [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), - [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), - [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), - [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) + [wp0]"r"(&dptr[z0][d + 0 * nsize]), + [wp1]"r"(&dptr[z0][d + 1 * nsize]), + [wp2]"r"(&dptr[z0][d + 2 * nsize]), + [wp3]"r"(&dptr[z0][d + 3 * nsize]), + [wp4]"r"(&dptr[z0][d + 4 * nsize]), + [wp5]"r"(&dptr[z0][d + 5 * nsize]), + [wp6]"r"(&dptr[z0][d + 6 * nsize]), + [wp7]"r"(&dptr[z0][d + 7 * nsize]) ); /* P/Q data pages */ @@ -1057,14 +1064,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, "vxor.vv v28, v28, v30\n" ".option pop\n" : : - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), - [wd4]"r"(&dptr[z][d + 4 * NSIZE]), - [wd5]"r"(&dptr[z][d + 5 * NSIZE]), - [wd6]"r"(&dptr[z][d + 6 * NSIZE]), - [wd7]"r"(&dptr[z][d + 7 * NSIZE]), + [wd0]"r"(&dptr[z][d + 0 * nsize]), + [wd1]"r"(&dptr[z][d + 1 * nsize]), + [wd2]"r"(&dptr[z][d + 2 * nsize]), + [wd3]"r"(&dptr[z][d + 3 * nsize]), + [wd4]"r"(&dptr[z][d + 4 * nsize]), + [wd5]"r"(&dptr[z][d + 5 * nsize]), + [wd6]"r"(&dptr[z][d + 6 * nsize]), + [wd7]"r"(&dptr[z][d + 7 * nsize]), [x1d]"r"(0x1d) ); } @@ -1195,22 +1202,22 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, "vse8.v v31, (%[wq7])\n" ".option pop\n" : : - [wp0]"r"(&p[d + NSIZE * 0]), - [wq0]"r"(&q[d + NSIZE * 0]), - [wp1]"r"(&p[d + NSIZE * 1]), - [wq1]"r"(&q[d + NSIZE * 1]), - [wp2]"r"(&p[d + NSIZE * 2]), - [wq2]"r"(&q[d + NSIZE * 2]), - [wp3]"r"(&p[d + NSIZE * 3]), - [wq3]"r"(&q[d + NSIZE * 3]), - [wp4]"r"(&p[d + NSIZE * 4]), - [wq4]"r"(&q[d + NSIZE * 4]), - [wp5]"r"(&p[d + NSIZE * 5]), - [wq5]"r"(&q[d + NSIZE * 5]), - [wp6]"r"(&p[d + NSIZE * 6]), - [wq6]"r"(&q[d + NSIZE * 6]), - [wp7]"r"(&p[d + NSIZE * 7]), - [wq7]"r"(&q[d + NSIZE * 7]) + [wp0]"r"(&p[d + nsize * 0]), + [wq0]"r"(&q[d + nsize * 0]), + [wp1]"r"(&p[d + nsize * 1]), + [wq1]"r"(&q[d + nsize * 1]), + [wp2]"r"(&p[d + nsize * 2]), + [wq2]"r"(&q[d + nsize * 2]), + [wp3]"r"(&p[d + nsize * 3]), + [wq3]"r"(&q[d + nsize * 3]), + [wp4]"r"(&p[d + nsize * 4]), + [wq4]"r"(&q[d + nsize * 4]), + [wp5]"r"(&p[d + nsize * 5]), + [wq5]"r"(&q[d + nsize * 5]), + [wp6]"r"(&p[d + nsize * 6]), + [wq6]"r"(&q[d + nsize * 6]), + [wp7]"r"(&p[d + nsize * 7]), + [wq7]"r"(&q[d + nsize * 7]) ); } } diff --git a/lib/raid6/rvv.h b/lib/raid6/rvv.h index 94044a1b707b..6d0708a2c8a4 100644 --- a/lib/raid6/rvv.h +++ b/lib/raid6/rvv.h @@ -7,6 +7,23 @@ * Definitions for RISC-V RAID-6 code */ +#ifdef __KERNEL__ +#include <asm/vector.h> +#else +#define kernel_vector_begin() +#define kernel_vector_end() +#include <sys/auxv.h> +#include <asm/hwcap.h> +#define has_vector() (getauxval(AT_HWCAP) & COMPAT_HWCAP_ISA_V) +#endif + +#include <linux/raid/pq.h> + +static int rvv_has_vector(void) +{ + return has_vector(); +} + #define RAID6_RVV_WRAPPER(_n) \ static void raid6_rvv ## _n ## _gen_syndrome(int disks, \ size_t bytes, void **ptrs) \ -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace 2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang @ 2025-07-17 7:04 ` Alexandre Ghiti 2025-07-17 7:39 ` Chunyan Zhang 0 siblings, 1 reply; 14+ messages in thread From: Alexandre Ghiti @ 2025-07-17 7:04 UTC (permalink / raw) To: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang On 7/11/25 12:09, Chunyan Zhang wrote: > To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel > header inclusions also userspace wrapper definitions to allow code to be > compiled in userspace. > > This patch also drops the NSIZE macro, instead of using the vector length, > which can work for both kernel and user space. > > Signed-off-by: Chunyan Zhang<zhangchunyan@iscas.ac.cn> > --- > lib/raid6/recov_rvv.c | 7 +- > lib/raid6/rvv.c | 297 +++++++++++++++++++++--------------------- > lib/raid6/rvv.h | 17 +++ > 3 files changed, 170 insertions(+), 151 deletions(-) > > diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c > index 500da521a806..8f2be833c015 100644 > --- a/lib/raid6/recov_rvv.c > +++ b/lib/raid6/recov_rvv.c > @@ -4,13 +4,8 @@ > * Author: Chunyan Zhang<zhangchunyan@iscas.ac.cn> > */ > > -#include <asm/vector.h> > #include <linux/raid/pq.h> > - > -static int rvv_has_vector(void) > -{ > - return has_vector(); > -} > +#include "rvv.h" > > static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp, > u8 *dq, const u8 *pbmul, > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > index 015f3ee4da25..75c9dafedb28 100644 > --- a/lib/raid6/rvv.c > +++ b/lib/raid6/rvv.c > @@ -9,17 +9,8 @@ > * Copyright 2002-2004 H. Peter Anvin > */ > > -#include <asm/vector.h> > -#include <linux/raid/pq.h> > #include "rvv.h" > > -#define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ > - > -static int rvv_has_vector(void) > -{ > - return has_vector(); > -} > - > #ifdef __riscv_vector > #error "This code must be built without compiler support for vector" > #endif > @@ -28,7 +19,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = disks - 3; /* Highest data disk */ > @@ -42,8 +33,10 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > : "=&r" (vl) > ); > > + nsize = vl; > + > /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */ > - for (d = 0; d < bytes; d += NSIZE * 1) { > + for (d = 0; d < bytes; d += nsize * 1) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ You missed a few NSIZE in comments > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -51,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vmv.v.v v1, v0\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]) > ); > > for (z = z0 - 1 ; z >= 0 ; z--) { > @@ -75,7 +68,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vxor.vv v0, v0, v2\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -90,8 +83,8 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vse8.v v1, (%[wq0])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]) > ); > } > } > @@ -101,7 +94,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = stop; /* P/Q right side optimization */ > @@ -115,8 +108,10 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > : "=&r" (vl) > ); > > + nsize = vl; > + > /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */ > - for (d = 0 ; d < bytes ; d += NSIZE * 1) { > + for (d = 0 ; d < bytes ; d += nsize * 1) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -124,7 +119,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > "vmv.v.v v1, v0\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]) > ); > > /* P/Q data pages */ > @@ -149,7 +144,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > "vxor.vv v0, v0, v2\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -189,8 +184,8 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > "vse8.v v3, (%[wq0])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]) > ); > } > } > @@ -199,7 +194,7 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = disks - 3; /* Highest data disk */ > @@ -213,11 +208,13 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > : "=&r" (vl) > ); > > + nsize = vl; > + > /* > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > */ > - for (d = 0; d < bytes; d += NSIZE * 2) { > + for (d = 0; d < bytes; d += nsize * 2) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -227,8 +224,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vmv.v.v v5, v4\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > + [wp1]"r"(&dptr[z0][d + 1 * nsize]) > ); > > for (z = z0 - 1; z >= 0; z--) { > @@ -260,8 +257,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vxor.vv v4, v4, v6\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -278,10 +275,10 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vse8.v v5, (%[wq1])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]), > - [wp1]"r"(&p[d + NSIZE * 1]), > - [wq1]"r"(&q[d + NSIZE * 1]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]), > + [wp1]"r"(&p[d + nsize * 1]), > + [wq1]"r"(&q[d + nsize * 1]) > ); > } > } > @@ -291,7 +288,7 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = stop; /* P/Q right side optimization */ > @@ -305,11 +302,13 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > : "=&r" (vl) > ); > > + nsize = vl; > + > /* > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > */ > - for (d = 0; d < bytes; d += NSIZE * 2) { > + for (d = 0; d < bytes; d += nsize * 2) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -319,8 +318,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > "vmv.v.v v5, v4\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > + [wp1]"r"(&dptr[z0][d + 1 * nsize]) > ); > > /* P/Q data pages */ > @@ -353,8 +352,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > "vxor.vv v4, v4, v6\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -407,10 +406,10 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > "vse8.v v7, (%[wq1])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]), > - [wp1]"r"(&p[d + NSIZE * 1]), > - [wq1]"r"(&q[d + NSIZE * 1]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]), > + [wp1]"r"(&p[d + nsize * 1]), > + [wq1]"r"(&q[d + nsize * 1]) > ); > } > } > @@ -419,7 +418,7 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = disks - 3; /* Highest data disk */ > @@ -433,13 +432,15 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > : "=&r" (vl) > ); > > + nsize = vl; > + > /* > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12 > *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13 > */ > - for (d = 0; d < bytes; d += NSIZE * 4) { > + for (d = 0; d < bytes; d += nsize * 4) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -453,10 +454,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vmv.v.v v13, v12\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > + [wp3]"r"(&dptr[z0][d + 3 * nsize]) > ); > > for (z = z0 - 1; z >= 0; z--) { > @@ -504,10 +505,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vxor.vv v12, v12, v14\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -528,14 +529,14 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vse8.v v13, (%[wq3])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]), > - [wp1]"r"(&p[d + NSIZE * 1]), > - [wq1]"r"(&q[d + NSIZE * 1]), > - [wp2]"r"(&p[d + NSIZE * 2]), > - [wq2]"r"(&q[d + NSIZE * 2]), > - [wp3]"r"(&p[d + NSIZE * 3]), > - [wq3]"r"(&q[d + NSIZE * 3]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]), > + [wp1]"r"(&p[d + nsize * 1]), > + [wq1]"r"(&q[d + nsize * 1]), > + [wp2]"r"(&p[d + nsize * 2]), > + [wq2]"r"(&q[d + nsize * 2]), > + [wp3]"r"(&p[d + nsize * 3]), > + [wq3]"r"(&q[d + nsize * 3]) > ); > } > } > @@ -545,7 +546,7 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = stop; /* P/Q right side optimization */ > @@ -559,13 +560,15 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > : "=&r" (vl) > ); > > + nsize = vl; > + > /* > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12 > *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13 > */ > - for (d = 0; d < bytes; d += NSIZE * 4) { > + for (d = 0; d < bytes; d += nsize * 4) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -579,10 +582,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > "vmv.v.v v13, v12\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > + [wp3]"r"(&dptr[z0][d + 3 * nsize]) > ); > > /* P/Q data pages */ > @@ -631,10 +634,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > "vxor.vv v12, v12, v14\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -713,14 +716,14 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > "vse8.v v15, (%[wq3])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]), > - [wp1]"r"(&p[d + NSIZE * 1]), > - [wq1]"r"(&q[d + NSIZE * 1]), > - [wp2]"r"(&p[d + NSIZE * 2]), > - [wq2]"r"(&q[d + NSIZE * 2]), > - [wp3]"r"(&p[d + NSIZE * 3]), > - [wq3]"r"(&q[d + NSIZE * 3]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]), > + [wp1]"r"(&p[d + nsize * 1]), > + [wq1]"r"(&q[d + nsize * 1]), > + [wp2]"r"(&p[d + nsize * 2]), > + [wq2]"r"(&q[d + nsize * 2]), > + [wp3]"r"(&p[d + nsize * 3]), > + [wq3]"r"(&q[d + nsize * 3]) > ); > } > } > @@ -729,7 +732,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = disks - 3; /* Highest data disk */ > @@ -743,6 +746,8 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > : "=&r" (vl) > ); > > + nsize = vl; > + > /* > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > @@ -753,7 +758,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16 > *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17 > */ > - for (d = 0; d < bytes; d += NSIZE * 8) { > + for (d = 0; d < bytes; d += nsize * 8) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -775,14 +780,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vmv.v.v v29, v28\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), > - [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), > - [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), > - [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), > - [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > + [wp3]"r"(&dptr[z0][d + 3 * nsize]), > + [wp4]"r"(&dptr[z0][d + 4 * nsize]), > + [wp5]"r"(&dptr[z0][d + 5 * nsize]), > + [wp6]"r"(&dptr[z0][d + 6 * nsize]), > + [wp7]"r"(&dptr[z0][d + 7 * nsize]) > ); > > for (z = z0 - 1; z >= 0; z--) { > @@ -862,14 +867,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vxor.vv v28, v28, v30\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > - [wd4]"r"(&dptr[z][d + 4 * NSIZE]), > - [wd5]"r"(&dptr[z][d + 5 * NSIZE]), > - [wd6]"r"(&dptr[z][d + 6 * NSIZE]), > - [wd7]"r"(&dptr[z][d + 7 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > + [wd4]"r"(&dptr[z][d + 4 * nsize]), > + [wd5]"r"(&dptr[z][d + 5 * nsize]), > + [wd6]"r"(&dptr[z][d + 6 * nsize]), > + [wd7]"r"(&dptr[z][d + 7 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -898,22 +903,22 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > "vse8.v v29, (%[wq7])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]), > - [wp1]"r"(&p[d + NSIZE * 1]), > - [wq1]"r"(&q[d + NSIZE * 1]), > - [wp2]"r"(&p[d + NSIZE * 2]), > - [wq2]"r"(&q[d + NSIZE * 2]), > - [wp3]"r"(&p[d + NSIZE * 3]), > - [wq3]"r"(&q[d + NSIZE * 3]), > - [wp4]"r"(&p[d + NSIZE * 4]), > - [wq4]"r"(&q[d + NSIZE * 4]), > - [wp5]"r"(&p[d + NSIZE * 5]), > - [wq5]"r"(&q[d + NSIZE * 5]), > - [wp6]"r"(&p[d + NSIZE * 6]), > - [wq6]"r"(&q[d + NSIZE * 6]), > - [wp7]"r"(&p[d + NSIZE * 7]), > - [wq7]"r"(&q[d + NSIZE * 7]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]), > + [wp1]"r"(&p[d + nsize * 1]), > + [wq1]"r"(&q[d + nsize * 1]), > + [wp2]"r"(&p[d + nsize * 2]), > + [wq2]"r"(&q[d + nsize * 2]), > + [wp3]"r"(&p[d + nsize * 3]), > + [wq3]"r"(&q[d + nsize * 3]), > + [wp4]"r"(&p[d + nsize * 4]), > + [wq4]"r"(&q[d + nsize * 4]), > + [wp5]"r"(&p[d + nsize * 5]), > + [wq5]"r"(&q[d + nsize * 5]), > + [wp6]"r"(&p[d + nsize * 6]), > + [wq6]"r"(&q[d + nsize * 6]), > + [wp7]"r"(&p[d + nsize * 7]), > + [wq7]"r"(&q[d + nsize * 7]) > ); > } > } > @@ -923,7 +928,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > { > u8 **dptr = (u8 **)ptrs; > u8 *p, *q; > - unsigned long vl, d; > + unsigned long vl, d, nsize; > int z, z0; > > z0 = stop; /* P/Q right side optimization */ > @@ -937,6 +942,8 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > : "=&r" (vl) > ); > > + nsize = vl; > + > /* > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > @@ -947,7 +954,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16 > *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17 > */ > - for (d = 0; d < bytes; d += NSIZE * 8) { > + for (d = 0; d < bytes; d += nsize * 8) { > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > asm volatile (".option push\n" > ".option arch,+v\n" > @@ -969,14 +976,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > "vmv.v.v v29, v28\n" > ".option pop\n" > : : > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), > - [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), > - [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), > - [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), > - [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > + [wp3]"r"(&dptr[z0][d + 3 * nsize]), > + [wp4]"r"(&dptr[z0][d + 4 * nsize]), > + [wp5]"r"(&dptr[z0][d + 5 * nsize]), > + [wp6]"r"(&dptr[z0][d + 6 * nsize]), > + [wp7]"r"(&dptr[z0][d + 7 * nsize]) > ); > > /* P/Q data pages */ > @@ -1057,14 +1064,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > "vxor.vv v28, v28, v30\n" > ".option pop\n" > : : > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > - [wd4]"r"(&dptr[z][d + 4 * NSIZE]), > - [wd5]"r"(&dptr[z][d + 5 * NSIZE]), > - [wd6]"r"(&dptr[z][d + 6 * NSIZE]), > - [wd7]"r"(&dptr[z][d + 7 * NSIZE]), > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > + [wd4]"r"(&dptr[z][d + 4 * nsize]), > + [wd5]"r"(&dptr[z][d + 5 * nsize]), > + [wd6]"r"(&dptr[z][d + 6 * nsize]), > + [wd7]"r"(&dptr[z][d + 7 * nsize]), > [x1d]"r"(0x1d) > ); > } > @@ -1195,22 +1202,22 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > "vse8.v v31, (%[wq7])\n" > ".option pop\n" > : : > - [wp0]"r"(&p[d + NSIZE * 0]), > - [wq0]"r"(&q[d + NSIZE * 0]), > - [wp1]"r"(&p[d + NSIZE * 1]), > - [wq1]"r"(&q[d + NSIZE * 1]), > - [wp2]"r"(&p[d + NSIZE * 2]), > - [wq2]"r"(&q[d + NSIZE * 2]), > - [wp3]"r"(&p[d + NSIZE * 3]), > - [wq3]"r"(&q[d + NSIZE * 3]), > - [wp4]"r"(&p[d + NSIZE * 4]), > - [wq4]"r"(&q[d + NSIZE * 4]), > - [wp5]"r"(&p[d + NSIZE * 5]), > - [wq5]"r"(&q[d + NSIZE * 5]), > - [wp6]"r"(&p[d + NSIZE * 6]), > - [wq6]"r"(&q[d + NSIZE * 6]), > - [wp7]"r"(&p[d + NSIZE * 7]), > - [wq7]"r"(&q[d + NSIZE * 7]) > + [wp0]"r"(&p[d + nsize * 0]), > + [wq0]"r"(&q[d + nsize * 0]), > + [wp1]"r"(&p[d + nsize * 1]), > + [wq1]"r"(&q[d + nsize * 1]), > + [wp2]"r"(&p[d + nsize * 2]), > + [wq2]"r"(&q[d + nsize * 2]), > + [wp3]"r"(&p[d + nsize * 3]), > + [wq3]"r"(&q[d + nsize * 3]), > + [wp4]"r"(&p[d + nsize * 4]), > + [wq4]"r"(&q[d + nsize * 4]), > + [wp5]"r"(&p[d + nsize * 5]), > + [wq5]"r"(&q[d + nsize * 5]), > + [wp6]"r"(&p[d + nsize * 6]), > + [wq6]"r"(&q[d + nsize * 6]), > + [wp7]"r"(&p[d + nsize * 7]), > + [wq7]"r"(&q[d + nsize * 7]) > ); > } > } > diff --git a/lib/raid6/rvv.h b/lib/raid6/rvv.h > index 94044a1b707b..6d0708a2c8a4 100644 > --- a/lib/raid6/rvv.h > +++ b/lib/raid6/rvv.h > @@ -7,6 +7,23 @@ > * Definitions for RISC-V RAID-6 code > */ > > +#ifdef __KERNEL__ > +#include <asm/vector.h> > +#else > +#define kernel_vector_begin() > +#define kernel_vector_end() > +#include <sys/auxv.h> > +#include <asm/hwcap.h> > +#define has_vector() (getauxval(AT_HWCAP) & COMPAT_HWCAP_ISA_V) > +#endif > + > +#include <linux/raid/pq.h> > + > +static int rvv_has_vector(void) > +{ > + return has_vector(); > +} > + > #define RAID6_RVV_WRAPPER(_n) \ > static void raid6_rvv ## _n ## _gen_syndrome(int disks, \ > size_t bytes, void **ptrs) \ Otherwise, looks good: Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks, Alex ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace 2025-07-17 7:04 ` Alexandre Ghiti @ 2025-07-17 7:39 ` Chunyan Zhang 0 siblings, 0 replies; 14+ messages in thread From: Chunyan Zhang @ 2025-07-17 7:39 UTC (permalink / raw) To: Alexandre Ghiti Cc: Chunyan Zhang, Paul Walmsley, Palmer Dabbelt, Albert Ou, Charlie Jenkins, Song Liu, Yu Kuai, linux-riscv, linux-raid, linux-kernel Hi Alex, On Thu, 17 Jul 2025 at 15:04, Alexandre Ghiti <alex@ghiti.fr> wrote: > > On 7/11/25 12:09, Chunyan Zhang wrote: > > To support userspace raid6test, this patch adds __KERNEL__ ifdef for kernel > > header inclusions also userspace wrapper definitions to allow code to be > > compiled in userspace. > > > > This patch also drops the NSIZE macro, instead of using the vector length, > > which can work for both kernel and user space. > > > > Signed-off-by: Chunyan Zhang<zhangchunyan@iscas.ac.cn> > > --- > > lib/raid6/recov_rvv.c | 7 +- > > lib/raid6/rvv.c | 297 +++++++++++++++++++++--------------------- > > lib/raid6/rvv.h | 17 +++ > > 3 files changed, 170 insertions(+), 151 deletions(-) > > > > diff --git a/lib/raid6/recov_rvv.c b/lib/raid6/recov_rvv.c > > index 500da521a806..8f2be833c015 100644 > > --- a/lib/raid6/recov_rvv.c > > +++ b/lib/raid6/recov_rvv.c > > @@ -4,13 +4,8 @@ > > * Author: Chunyan Zhang<zhangchunyan@iscas.ac.cn> > > */ > > > > -#include <asm/vector.h> > > #include <linux/raid/pq.h> > > - > > -static int rvv_has_vector(void) > > -{ > > - return has_vector(); > > -} > > +#include "rvv.h" > > > > static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp, > > u8 *dq, const u8 *pbmul, > > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c > > index 015f3ee4da25..75c9dafedb28 100644 > > --- a/lib/raid6/rvv.c > > +++ b/lib/raid6/rvv.c > > @@ -9,17 +9,8 @@ > > * Copyright 2002-2004 H. Peter Anvin > > */ > > > > -#include <asm/vector.h> > > -#include <linux/raid/pq.h> > > #include "rvv.h" > > > > -#define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ > > - > > -static int rvv_has_vector(void) > > -{ > > - return has_vector(); > > -} > > - > > #ifdef __riscv_vector > > #error "This code must be built without compiler support for vector" > > #endif > > @@ -28,7 +19,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = disks - 3; /* Highest data disk */ > > @@ -42,8 +33,10 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */ > > - for (d = 0; d < bytes; d += NSIZE * 1) { > > + for (d = 0; d < bytes; d += nsize * 1) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > > You missed a few NSIZE in comments These comments come from int.uc and neon.uc. I left NSIZE in the comments on purpose, my thought was that would make this code more readable through matching to the int.uc or neon.uc :) > > > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -51,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vmv.v.v v1, v0\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]) > > ); > > > > for (z = z0 - 1 ; z >= 0 ; z--) { > > @@ -75,7 +68,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vxor.vv v0, v0, v2\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -90,8 +83,8 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vse8.v v1, (%[wq0])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]) > > ); > > } > > } > > @@ -101,7 +94,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = stop; /* P/Q right side optimization */ > > @@ -115,8 +108,10 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /*v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 */ > > - for (d = 0 ; d < bytes ; d += NSIZE * 1) { > > + for (d = 0 ; d < bytes ; d += nsize * 1) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -124,7 +119,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > > "vmv.v.v v1, v0\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]) > > ); > > > > /* P/Q data pages */ > > @@ -149,7 +144,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > > "vxor.vv v0, v0, v2\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -189,8 +184,8 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, > > "vse8.v v3, (%[wq0])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]) > > ); > > } > > } > > @@ -199,7 +194,7 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = disks - 3; /* Highest data disk */ > > @@ -213,11 +208,13 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /* > > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > > */ > > - for (d = 0; d < bytes; d += NSIZE * 2) { > > + for (d = 0; d < bytes; d += nsize * 2) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -227,8 +224,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vmv.v.v v5, v4\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > > + [wp1]"r"(&dptr[z0][d + 1 * nsize]) > > ); > > > > for (z = z0 - 1; z >= 0; z--) { > > @@ -260,8 +257,8 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vxor.vv v4, v4, v6\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -278,10 +275,10 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vse8.v v5, (%[wq1])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]), > > - [wp1]"r"(&p[d + NSIZE * 1]), > > - [wq1]"r"(&q[d + NSIZE * 1]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]), > > + [wp1]"r"(&p[d + nsize * 1]), > > + [wq1]"r"(&q[d + nsize * 1]) > > ); > > } > > } > > @@ -291,7 +288,7 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = stop; /* P/Q right side optimization */ > > @@ -305,11 +302,13 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /* > > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > > */ > > - for (d = 0; d < bytes; d += NSIZE * 2) { > > + for (d = 0; d < bytes; d += nsize * 2) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -319,8 +318,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > > "vmv.v.v v5, v4\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > > + [wp1]"r"(&dptr[z0][d + 1 * nsize]) > > ); > > > > /* P/Q data pages */ > > @@ -353,8 +352,8 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > > "vxor.vv v4, v4, v6\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -407,10 +406,10 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, > > "vse8.v v7, (%[wq1])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]), > > - [wp1]"r"(&p[d + NSIZE * 1]), > > - [wq1]"r"(&q[d + NSIZE * 1]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]), > > + [wp1]"r"(&p[d + nsize * 1]), > > + [wq1]"r"(&q[d + nsize * 1]) > > ); > > } > > } > > @@ -419,7 +418,7 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = disks - 3; /* Highest data disk */ > > @@ -433,13 +432,15 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /* > > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > > *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12 > > *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13 > > */ > > - for (d = 0; d < bytes; d += NSIZE * 4) { > > + for (d = 0; d < bytes; d += nsize * 4) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -453,10 +454,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vmv.v.v v13, v12\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > > + [wp3]"r"(&dptr[z0][d + 3 * nsize]) > > ); > > > > for (z = z0 - 1; z >= 0; z--) { > > @@ -504,10 +505,10 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vxor.vv v12, v12, v14\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -528,14 +529,14 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vse8.v v13, (%[wq3])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]), > > - [wp1]"r"(&p[d + NSIZE * 1]), > > - [wq1]"r"(&q[d + NSIZE * 1]), > > - [wp2]"r"(&p[d + NSIZE * 2]), > > - [wq2]"r"(&q[d + NSIZE * 2]), > > - [wp3]"r"(&p[d + NSIZE * 3]), > > - [wq3]"r"(&q[d + NSIZE * 3]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]), > > + [wp1]"r"(&p[d + nsize * 1]), > > + [wq1]"r"(&q[d + nsize * 1]), > > + [wp2]"r"(&p[d + nsize * 2]), > > + [wq2]"r"(&q[d + nsize * 2]), > > + [wp3]"r"(&p[d + nsize * 3]), > > + [wq3]"r"(&q[d + nsize * 3]) > > ); > > } > > } > > @@ -545,7 +546,7 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = stop; /* P/Q right side optimization */ > > @@ -559,13 +560,15 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /* > > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > > *v8:wp2,v9:wq2,v10:wd2/w22,v11:w12 > > *v12:wp3,v13:wq3,v14:wd3/w23,v15:w13 > > */ > > - for (d = 0; d < bytes; d += NSIZE * 4) { > > + for (d = 0; d < bytes; d += nsize * 4) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -579,10 +582,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > > "vmv.v.v v13, v12\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > > + [wp3]"r"(&dptr[z0][d + 3 * nsize]) > > ); > > > > /* P/Q data pages */ > > @@ -631,10 +634,10 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > > "vxor.vv v12, v12, v14\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -713,14 +716,14 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, > > "vse8.v v15, (%[wq3])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]), > > - [wp1]"r"(&p[d + NSIZE * 1]), > > - [wq1]"r"(&q[d + NSIZE * 1]), > > - [wp2]"r"(&p[d + NSIZE * 2]), > > - [wq2]"r"(&q[d + NSIZE * 2]), > > - [wp3]"r"(&p[d + NSIZE * 3]), > > - [wq3]"r"(&q[d + NSIZE * 3]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]), > > + [wp1]"r"(&p[d + nsize * 1]), > > + [wq1]"r"(&q[d + nsize * 1]), > > + [wp2]"r"(&p[d + nsize * 2]), > > + [wq2]"r"(&q[d + nsize * 2]), > > + [wp3]"r"(&p[d + nsize * 3]), > > + [wq3]"r"(&q[d + nsize * 3]) > > ); > > } > > } > > @@ -729,7 +732,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = disks - 3; /* Highest data disk */ > > @@ -743,6 +746,8 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /* > > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > > @@ -753,7 +758,7 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16 > > *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17 > > */ > > - for (d = 0; d < bytes; d += NSIZE * 8) { > > + for (d = 0; d < bytes; d += nsize * 8) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -775,14 +780,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vmv.v.v v29, v28\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), > > - [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), > > - [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), > > - [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), > > - [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > > + [wp3]"r"(&dptr[z0][d + 3 * nsize]), > > + [wp4]"r"(&dptr[z0][d + 4 * nsize]), > > + [wp5]"r"(&dptr[z0][d + 5 * nsize]), > > + [wp6]"r"(&dptr[z0][d + 6 * nsize]), > > + [wp7]"r"(&dptr[z0][d + 7 * nsize]) > > ); > > > > for (z = z0 - 1; z >= 0; z--) { > > @@ -862,14 +867,14 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vxor.vv v28, v28, v30\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > > - [wd4]"r"(&dptr[z][d + 4 * NSIZE]), > > - [wd5]"r"(&dptr[z][d + 5 * NSIZE]), > > - [wd6]"r"(&dptr[z][d + 6 * NSIZE]), > > - [wd7]"r"(&dptr[z][d + 7 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > > + [wd4]"r"(&dptr[z][d + 4 * nsize]), > > + [wd5]"r"(&dptr[z][d + 5 * nsize]), > > + [wd6]"r"(&dptr[z][d + 6 * nsize]), > > + [wd7]"r"(&dptr[z][d + 7 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -898,22 +903,22 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void ** > > "vse8.v v29, (%[wq7])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]), > > - [wp1]"r"(&p[d + NSIZE * 1]), > > - [wq1]"r"(&q[d + NSIZE * 1]), > > - [wp2]"r"(&p[d + NSIZE * 2]), > > - [wq2]"r"(&q[d + NSIZE * 2]), > > - [wp3]"r"(&p[d + NSIZE * 3]), > > - [wq3]"r"(&q[d + NSIZE * 3]), > > - [wp4]"r"(&p[d + NSIZE * 4]), > > - [wq4]"r"(&q[d + NSIZE * 4]), > > - [wp5]"r"(&p[d + NSIZE * 5]), > > - [wq5]"r"(&q[d + NSIZE * 5]), > > - [wp6]"r"(&p[d + NSIZE * 6]), > > - [wq6]"r"(&q[d + NSIZE * 6]), > > - [wp7]"r"(&p[d + NSIZE * 7]), > > - [wq7]"r"(&q[d + NSIZE * 7]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]), > > + [wp1]"r"(&p[d + nsize * 1]), > > + [wq1]"r"(&q[d + nsize * 1]), > > + [wp2]"r"(&p[d + nsize * 2]), > > + [wq2]"r"(&q[d + nsize * 2]), > > + [wp3]"r"(&p[d + nsize * 3]), > > + [wq3]"r"(&q[d + nsize * 3]), > > + [wp4]"r"(&p[d + nsize * 4]), > > + [wq4]"r"(&q[d + nsize * 4]), > > + [wp5]"r"(&p[d + nsize * 5]), > > + [wq5]"r"(&q[d + nsize * 5]), > > + [wp6]"r"(&p[d + nsize * 6]), > > + [wq6]"r"(&q[d + nsize * 6]), > > + [wp7]"r"(&p[d + nsize * 7]), > > + [wq7]"r"(&q[d + nsize * 7]) > > ); > > } > > } > > @@ -923,7 +928,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > { > > u8 **dptr = (u8 **)ptrs; > > u8 *p, *q; > > - unsigned long vl, d; > > + unsigned long vl, d, nsize; > > int z, z0; > > > > z0 = stop; /* P/Q right side optimization */ > > @@ -937,6 +942,8 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > : "=&r" (vl) > > ); > > > > + nsize = vl; > > + > > /* > > *v0:wp0,v1:wq0,v2:wd0/w20,v3:w10 > > *v4:wp1,v5:wq1,v6:wd1/w21,v7:w11 > > @@ -947,7 +954,7 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > *v24:wp6,v25:wq6,v26:wd6/w26,v27:w16 > > *v28:wp7,v29:wq7,v30:wd7/w27,v31:w17 > > */ > > - for (d = 0; d < bytes; d += NSIZE * 8) { > > + for (d = 0; d < bytes; d += nsize * 8) { > > /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ > > asm volatile (".option push\n" > > ".option arch,+v\n" > > @@ -969,14 +976,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > "vmv.v.v v29, v28\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), > > - [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), > > - [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), > > - [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), > > - [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), > > - [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), > > - [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), > > - [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) > > + [wp0]"r"(&dptr[z0][d + 0 * nsize]), > > + [wp1]"r"(&dptr[z0][d + 1 * nsize]), > > + [wp2]"r"(&dptr[z0][d + 2 * nsize]), > > + [wp3]"r"(&dptr[z0][d + 3 * nsize]), > > + [wp4]"r"(&dptr[z0][d + 4 * nsize]), > > + [wp5]"r"(&dptr[z0][d + 5 * nsize]), > > + [wp6]"r"(&dptr[z0][d + 6 * nsize]), > > + [wp7]"r"(&dptr[z0][d + 7 * nsize]) > > ); > > > > /* P/Q data pages */ > > @@ -1057,14 +1064,14 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > "vxor.vv v28, v28, v30\n" > > ".option pop\n" > > : : > > - [wd0]"r"(&dptr[z][d + 0 * NSIZE]), > > - [wd1]"r"(&dptr[z][d + 1 * NSIZE]), > > - [wd2]"r"(&dptr[z][d + 2 * NSIZE]), > > - [wd3]"r"(&dptr[z][d + 3 * NSIZE]), > > - [wd4]"r"(&dptr[z][d + 4 * NSIZE]), > > - [wd5]"r"(&dptr[z][d + 5 * NSIZE]), > > - [wd6]"r"(&dptr[z][d + 6 * NSIZE]), > > - [wd7]"r"(&dptr[z][d + 7 * NSIZE]), > > + [wd0]"r"(&dptr[z][d + 0 * nsize]), > > + [wd1]"r"(&dptr[z][d + 1 * nsize]), > > + [wd2]"r"(&dptr[z][d + 2 * nsize]), > > + [wd3]"r"(&dptr[z][d + 3 * nsize]), > > + [wd4]"r"(&dptr[z][d + 4 * nsize]), > > + [wd5]"r"(&dptr[z][d + 5 * nsize]), > > + [wd6]"r"(&dptr[z][d + 6 * nsize]), > > + [wd7]"r"(&dptr[z][d + 7 * nsize]), > > [x1d]"r"(0x1d) > > ); > > } > > @@ -1195,22 +1202,22 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, > > "vse8.v v31, (%[wq7])\n" > > ".option pop\n" > > : : > > - [wp0]"r"(&p[d + NSIZE * 0]), > > - [wq0]"r"(&q[d + NSIZE * 0]), > > - [wp1]"r"(&p[d + NSIZE * 1]), > > - [wq1]"r"(&q[d + NSIZE * 1]), > > - [wp2]"r"(&p[d + NSIZE * 2]), > > - [wq2]"r"(&q[d + NSIZE * 2]), > > - [wp3]"r"(&p[d + NSIZE * 3]), > > - [wq3]"r"(&q[d + NSIZE * 3]), > > - [wp4]"r"(&p[d + NSIZE * 4]), > > - [wq4]"r"(&q[d + NSIZE * 4]), > > - [wp5]"r"(&p[d + NSIZE * 5]), > > - [wq5]"r"(&q[d + NSIZE * 5]), > > - [wp6]"r"(&p[d + NSIZE * 6]), > > - [wq6]"r"(&q[d + NSIZE * 6]), > > - [wp7]"r"(&p[d + NSIZE * 7]), > > - [wq7]"r"(&q[d + NSIZE * 7]) > > + [wp0]"r"(&p[d + nsize * 0]), > > + [wq0]"r"(&q[d + nsize * 0]), > > + [wp1]"r"(&p[d + nsize * 1]), > > + [wq1]"r"(&q[d + nsize * 1]), > > + [wp2]"r"(&p[d + nsize * 2]), > > + [wq2]"r"(&q[d + nsize * 2]), > > + [wp3]"r"(&p[d + nsize * 3]), > > + [wq3]"r"(&q[d + nsize * 3]), > > + [wp4]"r"(&p[d + nsize * 4]), > > + [wq4]"r"(&q[d + nsize * 4]), > > + [wp5]"r"(&p[d + nsize * 5]), > > + [wq5]"r"(&q[d + nsize * 5]), > > + [wp6]"r"(&p[d + nsize * 6]), > > + [wq6]"r"(&q[d + nsize * 6]), > > + [wp7]"r"(&p[d + nsize * 7]), > > + [wq7]"r"(&q[d + nsize * 7]) > > ); > > } > > } > > diff --git a/lib/raid6/rvv.h b/lib/raid6/rvv.h > > index 94044a1b707b..6d0708a2c8a4 100644 > > --- a/lib/raid6/rvv.h > > +++ b/lib/raid6/rvv.h > > @@ -7,6 +7,23 @@ > > * Definitions for RISC-V RAID-6 code > > */ > > > > +#ifdef __KERNEL__ > > +#include <asm/vector.h> > > +#else > > +#define kernel_vector_begin() > > +#define kernel_vector_end() > > +#include <sys/auxv.h> > > +#include <asm/hwcap.h> > > +#define has_vector() (getauxval(AT_HWCAP) & COMPAT_HWCAP_ISA_V) > > +#endif > > + > > +#include <linux/raid/pq.h> > > + > > +static int rvv_has_vector(void) > > +{ > > + return has_vector(); > > +} > > + > > #define RAID6_RVV_WRAPPER(_n) \ > > static void raid6_rvv ## _n ## _gen_syndrome(int disks, \ > > size_t bytes, void **ptrs) \ > > > Otherwise, looks good: > > Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks, Chunyan ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2 5/5] raid6: test: Add support for RISC-V 2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang ` (3 preceding siblings ...) 2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang @ 2025-07-11 10:09 ` Chunyan Zhang 4 siblings, 0 replies; 14+ messages in thread From: Chunyan Zhang @ 2025-07-11 10:09 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti, Charlie Jenkins, Song Liu, Yu Kuai Cc: linux-riscv, linux-raid, linux-kernel, Chunyan Zhang From: Chunyan Zhang <zhang.lyra@gmail.com> Add RISC-V code to be compiled to allow the userspace raid6test program to be built and run on RISC-V. Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com> --- lib/raid6/test/Makefile | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/lib/raid6/test/Makefile b/lib/raid6/test/Makefile index 8f2dd2210ba8..09bbe2b14cce 100644 --- a/lib/raid6/test/Makefile +++ b/lib/raid6/test/Makefile @@ -35,6 +35,11 @@ ifeq ($(ARCH),aarch64) HAS_NEON = yes endif +ifeq ($(findstring riscv,$(ARCH)),riscv) + CFLAGS += -I../../../arch/riscv/include -DCONFIG_RISCV=1 + HAS_RVV = yes +endif + ifeq ($(findstring ppc,$(ARCH)),ppc) CFLAGS += -I../../../arch/powerpc/include HAS_ALTIVEC := $(shell printf '$(pound)include <altivec.h>\nvector int a;\n' |\ @@ -63,6 +68,9 @@ else ifeq ($(HAS_ALTIVEC),yes) vpermxor1.o vpermxor2.o vpermxor4.o vpermxor8.o else ifeq ($(ARCH),loongarch64) OBJS += loongarch_simd.o recov_loongarch_simd.o +else ifeq ($(HAS_RVV),yes) + OBJS += rvv.o recov_rvv.o + CFLAGS += -DCONFIG_RISCV_ISA_V=1 endif .c.o: -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-07-21 7:52 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang 2025-07-16 13:38 ` Alexandre Ghiti 2025-07-21 7:52 ` Nutty Liu 2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang 2025-07-16 13:40 ` Alexandre Ghiti 2025-07-17 2:16 ` Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang 2025-07-16 13:43 ` Alexandre Ghiti 2025-07-17 3:16 ` Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang 2025-07-17 7:04 ` Alexandre Ghiti 2025-07-17 7:39 ` Chunyan Zhang 2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).