From: Chunyan Zhang <zhang.lyra@gmail.com>
To: Alexandre Ghiti <alex@ghiti.fr>
Cc: Chunyan Zhang <zhangchunyan@iscas.ac.cn>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Charlie Jenkins <charlie@rivosinc.com>,
Song Liu <song@kernel.org>, Yu Kuai <yukuai3@huawei.com>,
linux-riscv@lists.infradead.org, linux-raid@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation
Date: Thu, 17 Jul 2025 10:16:46 +0800 [thread overview]
Message-ID: <CAAfSe-snJ3Z_p0UyS85AMiPWsCo976XAJREGN3V_UgisOKG3Sg@mail.gmail.com> (raw)
In-Reply-To: <8865f36d-c8a9-454b-aa55-741a82ca96b4@ghiti.fr>
On Wed, 16 Jul 2025 at 21:40, Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> On 7/11/25 12:09, Chunyan Zhang wrote:
> > Since wp$$==wq$$, it doesn't need to load the same data twice, use move
> > instruction to replace one of the loads to let the program run faster.
> >
> > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> > ---
> > lib/raid6/rvv.c | 60 ++++++++++++++++++++++++-------------------------
> > 1 file changed, 30 insertions(+), 30 deletions(-)
> >
> > diff --git a/lib/raid6/rvv.c b/lib/raid6/rvv.c
> > index b193ea176d5d..89da5fc247aa 100644
> > --- a/lib/raid6/rvv.c
> > +++ b/lib/raid6/rvv.c
> > @@ -44,7 +44,7 @@ static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> > @@ -117,7 +117,7 @@ static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop,
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE])
> > @@ -218,9 +218,9 @@ static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > "vle8.v v4, (%[wp1])\n"
> > - "vle8.v v5, (%[wp1])\n"
> > + "vmv.v.v v5, v4\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -310,9 +310,9 @@ static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop,
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > "vle8.v v4, (%[wp1])\n"
> > - "vle8.v v5, (%[wp1])\n"
> > + "vmv.v.v v5, v4\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -440,13 +440,13 @@ static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > "vle8.v v4, (%[wp1])\n"
> > - "vle8.v v5, (%[wp1])\n"
> > + "vmv.v.v v5, v4\n"
> > "vle8.v v8, (%[wp2])\n"
> > - "vle8.v v9, (%[wp2])\n"
> > + "vmv.v.v v9, v8\n"
> > "vle8.v v12, (%[wp3])\n"
> > - "vle8.v v13, (%[wp3])\n"
> > + "vmv.v.v v13, v12\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -566,13 +566,13 @@ static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop,
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > "vle8.v v4, (%[wp1])\n"
> > - "vle8.v v5, (%[wp1])\n"
> > + "vmv.v.v v5, v4\n"
> > "vle8.v v8, (%[wp2])\n"
> > - "vle8.v v9, (%[wp2])\n"
> > + "vmv.v.v v9, v8\n"
> > "vle8.v v12, (%[wp3])\n"
> > - "vle8.v v13, (%[wp3])\n"
> > + "vmv.v.v v13, v12\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -754,21 +754,21 @@ static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > "vle8.v v4, (%[wp1])\n"
> > - "vle8.v v5, (%[wp1])\n"
> > + "vmv.v.v v5, v4\n"
> > "vle8.v v8, (%[wp2])\n"
> > - "vle8.v v9, (%[wp2])\n"
> > + "vmv.v.v v9, v8\n"
> > "vle8.v v12, (%[wp3])\n"
> > - "vle8.v v13, (%[wp3])\n"
> > + "vmv.v.v v13, v12\n"
> > "vle8.v v16, (%[wp4])\n"
> > - "vle8.v v17, (%[wp4])\n"
> > + "vmv.v.v v17, v16\n"
> > "vle8.v v20, (%[wp5])\n"
> > - "vle8.v v21, (%[wp5])\n"
> > + "vmv.v.v v21, v20\n"
> > "vle8.v v24, (%[wp6])\n"
> > - "vle8.v v25, (%[wp6])\n"
> > + "vmv.v.v v25, v24\n"
> > "vle8.v v28, (%[wp7])\n"
> > - "vle8.v v29, (%[wp7])\n"
> > + "vmv.v.v v29, v28\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
> > @@ -948,21 +948,21 @@ static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop,
> > asm volatile (".option push\n"
> > ".option arch,+v\n"
> > "vle8.v v0, (%[wp0])\n"
> > - "vle8.v v1, (%[wp0])\n"
> > + "vmv.v.v v1, v0\n"
> > "vle8.v v4, (%[wp1])\n"
> > - "vle8.v v5, (%[wp1])\n"
> > + "vmv.v.v v5, v4\n"
> > "vle8.v v8, (%[wp2])\n"
> > - "vle8.v v9, (%[wp2])\n"
> > + "vmv.v.v v9, v8\n"
> > "vle8.v v12, (%[wp3])\n"
> > - "vle8.v v13, (%[wp3])\n"
> > + "vmv.v.v v13, v12\n"
> > "vle8.v v16, (%[wp4])\n"
> > - "vle8.v v17, (%[wp4])\n"
> > + "vmv.v.v v17, v16\n"
> > "vle8.v v20, (%[wp5])\n"
> > - "vle8.v v21, (%[wp5])\n"
> > + "vmv.v.v v21, v20\n"
> > "vle8.v v24, (%[wp6])\n"
> > - "vle8.v v25, (%[wp6])\n"
> > + "vmv.v.v v25, v24\n"
> > "vle8.v v28, (%[wp7])\n"
> > - "vle8.v v29, (%[wp7])\n"
> > + "vmv.v.v v29, v28\n"
> > ".option pop\n"
> > : :
> > [wp0]"r"(&dptr[z0][d + 0 * NSIZE]),
>
>
> Out of curiosity, did you notice a gain?
Yes, I can see ~3% gain on my BPI-F3.
>
> Anyway:
>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>
> Thanks,
>
> Alex
>
next prev parent reply other threads:[~2025-07-17 2:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-11 10:09 [PATCH V2 0/5] Add an optimization also raid6test for RISC-V support Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 1/5] raid6: riscv: Clean up unused header file inclusion Chunyan Zhang
2025-07-16 13:38 ` Alexandre Ghiti
2025-07-21 7:52 ` Nutty Liu
2025-07-11 10:09 ` [PATCH V2 2/5] raid6: riscv: replace one load with a move to speed up the caculation Chunyan Zhang
2025-07-16 13:40 ` Alexandre Ghiti
2025-07-17 2:16 ` Chunyan Zhang [this message]
2025-07-11 10:09 ` [PATCH V2 3/5] raid6: riscv: Add a compiler error Chunyan Zhang
2025-07-16 13:43 ` Alexandre Ghiti
2025-07-17 3:16 ` Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 4/5] raid6: riscv: Allow code to be compiled in userspace Chunyan Zhang
2025-07-17 7:04 ` Alexandre Ghiti
2025-07-17 7:39 ` Chunyan Zhang
2025-07-11 10:09 ` [PATCH V2 5/5] raid6: test: Add support for RISC-V Chunyan Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAAfSe-snJ3Z_p0UyS85AMiPWsCo976XAJREGN3V_UgisOKG3Sg@mail.gmail.com \
--to=zhang.lyra@gmail.com \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=charlie@rivosinc.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=song@kernel.org \
--cc=yukuai3@huawei.com \
--cc=zhangchunyan@iscas.ac.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).