From: Drew Fustini <fustini@kernel.org>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: rkrcmar@ventanamicro.com, Bjorn Topel <bjorn@rivosinc.com>,
Alexandre Ghiti <alex@ghiti.fr>,
Paul Walmsley <paul.walmsley@sifive.com>,
samuel.holland@sifive.com, dfustini@tenstorrent.com,
andybnac@gmail.com, Conor Dooley <conor.dooley@microchip.com>,
linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
linux-riscv-bounces@lists.infradead.org
Subject: Re: [PATCH] riscv: Add sysctl to control discard of vstate during syscall
Date: Fri, 1 Aug 2025 14:41:51 -0700 [thread overview]
Message-ID: <aI00nzzma4gXrmh/@x1> (raw)
In-Reply-To: <mhng-E49DDC7D-A330-4626-A122-4146AADDBB33@Palmers-Mini.rwc.dabbelt.com>
On Wed, Jul 30, 2025 at 06:05:59PM -0700, Palmer Dabbelt wrote:
> My first guess here would be that trashing the V register state is still
> faster on the machines that triggered this patch, it's just that the way
> we're trashing it is slow. We're doing some wacky things in there (VILL,
> LMUL, clearing to -1), so it's not surprising that some implementations are
> slow on these routines.
>
> This came up during the original patch and we decided to just go with this
> way (which is recommended by the ISA) until someone could demonstrate it's
> slow, so sounds like it's time to go revisit those.
>
> So I'd start with something like
>
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index b61786d43c20..1fba33e62d2b 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -287,7 +287,6 @@ static inline void __riscv_v_vstate_discard(void)
> "vmv.v.i v8, -1\n\t"
> "vmv.v.i v16, -1\n\t"
> "vmv.v.i v24, -1\n\t"
> - "vsetvl %0, x0, %1\n\t"
> ".option pop\n\t"
> : "=&r" (vl) : "r" (vtype_inval));
>
> to try and see if we're tripping over bad implementation behavior, in which
> case we can just hide this all in the kernel. Then we can split out these
> performance issues from other things like lazy save/restore and a
> V-preserving uABI, as it stands this is all sort of getting mixed up.
Thank you for your insights and the suggestion of removing vsetvl.
Using our v6.16-rc1 branch [1], the avg duration of getppid() is 198 ns
with the existing upstream behavior in __riscv_v_vstate_discard():
debian@tt-blackhole:~$ ./null_syscall --vsetvli
vsetvli complete
iterations: 1000000000
duration: 198 seconds
avg latency: 198.10 ns
I removed 'vsetvl' as you suggested but the average duration only
decreased a very small amount to 197.5 ns, so it seems that the other
instructions are what is taking a lot of time on the X280 cores:
debian@tt-blackhole:~$ ./null_syscall --vsetvli
vsetvli complete
iterations: 1000000000
duration: 197 seconds
avg latency: 197.53 ns
This is compared to a duration of 150 ns when using this patch with
abi.riscv_v_vstate_discard=0 which skips all the clobbering assembly.
Do you have any other suggestions for the __riscv_v_vstate_discard()
inline assembly that might be worth me testing on the X280 cores?
Thanks,
Drew
[1] https://github.com/tenstorrent/linux/tree/tt-blackhole-v6.16-rc1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
WARNING: multiple messages have this Message-ID (diff)
From: Drew Fustini <fustini@kernel.org>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: rkrcmar@ventanamicro.com, Bjorn Topel <bjorn@rivosinc.com>,
Alexandre Ghiti <alex@ghiti.fr>,
Paul Walmsley <paul.walmsley@sifive.com>,
samuel.holland@sifive.com, dfustini@tenstorrent.com,
andybnac@gmail.com, Conor Dooley <conor.dooley@microchip.com>,
linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
linux-riscv-bounces@lists.infradead.org
Subject: Re: [PATCH] riscv: Add sysctl to control discard of vstate during syscall
Date: Fri, 1 Aug 2025 14:41:51 -0700 [thread overview]
Message-ID: <aI00nzzma4gXrmh/@x1> (raw)
In-Reply-To: <mhng-E49DDC7D-A330-4626-A122-4146AADDBB33@Palmers-Mini.rwc.dabbelt.com>
On Wed, Jul 30, 2025 at 06:05:59PM -0700, Palmer Dabbelt wrote:
> My first guess here would be that trashing the V register state is still
> faster on the machines that triggered this patch, it's just that the way
> we're trashing it is slow. We're doing some wacky things in there (VILL,
> LMUL, clearing to -1), so it's not surprising that some implementations are
> slow on these routines.
>
> This came up during the original patch and we decided to just go with this
> way (which is recommended by the ISA) until someone could demonstrate it's
> slow, so sounds like it's time to go revisit those.
>
> So I'd start with something like
>
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index b61786d43c20..1fba33e62d2b 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -287,7 +287,6 @@ static inline void __riscv_v_vstate_discard(void)
> "vmv.v.i v8, -1\n\t"
> "vmv.v.i v16, -1\n\t"
> "vmv.v.i v24, -1\n\t"
> - "vsetvl %0, x0, %1\n\t"
> ".option pop\n\t"
> : "=&r" (vl) : "r" (vtype_inval));
>
> to try and see if we're tripping over bad implementation behavior, in which
> case we can just hide this all in the kernel. Then we can split out these
> performance issues from other things like lazy save/restore and a
> V-preserving uABI, as it stands this is all sort of getting mixed up.
Thank you for your insights and the suggestion of removing vsetvl.
Using our v6.16-rc1 branch [1], the avg duration of getppid() is 198 ns
with the existing upstream behavior in __riscv_v_vstate_discard():
debian@tt-blackhole:~$ ./null_syscall --vsetvli
vsetvli complete
iterations: 1000000000
duration: 198 seconds
avg latency: 198.10 ns
I removed 'vsetvl' as you suggested but the average duration only
decreased a very small amount to 197.5 ns, so it seems that the other
instructions are what is taking a lot of time on the X280 cores:
debian@tt-blackhole:~$ ./null_syscall --vsetvli
vsetvli complete
iterations: 1000000000
duration: 197 seconds
avg latency: 197.53 ns
This is compared to a duration of 150 ns when using this patch with
abi.riscv_v_vstate_discard=0 which skips all the clobbering assembly.
Do you have any other suggestions for the __riscv_v_vstate_discard()
inline assembly that might be worth me testing on the X280 cores?
Thanks,
Drew
[1] https://github.com/tenstorrent/linux/tree/tt-blackhole-v6.16-rc1
next prev parent reply other threads:[~2025-08-01 21:42 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-19 3:39 [PATCH] riscv: Add sysctl to control discard of vstate during syscall Drew Fustini
2025-07-19 3:39 ` Drew Fustini
2025-07-21 12:13 ` Darius Rad
2025-07-21 12:13 ` Darius Rad
2025-07-21 20:59 ` Drew Fustini
2025-07-21 20:59 ` Drew Fustini
2025-07-21 21:28 ` Drew Fustini
2025-07-21 21:28 ` Drew Fustini
2025-07-21 12:35 ` Radim Krčmář
2025-07-21 12:35 ` Radim Krčmář
2025-07-21 14:54 ` Radim Krčmář
2025-07-21 14:54 ` Radim Krčmář
2025-07-21 21:20 ` Drew Fustini
2025-07-21 21:20 ` Drew Fustini
2025-07-31 1:05 ` Palmer Dabbelt
2025-07-31 1:05 ` Palmer Dabbelt
2025-07-31 12:24 ` Radim Krčmář
2025-07-31 12:24 ` Radim Krčmář
2025-08-01 21:41 ` Drew Fustini [this message]
2025-08-01 21:41 ` Drew Fustini
2025-08-05 18:51 ` Drew Fustini
2025-08-05 18:51 ` Drew Fustini
2025-07-21 21:16 ` Drew Fustini
2025-07-21 21:16 ` Drew Fustini
2025-07-27 17:29 ` Drew Fustini
2025-07-27 17:29 ` Drew Fustini
2025-07-23 21:55 ` Vivian Wang
2025-07-23 21:55 ` Vivian Wang
2025-07-25 10:18 ` Radim Krčmář
2025-07-25 10:18 ` Radim Krčmář
2025-07-25 15:01 ` Vivian Wang
2025-07-25 15:01 ` Vivian Wang
2025-07-25 18:47 ` Radim Krčmář
2025-07-25 18:47 ` Radim Krčmář
2025-07-26 18:37 ` Drew Fustini
2025-07-26 18:37 ` Drew Fustini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aI00nzzma4gXrmh/@x1 \
--to=fustini@kernel.org \
--cc=alex@ghiti.fr \
--cc=andybnac@gmail.com \
--cc=bjorn@rivosinc.com \
--cc=conor.dooley@microchip.com \
--cc=dfustini@tenstorrent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv-bounces@lists.infradead.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=rkrcmar@ventanamicro.com \
--cc=samuel.holland@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.