From: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Brown <broonie@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>,
Zhang Lei <zhang.lei@jp.fujitsu.com>,
Andre Przywara <andre.przywara@arm.com>,
Will Deacon <will@kernel.org>,
kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 7/7] arm64/sve: Don't zero non-FPSIMD register state on syscall by default
Date: Tue, 19 Jul 2022 18:35:37 +0100 [thread overview]
Message-ID: <YtbraaDE0eJhRHkx@arm.com> (raw)
In-Reply-To: <20220620124158.482039-8-broonie@kernel.org>
Hi Mark,
On Mon, Jun 20, 2022 at 01:41:58PM +0100, Mark Brown wrote:
> The documented syscall ABI specifies that the SVE state not shared with
> FPSIMD is undefined after a syscall. Currently we implement this by
> always flushing this register state to zero, ensuring consistent
> behaviour but introducing some overhead in the case where we can return
> directly to userspace without otherwise needing to update the register
> state. Take advantage of the flexibility offered by the documented ABI
> and instead leave the SVE registers untouched in the case where can
> return directly to userspace.
Do you have some rough numbers to quantify the gain? I suspect the
vector length doesn't matter much.
Where does the zeroing happen now? IIRC it's only done on a subsequent
trap to SVE and that's a lot more expensive (unless the code has changed
since last time I looked).
So if it's the actual subsequent trap that adds the overhead, maybe
zeroing the regs while leaving TIF_SVE on won't be that bad.
> Since this is a user visible change a new sysctl abi.sve_syscall_clear_regs
> is provided which will restore the current behaviour of flushing the
> unshared register state unconditionally when enabled. This can be
> enabled for testing or to work around problems with applications that
> have been relying on the current flushing behaviour.
>
> The sysctl is disabled by default since it is anticipated that the risk
> of disruption to userspace is low. As well as being within the
> documented ABI this new behaviour mirrors the standard function call ABI
> for SVE in the AAPCS which should mean that compiler generated code is
> unlikely to rely on the current behaviour, the main risk is from hand
> coded assembly which directly invokes syscalls. The new behaviour is
> also what is currently implemented by qemu user mode emulation.
IIRC both Will and Mark R commented in the past that they'd like the
current de-facto ABI to become the official one. I'll let them comment.
> @@ -183,7 +217,7 @@ static inline void fp_user_discard(void)
> if (!system_supports_sve())
> return;
>
> - if (test_thread_flag(TIF_SVE)) {
> + if (sve_syscall_regs_clear && test_thread_flag(TIF_SVE)) {
> unsigned int sve_vq_minus_one;
>
> sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
If we leave TIF_SVE on, does it mean that we incur an overhead on
context switching? E.g. something like hackbench with lots of syscalls
communicating between threads would unnecessarily context switch the SVE
state. Maybe there's something handling this but IIUC fpsimd_save()
seems to only check TIF_SVE.
--
Catalin
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Zhang Lei <zhang.lei@jp.fujitsu.com>,
James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Andre Przywara <andre.przywara@arm.com>,
kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org,
Mark Rutland <mark.rutland@arm.com>
Subject: Re: [PATCH v2 7/7] arm64/sve: Don't zero non-FPSIMD register state on syscall by default
Date: Tue, 19 Jul 2022 18:35:37 +0100 [thread overview]
Message-ID: <YtbraaDE0eJhRHkx@arm.com> (raw)
In-Reply-To: <20220620124158.482039-8-broonie@kernel.org>
Hi Mark,
On Mon, Jun 20, 2022 at 01:41:58PM +0100, Mark Brown wrote:
> The documented syscall ABI specifies that the SVE state not shared with
> FPSIMD is undefined after a syscall. Currently we implement this by
> always flushing this register state to zero, ensuring consistent
> behaviour but introducing some overhead in the case where we can return
> directly to userspace without otherwise needing to update the register
> state. Take advantage of the flexibility offered by the documented ABI
> and instead leave the SVE registers untouched in the case where can
> return directly to userspace.
Do you have some rough numbers to quantify the gain? I suspect the
vector length doesn't matter much.
Where does the zeroing happen now? IIRC it's only done on a subsequent
trap to SVE and that's a lot more expensive (unless the code has changed
since last time I looked).
So if it's the actual subsequent trap that adds the overhead, maybe
zeroing the regs while leaving TIF_SVE on won't be that bad.
> Since this is a user visible change a new sysctl abi.sve_syscall_clear_regs
> is provided which will restore the current behaviour of flushing the
> unshared register state unconditionally when enabled. This can be
> enabled for testing or to work around problems with applications that
> have been relying on the current flushing behaviour.
>
> The sysctl is disabled by default since it is anticipated that the risk
> of disruption to userspace is low. As well as being within the
> documented ABI this new behaviour mirrors the standard function call ABI
> for SVE in the AAPCS which should mean that compiler generated code is
> unlikely to rely on the current behaviour, the main risk is from hand
> coded assembly which directly invokes syscalls. The new behaviour is
> also what is currently implemented by qemu user mode emulation.
IIRC both Will and Mark R commented in the past that they'd like the
current de-facto ABI to become the official one. I'll let them comment.
> @@ -183,7 +217,7 @@ static inline void fp_user_discard(void)
> if (!system_supports_sve())
> return;
>
> - if (test_thread_flag(TIF_SVE)) {
> + if (sve_syscall_regs_clear && test_thread_flag(TIF_SVE)) {
> unsigned int sve_vq_minus_one;
>
> sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
If we leave TIF_SVE on, does it mean that we incur an overhead on
context switching? E.g. something like hackbench with lots of syscalls
communicating between threads would unnecessarily context switch the SVE
state. Maybe there's something handling this but IIUC fpsimd_save()
seems to only check TIF_SVE.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-07-19 17:35 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-20 12:41 [PATCH v2 0/7] arm64/sve: Clean up KVM integration and optimise syscalls Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 1/7] KVM: arm64: Discard any SVE state when entering KVM guests Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 2/7] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-07-11 9:40 ` Marc Zyngier
2022-07-11 9:40 ` Marc Zyngier
2022-07-11 11:39 ` Mark Brown
2022-07-11 11:39 ` Mark Brown
2022-07-11 14:33 ` Marc Zyngier
2022-07-11 14:33 ` Marc Zyngier
2022-07-11 15:53 ` Mark Brown
2022-07-11 15:53 ` Mark Brown
2022-07-20 9:40 ` Marc Zyngier
2022-07-20 9:40 ` Marc Zyngier
2022-07-20 13:51 ` Mark Brown
2022-07-20 13:51 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 3/7] arm64/fpsimd: Have KVM explicitly say which FP registers to save Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 4/7] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 5/7] arm64/fpsimd: Load FP state based on recorded data type Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 6/7] arm64/sve: Leave SVE enabled on syscall if we don't context switch Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-06-20 12:41 ` [PATCH v2 7/7] arm64/sve: Don't zero non-FPSIMD register state on syscall by default Mark Brown
2022-06-20 12:41 ` Mark Brown
2022-07-19 17:35 ` Catalin Marinas [this message]
2022-07-19 17:35 ` Catalin Marinas
2022-07-19 19:35 ` Mark Brown
2022-07-19 19:35 ` Mark Brown
2022-07-20 9:20 ` Will Deacon
2022-07-20 9:20 ` Will Deacon
2022-07-20 12:32 ` Mark Brown
2022-07-20 12:32 ` Mark Brown
2022-07-20 9:29 ` Marc Zyngier
2022-07-20 9:29 ` Marc Zyngier
2022-07-20 14:31 ` Mark Brown
2022-07-20 14:31 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YtbraaDE0eJhRHkx@arm.com \
--to=catalin.marinas@arm.com \
--cc=andre.przywara@arm.com \
--cc=broonie@kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=will@kernel.org \
--cc=zhang.lei@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.