From: Marc Zyngier <maz@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, broonie@kernel.org,
catalin.marinas@arm.com, eauger@redhat.com, fweimer@redhat.com,
jeremy.linton@arm.com, oliver.upton@linux.dev,
pbonzini@redhat.com, stable@vger.kernel.org,
wilco.dijkstra@arm.com, will@kernel.org
Subject: Re: [PATCH] KVM: arm64/sve: Ensure SVE is trapped after guest exit
Date: Tue, 21 Jan 2025 11:20:04 +0000 [thread overview]
Message-ID: <86r04wv2fv.wl-maz@kernel.org> (raw)
In-Reply-To: <20250121100026.3974971-1-mark.rutland@arm.com>
Hi Mark,
On Tue, 21 Jan 2025 10:00:26 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
>
> There is a period of time after returning from a KVM_RUN ioctl where
> userspace may use SVE without trapping, but the kernel can unexpectedly
> discard the live SVE state. Eric Auger has observed this causing QEMU
> crashes where SVE is used by memmove():
>
> https://issues.redhat.com/browse/RHEL-68997
>
> The only state discarded is the user SVE state of the task which issued
> the KVM_RUN ioctl. Other tasks are unaffected, plain FPSIMD state is
> unaffected, and kernel state is unaffected.
>
> This happens because fpsimd_kvm_prepare() incorrectly manipulates the
> FPSIMD/SVE state. When the vCPU is loaded, fpsimd_kvm_prepare()
> unconditionally clears TIF_SVE but does not reconfigure CPACR_EL1.ZEN to
> trap userspace SVE usage. If the vCPU does not use FPSIMD/SVE and hyp
> does not save the host's FPSIMD/SVE state, the kernel may return to
> userspace with TIF_SVE clear while SVE is still enabled in
> CPACR_EL1.ZEN. Subsequent userspace usage of SVE will not be trapped,
> and the next save of userspace FPSIMD/SVE state will only store the
> FPSIMD portion due to TIF_SVE being clear, discarding any SVE state.
>
> The broken logic was originally introduced in commit:
>
> 93ae6b01bafee8fa ("KVM: arm64: Discard any SVE state when entering KVM guests")
>
> ... though at the time fp_user_discard() would reconfigure CPACR_EL1.ZEN
> to trap subsequent SVE usage, masking the issue until that logic was
> removed in commit:
>
> 8c845e2731041f0f ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
>
> Avoid this issue by reconfiguring CPACR_EL1.ZEN when clearing
> TIF_SVE. At the same time, add a comment to explain why
> current->thread.fp_type must be set even though the FPSIMD state is not
> foreign. A similar issue exists when SME is enabled, and will require
> further rework. As SME currently depends on BROKEN, a BUILD_BUG() and
> comment are added for now, and this issue will need to be fixed properly
> in a follow-up patch.
>
> Commit 93ae6b01bafee8fa also introduced an unintended ptrace ABI change.
> Unconditionally clearing TIF_SVE regardless of whether the state is
> foreign discards saved SVE state created by ptrace after syscall entry.
> Avoid this by only clearing TIF_SVE when the FPSIMD/SVE state is not
> foreign. When the state is foreign, KVM hyp code does not need to save
> any host state, and so this will not affect KVM.
>
> There appear to be further issues with unintentional SVE state
> discarding, largely impacting ptrace and signal handling, which will
> need to be addressed in separate patches.
>
> Reported-by: Eric Auger <eauger@redhat.com>
> Reported-by: Wilco Dijkstra <wilco.dijkstra@arm.com>
> Cc: stable@vger.kernel.org
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Florian Weimer <fweimer@redhat.com>
> Cc: Jeremy Linton <jeremy.linton@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Will Deacon <will@kernel.org>
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> ---
> arch/arm64/kernel/fpsimd.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> I believe there are some other issues in this area, but I'm sending this
> out on its own because I beleive the other issues are more complex while
> this is self-contained, and people are actively hitting this case in
> production.
>
> I intend to follow-up with fixes for the other cases I mention in the
> commit message, and for the SME case with the BUILD_BUG_ON().
>
> Mark.
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 8c4c1a2186cc5..e4053a90ed240 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -1711,8 +1711,24 @@ void fpsimd_kvm_prepare(void)
> */
> get_cpu_fpsimd_context();
>
> - if (test_and_clear_thread_flag(TIF_SVE)) {
> - sve_to_fpsimd(current);
> + if (!test_thread_flag(TIF_FOREIGN_FPSTATE) &&
> + test_and_clear_thread_flag(TIF_SVE)) {
> + sve_user_disable();
I'm pretty happy with this fix. However...
> +
> + /*
> + * The KVM hyp code doesn't set fp_type when saving the host's
> + * FPSIMD state. Set fp_type here in case the hyp code saves
> + * the host state.
Should KVM do that? The comment seems to indicate that this is
papering over yet another bug...
> + *
> + * If hyp code does not save the host state, then the host
> + * state remains live on the CPU and saved fp_type is
> + * irrelevant until it is overwritten by a later call to
> + * fpsimd_save_user_state().
I'm not sure I understand this. If fp_type is irrelevant, surely it is
*forever* irrelevant, not until something else happens. Or am I
missing something?
> + *
> + * This is *NOT* sufficient when CONFIG_ARM64_SME=y, where
> + * fp_type can be FP_STATE_SVE regardless of TIF_SVE.
> + */
> + BUILD_BUG_ON(IS_ENABLED(CONFIG_ARM64_SME));
I'd rather not have this build-time failure, as this is very likely to
annoy a lot of people. Instead, just make SME unselectable with KVM:
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 100570a048c5e..88bedf95a3662 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2271,6 +2271,7 @@ config ARM64_SME
default y
depends on ARM64_SVE
depends on BROKEN
+ depends on !KVM
help
The Scalable Matrix Extension (SME) is an extension to the AArch64
execution state which utilises a substantial subset of the SVE
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2025-01-21 11:21 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-21 10:00 [PATCH] KVM: arm64/sve: Ensure SVE is trapped after guest exit Mark Rutland
2025-01-21 11:20 ` Marc Zyngier [this message]
2025-01-21 13:33 ` Mark Brown
2025-01-21 15:37 ` Mark Rutland
2025-01-22 11:46 ` Marc Zyngier
2025-01-22 11:55 ` Mark Rutland
2025-01-21 13:59 ` Mark Brown
2025-01-21 15:32 ` Eric Auger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86r04wv2fv.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=broonie@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=eauger@redhat.com \
--cc=fweimer@redhat.com \
--cc=jeremy.linton@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
--cc=wilco.dijkstra@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).