From: Yury Khrustalev <yury.khrustalev@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>, <broonie@kernel.org>,
<catalin.marinas@arm.com>, <daniel.kiss@arm.com>,
<david.spickett@arm.com>, <luis.machado@arm.com>,
<maz@kernel.org>, <richard.sandiford@arm.com>,
<sander.desmalen@arm.com>, <tabba@google.com>,
<tamas.petz@arm.com>, <tkjos@google.com>, <will@kernel.org>
Subject: Re: [PATCH 13/20] arm64/fpsimd: Make clone() compatible with ZA lazy saving
Date: Wed, 7 May 2025 16:57:18 +0100 [thread overview]
Message-ID: <aBuC3ncUQeq-Ki5p@arm.com> (raw)
In-Reply-To: <20250506152523.1107431-14-mark.rutland@arm.com>
On Tue, May 06, 2025 at 04:25:16PM +0100, Mark Rutland wrote:
> Linux is intended to be compatible with userspace written to Arm's
> AAPCS64 procedure call standard [1,2]. For the Scalable Matrix Extension
> (SME), AAPCS64 was extended with a "ZA lazy saving scheme", where SME's
> ZA tile is lazily callee-saved and caller-restored. In this scheme,
> TPIDR2_EL0 indicates whether the ZA tile is live or has been saved by
> pointing to a "TPIDR2 block" in memory, which has a "za_save_buffer"
> pointer. This scheme has been implemented in GCC and LLVM, with
> necessary runtime support implemented in glibc and bionic.
>
> AAPCS64 does not specify how the ZA lazy saving scheme is expected to
> interact with thread creation mechanisms such as fork() and
> pthread_create(), which would be implemented in terms of the Linux clone
> syscall. The behaviour implemented by Linux and glibc/bionic doesn't
> always compose safely, as explained below.
>
> Currently the clone syscall is implemented such that PSTATE.ZA and the
> ZA tile are always inherited by the new task, and TPIDR2_EL0 is
> inherited unless the 'flags' argument includes CLONE_SETTLS,
> in which case TPIDR2_EL0 is set to 0/NULL. This doesn't make much sense:
>
> (a) TPIDR2_EL0 is part of the calling convention, and changes as control
> is passed between functions. It is *NOT* used for thread local
> storage, despite superficial similarity to TPIDR_EL0, which is is
> used as the TLS register.
>
> (b) TPIDR2_EL0 and PSTATE.ZA are tightly coupled in the procedure call
> standard, and some combinations of states are illegal. In general,
> manipulating the two independently is not guaranteed to be safe.
>
> In practice, code which is compliant with the procedure call standard
> may issue a clone syscall while in the "ZA dormant" state, where
> PSTATE.ZA==1 and TPIDR2_EL0 is non-null and indicates that ZA needs to
> be saved. This can cause a variety of problems, including:
>
> * If the implementation of pthread_create() passes CLONE_SETTLS, the
> new thread will start with PSTATE.ZA==1 and TPIDR2==NULL. Per the
> procedure call standard this is not a legitimate state for most
> functions. This can cause data corruption (e.g. as code may rely on
> PSTATE.ZA being 0 to guarantee that an SMSTART ZA instruction will
> zero the ZA tile contents), and may result in other undefined
> behaviour.
>
> * If the implementation of pthread_create() does not pass CLONE_SETTLS, the
> new thread will start with PSTATE.ZA==1 and TPIDR2 pointing to a
> TPIDR2 block on the parent thread's stack. This can result in a
> variety of problems, e.g.
>
> - The child may write back to the parent's za_save_buffer, corrupting
> its contents.
>
> - The child may read from the TPIDR2 block after the parent has reused
> this memory for something else, and consequently the child may abort
> or clobber arbitrary memory.
>
> Ideally we'd require that userspace ensures that a task is in the "ZA
> off" state (with PSTATE.ZA==0 and TPIDR2_EL0==NULL) prior to issuing a
> clone syscall, and have the kernel force this state for new threads.
> Unfortunately, contemporary C libraries do not do this, and simply
> forcing this state within the implementation of clone would break
> fork().
>
> Instead, we can bodge around this by considering the CLONE_VM flag, and
> mainpulate PSTATE.ZA and TPIDR2_EL0 as a pair. CLONE_VM indicates that
> the new task will run in the same address space as its parent, and in
> that case it doesn't make sense to inherit a stale pointer to the
> parent's TPIDR2 block:
>
> * For fork(), CLONE_VM will not be set, and it is safe to inherit both
> PSTATE.ZA and TPIDR2_EL0 as the new task will have its own copy of the
> address space, and cannot clobber its parent's stack.
>
> * For pthread_create() and vfork(), CLONE_VM will be set, and discarding
> PSTATE.ZA and TPIDR2_EL0 for the new task doesn't break any existing
> assumptions in userspace.
Acked-by: Yury Khrustalev <yury.khrustalev@arm.com>
Kind regards,
Yury
next prev parent reply other threads:[~2025-05-07 17:26 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-06 15:25 [PATCH 00/20] arm64: FPSIMD/SVE/SME fixes + re-eanble SME Mark Rutland
2025-05-06 15:25 ` [PATCH 01/20] kselftest/arm64: fp-ptrace: Fix expected FPMR value when PSTATE.SM is changed Mark Rutland
2025-05-06 15:25 ` [PATCH 02/20] arm64/fpsimd: Do not discard modified SVE state Mark Rutland
2025-05-06 15:25 ` [PATCH 03/20] arm64/fpsimd: signal: Clear PSTATE.SM when restoring FPSIMD frame only Mark Rutland
2025-05-07 12:46 ` Will Deacon
2025-05-07 14:01 ` Mark Rutland
2025-05-07 14:39 ` Will Deacon
2025-05-06 15:25 ` [PATCH 04/20] arm64/fpsimd: signal: Mandate SVE payload for streaming-mode state Mark Rutland
2025-05-07 0:59 ` Mark Brown
2025-05-07 12:59 ` Will Deacon
2025-05-07 14:21 ` Mark Rutland
2025-05-07 14:29 ` Will Deacon
2025-05-07 15:02 ` Mark Rutland
2025-05-07 16:14 ` Will Deacon
2025-05-06 15:25 ` [PATCH 05/20] arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVE Mark Rutland
2025-05-06 15:25 ` [PATCH 06/20] arm64/fpsimd: Clarify sve_sync_*() functions Mark Rutland
2025-05-06 15:25 ` [PATCH 07/20] arm64/fpsimd: Factor out {sve,sme}_state_size() helpers Mark Rutland
2025-05-06 15:25 ` [PATCH 08/20] arm64/fpsimd: Add task_smstop_sm() Mark Rutland
2025-05-06 15:25 ` [PATCH 09/20] arm64/fpsimd: signal: Use SMSTOP behaviour in setup_return() Mark Rutland
2025-05-06 15:25 ` [PATCH 10/20] arm64/fpsimd: Remove redundant task->mm check Mark Rutland
2025-05-06 15:25 ` [PATCH 11/20] arm64/fpsimd: Consistently preserve FPSIMD state during clone() Mark Rutland
2025-05-06 15:25 ` [PATCH 12/20] arm64/fpsimd: Clear PSTATE.SM " Mark Rutland
2025-05-06 15:25 ` [PATCH 13/20] arm64/fpsimd: Make clone() compatible with ZA lazy saving Mark Rutland
2025-05-07 14:58 ` Will Deacon
2025-05-07 15:22 ` Mark Rutland
2025-05-07 16:11 ` Will Deacon
2025-05-07 17:21 ` Mark Rutland
2025-05-07 15:57 ` Yury Khrustalev [this message]
2025-05-06 15:25 ` [PATCH 14/20] arm64/fpsimd: ptrace/prctl: Ensure VL changes do not resurrect stale data Mark Rutland
2025-05-06 15:25 ` [PATCH 15/20] arm64/fpsimd: ptrace/prctl: Ensure VL changes leave task in a valid state Mark Rutland
2025-05-07 16:12 ` Will Deacon
2025-05-07 17:10 ` Mark Rutland
2025-05-08 10:31 ` Will Deacon
2025-05-06 15:25 ` [PATCH 16/20] arm64/fpsimd: ptrace: Save task state before generating SVE header Mark Rutland
2025-05-06 15:25 ` [PATCH 17/20] arm64/fpsimd: ptrace: Do not present register data for inactive mode Mark Rutland
2025-05-06 15:25 ` [PATCH 18/20] arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode state Mark Rutland
2025-05-07 1:09 ` Mark Brown
2025-05-06 15:25 ` [PATCH 19/20] arm64/fpsimd: ptrace: Gracefully handle errors Mark Rutland
2025-05-07 16:32 ` Will Deacon
2025-05-08 12:12 ` Mark Rutland
2025-05-06 15:25 ` [PATCH 20/20] arm64/fpsimd: Allow CONFIG_ARM64_SME to be selected Mark Rutland
2025-05-07 1:48 ` [PATCH 00/20] arm64: FPSIMD/SVE/SME fixes + re-eanble SME Mark Brown
2025-05-07 9:56 ` Mark Rutland
2025-05-07 11:26 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aBuC3ncUQeq-Ki5p@arm.com \
--to=yury.khrustalev@arm.com \
--cc=broonie@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=daniel.kiss@arm.com \
--cc=david.spickett@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=luis.machado@arm.com \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=richard.sandiford@arm.com \
--cc=sander.desmalen@arm.com \
--cc=tabba@google.com \
--cc=tamas.petz@arm.com \
--cc=tkjos@google.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).