From: Catalin Marinas <catalin.marinas@arm.com>
To: Dave Martin <Dave.Martin@arm.com>
Cc: linux-arch@vger.kernel.org, libc-alpha@sourceware.org,
"Ard Biesheuvel" <ard.biesheuvel@linaro.org>,
"Szabolcs Nagy" <szabolcs.nagy@arm.com>,
"Richard Sandiford" <richard.sandiford@arm.com>,
"Will Deacon" <will.deacon@arm.com>,
"Alex Bennée" <alex.bennee@linaro.org>,
kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 11/28] arm64/sve: Core task context handling
Date: Thu, 5 Oct 2017 12:28:35 +0100 [thread overview]
Message-ID: <20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com> (raw)
In-Reply-To: <20171003113302.GR3611@e103592.cambridge.arm.com>
On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:
> On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:
> > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:
> > > +/*
> > > + * Handle SVE state across fork():
> > > + *
> > > + * dst and src must not end up with aliases of the same sve_state.
> > > + * Because a task cannot fork except in a syscall, we can discard SVE
> > > + * state for dst here, so long as we take care to retain the FPSIMD
> > > + * subset of the state if SVE is in use. Reallocation of the SVE state
> > > + * will be deferred until dst tries to use SVE.
> > > + */
> > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)
> > > +{
> > > + if (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {
> > > + WARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));
> > > + sve_to_fpsimd(dst);
> > > + }
> > > +
> > > + dst->thread.sve_state = NULL;
> > > +}
> >
> > I first thought the thread flags are not visible in dst yet since
> > dup_task_struct() calls arch_dup_task_struct() before
> > setup_thread_stack(). However, at the end of the last year we enabled
> > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying
> > on this.
> >
> > Anyway, IIUC we don't need sve_to_fpsimd() here. The
> > arch_dup_task_struct() already called fpsimd_preserve_current_state()
> > for src, so the FPSIMD state (which we care about) is transferred during
> > the *dst = *src assignment. So you'd only need the last statement,
> > possibly with a different function name like fpsimd_erase_sve (and maybe
> > make the function static inline in the header).
>
> Regarding the intended meanings of the thread flags, does this help?
>
> --8<--
>
> TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise
> unchanged:
>
> * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU
> registers of the CPU the task is running on contain the authoritative
> FPSIMD/SVE state of the task. The backing memory may be stale.
>
> * Otherwise (i.e., task not running, or task running and
> TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is
> authoritative. If additionally per_cpu(fpsimd_last_state,
> task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then
> task->fpsimd_state.cpu's registers are also up to date for task, but
> not authorititive: the current FPSIMD/SVE state may be read from
> them, but they must not be written.
>
> The FPSIMD/SVE backing memory is selected by TIF_SVE:
>
> * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are
> stored in task->thread.sve_state, formatted appropriately for vector
> length task->thread.sve_vl. task->thread.sve_state must point to a
> valid buffer at least sve_state_size(task) bytes in size.
>
> * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are
> logically zero[*] but not stored anywhere; Pn, FFR are not stored and
> have unspecified values from userspace's point of view.
> task->thread.sve_state does not need to be non-null, valid or any
> particular size: it must not be dereferenced.
>
> In practice I don't exploit the "unspecifiedness" much. The Zn high
> bits, Pn and FFR are all zeroed when setting TIF_SVE again:
> sve_alloc() is the common path for this.
>
> * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of
> whether TIF_SVE is clear or set, since these are not vector length
> dependent.
This looks fine. I think we need to make sure (with a warning) that
task_fpsimd_save() (and probably load) is always called in a
non-preemptible context.
> [*] theoretically unspecified, which is what I've written in sve.txt.
> However, on any FPSIMD Vn write the upper bits of the corresponding Zn
> must become logically zero in order to conform to the SVE programmer's
> model. It's not feasible to track which Vn have been written
> individually because that would involve trapping every FPSIMD insn until
> all possible Vn have been written. So the kernel ensures that the Zn
> high bits become zero.
>
> Maybe we should just guarantee "zero-or-preserved" behaviour for
> userspace. This may close down some future optimisation opportunities,
> but maybe it's better to be simple.
Does it work if we leave it as "unspecified" in the document but just do
zero-or-preserved in the kernel code?
Just wondering, as an optimisation for do_sve_acc() - instead of
sve_alloc() and fpsimd_to_sve(), can we not force the loading of the
FPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would
ensure the zeroing of the top SVE bits and we only need to allocate the
SVE state on the saving path. This means enabling SVE for user and
setting TIF_SVE without having the backing storage allocated.
--
Catalin
next prev parent reply other threads:[~2017-10-05 11:28 UTC|newest]
Thread overview: 128+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-31 17:00 [PATCH v2 00/28] ARM Scalable Vector Extension (SVE) Dave Martin
2017-08-31 17:00 ` [PATCH v2 01/28] regset: Add support for dynamically sized regsets Dave Martin
2017-08-31 17:00 ` [PATCH v2 02/28] arm64: KVM: Hide unsupported AArch64 CPU features from guests Dave Martin
2017-09-13 14:37 ` Alex Bennée
2017-09-13 14:37 ` Alex Bennée
2017-09-15 0:04 ` Dave Martin
2017-08-31 17:00 ` [PATCH v2 03/28] arm64: efi: Add missing Kconfig dependency on KERNEL_MODE_NEON Dave Martin
2017-08-31 17:00 ` [PATCH v2 04/28] arm64: Port deprecated instruction emulation to new sysctl interface Dave Martin
2017-08-31 17:00 ` [PATCH v2 05/28] arm64: fpsimd: Simplify uses of {set,clear}_ti_thread_flag() Dave Martin
2017-08-31 17:00 ` [PATCH v2 06/28] arm64/sve: System register and exception syndrome definitions Dave Martin
2017-09-13 14:48 ` Alex Bennée
2017-09-13 14:48 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 07/28] arm64/sve: Low-level SVE architectural state manipulation functions Dave Martin
2017-09-13 15:39 ` Alex Bennée
2017-09-13 15:39 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 08/28] arm64/sve: Kconfig update and conditional compilation support Dave Martin
2017-08-31 17:00 ` [PATCH v2 09/28] arm64/sve: Signal frame and context structure definition Dave Martin
2017-09-13 13:36 ` Catalin Marinas
2017-09-13 21:33 ` Dave Martin
2017-08-31 17:00 ` [PATCH v2 10/28] arm64/sve: Low-level CPU setup Dave Martin
2017-09-13 13:32 ` Catalin Marinas
2017-09-13 19:21 ` Dave Martin
2017-09-13 19:21 ` Dave Martin
2017-10-05 10:47 ` Dave Martin
2017-10-05 11:04 ` Suzuki K Poulose
2017-10-05 11:22 ` Dave Martin
2017-08-31 17:00 ` [PATCH v2 11/28] arm64/sve: Core task context handling Dave Martin
2017-09-13 14:33 ` Catalin Marinas
2017-09-14 19:55 ` Dave Martin
2017-09-20 13:58 ` Catalin Marinas
2017-10-03 11:11 ` Dave Martin
2017-10-03 11:11 ` Dave Martin
2017-10-04 17:29 ` Catalin Marinas
2017-10-03 11:33 ` Dave Martin
2017-10-05 11:28 ` Catalin Marinas [this message]
2017-10-06 13:10 ` Dave Martin
2017-10-06 13:36 ` Catalin Marinas
2017-10-06 15:15 ` Dave Martin
2017-10-06 15:15 ` Dave Martin
2017-10-06 15:33 ` Catalin Marinas
2017-09-13 17:26 ` Catalin Marinas
2017-09-13 19:17 ` Dave Martin
2017-09-13 22:21 ` Catalin Marinas
2017-09-14 19:40 ` Dave Martin
2017-09-19 17:13 ` Catalin Marinas
2017-08-31 17:00 ` [PATCH v2 12/28] arm64/sve: Support vector length resetting for new processes Dave Martin
2017-09-14 8:47 ` Alex Bennée
2017-09-14 8:47 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 13/28] arm64/sve: Signal handling support Dave Martin
2017-09-14 9:30 ` Alex Bennée
2017-09-14 9:30 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 14/28] arm64/sve: Backend logic for setting the vector length Dave Martin
2017-09-13 17:29 ` Catalin Marinas
2017-09-13 19:06 ` Dave Martin
2017-09-13 22:11 ` Catalin Marinas
2017-10-05 16:42 ` Dave Martin
2017-10-05 16:53 ` Catalin Marinas
2017-10-05 17:04 ` Dave Martin
2017-09-20 10:57 ` Alan Hayward
2017-09-20 10:59 ` Alan Hayward
2017-09-20 11:09 ` Dave Martin
2017-09-20 18:08 ` Alan Hayward
2017-09-21 11:19 ` Dave Martin
2017-09-21 11:57 ` Alan Hayward
2017-08-31 17:00 ` [PATCH v2 15/28] arm64: cpufeature: Move sys_caps_initialised declarations Dave Martin
2017-09-14 9:33 ` Alex Bennée
2017-09-14 9:33 ` Alex Bennée
2017-09-14 9:35 ` Suzuki K Poulose
2017-08-31 17:00 ` [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable vector lengths Dave Martin
2017-09-14 9:45 ` Alex Bennée
2017-09-14 9:45 ` Alex Bennée
2017-09-28 14:22 ` Dave Martin
2017-09-28 17:32 ` Alex Bennée
2017-09-28 17:32 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 17/28] arm64/sve: Preserve SVE registers around kernel-mode NEON use Dave Martin
2017-09-14 10:52 ` Alex Bennée
2017-09-14 10:52 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 18/28] arm64/sve: Preserve SVE registers around EFI runtime service calls Dave Martin
2017-08-31 17:00 ` Dave Martin
2017-09-14 11:01 ` Alex Bennée
2017-09-14 11:01 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 19/28] arm64/sve: ptrace and ELF coredump support Dave Martin
2017-09-06 16:21 ` Okamoto, Takayuki
2017-09-06 16:21 ` Okamoto, Takayuki
2017-09-06 18:16 ` Dave Martin
2017-09-07 5:11 ` Okamoto, Takayuki
2017-09-07 5:11 ` Okamoto, Takayuki
2017-09-08 13:11 ` Dave Martin
2017-09-14 12:57 ` Alex Bennée
2017-09-14 12:57 ` Alex Bennée
2017-09-28 14:57 ` Dave Martin
2017-09-29 12:46 ` Dave Martin
2017-08-31 17:00 ` [PATCH v2 20/28] arm64/sve: Add prctl controls for userspace vector length management Dave Martin
2017-09-14 13:02 ` Alex Bennée
2017-09-14 13:02 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 21/28] arm64/sve: Add sysctl to set the default vector length for new processes Dave Martin
2017-09-14 13:05 ` Alex Bennée
2017-09-14 13:05 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 22/28] arm64/sve: KVM: Prevent guests from using SVE Dave Martin
2017-09-14 13:28 ` Alex Bennée
2017-09-14 13:28 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 23/28] arm64/sve: KVM: Treat guest SVE use as undefined instruction execution Dave Martin
2017-09-14 13:30 ` Alex Bennée
2017-09-14 13:30 ` Alex Bennée
2017-09-14 13:31 ` Alex Bennée
2017-09-14 13:31 ` Alex Bennée
2017-09-29 13:00 ` Dave Martin
2017-09-29 14:43 ` Alex Bennée
2017-09-29 14:43 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 24/28] arm64/sve: KVM: Hide SVE from CPU features exposed to guests Dave Martin
2017-09-14 13:32 ` Alex Bennée
2017-09-14 13:32 ` Alex Bennée
2017-08-31 17:00 ` [PATCH v2 25/28] arm64/sve: Detect SVE and activate runtime support Dave Martin
2017-08-31 17:00 ` [PATCH v2 26/28] arm64/sve: Add documentation Dave Martin
2017-10-05 16:39 ` Szabolcs Nagy
2017-10-05 17:02 ` Dave Martin
2017-10-06 15:43 ` Szabolcs Nagy
2017-10-06 17:37 ` Dave Martin
2017-10-09 9:34 ` Alex Bennée
2017-10-09 9:34 ` Alex Bennée
2017-10-09 9:49 ` Dave Martin
2017-10-09 14:07 ` Alex Bennée
2017-10-09 14:07 ` Alex Bennée
2017-10-09 16:20 ` Dave Martin
2017-10-09 16:20 ` Dave Martin
2017-08-31 17:00 ` [RFC PATCH v2 27/28] arm64: signal: Report signal frame size to userspace via auxv Dave Martin
2017-08-31 17:00 ` Dave Martin
2017-08-31 17:01 ` [RFC PATCH v2 28/28] arm64/sve: signal: Include SVE when computing AT_MINSIGSTKSZ Dave Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com \
--to=catalin.marinas@arm.com \
--cc=Dave.Martin@arm.com \
--cc=alex.bennee@linaro.org \
--cc=ard.biesheuvel@linaro.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=libc-alpha@sourceware.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=richard.sandiford@arm.com \
--cc=szabolcs.nagy@arm.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).