Re: [PATCH v5 1/2] arm64/sve: Don't disable SVE on syscalls return

From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Julien Grall <julien@xen.org>,
	Zhang Lei <zhang.lei@jp.fujitsu.com>,
	Julien Grall <julien.grall@arm.com>,
	Will Deacon <will@kernel.org>, Dave Martin <Dave.Martin@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	Daniel Kiss <Daniel.Kiss@arm.com>
Subject: Re: [PATCH v5 1/2] arm64/sve: Don't disable SVE on syscalls return
Date: Fri, 13 Nov 2020 20:13:28 +0000	[thread overview]
Message-ID: <20201113201328.GG4828@sirena.org.uk> (raw)
In-Reply-To: <20201113184855.GF3212@gaia>

[-- Attachment #1.1: Type: text/plain, Size: 4118 bytes --]

On Fri, Nov 13, 2020 at 06:48:56PM +0000, Catalin Marinas wrote:
> On Fri, Nov 06, 2020 at 07:35:52PM +0000, Mark Brown wrote:
> > From: Julien Grall <julien.grall@arm.com>

> > We could instead handle flushing the SVE state in do_el0_svc() however
> > doing this reduces the potential for further optimisations such as
> > initializing the SVE registers directly from the FPSIMD state when
> > taking a SVE access trap and has some potential edge cases if we
> > schedule before we return to userspace after do_el0_svc().

> Ah, you covered the do_el0_svc() topic here already. Is this potential
> optimisation prevented because TIF_SVE has two meanings: SVE access
> enabled and sve_state valid (trying to page this code in)? For example,
> task_fpsimd_load() uses TIF_SVE to check whether to load from sve_state
> or from fpsimd. fpsimd_bind_task_to_cpu() uses TIF_SVE to enable the
> user access.

Yes, there's some overloading with the storage for the SVE register file
and the new flag is effectively about the storage.

> Maybe that's what TIF_SVE_NEEDS_FLUSH tries to address but I'd rather
> have two clearly defined flags and better separate their meaning:

> - TIF_SVE_ACC - it tells us whether the user access was enabled or it
>   needs to be enabled on return to user (maybe we can skip this and just
>   enable it directly in places like do_sve_acc).
> - TIF_SVE_STATE_VALID - the validity of the kernel sve_state data.

That would do the trick as well, it's close to what we have now but
explained slightly differently - the main difference is that you can set
each flag independently at the minute since both flags mean that SVE
should be enabled on return to userspace.  

IIRC I did start down a very similar route at one point but found it was
making some of the decision making code more complex with more sites
needing to check multiple flags, it was a while ago but one of the
things causing issues was that we don't immediately save the state on
kernel entry.  Looking at your suggested naming that might be a naming
thing with the "where is the state?" flag that was causing issues
though, thinking again we might want to use something like
TIF_SVE_FULL_STATE or similar instead.

> I'll list what I think the requirement are for the SVE state and which
> (not necessarily how the current code behaves):

This sounds about right to me.

> 3. On syscall entry we have to discard the hardware SVE state if access
>    was enabled in user.

To be a bit more explicit that's SVE state that's not shared with
FPSIMD.  The documented ABI doesn't *require* us to discard the extra
state but it's safer and cleaner to do that than doing it only
sometimes.

> 5. On return to user from syscall, we just restore the FPSIMD state (I
>    think we still need to do something with the predicates but sve_state
>    is invalid, so we just initialise them). As an optimisation, we want
>    to leave SVE access on.

The predicate registers are like the unshared bits of the Z registers and
explicitly documented as being unspecified on return from syscall so the
handling is the same, they're reset to ensure that we're not leaking
anything.

> The patch may try to address part of the above but I find that the
> addition of TIF_SVE_NEEDS_FLUSH makes this state machine more
> complicated than necessary (well, based on my mental model; possibly I'm
> missing some other cases).

Yeah.  The mental model I came up with was that both flags say that we
should leave SVE on for userspace and which flag is set dictates where
we get the state of the Z registers from and hence where the state we
need to save should go, my main focus with the last version was trying
to document that I guess not 100% successfully.  I found that made
things simpler since it detangles them for the most part and we don't
need to have a "where did I put my data?" decision at as many of the
sites that look at it.

I'll wait for more review but in the absence of other suggestions I'll
have another run at splitting the storage and trapping flags.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel