linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Mark Brown <broonie@kernel.org>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	kvm@vger.kernel.org, James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	James Clark <james.clark@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Re: [PATCH 5/5] KVM: arm64: Exclude FP ownership from kvm_vcpu_arch
Date: Sat, 09 Mar 2024 11:01:56 +0000	[thread overview]
Message-ID: <871q8jr7mj.wl-maz@kernel.org> (raw)
In-Reply-To: <3d731a55-300e-4a14-89bb-4effbd16b781@sirena.org.uk>

On Thu, 07 Mar 2024 14:26:30 +0000,
Mark Brown <broonie@kernel.org> wrote:
> 
> On Thu, Mar 07, 2024 at 11:10:40AM +0000, Marc Zyngier wrote:
> > Mark Brown <broonie@kernel.org> wrote:
> > > On Wed, Mar 06, 2024 at 09:43:13AM +0000, Marc Zyngier wrote:
> > > > Mark Brown <broonie@kernel.org> wrote:
> > > > > On Sat, Mar 02, 2024 at 11:19:35AM +0000, Marc Zyngier wrote:
> 
> > > > > The SME patch series proposes adding an additional state to this
> > > > > enumeration which would say if the registers are stored in a format
> > > > > suitable for exchange with userspace, that would make this state part of
> > > > > the vCPU state.  With the addition of SME we can have two vector lengths
> > > > > in play so the series proposes picking the larger to be the format for
> > > > > userspace registers.
> 
> > > > What does this addition have anything to do with the ownership of the
> > > > physical register file? Not a lot, it seems.
> 
> > > > Specially as there better be no state resident on the CPU when
> > > > userspace messes up with it.
> 
> > > If we have a situation where the state might be stored in memory in
> > > multiple formats it seems reasonable to consider the metadata which
> > > indicates which format is currently in use as part of the state.
> 
> > There is no reason why the state should be in multiple formats
> > *simultaneously*. All the FP/SIMD/SVE/SME state is largely
> > overlapping, and we only need to correctly invalidate the state that
> > isn't relevant to writes from userspace.
> 
> I agree that we don't want to store multiple copies, the proposed
> implementation for SME does that - when converting between guest native
> and userspace formats we discard the original format after conversion
> and updates the metadata which says which format is stored.  That
> metadata is modeled as adding a new owner.

What conversion? If userspace writes to FP, the upper bits of the SVE
registers should get zeroed. If it writes to SVE, it writes using the
current VL for the current streaming/non-streaming mode.

If the current userspace API doesn't give us the tools to do so, we
rev it.

>
> > > > > We could store this separately to fp_state/owner but it'd still be a
> > > > > value stored in the vCPU.
> 
> > > > I totally disagree.
> 
> > > Where would you expect to see the state stored?
> 
> > Sorry, that came out wrong. I expect *some* vcpu state to describe the
> > current use of the FP/vector registers, and that's about it. Not the
> > ownership information.
> 
> Ah, I think the distinction here is that you're modeling the state
> tracking updated by this patch as purely tracking the registers in which
> case yes, we should just add a separate variable in the vCPU for
> tracking the format of the data.  I was modeling the state tracking as
> covering both the state in the registers and the state of the storage
> backing them (since the guest state being in the registers means that
> the state in memory is invalid).
> 
> > > > > Storing in a format suitable for userspace
> > > > > usage all the time when we've got SME would most likely result in
> > > > > performance overhead
> 
> > > > What performance overhead? Why should we care?
> > > 
> > > Since in situations where we're not using the larger VL we would need to
> > > load and store the registers using a vector length other than the
> 
> ...
> 
> > > and would instead need to manually compute the memory locations where
> > > values are stored.  As well as the extra instructions when using the
> > > smaller vector length we'd also be working with sparser data likely over
> > > more cache lines.
> 
> > Are you talking about a context switch? or userspace accesses? I don't
> > give a damn about the latter, as it statistically never happens. The
> > former is of course of interest, but you still don't explain why the
> > above is a problem.
> 
> Well, there will be a cost in any rewriting so I'm trying to shift it
> away from context switch to userspace access since as you say they are
> negligably frequent.

So we're in violent agreement.

> > Nothing prevent you from storing the registers using the *current* VL,
> > since there is no data sharing between the SVE registers and the
> > streaming-SVE ones. All you need to do is to make sure you don't mix
> > the two.
> 
> You seem to be suggesting modeling the streaming mode registers as a
> completely different set of registers in the KVM ABI.

Where are you reading this? I *never* said anything of the sort. There
is only one set of SVE registers. The fact that they change size and
get randomly zeroed when userspace flips bits is a property of the
architecture. The HW *can* implements them as two sets of registers if
SME is a separate accelerator, but that's not architectural.

All I'm saying is that you can have a *single* register file, spanning
both FPSIMD and SVE, using the maximum VL achievable in the guest, and
be done with it. Yes, you'll have to refactor the code so that FPSIMD
lands at the right spot in the SVE register file. Big deal.

> That would
> certainly be another option, though it's a bit unclear from an
> architecture point of view and it didn't seem to entirely fit with the
> handling of the FPSIMD registers when SVE is in use.  From an
> architecture point of view the streaming mode registers aren't really a
> separate set of registers.  When in streaming mode it is not possible to
> observe the non-streaming register state and vice versa, and entering or
> exiting streaming mode zeros the register state so functionally you just
> have floating point registers.
> 
> You'd need to handle what happens with access to the inactive register
> set from userspace, the simplest thing to implement would be to read the
> logical value of zero and either discard or return an error on writes.
> 
> > > We would also need to consider if we need to zero the holes in the data
> > > when saving, we'd only potentially be leaking information from the guest
> > > but it might cause nasty surprises given that transitioning to/from
> > > streaming mode is expected to zero values.  If we do need to zero then
> > > that would be additional work that would need doing.
> 
> > The zeroing is mandated by the architecture, AFAIU. That's not optional.
> 
> Yes, we need to zero at some point - we could just do it on userspace
> read though I think rather than having to do it when saving.

As long as the guest only observes an architecturally compliant state,
I'm find with that.

> 
> > > Spending more effort to implement something which also has more runtime
> > > performance overhead for the case of saving and restoring guest state
> > > which I expect to be vastly more common than the VMM accessing the guest
> > > registers just doesn't seem like an appealing choice.
> 
> > I don't buy the runtime performance aspect at all. As long as you have
> > the space to dump the largest possible VL, you can always dump it in
> > the native format.
> 
> Sure, my point was that you appeared to be asking to dump in a
> non-native format.  

Again, what makes you think that? You keep putting words in my mouth.

>
> > > If we were storing the data in the native format for the guest then that
> > > format will change if streaming mode is changed via a write to SVCR.
> > > This would mean that the host would need to understand that when writing
> > > values SVCR needs to be written before the Z and P registers.  To be
> > > clear I don't think this is a good idea.
> 
> > The architecture is crystal clear: you flip SVCR.SM, you loose all
> > data in both Z and P regs. If userspace doesn't understand the
> > architecture, that's their problem. The only thing we need to provide
> > is a faithful emulation of the architecture.
> 
> It is not clear to me that the intention behind the KVM register ABI is
> that it should have all the ordering requirements that the architecture
> has, and decisions like hiding the V registers when SVE is active do
> take us away from just a natural architectural point of view.

My intention is that userspace accesses should be as close as possible
as that of a guest. If the current ABI doesn't allow this, I'm all for
changing it. Won't be the first time, won't be the last.

> My
> understanding was that it was intended that userspace should be able to
> do something like just enumerate all the registers, save them and then
> later on restore them without really understanding them.

And this statement doesn't get in the way of anything above. We own
the ABI, and can change it as we see fit when SME is enabled.

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-03-09 11:02 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-02 11:19 [PATCH 0/5] KVM: arm64: Move host-specific data out of kvm_vcpu_arch Marc Zyngier
2024-03-02 11:19 ` [PATCH 1/5] KVM: arm64: Add accessor for per-CPU state Marc Zyngier
2024-03-04 12:05   ` Suzuki K Poulose
2024-03-09 13:00     ` Marc Zyngier
2024-03-11  4:50   ` Dongli Zhang
2024-03-11 17:13     ` Marc Zyngier
2024-03-02 11:19 ` [PATCH 2/5] KVM: arm64: Exclude host_debug_data from vcpu_arch Marc Zyngier
2024-03-02 11:19 ` [PATCH 3/5] KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_arch Marc Zyngier
2024-03-02 11:19 ` [PATCH 4/5] KVM: arm64: Exclude host_fpsimd_state pointer " Marc Zyngier
2024-03-04 20:45   ` Mark Brown
2024-03-02 11:19 ` [PATCH 5/5] KVM: arm64: Exclude FP ownership " Marc Zyngier
2024-03-04 19:10   ` Mark Brown
2024-03-06  9:43     ` Marc Zyngier
2024-03-06 22:19       ` Mark Brown
2024-03-07 11:10         ` Marc Zyngier
2024-03-07 14:26           ` Mark Brown
2024-03-09 11:01             ` Marc Zyngier [this message]
2024-03-11 18:42               ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871q8jr7mj.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=broonie@kernel.org \
    --cc=james.clark@arm.com \
    --cc=james.morse@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).