All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Martin <Dave.Martin@arm.com>
To: Julien Thierry <julien.thierry@arm.com>
Cc: Okamoto Takayuki <tokamoto@jp.fujitsu.com>,
	Christoffer Dall <cdall@kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Zhang Lei <zhang.lei@jp.fujitsu.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v5 22/26] KVM: arm64/sve: Add pseudo-register for the guest's vector lengths
Date: Fri, 1 Mar 2019 14:55:04 +0000	[thread overview]
Message-ID: <20190301145503.GI3567@e103592.cambridge.arm.com> (raw)
In-Reply-To: <a8082d89-84bd-06fa-5781-96e11376bb66@arm.com>

[FYI Peter, your views on the proposal outlined torward the end of the
mail would be appreciated.]

On Fri, Mar 01, 2019 at 01:28:19PM +0000, Julien Thierry wrote:
> 
> 
> On 26/02/2019 12:13, Dave Martin wrote:
> > On Thu, Feb 21, 2019 at 05:48:59PM +0000, Julien Thierry wrote:
> >> Hi Dave,
> >>
> >> On 18/02/2019 19:52, Dave Martin wrote:
> >>> This patch adds a new pseudo-register KVM_REG_ARM64_SVE_VLS to
> >>> allow userspace to set and query the set of vector lengths visible
> >>> to the guest, along with corresponding storage in struct
> >>> kvm_vcpu_arch.
> >>>
> >>> In the future, multiple register slices per SVE register may be
> >>> visible through the ioctl interface.  Once the set of slices has
> >>> been determined we would not be able to allow the vector length set
> >>> to be changed any more, in order to avoid userspace seeing
> >>> inconsistent sets of registers.  For this reason, this patch adds
> >>> support to track vcpu finalization explicitly, and enforce proper
> >>> sequencing of ioctls.
> >>>
> >>> The new pseudo-register is not exposed yet.  Subsequent patches
> >>> will allow SVE to be turned on for guest vcpus, making it visible.
> >>>
> >>> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >>>
> >>> ---
> >>>
> >>> Changes since v4:
> >>>
> >>>  * Add a UAPI header comment indicating the pseudo-register status of
> >>>    KVM_REG_ARM64_SVE_VLS.
> >>>
> >>>  * Get rid of the sve_vqs[] array from struct kvm_vcpu_arch.  This
> >>>    array is pointless, because its contents must match the host's
> >>>    internal sve_vq_map anyway, up to vcpu->arch.sve_max_vl.
> >>>
> >>>    The ioctl register accessors are slow-path code, so we can decode
> >>>    or reconstruct sve_vqs[] on demand instead, for exchange with
> >>>    userspace.
> >>>
> >>>  * For compatibility with potential future architecture extensions,
> >>>    enabling vector lengths above 256 bytes for the guest is explicitly
> >>>    disallowed now (because the code for exposing additional bits
> >>>    through ioctl is currently missing).  This can be addressed later
> >>>    if/when needed.
> >>>
> >>> Note:
> >>>
> >>>  * I defensively pass an array by pointer here, to help avoid
> >>>    accidentally breaking assumptions during future maintenance.
> >>>
> >>>    Due to (over?)zealous constification, this causes the following
> >>>    sparse warning.  I think this is sparse getting confused: I am not
> >>>    relying on any kernel-specific semantics here, and GCC generates no
> >>>    warning.
> >>>
> >>> +arch/arm64/kvm/guest.c:33: warning: incorrect type in argument 1 (different modifiers)
> >>> +arch/arm64/kvm/guest.c:33:    expected unsigned long long const [usertype] ( *const vqs )[8]
> >>> +arch/arm64/kvm/guest.c:33:    got unsigned long long [usertype] ( * )[8]
> >>>
> >>> ---
> > 
> > [...]
> > 
> >>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > 
> > [...]
> > 
> >>> +static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >>> +{
> >>> +	unsigned int max_vq, vq;
> >>> +	u64 vqs[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
> >>> +
> >>> +	if (WARN_ON(!sve_vl_valid(vcpu->arch.sve_max_vl)))
> >>> +		return -EINVAL;
> >>> +
> >>> +	memset(vqs, 0, sizeof(vqs));
> >>> +
> >>> +	max_vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> >>> +	for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
> >>> +		if (sve_vq_available(vq))
> >>> +			vqs[vq_word(vq)] |= vq_mask(vq);
> >>> +
> >>> +	BUILD_BUG_ON(sizeof(vqs) != KVM_REG_SIZE(reg->id));
> >>
> >> reg->id is not know at build time. From my understanding of
> >> BUILD_BUG_ON(), things actually ends up evaluated at runtime but I'm not
> >> sure what happens when doing sizeof(char[1- 2*0]) at runtime.
> >>
> >> Anyway, I don't think this is intended.
> > 
> > There's no runtime check: BUILD_BUG_ON() will cause compilation to fail
> > if the required condition doesn't fall out from optimisation.
> > 
> > Because of the way this function is called, reg->id is always
> > KVM_REG_ARM64_SVE_VLS, so inlining and constant propagation will make
> > this check pass (and compile to nothing).
> > 
> > We can assume a certain amount of inlining: the kernel officially can't
> > be built without optimisation.  But the precise build configuration can
> > sometimes have an effect here -- so it may not be better to rely on this
> > working for this slow-path code.
> > 
> > I'll convert these to if (WARN_ON()) return -EIO or something, or drop
> > them if I can get rid of them in other refactoring.
> > 
> > The fact that struct kvm_one_reg lacks any "size" field is rather
> > unfortunate, since it provides no explicit way for userspace to tell us
> > the size of its buffer.

I dropped the checks completely now, since they seem rather
unnecessary.

[...]

> >>>  #define SVE_REG_SLICE_SHIFT	0
> >>>  #define SVE_REG_SLICE_BITS	5
> >>>  #define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> >>> @@ -297,9 +372,21 @@ static int sve_reg_region(struct sve_state_region *region,
> >>>  static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >>>  {
> >>>  	struct sve_state_region region;
> >>> +	int ret;
> >>>  	char __user *uptr = (char __user *)reg->addr;
> >>>  
> >>> -	if (!vcpu_has_sve(vcpu) || sve_reg_region(&region, vcpu, reg))
> >>> +	if (!vcpu_has_sve(vcpu))
> >>> +		return -ENOENT;
> >>> +
> >>> +	if (reg->id == KVM_REG_ARM64_SVE_VLS)
> >>> +		return get_sve_vls(vcpu, reg);
> >>> +
> >>> +	/* Finalize the number of slices per SVE register: */
> >>> +	ret = kvm_arm_vcpu_finalize(vcpu);
> >>
> >> Having this here feels a bit random...
> > 
> > I'll have another think about this.
> > 
> > KVM_REG_ARM64_SVE_VLS certainly has nothing to do with anything except
> > SVE, and putting it in the SVE coproc space allows us to demux SVE-
> > related stuff from everything else neatly.
> > 
> > The whole "coprocessor" concept is nonsense for arm64 anyway except
> > for the sysregs (where the reg ID encodings map directly onto the
> > architecture and also align with AArch32).  So, the copro number is
> > pretty meaningless except as a demux key for related groups of
> > registers, which is how I use it here.
> > 
> > There's also no well-defined distinction between real and fake
> > registers, and no natural home for miscellaneous fakes -- all existing
> > blocks are for specific purposes such as recording the state of emulated
> > firmware APIs etc.  The SVE registers as exposed here already have no
> > direct equivalent in the architecture either, due to the padding up to
> > 2048 bits, and the presentation of FFR as addressable P-register etc.
> > 
> >> I'd suggest considering the pseudo-register outside of the SVE co-proc,
> >> as part of a set of registers that do not finalize a vcpu when accessed.

(Without a compelling alternative, I've kept KVM_ARM_ARM64_SVE_VLS in the
SVE copro for now, though this could still be changed if people feel
strongly about it.)

> > (Also, the dependency is not one-way:  _SVE_VLS is not independent
> > of finalization: quite the reverse -- it can only be written _before_
> > finalization.)
> > 
> >> All other registers (even non-sve ones) would finalize the vcpu when
> >> accessed from userland.
> > 
> > So, we could consider register to be in two classes as you suggest:
> > 
> >  * early registers
> >  * regular registers (i.e., everything else)
> > 
> > where early registers can only be written until the first access to a
> > regular register, or the first KVM_GET_REG_LIST or KVM_RUN.
> > 
> 
> That would be my personal preference, yes.

For comparison, we currently have

 * early registers (currently KVM_REG_ARM64_SVE_VLS)
 * late registers (currently KVM_REG_ARM64_SVE_{Z,P,FFR}*)
 * relaxed registers (everything else)

plus a couple of "late ioctls" that implcitly finalize (KVM_RUN,
KVM_GET_REG_LIST).

A write to an early register cannot follow a late ioctl or any access to
a late register.

I'm not convinced that's worse, but it's different.

> > This is a stricter rule than I have today: i.e., _SVE_VLS would really
> > need to be written before touching anything else.  But that may be a
> > good thing for avoiding future ABI drift.
> > 
> 
> I did notice this would be more restrictive, but I don't believe this
> should introduce shortcoming in ways to set up a VCPU. I might have
> missed some case of course.
> 
> > This may be premature factoring though.  It's not clear whether many new
> > registers like KVM_REG_ARM64_SVE_VLS will exist, or that they would have
> > compatible sequencing requirements.
> > 
> > Thoughts?
> > 
> 
> It might be premature, but I really don't like seeing this
> "vcpu_finalize" in the middle of SVE code. I think it is very easy to
> miss that touching some "sve" register suddenly finalizes the vcpu.
> I fear that some other feature/extention might need early vcpu
> finalization and that it might somehow conflict with sve depending on
> which triggers finalization first.
> 
> > 
> > Independently of this, I'll have a look at whether there's
> > a straightforward way to simplify the code flow.
> > 
> 
> I might be over-thinking it, but if there is a way to move that
> finalization call I'd prefer that.

I know what you mean, but there's not really another clear place to put
it either, right now.  Making finalization a side-effect of only KVM_RUN
and KVM_GET_REG_LIST would mean that if userspace is squirting in the
state of a migrated vcpu, a dummy KVM_GET_REG_LIST call would need to be
inserted between setting KVM_REG_ARM64_SVE_VLS and setting the other SVE
regs.

One thing we could to is get rid of the implicit finalization behaviour
completely, and require an explicit ioctl KVM_ARM_VCPU_FINALIZE to do
this.  This makes a clear barrier between reg writes that manipulate the
"physical" configuration of the vcpu, and reg reads/writes that merely
manipulate the vcpu's runtime state.  Say:

	KVM_ARM_VCPU_INIT(features[] = KVM_ARM_VCPU_SVE) -> ok
	KVM_RUN -> -EPERM (too early)
	KVM_GET_REG_LIST -> -EPERM (too early)
	...
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_VLS) -> ok
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_ZREG(0, 0)) -> -EPERM (too early)
	...
	KVM_RUN -> -EPERM (too early)
	...
	KVM_ARM_VCPU_FINALIZE -> ok
	KVM_ARM_VCPU_FINALIZE -> -EPERM (too late)
	...
	KVM_REG_REG_LIST -> ok
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_VLS) -> -EPERM (too late)
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_ZREG(0, 0)) -> ok
	KVM_RUN -> ok
	KVM_ARM_VCPU_FINALIZE -> -EPERM (too late)

This would change all existing kvm_arm_vcpu_finalize() calls to

	if (!kvm_arm_vcpu_finalized())
		return -EPERM;

(or similar).
	
Without an affected vcpu feature enabled, for backwards compatibility
we would not require the KVM_ARM_VCPU_FINALIZE call (or perhaps even
forbid it, since userspace that wants to backwards compatible cannot
use the new ioctl anyway.  I'm in two minds about this.  Either way,
calling _FINALIZE on an old kvm is harmless, so long as userspace knows
to ignore this failure case.)

The backwards-compatible flow would be:

	KVM_ARM_VCPU_INIT(features[] = 0) -> ok
	...
	KVM_ARM_VCPU_FINALIZE -> ok (or -EPERM)
	...
	KVM_RUN -> ok
	KVM_ARM_VCPU_FINALIZE -> -EPERM (too late)


Thoughts?

I may not have time to rework this for -rc1, and being a late ABI
change I'd like to get others' views on it before heading too far into
the tunnel.

Cheers
---Dave

WARNING: multiple messages have this Message-ID (diff)
From: Dave Martin <Dave.Martin@arm.com>
To: Julien Thierry <julien.thierry@arm.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Okamoto Takayuki <tokamoto@jp.fujitsu.com>,
	Christoffer Dall <cdall@kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Marc Zyngier <marc.zyngier@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Zhang Lei <zhang.lei@jp.fujitsu.com>,
	kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v5 22/26] KVM: arm64/sve: Add pseudo-register for the guest's vector lengths
Date: Fri, 1 Mar 2019 14:55:04 +0000	[thread overview]
Message-ID: <20190301145503.GI3567@e103592.cambridge.arm.com> (raw)
In-Reply-To: <a8082d89-84bd-06fa-5781-96e11376bb66@arm.com>

[FYI Peter, your views on the proposal outlined torward the end of the
mail would be appreciated.]

On Fri, Mar 01, 2019 at 01:28:19PM +0000, Julien Thierry wrote:
> 
> 
> On 26/02/2019 12:13, Dave Martin wrote:
> > On Thu, Feb 21, 2019 at 05:48:59PM +0000, Julien Thierry wrote:
> >> Hi Dave,
> >>
> >> On 18/02/2019 19:52, Dave Martin wrote:
> >>> This patch adds a new pseudo-register KVM_REG_ARM64_SVE_VLS to
> >>> allow userspace to set and query the set of vector lengths visible
> >>> to the guest, along with corresponding storage in struct
> >>> kvm_vcpu_arch.
> >>>
> >>> In the future, multiple register slices per SVE register may be
> >>> visible through the ioctl interface.  Once the set of slices has
> >>> been determined we would not be able to allow the vector length set
> >>> to be changed any more, in order to avoid userspace seeing
> >>> inconsistent sets of registers.  For this reason, this patch adds
> >>> support to track vcpu finalization explicitly, and enforce proper
> >>> sequencing of ioctls.
> >>>
> >>> The new pseudo-register is not exposed yet.  Subsequent patches
> >>> will allow SVE to be turned on for guest vcpus, making it visible.
> >>>
> >>> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> >>>
> >>> ---
> >>>
> >>> Changes since v4:
> >>>
> >>>  * Add a UAPI header comment indicating the pseudo-register status of
> >>>    KVM_REG_ARM64_SVE_VLS.
> >>>
> >>>  * Get rid of the sve_vqs[] array from struct kvm_vcpu_arch.  This
> >>>    array is pointless, because its contents must match the host's
> >>>    internal sve_vq_map anyway, up to vcpu->arch.sve_max_vl.
> >>>
> >>>    The ioctl register accessors are slow-path code, so we can decode
> >>>    or reconstruct sve_vqs[] on demand instead, for exchange with
> >>>    userspace.
> >>>
> >>>  * For compatibility with potential future architecture extensions,
> >>>    enabling vector lengths above 256 bytes for the guest is explicitly
> >>>    disallowed now (because the code for exposing additional bits
> >>>    through ioctl is currently missing).  This can be addressed later
> >>>    if/when needed.
> >>>
> >>> Note:
> >>>
> >>>  * I defensively pass an array by pointer here, to help avoid
> >>>    accidentally breaking assumptions during future maintenance.
> >>>
> >>>    Due to (over?)zealous constification, this causes the following
> >>>    sparse warning.  I think this is sparse getting confused: I am not
> >>>    relying on any kernel-specific semantics here, and GCC generates no
> >>>    warning.
> >>>
> >>> +arch/arm64/kvm/guest.c:33: warning: incorrect type in argument 1 (different modifiers)
> >>> +arch/arm64/kvm/guest.c:33:    expected unsigned long long const [usertype] ( *const vqs )[8]
> >>> +arch/arm64/kvm/guest.c:33:    got unsigned long long [usertype] ( * )[8]
> >>>
> >>> ---
> > 
> > [...]
> > 
> >>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > 
> > [...]
> > 
> >>> +static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >>> +{
> >>> +	unsigned int max_vq, vq;
> >>> +	u64 vqs[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
> >>> +
> >>> +	if (WARN_ON(!sve_vl_valid(vcpu->arch.sve_max_vl)))
> >>> +		return -EINVAL;
> >>> +
> >>> +	memset(vqs, 0, sizeof(vqs));
> >>> +
> >>> +	max_vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> >>> +	for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
> >>> +		if (sve_vq_available(vq))
> >>> +			vqs[vq_word(vq)] |= vq_mask(vq);
> >>> +
> >>> +	BUILD_BUG_ON(sizeof(vqs) != KVM_REG_SIZE(reg->id));
> >>
> >> reg->id is not know at build time. From my understanding of
> >> BUILD_BUG_ON(), things actually ends up evaluated at runtime but I'm not
> >> sure what happens when doing sizeof(char[1- 2*0]) at runtime.
> >>
> >> Anyway, I don't think this is intended.
> > 
> > There's no runtime check: BUILD_BUG_ON() will cause compilation to fail
> > if the required condition doesn't fall out from optimisation.
> > 
> > Because of the way this function is called, reg->id is always
> > KVM_REG_ARM64_SVE_VLS, so inlining and constant propagation will make
> > this check pass (and compile to nothing).
> > 
> > We can assume a certain amount of inlining: the kernel officially can't
> > be built without optimisation.  But the precise build configuration can
> > sometimes have an effect here -- so it may not be better to rely on this
> > working for this slow-path code.
> > 
> > I'll convert these to if (WARN_ON()) return -EIO or something, or drop
> > them if I can get rid of them in other refactoring.
> > 
> > The fact that struct kvm_one_reg lacks any "size" field is rather
> > unfortunate, since it provides no explicit way for userspace to tell us
> > the size of its buffer.

I dropped the checks completely now, since they seem rather
unnecessary.

[...]

> >>>  #define SVE_REG_SLICE_SHIFT	0
> >>>  #define SVE_REG_SLICE_BITS	5
> >>>  #define SVE_REG_ID_SHIFT	(SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> >>> @@ -297,9 +372,21 @@ static int sve_reg_region(struct sve_state_region *region,
> >>>  static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >>>  {
> >>>  	struct sve_state_region region;
> >>> +	int ret;
> >>>  	char __user *uptr = (char __user *)reg->addr;
> >>>  
> >>> -	if (!vcpu_has_sve(vcpu) || sve_reg_region(&region, vcpu, reg))
> >>> +	if (!vcpu_has_sve(vcpu))
> >>> +		return -ENOENT;
> >>> +
> >>> +	if (reg->id == KVM_REG_ARM64_SVE_VLS)
> >>> +		return get_sve_vls(vcpu, reg);
> >>> +
> >>> +	/* Finalize the number of slices per SVE register: */
> >>> +	ret = kvm_arm_vcpu_finalize(vcpu);
> >>
> >> Having this here feels a bit random...
> > 
> > I'll have another think about this.
> > 
> > KVM_REG_ARM64_SVE_VLS certainly has nothing to do with anything except
> > SVE, and putting it in the SVE coproc space allows us to demux SVE-
> > related stuff from everything else neatly.
> > 
> > The whole "coprocessor" concept is nonsense for arm64 anyway except
> > for the sysregs (where the reg ID encodings map directly onto the
> > architecture and also align with AArch32).  So, the copro number is
> > pretty meaningless except as a demux key for related groups of
> > registers, which is how I use it here.
> > 
> > There's also no well-defined distinction between real and fake
> > registers, and no natural home for miscellaneous fakes -- all existing
> > blocks are for specific purposes such as recording the state of emulated
> > firmware APIs etc.  The SVE registers as exposed here already have no
> > direct equivalent in the architecture either, due to the padding up to
> > 2048 bits, and the presentation of FFR as addressable P-register etc.
> > 
> >> I'd suggest considering the pseudo-register outside of the SVE co-proc,
> >> as part of a set of registers that do not finalize a vcpu when accessed.

(Without a compelling alternative, I've kept KVM_ARM_ARM64_SVE_VLS in the
SVE copro for now, though this could still be changed if people feel
strongly about it.)

> > (Also, the dependency is not one-way:  _SVE_VLS is not independent
> > of finalization: quite the reverse -- it can only be written _before_
> > finalization.)
> > 
> >> All other registers (even non-sve ones) would finalize the vcpu when
> >> accessed from userland.
> > 
> > So, we could consider register to be in two classes as you suggest:
> > 
> >  * early registers
> >  * regular registers (i.e., everything else)
> > 
> > where early registers can only be written until the first access to a
> > regular register, or the first KVM_GET_REG_LIST or KVM_RUN.
> > 
> 
> That would be my personal preference, yes.

For comparison, we currently have

 * early registers (currently KVM_REG_ARM64_SVE_VLS)
 * late registers (currently KVM_REG_ARM64_SVE_{Z,P,FFR}*)
 * relaxed registers (everything else)

plus a couple of "late ioctls" that implcitly finalize (KVM_RUN,
KVM_GET_REG_LIST).

A write to an early register cannot follow a late ioctl or any access to
a late register.

I'm not convinced that's worse, but it's different.

> > This is a stricter rule than I have today: i.e., _SVE_VLS would really
> > need to be written before touching anything else.  But that may be a
> > good thing for avoiding future ABI drift.
> > 
> 
> I did notice this would be more restrictive, but I don't believe this
> should introduce shortcoming in ways to set up a VCPU. I might have
> missed some case of course.
> 
> > This may be premature factoring though.  It's not clear whether many new
> > registers like KVM_REG_ARM64_SVE_VLS will exist, or that they would have
> > compatible sequencing requirements.
> > 
> > Thoughts?
> > 
> 
> It might be premature, but I really don't like seeing this
> "vcpu_finalize" in the middle of SVE code. I think it is very easy to
> miss that touching some "sve" register suddenly finalizes the vcpu.
> I fear that some other feature/extention might need early vcpu
> finalization and that it might somehow conflict with sve depending on
> which triggers finalization first.
> 
> > 
> > Independently of this, I'll have a look at whether there's
> > a straightforward way to simplify the code flow.
> > 
> 
> I might be over-thinking it, but if there is a way to move that
> finalization call I'd prefer that.

I know what you mean, but there's not really another clear place to put
it either, right now.  Making finalization a side-effect of only KVM_RUN
and KVM_GET_REG_LIST would mean that if userspace is squirting in the
state of a migrated vcpu, a dummy KVM_GET_REG_LIST call would need to be
inserted between setting KVM_REG_ARM64_SVE_VLS and setting the other SVE
regs.

One thing we could to is get rid of the implicit finalization behaviour
completely, and require an explicit ioctl KVM_ARM_VCPU_FINALIZE to do
this.  This makes a clear barrier between reg writes that manipulate the
"physical" configuration of the vcpu, and reg reads/writes that merely
manipulate the vcpu's runtime state.  Say:

	KVM_ARM_VCPU_INIT(features[] = KVM_ARM_VCPU_SVE) -> ok
	KVM_RUN -> -EPERM (too early)
	KVM_GET_REG_LIST -> -EPERM (too early)
	...
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_VLS) -> ok
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_ZREG(0, 0)) -> -EPERM (too early)
	...
	KVM_RUN -> -EPERM (too early)
	...
	KVM_ARM_VCPU_FINALIZE -> ok
	KVM_ARM_VCPU_FINALIZE -> -EPERM (too late)
	...
	KVM_REG_REG_LIST -> ok
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_VLS) -> -EPERM (too late)
	KVM_SET_ONE_REG(KVM_REG_ARM64_SVE_ZREG(0, 0)) -> ok
	KVM_RUN -> ok
	KVM_ARM_VCPU_FINALIZE -> -EPERM (too late)

This would change all existing kvm_arm_vcpu_finalize() calls to

	if (!kvm_arm_vcpu_finalized())
		return -EPERM;

(or similar).
	
Without an affected vcpu feature enabled, for backwards compatibility
we would not require the KVM_ARM_VCPU_FINALIZE call (or perhaps even
forbid it, since userspace that wants to backwards compatible cannot
use the new ioctl anyway.  I'm in two minds about this.  Either way,
calling _FINALIZE on an old kvm is harmless, so long as userspace knows
to ignore this failure case.)

The backwards-compatible flow would be:

	KVM_ARM_VCPU_INIT(features[] = 0) -> ok
	...
	KVM_ARM_VCPU_FINALIZE -> ok (or -EPERM)
	...
	KVM_RUN -> ok
	KVM_ARM_VCPU_FINALIZE -> -EPERM (too late)


Thoughts?

I may not have time to rework this for -rc1, and being a late ABI
change I'd like to get others' views on it before heading too far into
the tunnel.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-03-01 14:55 UTC|newest]

Thread overview: 189+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-18 19:52 [PATCH v5 00/26] KVM: arm64: SVE guest support Dave Martin
2019-02-18 19:52 ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 01/26] KVM: Documentation: Document arm64 core registers in detail Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-21 11:48   ` Julien Grall
2019-02-21 11:48     ` Julien Grall
2019-02-26 12:05     ` Dave Martin
2019-02-26 12:05       ` Dave Martin
2019-02-21 11:57   ` Peter Maydell
2019-02-21 11:57     ` Peter Maydell
2019-02-18 19:52 ` [PATCH v5 02/26] arm64: fpsimd: Always set TIF_FOREIGN_FPSTATE on task state flush Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-21 12:39   ` Julien Grall
2019-02-26 12:06     ` Dave Martin
2019-02-26 12:06       ` Dave Martin
2019-02-26 12:35       ` Julien Grall
2019-02-26 12:35         ` Julien Grall
2019-02-18 19:52 ` [PATCH v5 03/26] KVM: arm64: Delete orphaned declaration for __fpsimd_enabled() Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 04/26] KVM: arm64: Refactor kvm_arm_num_regs() for easier maintenance Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 05/26] KVM: arm64: Add missing #include of <linux/bitmap.h> to kvm_host.h Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 15:23   ` Mark Rutland
2019-02-20 15:23     ` Mark Rutland
2019-02-26 12:06     ` Dave Martin
2019-02-26 12:06       ` Dave Martin
2019-02-26 12:31       ` Mark Rutland
2019-02-26 12:33         ` Dave Martin
2019-02-26 12:33           ` Dave Martin
2019-02-26 12:40           ` Mark Rutland
2019-02-26 12:40             ` Mark Rutland
2019-02-18 19:52 ` [PATCH v5 06/26] arm64/sve: Check SVE virtualisability Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 11:12   ` Julien Thierry
2019-02-20 11:12     ` Julien Thierry
2019-02-26 12:06     ` Dave Martin
2019-02-26 12:06       ` Dave Martin
2019-03-01 12:39       ` Julien Thierry
2019-03-01 12:39         ` Julien Thierry
2019-03-01 14:44         ` Dave Martin
2019-03-01 14:44           ` Dave Martin
2019-02-21 13:36   ` Julien Grall
2019-02-21 13:36     ` Julien Grall
2019-02-26 12:06     ` Dave Martin
2019-02-26 12:06       ` Dave Martin
2019-02-26 15:43       ` Julien Grall
2019-02-26 15:43         ` Julien Grall
2019-02-18 19:52 ` [PATCH v5 07/26] arm64/sve: Clarify role of the VQ map maintenance functions Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 11:43   ` Julien Thierry
2019-02-20 11:43     ` Julien Thierry
2019-02-26 12:06     ` Dave Martin
2019-02-26 12:06       ` Dave Martin
2019-02-21 13:46   ` Julien Grall
2019-02-21 13:46     ` Julien Grall
2019-02-26 12:07     ` Dave Martin
2019-02-26 12:07       ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 08/26] arm64/sve: Enable SVE state tracking for non-task contexts Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-22 15:26   ` Julien Grall
2019-02-22 15:26     ` Julien Grall
2019-02-26 12:07     ` Dave Martin
2019-02-26 12:07       ` Dave Martin
2019-02-26 15:49       ` Julien Grall
2019-02-26 15:49         ` Julien Grall
2019-02-26 15:58         ` Dave Martin
2019-02-26 15:58           ` Dave Martin
2019-02-26 15:59           ` Julien Grall
2019-02-26 15:59             ` Julien Grall
2019-02-26 16:03             ` Dave Martin
2019-02-26 16:03               ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 09/26] KVM: arm64: Add a vcpu flag to control SVE visibility for the guest Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 10/26] KVM: arm64: Propagate vcpu into read_id_reg() Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 11/26] KVM: arm64: Extend reset_unknown() to handle mixed RES0/UNKNOWN registers Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 13:33   ` Julien Thierry
2019-02-20 13:33     ` Julien Thierry
2019-02-26 12:07     ` Dave Martin
2019-02-26 12:07       ` Dave Martin
2019-02-22 16:04   ` Julien Grall
2019-02-22 16:04     ` Julien Grall
2019-02-18 19:52 ` [PATCH v5 12/26] KVM: arm64: Support runtime sysreg visibility filtering Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 14:33   ` Julien Thierry
2019-02-20 14:33     ` Julien Thierry
2019-02-26 12:07     ` Dave Martin
2019-02-26 12:07       ` Dave Martin
2019-02-20 15:37   ` Mark Rutland
2019-02-20 15:37     ` Mark Rutland
2019-02-26 12:12     ` Dave Martin
2019-02-26 12:12       ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 13/26] KVM: arm64/sve: System register context switch and access support Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 16:48   ` Julien Thierry
2019-02-20 16:48     ` Julien Thierry
2019-02-26 16:32   ` Julien Grall
2019-02-26 16:32     ` Julien Grall
2019-02-26 17:01     ` Dave Martin
2019-02-26 17:01       ` Dave Martin
2019-02-27 12:02       ` Julien Grall
2019-02-27 12:02         ` Julien Grall
2019-02-27 13:50         ` Dave Martin
2019-02-27 13:50           ` Dave Martin
2019-02-27 14:17           ` Julien Grall
2019-02-27 14:17             ` Julien Grall
2019-02-27 14:38             ` Dave Martin
2019-02-27 14:38               ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 14/26] KVM: arm64/sve: Context switch the SVE registers Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 16:19   ` Mark Rutland
2019-02-20 16:19     ` Mark Rutland
2019-02-26 12:13     ` Dave Martin
2019-02-26 12:13       ` Dave Martin
2019-02-20 16:46   ` Julien Thierry
2019-02-20 16:46     ` Julien Thierry
2019-02-26 12:13     ` Dave Martin
2019-02-26 12:13       ` Dave Martin
2019-02-26 16:56       ` Julien Grall
2019-02-26 16:56         ` Julien Grall
2019-02-27 13:37         ` Dave Martin
2019-02-27 13:37           ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 15/26] KVM: Allow 2048-bit register access via ioctl interface Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 16/26] KVM: arm64: Add missing #include of <linux/string.h> in guest.c Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 17/26] KVM: arm64: Reject ioctl access to FPSIMD V-regs on SVE vcpus Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-21 12:06   ` Julien Thierry
2019-02-21 12:06     ` Julien Thierry
2019-02-26 12:13     ` Dave Martin
2019-02-26 12:13       ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 18/26] KVM: arm64/sve: Add SVE support to register access ioctl interface Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-21 15:23   ` Julien Thierry
2019-02-21 15:23     ` Julien Thierry
2019-02-26 12:13     ` Dave Martin
2019-02-26 12:13       ` Dave Martin
2019-03-01 13:03       ` Julien Thierry
2019-03-01 13:03         ` Julien Thierry
2019-03-01 14:45         ` Dave Martin
2019-03-01 14:45           ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 19/26] KVM: arm64: Enumerate SVE register indices for KVM_GET_REG_LIST Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-21 16:28   ` Julien Thierry
2019-02-21 16:28     ` Julien Thierry
2019-02-18 19:52 ` [PATCH v5 20/26] arm64/sve: In-kernel vector length availability query interface Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 21/26] KVM: arm/arm64: Add hook to finalize the vcpu configuration Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 22/26] KVM: arm64/sve: Add pseudo-register for the guest's vector lengths Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-21 17:48   ` Julien Thierry
2019-02-21 17:48     ` Julien Thierry
2019-02-26 12:13     ` Dave Martin
2019-02-26 12:13       ` Dave Martin
2019-03-01 13:28       ` Julien Thierry
2019-03-01 13:28         ` Julien Thierry
2019-03-01 14:55         ` Dave Martin [this message]
2019-03-01 14:55           ` Dave Martin
2019-03-07 13:47           ` Marc Zyngier
2019-03-07 13:47             ` Marc Zyngier
2019-03-07 15:30             ` Dave Martin
2019-03-07 15:30               ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 23/26] KVM: arm64/sve: Allow userspace to enable SVE for vcpus Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-22  9:05   ` Julien Thierry
2019-02-22  9:05     ` Julien Thierry
2019-02-26 12:13     ` Dave Martin
2019-02-26 12:13       ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 24/26] KVM: arm64: Add a capabillity to advertise SVE support Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-22  9:10   ` Julien Thierry
2019-02-22  9:10     ` Julien Thierry
2019-02-26 12:14     ` Dave Martin
2019-02-26 12:14       ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 25/26] KVM: Document errors for KVM_GET_ONE_REG and KVM_SET_ONE_REG Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-18 19:52 ` [PATCH v5 26/26] KVM: arm64/sve: Document KVM API extensions for SVE Dave Martin
2019-02-18 19:52   ` Dave Martin
2019-02-20 15:47 ` [PATCH v5 00/26] KVM: arm64: SVE guest support Dave Martin
2019-02-20 15:47   ` Dave Martin
2019-03-03  2:40 ` Zhang, Lei
2019-03-05  9:47   ` Dave Martin
2019-03-05  9:47     ` Dave Martin
2019-03-08  7:06     ` Zhang, Lei
2019-03-08  7:06       ` Zhang, Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301145503.GI3567@e103592.cambridge.arm.com \
    --to=dave.martin@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=cdall@kernel.org \
    --cc=julien.thierry@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=tokamoto@jp.fujitsu.com \
    --cc=will.deacon@arm.com \
    --cc=zhang.lei@jp.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.