From: Dave Martin <Dave.Martin@arm.com>
To: Daniel Kiss <Daniel.Kiss@arm.com>
Cc: Marc Zyngier <maz@kernel.org>, Mark Brown <broonie@kernel.org>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>
Subject: Re: [PATCHv2] kvm: arm64: Add SVE support for nVHE.
Date: Mon, 8 Feb 2021 14:46:31 +0000 [thread overview]
Message-ID: <20210208144629.GD21837@arm.com> (raw)
In-Reply-To: <2C9C9BA7-6872-4420-9EDB-BCCEBD96BF6C@arm.com>
On Fri, Feb 05, 2021 at 12:12:51AM +0000, Daniel Kiss wrote:
>
>
> > On 4 Feb 2021, at 18:36, Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Tue, Feb 02, 2021 at 07:52:54PM +0100, Daniel Kiss wrote:
> >> CPUs that support SVE are architecturally required to support the
> >> Virtualization Host Extensions (VHE), so far the kernel supported
> >> SVE alongside KVM with VHE enabled. In same cases it is desired to
> >> run nVHE config even when VHE is available.
> >> This patch add support for SVE for nVHE configuration too.
> >>
> >> Tested on FVP with a Linux guest VM that run with a different VL than
> >> the host system.
> >>
> >> Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
[...]
> >> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> >> index 3e081d556e81..8f29b468e989 100644
> >> --- a/arch/arm64/kvm/fpsimd.c
> >> +++ b/arch/arm64/kvm/fpsimd.c
> >> @@ -42,6 +42,16 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> >> if (ret)
> >> goto error;
> >>
> >> + if (!has_vhe() && vcpu->arch.sve_state) {
> >> + void *sve_state_end = vcpu->arch.sve_state +
> >> + SVE_SIG_REGS_SIZE(
> >> + sve_vq_from_vl(vcpu->arch.sve_max_vl));
> >> + ret = create_hyp_mappings(vcpu->arch.sve_state,
> >> + sve_state_end,
> >> + PAGE_HYP);
> >> + if (ret)
> >> + goto error;
> >> + }
> >> vcpu->arch.host_thread_info = kern_hyp_va(ti);
> >> vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> >> error:
> >> @@ -109,10 +119,22 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >> local_irq_save(flags);
> >>
> >> if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> >> + if (guest_has_sve) {
> >> + if (has_vhe())
> >> + __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL12);
> >> + else {
> >> + __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL1);
> >> + /*
> >> + * vcpu could set ZCR_EL1 to a shorter VL then the max VL but
> >> + * the context is still valid there. Save the whole context.
> >> + * In nVHE case we need to reset the ZCR_EL1 for that
> >> + * because the save will be done in EL1.
> >> + */
> >> + write_sysreg_s(sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> + SYS_ZCR_EL1);
> >
> > This still doesn't feel right. We're already in EL1 here I think, in
> > which case ZCR_EL1 has an immediate effect on what state the
> > architecture guarantees to preserve: if we need to change ZCR_EL1, it's
> > because it might be wrong. If it's wrong, it might be too small. And
> > if it's too small, we may have already lost some SVE register bits that
> > the guest cares about.
> "On taking an exception from an Exception level that is more constrained
> to a target Exception level that is less constrained, or on writing a larger value
> to ZCR_ELx.LEN, then the previously inaccessible bits of these registers that
> become accessible have a value of either zero or the value they had before
> executing at the more constrained size.”
> If the CPU zeros the register when ZCR is written or exception is taken my reading
> of the above is that the register content maybe lost when we land in EL2.
> No code shall not count on the register content after reduces the VL in ZCR.
>
> I see my comment also not clear enough.
> Maybe we shouldn’t save the guest’s sve_max_vl here, would enough to save up to
> the actual VL.
Maybe you're right, but I may be missing some information here.
Can you sketch out more explicitly how it works, showing how all the
bits the host cares about (and only those bits) are saved/restored for
the host, and how all the bits the guest cares about (and only those
bits) are saved/restored for the guest?
Various optimisations are possible, but there is a risk of breaking
assumptions elsewhere. For example, the KVM_{SET,GET}_ONE_REG code
makes assmuptions about the layout of the data in
vcpu->arch.sve_state.
The architectural rules about when SVE register bits are also complex,
with many interactions. We also don't want to aggressively optimise in
a way that might be hard to apply to nested virt.
My instinct is to keep it simple while this patch matures, and continue
to save/restore based on vcpu->arch.sve_max_vl. This keeps a clear
split between the emulated "hardware" (which doesn't change while the
guest runs), and changeable runtime state (like the guest's ZCR_EL1).
I'm happy to review proposed optimisations, but I think those should be
separated out as later patches (or a separate series). My experience
is that it's much easier to get this wrong than to get it right!
When it's wrong, it can be a nightmare to debug.
> > I think that we need to handle this on our way out of hyp instead,
> > _before_ returning back to EL1.
> >
> > When at EL2 exiting back to the host: if and only if
> > KVM_ARM64_FP_ENABLED is set then save the guest's ZCR_EL1 and ZCR_EL1 to
> > match the guest's sve_max_vl (possibly by just cloning ZCR_EL2).
> >
> >> + }
> >> + }
> >> fpsimd_save_and_flush_cpu_state();
> >> -
> >> - if (guest_has_sve)
> >> - __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL12);
> >> } else if (host_has_sve) {
> >> /*
> >> * The FPSIMD/SVE state in the CPU has not been touched, and we
[...]
> >> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> >> index f3d0e9eca56c..df9e912d1278 100644
> >> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> >> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> >> @@ -45,6 +45,18 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
> >> if (!update_fp_enabled(vcpu)) {
> >> val |= CPTR_EL2_TFP;
> >> __activate_traps_fpsimd32(vcpu);
> >> + } else {
> >> + if (vcpu_has_sve(vcpu)) {
> >> + /*
> >> + * The register access will not be trapped so restore
> >> + * ZCR_EL1/ZCR_EL2 because those were set for the host.
> >> + */
> >> + write_sysreg_s(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR_EL1);
> >> + write_sysreg_s(
> >> + sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> + SYS_ZCR_EL2);
> >> + val &= ~CPTR_EL2_TZ;
> >> + }
> >> }
> >>
> >> write_sysreg(val, cptr_el2);
> >> @@ -110,6 +122,17 @@ static void __load_host_stage2(void)
> >> write_sysreg(0, vttbr_el2);
> >> }
> >>
> >> +static void __restore_host_sve_state(struct kvm_vcpu *vcpu)
> >> +{
> >> + /*
> >> + * If the guest uses SVE, the ZCR_EL2 was configured for the guest.
> >> + * Host might save the context in EL1 but for that the ZCR_EL2 need
> >> + * to be reset to the host's default.
> >
> > This isn't just about saving the guest context correctly. The host have
> > be using larger vectors than the guest's sve_max_vl allows.
> Right.
Can you clarify the comment please (unless you've done so already)?
> >
> >> + */
> >> + if (vcpu_has_sve(vcpu) && (vcpu->arch.flags |= KVM_ARM64_FP_ENABLED))
> >> + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> >
> > I'm not sure if it's worth having a special #define for this, but it
> > would be a good idea at least to put comments here and in el2_setup.h to
> > remind people that the ZCR_EL2 settings need to match. Otherwise this
> > might get mis-maintained in the future.
> I will add a comment to el2_setup.h
Can you put comments in both places please? Maintainers of either bit
of code need to be aware of the other code.
[...]
> >> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> >> index 47f3f035f3ea..17cc5e87adcd 100644
> >> --- a/arch/arm64/kvm/reset.c
> >> +++ b/arch/arm64/kvm/reset.c
> >> @@ -74,10 +74,6 @@ static int kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu)
> >> if (!system_supports_sve())
> >> return -EINVAL;
> >>
> >> - /* Verify that KVM startup enforced this when SVE was detected: */
> >> - if (WARN_ON(!has_vhe()))
> >> - return -EINVAL;
> >> -
> >> vcpu->arch.sve_max_vl = kvm_sve_max_vl;
> >>
> >> /*
> >> @@ -113,7 +109,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
> >> buf = kzalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)), GFP_KERNEL);
> >> if (!buf)
> >> return -ENOMEM;
> >> -
> >> + __vcpu_sys_reg(vcpu, ZCR_EL1) = sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1;
> >
> > What's this for?
> >
> > __vcpu_sys_reg(vcpu, ZCR_EL1) should already be being reset sensibly
> > somewhere. If not, that would be a bug…
> It is not needed indeed.
Ah, OK.
Cheers
---Dave
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Dave Martin <Dave.Martin@arm.com>
To: Daniel Kiss <Daniel.Kiss@arm.com>
Cc: Suzuki Poulose <Suzuki.Poulose@arm.com>,
Marc Zyngier <maz@kernel.org>, Mark Brown <broonie@kernel.org>,
James Morse <James.Morse@arm.com>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
"julien.thierry.kdev@gmail.com" <julien.thierry.kdev@gmail.com>
Subject: Re: [PATCHv2] kvm: arm64: Add SVE support for nVHE.
Date: Mon, 8 Feb 2021 14:46:31 +0000 [thread overview]
Message-ID: <20210208144629.GD21837@arm.com> (raw)
In-Reply-To: <2C9C9BA7-6872-4420-9EDB-BCCEBD96BF6C@arm.com>
On Fri, Feb 05, 2021 at 12:12:51AM +0000, Daniel Kiss wrote:
>
>
> > On 4 Feb 2021, at 18:36, Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Tue, Feb 02, 2021 at 07:52:54PM +0100, Daniel Kiss wrote:
> >> CPUs that support SVE are architecturally required to support the
> >> Virtualization Host Extensions (VHE), so far the kernel supported
> >> SVE alongside KVM with VHE enabled. In same cases it is desired to
> >> run nVHE config even when VHE is available.
> >> This patch add support for SVE for nVHE configuration too.
> >>
> >> Tested on FVP with a Linux guest VM that run with a different VL than
> >> the host system.
> >>
> >> Signed-off-by: Daniel Kiss <daniel.kiss@arm.com>
[...]
> >> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> >> index 3e081d556e81..8f29b468e989 100644
> >> --- a/arch/arm64/kvm/fpsimd.c
> >> +++ b/arch/arm64/kvm/fpsimd.c
> >> @@ -42,6 +42,16 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
> >> if (ret)
> >> goto error;
> >>
> >> + if (!has_vhe() && vcpu->arch.sve_state) {
> >> + void *sve_state_end = vcpu->arch.sve_state +
> >> + SVE_SIG_REGS_SIZE(
> >> + sve_vq_from_vl(vcpu->arch.sve_max_vl));
> >> + ret = create_hyp_mappings(vcpu->arch.sve_state,
> >> + sve_state_end,
> >> + PAGE_HYP);
> >> + if (ret)
> >> + goto error;
> >> + }
> >> vcpu->arch.host_thread_info = kern_hyp_va(ti);
> >> vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> >> error:
> >> @@ -109,10 +119,22 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
> >> local_irq_save(flags);
> >>
> >> if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
> >> + if (guest_has_sve) {
> >> + if (has_vhe())
> >> + __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL12);
> >> + else {
> >> + __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL1);
> >> + /*
> >> + * vcpu could set ZCR_EL1 to a shorter VL then the max VL but
> >> + * the context is still valid there. Save the whole context.
> >> + * In nVHE case we need to reset the ZCR_EL1 for that
> >> + * because the save will be done in EL1.
> >> + */
> >> + write_sysreg_s(sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> + SYS_ZCR_EL1);
> >
> > This still doesn't feel right. We're already in EL1 here I think, in
> > which case ZCR_EL1 has an immediate effect on what state the
> > architecture guarantees to preserve: if we need to change ZCR_EL1, it's
> > because it might be wrong. If it's wrong, it might be too small. And
> > if it's too small, we may have already lost some SVE register bits that
> > the guest cares about.
> "On taking an exception from an Exception level that is more constrained
> to a target Exception level that is less constrained, or on writing a larger value
> to ZCR_ELx.LEN, then the previously inaccessible bits of these registers that
> become accessible have a value of either zero or the value they had before
> executing at the more constrained size.”
> If the CPU zeros the register when ZCR is written or exception is taken my reading
> of the above is that the register content maybe lost when we land in EL2.
> No code shall not count on the register content after reduces the VL in ZCR.
>
> I see my comment also not clear enough.
> Maybe we shouldn’t save the guest’s sve_max_vl here, would enough to save up to
> the actual VL.
Maybe you're right, but I may be missing some information here.
Can you sketch out more explicitly how it works, showing how all the
bits the host cares about (and only those bits) are saved/restored for
the host, and how all the bits the guest cares about (and only those
bits) are saved/restored for the guest?
Various optimisations are possible, but there is a risk of breaking
assumptions elsewhere. For example, the KVM_{SET,GET}_ONE_REG code
makes assmuptions about the layout of the data in
vcpu->arch.sve_state.
The architectural rules about when SVE register bits are also complex,
with many interactions. We also don't want to aggressively optimise in
a way that might be hard to apply to nested virt.
My instinct is to keep it simple while this patch matures, and continue
to save/restore based on vcpu->arch.sve_max_vl. This keeps a clear
split between the emulated "hardware" (which doesn't change while the
guest runs), and changeable runtime state (like the guest's ZCR_EL1).
I'm happy to review proposed optimisations, but I think those should be
separated out as later patches (or a separate series). My experience
is that it's much easier to get this wrong than to get it right!
When it's wrong, it can be a nightmare to debug.
> > I think that we need to handle this on our way out of hyp instead,
> > _before_ returning back to EL1.
> >
> > When at EL2 exiting back to the host: if and only if
> > KVM_ARM64_FP_ENABLED is set then save the guest's ZCR_EL1 and ZCR_EL1 to
> > match the guest's sve_max_vl (possibly by just cloning ZCR_EL2).
> >
> >> + }
> >> + }
> >> fpsimd_save_and_flush_cpu_state();
> >> -
> >> - if (guest_has_sve)
> >> - __vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_s(SYS_ZCR_EL12);
> >> } else if (host_has_sve) {
> >> /*
> >> * The FPSIMD/SVE state in the CPU has not been touched, and we
[...]
> >> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> >> index f3d0e9eca56c..df9e912d1278 100644
> >> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> >> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> >> @@ -45,6 +45,18 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
> >> if (!update_fp_enabled(vcpu)) {
> >> val |= CPTR_EL2_TFP;
> >> __activate_traps_fpsimd32(vcpu);
> >> + } else {
> >> + if (vcpu_has_sve(vcpu)) {
> >> + /*
> >> + * The register access will not be trapped so restore
> >> + * ZCR_EL1/ZCR_EL2 because those were set for the host.
> >> + */
> >> + write_sysreg_s(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR_EL1);
> >> + write_sysreg_s(
> >> + sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1,
> >> + SYS_ZCR_EL2);
> >> + val &= ~CPTR_EL2_TZ;
> >> + }
> >> }
> >>
> >> write_sysreg(val, cptr_el2);
> >> @@ -110,6 +122,17 @@ static void __load_host_stage2(void)
> >> write_sysreg(0, vttbr_el2);
> >> }
> >>
> >> +static void __restore_host_sve_state(struct kvm_vcpu *vcpu)
> >> +{
> >> + /*
> >> + * If the guest uses SVE, the ZCR_EL2 was configured for the guest.
> >> + * Host might save the context in EL1 but for that the ZCR_EL2 need
> >> + * to be reset to the host's default.
> >
> > This isn't just about saving the guest context correctly. The host have
> > be using larger vectors than the guest's sve_max_vl allows.
> Right.
Can you clarify the comment please (unless you've done so already)?
> >
> >> + */
> >> + if (vcpu_has_sve(vcpu) && (vcpu->arch.flags |= KVM_ARM64_FP_ENABLED))
> >> + write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
> >
> > I'm not sure if it's worth having a special #define for this, but it
> > would be a good idea at least to put comments here and in el2_setup.h to
> > remind people that the ZCR_EL2 settings need to match. Otherwise this
> > might get mis-maintained in the future.
> I will add a comment to el2_setup.h
Can you put comments in both places please? Maintainers of either bit
of code need to be aware of the other code.
[...]
> >> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> >> index 47f3f035f3ea..17cc5e87adcd 100644
> >> --- a/arch/arm64/kvm/reset.c
> >> +++ b/arch/arm64/kvm/reset.c
> >> @@ -74,10 +74,6 @@ static int kvm_vcpu_enable_sve(struct kvm_vcpu *vcpu)
> >> if (!system_supports_sve())
> >> return -EINVAL;
> >>
> >> - /* Verify that KVM startup enforced this when SVE was detected: */
> >> - if (WARN_ON(!has_vhe()))
> >> - return -EINVAL;
> >> -
> >> vcpu->arch.sve_max_vl = kvm_sve_max_vl;
> >>
> >> /*
> >> @@ -113,7 +109,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
> >> buf = kzalloc(SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl)), GFP_KERNEL);
> >> if (!buf)
> >> return -ENOMEM;
> >> -
> >> + __vcpu_sys_reg(vcpu, ZCR_EL1) = sve_vq_from_vl(vcpu->arch.sve_max_vl) - 1;
> >
> > What's this for?
> >
> > __vcpu_sys_reg(vcpu, ZCR_EL1) should already be being reset sensibly
> > somewhere. If not, that would be a bug…
> It is not needed indeed.
Ah, OK.
Cheers
---Dave
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-02-08 14:46 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-02 18:52 [PATCHv2] kvm: arm64: Add SVE support for nVHE Daniel Kiss
2021-02-02 18:52 ` Daniel Kiss
2021-02-04 17:36 ` Dave Martin
2021-02-04 17:36 ` Dave Martin
2021-02-05 0:12 ` Daniel Kiss
2021-02-05 0:12 ` Daniel Kiss
2021-02-08 14:46 ` Dave Martin [this message]
2021-02-08 14:46 ` Dave Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210208144629.GD21837@arm.com \
--to=dave.martin@arm.com \
--cc=Daniel.Kiss@arm.com \
--cc=broonie@kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.