linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Sean Christopherson <seanjc@google.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Anirudh Rayabharam <anrayabh@linux.microsoft.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Michael Kelley <mikelley@microsoft.com>,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 03/26] x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
Date: Tue, 23 Aug 2022 18:54:05 +0200	[thread overview]
Message-ID: <87tu62x5n6.fsf@redhat.com> (raw)
In-Reply-To: <YwTrlgeqoAqyH0KF@google.com>

Sean Christopherson <seanjc@google.com> writes:

> We're talking about nested VMX, i.e. exposing TSC_SCALING to L1.  QEMU's CLX
> definition doesn't include TSC_SCALING.  In fact, none of QEMU's predefined CPU
> models supports TSC_SCALING, precisely because KVM didn't support exposing the
> feature to L1 until relatively recently.
>
> $ git grep VMX_SECONDARY_EXEC_TSC_SCALING
> target/i386/cpu.h:#define VMX_SECONDARY_EXEC_TSC_SCALING              0x02000000
> target/i386/kvm/kvm.c:    if (f[FEAT_VMX_SECONDARY_CTLS] &  VMX_SECONDARY_EXEC_TSC_SCALING) {

(sorry for my persistence but I still believe there are issues which we
won't be able to solve if we take the suggested approach).

You got me. Indeed, "vmx-tsc-scaling" feature is indeed not set for
named CPU models so my example was flawed. Let's swap it with
VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL /
VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL which a bunch of named models
have. So I do the same,

'-cpu CascadeLake-Sever,hv-evmcs'

on both the source host which knows about these eVMCS fields and the
destination host which doesn't.

First problem: CPUID. On the source host, we will have
CPUID.0x4000000A.EBX BIT(0) = 1, and "=0" on the destination. I don't
think we migrate CPUID data (can be wrong, though).

Second, assuming VMX feature MSRs are actually migrated, we must fail on
the destnation because VMX_VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL is
trying to get set. We can do this in KVM but note: currently, KVM
filters guest reads but not host's so when you're trying to migrate from
a non-fixed KVM, VMX_VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL are
actually present! So how do we distinguinsh in KVM between these two
cases, i.e. how do we know if
VMX_VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL were filtered out on the
source (old kvm) or not (new KVM)?

...
>
> Because it's completely unnecessary, adds non-trivial maintenance burden to KVM,
> and requires explicit documentation to explain to userspace what "hv-evmcs-2022"
> means.
>
> It's unnecessary because if the user is concerned about eVMCS features showing up
> in the future, then they should do:
>
>   -cpu CascadeLake-Server,hv-evmcs,-vmx-tsc-scaling,-<any other VMX features not eVMCS-friendly>
>
> If QEMU wants to make that more user friendly, then define CascadeLake-Server-eVMCS
> or whatever so that the features that are unlikely be supported for eVMCS are off by
> default.

I completely agree that what I'm trying to achieve here could've been
done in QEMU from day 1 but we now have what we have: KVM silently
filtering out certain VMX features and zero indication to userspace
VMM whether filtering is being done or not (besides this
CPUID.0x4000000A.EBX BIT(0) bit but I'm not even sure we analyze
source's CPUID data upon migration).

>  This is no different than QEMU not including nested TSC_SCALING in any of
> the predefined models; the developers _know_ KVM doesn't widely support TSC_SCALING,
> so it was omitted, even though a real CLX CPU is guaranteed to support TSC_SCALING.
>

Out of curiosity, what happens if someone sends the following patch to
QEMU:

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1db1278a599b..2278f4522b44 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3191,6 +3191,12 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                   { "vmx-xsaves", "on" },
                   { /* end of list */ }
               },
+            { .version = 6,
+              .note = "ARCH_CAPABILITIES, EPT switching, XSAVES, no TSX, TSC_SCALING",
+              .props = (PropValue[]) {
+                  { "vmx-tsc-scaling", "on" },
+                  { /* end of list */ }
+              },
             },
             { /* end of list */ }
         }

Will Paolo remember about eVMCS and reject it?

> It's non-trivial maintenance for KVM because it would require defining new versions
> every time an eVMCS field is added, allowing userspace to specify and restrict
> features based on arbitrary versions, and do all of that without conflicting with
> whatever PV enumeration Microsoft adds.

The update at hand comes with a feature bit so no mater what we do, we
will need a new QEMU flag to support this feature bit. My suggestion was
just that we stretch its definition a bit and encode not only
PERF_GLOBAL_CTRL but all fields which were added. At the same time we
can switch to filtering host reads and failing host writes for what's
missing (and to do so we'll likely need to invert the logic and
explicitly list what eVMCS supports) so we're better prepared to the
next update.

-- 
Vitaly


  parent reply	other threads:[~2022-08-23 18:32 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-02 16:07 [PATCH v5 00/26] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 01/26] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags Vitaly Kuznetsov
2022-08-18 15:14   ` Sean Christopherson
2022-08-18 15:20     ` Vitaly Kuznetsov
2022-08-18 15:49       ` Sean Christopherson
2022-08-18 15:59         ` Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 02/26] x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 03/26] x86/hyperv: Update " Vitaly Kuznetsov
2022-08-18 15:21   ` Sean Christopherson
2022-08-18 15:29     ` Vitaly Kuznetsov
2022-08-18 17:57       ` Sean Christopherson
2022-08-22  9:18         ` Vitaly Kuznetsov
2022-08-22 15:55           ` Sean Christopherson
2022-08-22 16:21             ` Vitaly Kuznetsov
2022-08-22 17:01               ` Sean Christopherson
2022-08-22 17:46                 ` Vitaly Kuznetsov
2022-08-22 18:32                   ` Sean Christopherson
2022-08-23  7:33                     ` Vitaly Kuznetsov
2022-08-23 15:00                       ` Sean Christopherson
2022-08-23 15:31                         ` Sean Christopherson
2022-08-23 16:54                         ` Vitaly Kuznetsov [this message]
2022-08-23 20:16                           ` Sean Christopherson
2022-08-22 16:13           ` Sean Christopherson
2022-08-22 16:24             ` Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 04/26] KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 05/26] KVM: nVMX: Support several new fields in eVMCSv1 Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 06/26] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 07/26] KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 08/26] KVM: selftests: Switch to updated eVMCSv1 definition Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 09/26] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS Vitaly Kuznetsov
2022-08-18 17:15   ` Sean Christopherson
2022-08-19  8:06     ` Vitaly Kuznetsov
2022-08-19 17:02       ` Sean Christopherson
2022-08-22  8:47         ` Vitaly Kuznetsov
2022-08-22 16:50           ` Sean Christopherson
2022-08-22 17:49             ` Vitaly Kuznetsov
2022-08-18 17:19   ` Sean Christopherson
2022-08-19  7:42     ` Vitaly Kuznetsov
2022-08-19 14:49       ` Sean Christopherson
2022-08-19 15:07         ` Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 10/26] KVM: selftests: Enable TSC scaling in evmcs selftest Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 11/26] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 12/26] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 13/26] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING " Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 14/26] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING " Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 15/26] KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 16/26] KVM: VMX: Extend VMX controls macro shenanigans Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 17/26] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 18/26] KVM: VMX: Add missing VMEXIT controls to vmcs_config Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 19/26] KVM: VMX: Add missing CPU based VM execution " Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 20/26] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 21/26] KVM: x86: VMX: Replace some Intel model numbers with mnemonics Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 22/26] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config() Vitaly Kuznetsov
2022-08-18 17:49   ` Sean Christopherson
2022-08-19  7:48     ` Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 23/26] KVM: nVMX: Always set required-1 bits of pinbased_ctls to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 24/26] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 25/26] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config Vitaly Kuznetsov
2022-08-02 16:07 ` [PATCH v5 26/26] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tu62x5n6.fsf@redhat.com \
    --to=vkuznets@redhat.com \
    --cc=anrayabh@linux.microsoft.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=mlevitsk@redhat.com \
    --cc=nathan@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).