All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Ilias Stamatis <ilstam@amazon.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	mail@anirudhrb.com, kumarpraveen@linux.microsoft.com,
	wei.liu@kernel.org, robert.bradford@intel.com,
	liuwe@microsoft.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: nVMX: Don't expose TSC scaling to L1 when on Hyper-V
Date: Wed, 22 Jun 2022 16:35:27 +0200	[thread overview]
Message-ID: <87r13gyde8.fsf@redhat.com> (raw)
In-Reply-To: <YrMenI1mTbqA9MaR@anrayabh-desk>

Anirudh Rayabharam <anrayabh@linux.microsoft.com> writes:

> On Wed, Jun 22, 2022 at 10:00:29AM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@google.com> writes:
>> 
>> > On Tue, Jun 14, 2022, Anirudh Rayabharam wrote:
>> >> On Mon, Jun 13, 2022 at 04:57:49PM +0000, Sean Christopherson wrote:
>> 
>> ...
>> 
>> >> > 
>> >> > Any reason not to use the already sanitized vmcs_config?  I can't think of any
>> >> > reason why the nested path should blindly use the raw MSR values from hardware.
>> >> 
>> >> vmcs_config has the sanitized exec controls. But how do we construct MSR
>> >> values using them?
>> >
>> > I was thinking we could use the sanitized controls for the allowed-1 bits, and then
>> > take the required-1 bits from the CPU.  And then if we wanted to avoid the redundant
>> > RDMSRs in a follow-up patch we could add required-1 fields to vmcs_config.
>> >
>> > Hastily constructed and compile-tested only, proceed with caution :-)
>> 
>> Independently from "[PATCH 00/11] KVM: VMX: Support TscScaling and
>> EnclsExitingBitmap whith eVMCS" which is supposed to fix the particular
>> TSC scaling issue, I like the idea to make nested_vmx_setup_ctls_msrs()
>> use both allowed-1 and required-1 bits from vmcs_config. I'll pick up
>> the suggested patch and try to construct something for required-1 bits.
>
> I tried this patch today but it causes some regression which causes
> /dev/kvm to be unavailable in L1. I didn't get a chance to look into it
> closely but I am guessing it has something to do with the fact that
> vmcs_config reflects the config that L0 chose to use rather than what is
> available to use. So constructing allowed-1 MSR bits based on what bits
> are set in exec controls maybe isn't correct.

I've tried to pick it up but it's actually much harder than I think. The
patch has some minor issues ('&vmcs_config.nested' needs to be switched
to '&vmcs_conf->nested' in nested_vmx_setup_ctls_msrs()), but the main
problem is that the set of controls nested_vmx_setup_ctls_msrs() needs
is NOT a subset of vmcs_config (setup_vmcs_config()). I was able to
identify at least:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5e14e4c40007..8076352174ad 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2483,8 +2483,14 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
              CPU_BASED_INVLPG_EXITING |
              CPU_BASED_RDPMC_EXITING;
 
-       opt = CPU_BASED_TPR_SHADOW |
+       opt = CPU_BASED_INTR_WINDOW_EXITING |
+             CPU_BASED_NMI_WINDOW_EXITING |
+             CPU_BASED_TPR_SHADOW |
+             CPU_BASED_USE_IO_BITMAPS |
              CPU_BASED_USE_MSR_BITMAPS |
+             CPU_BASED_MONITOR_TRAP_FLAG |
+             CPU_BASED_RDTSC_EXITING |
+             CPU_BASED_PAUSE_EXITING |
              CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
              CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
        if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
@@ -2582,6 +2588,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 #endif
        opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
              VM_EXIT_LOAD_IA32_PAT |
+             VM_EXIT_SAVE_IA32_PAT |
              VM_EXIT_LOAD_IA32_EFER |
              VM_EXIT_CLEAR_BNDCFGS |
              VM_EXIT_PT_CONCEAL_PIP |
@@ -2604,7 +2611,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
                _pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
 
        min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
-       opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+       opt =
+#ifdef CONFIG_X86_64
+             VM_ENTRY_IA32E_MODE |
+#endif
+             VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
              VM_ENTRY_LOAD_IA32_PAT |
              VM_ENTRY_LOAD_IA32_EFER |
              VM_ENTRY_LOAD_BNDCFGS |

but it is 1) not sufficient because some controls are smartly filtered
out just because we don't want them for L1 -- and this doesn't mean that
L2 doesn't need them and 2) because if we add some 'opt' controls to
setup_vmcs_config() we need to filter them out somewhere else.

I'm starting to think we may just want to store raw VMX MSR values in
vmcs_config first, then sanitize them (eVMCS, vmx preemtoion timer bug,
perf_ctrl bug,..) and then do the adjust_vmx_controls() magic. 

I'm not giving up yet but don't expect something small and backportable
to stable :-) 

-- 
Vitaly


  reply	other threads:[~2022-06-22 14:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13 16:16 [PATCH] KVM: nVMX: Don't expose TSC scaling to L1 when on Hyper-V Anirudh Rayabharam
2022-06-13 16:41 ` Sean Christopherson
2022-06-13 16:49 ` Paolo Bonzini
2022-06-13 16:57   ` Sean Christopherson
2022-06-14 15:28     ` Anirudh Rayabharam
2022-06-14 16:00       ` Sean Christopherson
2022-06-22  8:00         ` Vitaly Kuznetsov
2022-06-22 13:52           ` Anirudh Rayabharam
2022-06-22 14:35             ` Vitaly Kuznetsov [this message]
2022-06-22 16:19               ` Anirudh Rayabharam
2022-06-22 16:48                 ` Vitaly Kuznetsov
2022-06-23 10:17                   ` Anirudh Rayabharam
2022-06-23 11:49                     ` Vitaly Kuznetsov
2022-06-28 10:30                       ` Anirudh Rayabharam
2022-06-14  4:55   ` Anirudh Rayabharam
2022-06-14 12:16     ` Paolo Bonzini
2022-06-14 15:13       ` Anirudh Rayabharam
2022-06-14 17:28         ` Paolo Bonzini
2022-06-14 15:17     ` Anirudh Rayabharam
2022-06-14 12:12   ` Vitaly Kuznetsov
2022-06-14 12:19 ` Vitaly Kuznetsov
2022-06-14 15:01   ` Vitaly Kuznetsov
2022-06-15 11:30     ` Vitaly Kuznetsov
2022-06-14 17:20   ` Paolo Bonzini
2022-06-15  9:01     ` Anirudh Rayabharam
2022-06-15  9:36       ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r13gyde8.fsf@redhat.com \
    --to=vkuznets@redhat.com \
    --cc=anrayabh@linux.microsoft.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=ilstam@amazon.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kumarpraveen@linux.microsoft.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liuwe@microsoft.com \
    --cc=mail@anirudhrb.com \
    --cc=mingo@redhat.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=robert.bradford@intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=wanpengli@tencent.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.