From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Cc: Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Ilias Stamatis <ilstam@amazon.com>,
Maxim Levitsky <mlevitsk@redhat.com>,
mail@anirudhrb.com, kumarpraveen@linux.microsoft.com,
wei.liu@kernel.org, robert.bradford@intel.com,
liuwe@microsoft.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: nVMX: Don't expose TSC scaling to L1 when on Hyper-V
Date: Wed, 22 Jun 2022 16:35:27 +0200 [thread overview]
Message-ID: <87r13gyde8.fsf@redhat.com> (raw)
In-Reply-To: <YrMenI1mTbqA9MaR@anrayabh-desk>
Anirudh Rayabharam <anrayabh@linux.microsoft.com> writes:
> On Wed, Jun 22, 2022 at 10:00:29AM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@google.com> writes:
>>
>> > On Tue, Jun 14, 2022, Anirudh Rayabharam wrote:
>> >> On Mon, Jun 13, 2022 at 04:57:49PM +0000, Sean Christopherson wrote:
>>
>> ...
>>
>> >> >
>> >> > Any reason not to use the already sanitized vmcs_config? I can't think of any
>> >> > reason why the nested path should blindly use the raw MSR values from hardware.
>> >>
>> >> vmcs_config has the sanitized exec controls. But how do we construct MSR
>> >> values using them?
>> >
>> > I was thinking we could use the sanitized controls for the allowed-1 bits, and then
>> > take the required-1 bits from the CPU. And then if we wanted to avoid the redundant
>> > RDMSRs in a follow-up patch we could add required-1 fields to vmcs_config.
>> >
>> > Hastily constructed and compile-tested only, proceed with caution :-)
>>
>> Independently from "[PATCH 00/11] KVM: VMX: Support TscScaling and
>> EnclsExitingBitmap whith eVMCS" which is supposed to fix the particular
>> TSC scaling issue, I like the idea to make nested_vmx_setup_ctls_msrs()
>> use both allowed-1 and required-1 bits from vmcs_config. I'll pick up
>> the suggested patch and try to construct something for required-1 bits.
>
> I tried this patch today but it causes some regression which causes
> /dev/kvm to be unavailable in L1. I didn't get a chance to look into it
> closely but I am guessing it has something to do with the fact that
> vmcs_config reflects the config that L0 chose to use rather than what is
> available to use. So constructing allowed-1 MSR bits based on what bits
> are set in exec controls maybe isn't correct.
I've tried to pick it up but it's actually much harder than I think. The
patch has some minor issues ('&vmcs_config.nested' needs to be switched
to '&vmcs_conf->nested' in nested_vmx_setup_ctls_msrs()), but the main
problem is that the set of controls nested_vmx_setup_ctls_msrs() needs
is NOT a subset of vmcs_config (setup_vmcs_config()). I was able to
identify at least:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5e14e4c40007..8076352174ad 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2483,8 +2483,14 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
CPU_BASED_INVLPG_EXITING |
CPU_BASED_RDPMC_EXITING;
- opt = CPU_BASED_TPR_SHADOW |
+ opt = CPU_BASED_INTR_WINDOW_EXITING |
+ CPU_BASED_NMI_WINDOW_EXITING |
+ CPU_BASED_TPR_SHADOW |
+ CPU_BASED_USE_IO_BITMAPS |
CPU_BASED_USE_MSR_BITMAPS |
+ CPU_BASED_MONITOR_TRAP_FLAG |
+ CPU_BASED_RDTSC_EXITING |
+ CPU_BASED_PAUSE_EXITING |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
@@ -2582,6 +2588,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
#endif
opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
VM_EXIT_LOAD_IA32_PAT |
+ VM_EXIT_SAVE_IA32_PAT |
VM_EXIT_LOAD_IA32_EFER |
VM_EXIT_CLEAR_BNDCFGS |
VM_EXIT_PT_CONCEAL_PIP |
@@ -2604,7 +2611,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
- opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+ opt =
+#ifdef CONFIG_X86_64
+ VM_ENTRY_IA32E_MODE |
+#endif
+ VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
VM_ENTRY_LOAD_IA32_PAT |
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_LOAD_BNDCFGS |
but it is 1) not sufficient because some controls are smartly filtered
out just because we don't want them for L1 -- and this doesn't mean that
L2 doesn't need them and 2) because if we add some 'opt' controls to
setup_vmcs_config() we need to filter them out somewhere else.
I'm starting to think we may just want to store raw VMX MSR values in
vmcs_config first, then sanitize them (eVMCS, vmx preemtoion timer bug,
perf_ctrl bug,..) and then do the adjust_vmx_controls() magic.
I'm not giving up yet but don't expect something small and backportable
to stable :-)
--
Vitaly
next prev parent reply other threads:[~2022-06-22 14:35 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-13 16:16 [PATCH] KVM: nVMX: Don't expose TSC scaling to L1 when on Hyper-V Anirudh Rayabharam
2022-06-13 16:41 ` Sean Christopherson
2022-06-13 16:49 ` Paolo Bonzini
2022-06-13 16:57 ` Sean Christopherson
2022-06-14 15:28 ` Anirudh Rayabharam
2022-06-14 16:00 ` Sean Christopherson
2022-06-22 8:00 ` Vitaly Kuznetsov
2022-06-22 13:52 ` Anirudh Rayabharam
2022-06-22 14:35 ` Vitaly Kuznetsov [this message]
2022-06-22 16:19 ` Anirudh Rayabharam
2022-06-22 16:48 ` Vitaly Kuznetsov
2022-06-23 10:17 ` Anirudh Rayabharam
2022-06-23 11:49 ` Vitaly Kuznetsov
2022-06-28 10:30 ` Anirudh Rayabharam
2022-06-14 4:55 ` Anirudh Rayabharam
2022-06-14 12:16 ` Paolo Bonzini
2022-06-14 15:13 ` Anirudh Rayabharam
2022-06-14 17:28 ` Paolo Bonzini
2022-06-14 15:17 ` Anirudh Rayabharam
2022-06-14 12:12 ` Vitaly Kuznetsov
2022-06-14 12:19 ` Vitaly Kuznetsov
2022-06-14 15:01 ` Vitaly Kuznetsov
2022-06-15 11:30 ` Vitaly Kuznetsov
2022-06-14 17:20 ` Paolo Bonzini
2022-06-15 9:01 ` Anirudh Rayabharam
2022-06-15 9:36 ` Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r13gyde8.fsf@redhat.com \
--to=vkuznets@redhat.com \
--cc=anrayabh@linux.microsoft.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=ilstam@amazon.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kumarpraveen@linux.microsoft.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=liuwe@microsoft.com \
--cc=mail@anirudhrb.com \
--cc=mingo@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=robert.bradford@intel.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=wanpengli@tencent.com \
--cc=wei.liu@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).