* [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening
@ 2026-01-23 22:15 Sean Christopherson
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe,
Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson
Fix a bug where KVM will clear IBT and SHSTK bits after nested VMX MSRs
have been configured, e.g. if the kernel is built with CONFIG_X86_CET=y
but CONFIG_X86_KERNEL_IBT=n. The late clearing results in kvm-intel.ko
refusing to load as the CPU compatible checks generate their VMCS configs
with IBT=n and SHSTK=n, ultimately causing a mismatch on the CET entry
and exit controls.
Patch 2 hardens against similar bugs in the future by added a flag and
WARNs to yell if KVM sets or clear feature flags outside of the dedicated
flow.
Patch 3 adds (very, very) long overdue printing of the mistmatching offsets
in the VMCS configs.
Sean Christopherson (3):
KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps()
KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps
KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch
arch/x86/kvm/cpuid.c | 29 +++++++++++++++++++++++++++--
arch/x86/kvm/cpuid.h | 7 ++++++-
arch/x86/kvm/svm/svm.c | 4 +++-
arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++--
arch/x86/kvm/x86.c | 14 --------------
arch/x86/kvm/x86.h | 2 ++
6 files changed, 56 insertions(+), 20 deletions(-)
base-commit: e81f7c908e1664233974b9f20beead78cde6343a
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() 2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson @ 2026-01-23 22:15 ` Sean Christopherson 2026-01-27 7:42 ` Chao Gao 2026-01-27 15:12 ` Xiaoyao Li 2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson 2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson 2 siblings, 2 replies; 11+ messages in thread From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson Explicitly finalize kvm_cpu_caps as part of each vendor's setup flow to fix a bug where clearing SHSTK and IBT due to lack of CET XFEATURE support makes kvm-intel.ko unloadable when nested=1. The late clearing results in nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks, ultimately leading to a mismatched VMCS config due to the reference config having the CET bits set, but every CPU's "local" config having the bits cleared. Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by kvm_x86_vendor_init(), before calling into vendor code, and not referenced between ops->hardware_setup() and their current/old location. Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER") Cc: stable@vger.kernel.org Cc: Mathias Krause <minipli@grsecurity.net> Cc: John Allen <john.allen@amd.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Chao Gao <chao.gao@intel.com> Cc: Binbin Wu <binbin.wu@linux.intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/cpuid.c | 21 +++++++++++++++++++-- arch/x86/kvm/cpuid.h | 3 ++- arch/x86/kvm/svm/svm.c | 4 +++- arch/x86/kvm/vmx/vmx.c | 4 +++- arch/x86/kvm/x86.c | 14 -------------- arch/x86/kvm/x86.h | 2 ++ 6 files changed, 29 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 575244af9c9f..267e59b405c1 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -826,7 +826,7 @@ do { \ /* DS is defined by ptrace-abi.h on 32-bit builds. */ #undef DS -void kvm_set_cpu_caps(void) +void kvm_initialize_cpu_caps(void) { memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps)); @@ -1289,7 +1289,24 @@ void kvm_set_cpu_caps(void) kvm_cpu_cap_clear(X86_FEATURE_RDPID); } } -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_set_cpu_caps); +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_initialize_cpu_caps); + +void kvm_finalize_cpu_caps(void) +{ + if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) + kvm_caps.supported_xss = 0; + + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; + + if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; + } +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_finalize_cpu_caps); #undef F #undef SCATTERED_F diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index d3f5ae15a7ca..3b0b4b1adb97 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -8,7 +8,8 @@ #include <uapi/asm/kvm_para.h> extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; -void kvm_set_cpu_caps(void); +void kvm_initialize_cpu_caps(void); +void kvm_finalize_cpu_caps(void); void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu); struct kvm_cpuid_entry2 *kvm_find_cpuid_entry2(struct kvm_cpuid_entry2 *entries, diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7803d2781144..0c23fcaedcc5 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -5305,7 +5305,7 @@ static __init void svm_adjust_mmio_mask(void) static __init void svm_set_cpu_caps(void) { - kvm_set_cpu_caps(); + kvm_initialize_cpu_caps(); kvm_caps.supported_perf_cap = 0; @@ -5387,6 +5387,8 @@ static __init void svm_set_cpu_caps(void) */ kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT); kvm_cpu_cap_clear(X86_FEATURE_MSR_IMM); + + kvm_finalize_cpu_caps(); } static __init int svm_hardware_setup(void) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 27acafd03381..7d373e32ea9c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8173,7 +8173,7 @@ static __init u64 vmx_get_perf_capabilities(void) static __init void vmx_set_cpu_caps(void) { - kvm_set_cpu_caps(); + kvm_initialize_cpu_caps(); /* CPUID 0x1 */ if (nested) @@ -8230,6 +8230,8 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_clear(X86_FEATURE_SHSTK); kvm_cpu_cap_clear(X86_FEATURE_IBT); } + + kvm_finalize_cpu_caps(); } static bool vmx_is_io_intercepted(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8acfdfc583a1..36385e6aebfa 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -220,7 +220,6 @@ static DEFINE_PER_CPU(struct kvm_user_return_msrs, user_return_msrs); | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) -#define XFEATURE_MASK_CET_ALL (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL) /* * Note, KVM supports exposing PT to the guest, but does not support context * switching PT via XSTATE (KVM's PT virtualization relies on perf; swapping @@ -10138,19 +10137,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!tdp_enabled) kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT; - if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) - kvm_caps.supported_xss = 0; - - if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && - !kvm_cpu_cap_has(X86_FEATURE_IBT)) - kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; - - if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { - kvm_cpu_cap_clear(X86_FEATURE_SHSTK); - kvm_cpu_cap_clear(X86_FEATURE_IBT); - kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; - } - if (kvm_caps.has_tsc_control) { /* * Make sure the user can only configure tsc_khz values that diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 70e81f008030..9edfac5d5ffb 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -483,6 +483,8 @@ extern struct kvm_host_values kvm_host; extern bool enable_pmu; extern bool enable_mediated_pmu; +#define XFEATURE_MASK_CET_ALL (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL) + /* * Get a filtered version of KVM's supported XCR0 that strips out dynamic * features for which the current process doesn't (yet) have permission to use. -- 2.52.0.457.g6b5491de43-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() 2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson @ 2026-01-27 7:42 ` Chao Gao 2026-01-27 15:12 ` Xiaoyao Li 1 sibling, 0 replies; 11+ messages in thread From: Chao Gao @ 2026-01-27 7:42 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson On Fri, Jan 23, 2026 at 02:15:40PM -0800, Sean Christopherson wrote: >Explicitly finalize kvm_cpu_caps as part of each vendor's setup flow to >fix a bug where clearing SHSTK and IBT due to lack of CET XFEATURE support >makes kvm-intel.ko unloadable when nested=1. The late clearing results in >nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE >when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks, >ultimately leading to a mismatched VMCS config due to the reference config >having the CET bits set, but every CPU's "local" config having the bits >cleared. > >Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by >kvm_x86_vendor_init(), before calling into vendor code, and not referenced >between ops->hardware_setup() and their current/old location. > >Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER") >Cc: stable@vger.kernel.org >Cc: Mathias Krause <minipli@grsecurity.net> >Cc: John Allen <john.allen@amd.com> >Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> >Cc: Chao Gao <chao.gao@intel.com> >Cc: Binbin Wu <binbin.wu@linux.intel.com> >Cc: Xiaoyao Li <xiaoyao.li@intel.com> >Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Chao Gao <chao.gao@intel.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() 2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson 2026-01-27 7:42 ` Chao Gao @ 2026-01-27 15:12 ` Xiaoyao Li 2026-01-27 16:19 ` Sean Christopherson 1 sibling, 1 reply; 11+ messages in thread From: Xiaoyao Li @ 2026-01-27 15:12 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Chao Gao, Binbin Wu, Jim Mattson On 1/24/2026 6:15 AM, Sean Christopherson wrote: ... > +void kvm_finalize_cpu_caps(void) It also finalizes the kvm_caps, at least kvm_caps.supported_xss, which seems not consistent with the name. Even more, just look at the function body, the name "kvm_finalize_supported_xss" seems to fit better while clearing SHSTK and IBT just the side effect of the finalized kvm_caps.supported_xss. > +{ > + if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) > + kvm_caps.supported_xss = 0; > + > + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && > + !kvm_cpu_cap_has(X86_FEATURE_IBT)) > + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; > + > + if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { > + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); > + kvm_cpu_cap_clear(X86_FEATURE_IBT); > + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; > + } > +} ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() 2026-01-27 15:12 ` Xiaoyao Li @ 2026-01-27 16:19 ` Sean Christopherson 0 siblings, 0 replies; 11+ messages in thread From: Sean Christopherson @ 2026-01-27 16:19 UTC (permalink / raw) To: Xiaoyao Li Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Chao Gao, Binbin Wu, Jim Mattson On Tue, Jan 27, 2026, Xiaoyao Li wrote: > On 1/24/2026 6:15 AM, Sean Christopherson wrote: > ... > > +void kvm_finalize_cpu_caps(void) > > It also finalizes the kvm_caps, No, it just happens to update supported_xss as well. > at least kvm_caps.supported_xss, which seems not consistent with the name. I agree, but I don't see a clearly better option. E.g. kvm_finalize_cpu_caps() could be pedantic and only touch cpu_caps: if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES) || (kvm_host.xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { kvm_cpu_cap_clear(X86_FEATURE_SHSTK); kvm_cpu_cap_clear(X86_FEATURE_IBT); } but then we have duplicate logic, and the connection between supported_xss and SHSTK/IBT is lost. The only viable alternative I can think of would be to provide a dedicated kvm_set_xss_caps() and then do: kvm_set_xss_caps(); kvm_finalize_cpu_caps(); where kvm_finalize_cpu_caps() just clears kvm_is_configuring_cpu_caps. Or I suppose it could be: kvm_set_xss_caps(); kvm_is_configuring_cpu_caps = false; though I think I'd prefer to keep kvm_finalize_cpu_caps() and make it an inline. Hmm, the more I look at that option, the more I like it? It's kinda silly, especially if we end up with a whole pile of helpers, e.g. kvm_set_xss_caps(); kvm_set_blah_caps(); kvm_set_loblaw_caps(); kvm_finalize_cpu_caps(); But at least for now, I definitely don't hate it. > Even more, just look at the function body, the name > "kvm_finalize_supported_xss" seems to fit better while clearing SHSTK and > IBT just the side effect of the finalized kvm_caps.supported_xss. No, I definitely want kvm_finalize_cpu_caps() somewhere, so that we end up with kvm_initialize_cpu_caps() + kvm_finalize_cpu_caps(). The function happens to only modify CET caps and thus only touches supported_xss as a side effect, but the intent is very much that it will serve as the one and only place where KVM makes "final" adjustments that are common to VMX and SVM. But as above, I'm not opposed to having both. And it does provide a leaner diff for the stable@ fix (though that's largely irrelevant since only 6.18 needs the fix). So this for patch 1 (not yet tested)? From: Sean Christopherson <seanjc@google.com> Date: Tue, 27 Jan 2026 08:14:27 -0800 Subject: [PATCH] KVM: x86: Configuring supported XSS from {svm,vmx}_set_cpu_caps() Explicitly configure KVM's supported XSS as part of each vendor's setup flow to fix a bug where clearing SHSTK and IBT in kvm_cpu_caps, e.g. due to lack of CET XFEATURE support, makes kvm-intel.ko unloadable when nested VMX is enabled, i.e. when nested=1. The late clearing results in nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks, ultimately leading to a mismatched VMCS config due to the reference config having the CET bits set, but every CPU's "local" config having the bits cleared. Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by kvm_x86_vendor_init(), before calling into vendor code, and not referenced between ops->hardware_setup() and their current/old location. Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER") Cc: stable@vger.kernel.org Cc: Mathias Krause <minipli@grsecurity.net> Cc: John Allen <john.allen@amd.com> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Chao Gao <chao.gao@intel.com> Cc: Binbin Wu <binbin.wu@linux.intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 2 ++ arch/x86/kvm/x86.c | 30 +++++++++++++++++------------- arch/x86/kvm/x86.h | 2 ++ 4 files changed, 23 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7803d2781144..c00a696dacfc 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -5387,6 +5387,8 @@ static __init void svm_set_cpu_caps(void) */ kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT); kvm_cpu_cap_clear(X86_FEATURE_MSR_IMM); + + kvm_setup_xss_caps(); } static __init int svm_hardware_setup(void) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 27acafd03381..9f85c3829890 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8230,6 +8230,8 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_clear(X86_FEATURE_SHSTK); kvm_cpu_cap_clear(X86_FEATURE_IBT); } + + kvm_setup_xss_caps(); } static bool vmx_is_io_intercepted(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8acfdfc583a1..cac1d6a67b49 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9965,6 +9965,23 @@ static struct notifier_block pvclock_gtod_notifier = { }; #endif +void kvm_setup_xss_caps(void) +{ + if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) + kvm_caps.supported_xss = 0; + + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; + + if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; + } +} +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_setup_xss_caps); + static inline void kvm_ops_update(struct kvm_x86_init_ops *ops) { memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops)); @@ -10138,19 +10155,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!tdp_enabled) kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT; - if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) - kvm_caps.supported_xss = 0; - - if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && - !kvm_cpu_cap_has(X86_FEATURE_IBT)) - kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; - - if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) { - kvm_cpu_cap_clear(X86_FEATURE_SHSTK); - kvm_cpu_cap_clear(X86_FEATURE_IBT); - kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; - } - if (kvm_caps.has_tsc_control) { /* * Make sure the user can only configure tsc_khz values that diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 70e81f008030..94d4f07aaaa0 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -483,6 +483,8 @@ extern struct kvm_host_values kvm_host; extern bool enable_pmu; extern bool enable_mediated_pmu; +void kvm_setup_xss_caps(void); + /* * Get a filtered version of KVM's supported XCR0 that strips out dynamic * features for which the current process doesn't (yet) have permission to use. base-commit: e81f7c908e1664233974b9f20beead78cde6343a -- ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps 2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson 2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson @ 2026-01-23 22:15 ` Sean Christopherson 2026-01-27 7:47 ` Chao Gao 2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson 2 siblings, 1 reply; 11+ messages in thread From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson Add a flag to track when KVM is actively configuring its CPU caps, and WARN if a cap is set or cleared if KVM isn't in its configuration stage. Modifying CPU caps after {svm,vmx}_set_cpu_caps() can be fatal to KVM, as vendor setup code expects the CPU caps to be frozen at that point, e.g. will do additional configuration based on the caps. Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/cpuid.c | 8 ++++++++ arch/x86/kvm/cpuid.h | 4 ++++ 2 files changed, 12 insertions(+) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 267e59b405c1..2f01511135c2 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -36,6 +36,9 @@ u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_cpu_caps); +bool kvm_is_configuring_cpu_caps __read_mostly; +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_is_configuring_cpu_caps); + struct cpuid_xstate_sizes { u32 eax; u32 ebx; @@ -830,6 +833,9 @@ void kvm_initialize_cpu_caps(void) { memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps)); + WARN_ON_ONCE(kvm_is_configuring_cpu_caps); + kvm_is_configuring_cpu_caps = true; + BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) > sizeof(boot_cpu_data.x86_capability)); @@ -1305,6 +1311,8 @@ void kvm_finalize_cpu_caps(void) kvm_cpu_cap_clear(X86_FEATURE_IBT); kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL; } + + kvm_is_configuring_cpu_caps = false; } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_finalize_cpu_caps); diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index 3b0b4b1adb97..07175dff24d6 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -8,6 +8,8 @@ #include <uapi/asm/kvm_para.h> extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly; +extern bool kvm_is_configuring_cpu_caps __read_mostly; + void kvm_initialize_cpu_caps(void); void kvm_finalize_cpu_caps(void); @@ -189,6 +191,7 @@ static __always_inline void kvm_cpu_cap_clear(unsigned int x86_feature) { unsigned int x86_leaf = __feature_leaf(x86_feature); + WARN_ON_ONCE(!kvm_is_configuring_cpu_caps); kvm_cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature); } @@ -196,6 +199,7 @@ static __always_inline void kvm_cpu_cap_set(unsigned int x86_feature) { unsigned int x86_leaf = __feature_leaf(x86_feature); + WARN_ON_ONCE(!kvm_is_configuring_cpu_caps); kvm_cpu_caps[x86_leaf] |= __feature_bit(x86_feature); } -- 2.52.0.457.g6b5491de43-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps 2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson @ 2026-01-27 7:47 ` Chao Gao 0 siblings, 0 replies; 11+ messages in thread From: Chao Gao @ 2026-01-27 7:47 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson On Fri, Jan 23, 2026 at 02:15:41PM -0800, Sean Christopherson wrote: >Add a flag to track when KVM is actively configuring its CPU caps, and >WARN if a cap is set or cleared if KVM isn't in its configuration stage. >Modifying CPU caps after {svm,vmx}_set_cpu_caps() can be fatal to KVM, as >vendor setup code expects the CPU caps to be frozen at that point, e.g. >will do additional configuration based on the caps. > >Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Chao Gao <chao.gao@intel.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch 2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson 2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson 2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson @ 2026-01-23 22:15 ` Sean Christopherson 2026-01-26 14:57 ` Sean Christopherson 2 siblings, 1 reply; 11+ messages in thread From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw) To: Sean Christopherson, Paolo Bonzini Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson When kvm-intel.ko refuses to load due to a mismatched VMCS config, print all mismatching offsets+values to make it easier to debug goofs during development, and it to make it at least feasible to triage failures that occur during production. E.g. if a physical core is flaky or is running with the "wrong" microcode patch loaded, then a CPU can get a legitimate mismatch even without KVM bugs. Print the mismatches as 32-bit values as a compromise between hand coding every field (to provide precise information) and printing individual bytes (requires more effort to deduce the mismatch bit(s)). All fields in the VMCS config are either 32-bit or 64-bit values, i.e. in many cases, printing 32-bit values will be 100% precise, and in the others it's close enough, especially when considering that MSR values are split into EDX:EAX anyways. E.g. on mismatch CET entry/exit controls, KVM will print: kvm_intel: VMCS config on CPU 0 doesn't match reference config: Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000 Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000 Opportunistically tweak the wording on the initial error message to say "mismatch" instead of "inconsistent", as the VMCS config itself isn't inconsistent, and the wording conflates the cross-CPU compatibility check with the error_on_inconsistent_vmcs_config knob that treats inconsistent VMCS configurations as errors (e.g. if a CPU supports CET entry controls but no CET exit controls). Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/vmx/vmx.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7d373e32ea9c..700a8c47b4ca 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2962,8 +2962,22 @@ int vmx_check_processor_compat(void) } if (nested) nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept); + if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config))) { - pr_err("Inconsistent VMCS config on CPU %d\n", cpu); + u32 *gold = (void *)&vmcs_config; + u32 *mine = (void *)&vmcs_conf; + int i; + + BUILD_BUG_ON(sizeof(struct vmcs_config) % sizeof(u32)); + + pr_err("VMCS config on CPU %d doesn't match reference config:\n", cpu); + for (i = 0; i < sizeof(struct vmcs_config) / sizeof(u32); i++) { + if (gold[i] == mine[i]) + continue; + + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); + } return -EIO; } return 0; -- 2.52.0.457.g6b5491de43-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch 2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson @ 2026-01-26 14:57 ` Sean Christopherson 2026-01-27 7:53 ` Chao Gao 0 siblings, 1 reply; 11+ messages in thread From: Sean Christopherson @ 2026-01-26 14:57 UTC (permalink / raw) To: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson On Fri, Jan 23, 2026, Sean Christopherson wrote: > + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", > + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); As pointed out by the kernel bot, sizeof() isn't an unsigned long on 32-bit. Simplest fix is to force it to an int. pr_cont(" Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch 2026-01-26 14:57 ` Sean Christopherson @ 2026-01-27 7:53 ` Chao Gao 2026-01-27 18:59 ` Sean Christopherson 0 siblings, 1 reply; 11+ messages in thread From: Chao Gao @ 2026-01-27 7:53 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson On Mon, Jan 26, 2026 at 06:57:26AM -0800, Sean Christopherson wrote: >On Fri, Jan 23, 2026, Sean Christopherson wrote: >> + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", >> + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); > >As pointed out by the kernel bot, sizeof() isn't an unsigned long on 32-bit. >Simplest fix is to force it to an int. > > pr_cont(" Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", > i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); Why pr_cont()? The previous line ends with '\n'. so, a plain pr_err() should work. Anyway, the code looks good. Reviewed-by: Chao Gao <chao.gao@intel.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch 2026-01-27 7:53 ` Chao Gao @ 2026-01-27 18:59 ` Sean Christopherson 0 siblings, 0 replies; 11+ messages in thread From: Sean Christopherson @ 2026-01-27 18:59 UTC (permalink / raw) To: Chao Gao Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson On Tue, Jan 27, 2026, Chao Gao wrote: > On Mon, Jan 26, 2026 at 06:57:26AM -0800, Sean Christopherson wrote: > >On Fri, Jan 23, 2026, Sean Christopherson wrote: > >> + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", > >> + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); > > > >As pointed out by the kernel bot, sizeof() isn't an unsigned long on 32-bit. > >Simplest fix is to force it to an int. > > > > pr_cont(" Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n", > > i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); > > Why pr_cont()? The previous line ends with '\n'. so, a plain pr_err() should work. To avoid the "kvm_intel:" formatting. E.g. with pr_cont(): [ 5.355958] kvm_intel: VMCS config on CPU 0 doesn't match reference config: [ 5.355986] Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000 [ 5.356019] Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000 [ 5.356048] kvm: enabling virtualization on CPU0 failed versus with pr_err(): [ 6.527945] kvm_intel: VMCS config on CPU 0 doesn't match reference config: [ 6.527979] kvm_intel: Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000 [ 6.528013] kvm_intel: Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000 [ 6.528048] kvm: enabling virtualization on CPU0 failed Ugh, but my use of pr_cont() isn't right, because the '\n' resets to KERN_DEFAULT, i.e. not captured in the above is that the continuations are printed at "warn", not "err" as intended. Ah, and fixing that by shoving the newline into pr_cont(): pr_err("VMCS config on CPU %d doesn't match reference config:", cpu); for (i = 0; i < sizeof(struct vmcs_config) / sizeof(u32); i++) { if (gold[i] == mine[i]) continue; pr_cont("\n Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x", i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]); } pr_cont("\n"); avoids generating new timestamps too, which is even more desirable. [ 5.239320] kvm_intel: VMCS config on CPU 0 doesn't match reference config: Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000 Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000 [ 5.239397] kvm: enabling virtualization on CPU0 failed Unless someone strongly prefers re-printing the timestamp+kvm-intel, I'll go with the above approach for v2. Thanks for the reviews! ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-01-27 18:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
2026-01-27 7:42 ` Chao Gao
2026-01-27 15:12 ` Xiaoyao Li
2026-01-27 16:19 ` Sean Christopherson
2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson
2026-01-27 7:47 ` Chao Gao
2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson
2026-01-26 14:57 ` Sean Christopherson
2026-01-27 7:53 ` Chao Gao
2026-01-27 18:59 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox