* [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening
@ 2026-01-23 22:15 Sean Christopherson
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe,
Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson
Fix a bug where KVM will clear IBT and SHSTK bits after nested VMX MSRs
have been configured, e.g. if the kernel is built with CONFIG_X86_CET=y
but CONFIG_X86_KERNEL_IBT=n. The late clearing results in kvm-intel.ko
refusing to load as the CPU compatible checks generate their VMCS configs
with IBT=n and SHSTK=n, ultimately causing a mismatch on the CET entry
and exit controls.
Patch 2 hardens against similar bugs in the future by added a flag and
WARNs to yell if KVM sets or clear feature flags outside of the dedicated
flow.
Patch 3 adds (very, very) long overdue printing of the mistmatching offsets
in the VMCS configs.
Sean Christopherson (3):
KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps()
KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps
KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch
arch/x86/kvm/cpuid.c | 29 +++++++++++++++++++++++++++--
arch/x86/kvm/cpuid.h | 7 ++++++-
arch/x86/kvm/svm/svm.c | 4 +++-
arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++--
arch/x86/kvm/x86.c | 14 --------------
arch/x86/kvm/x86.h | 2 ++
6 files changed, 56 insertions(+), 20 deletions(-)
base-commit: e81f7c908e1664233974b9f20beead78cde6343a
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps()
2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson
@ 2026-01-23 22:15 ` Sean Christopherson
2026-01-27 7:42 ` Chao Gao
2026-01-27 15:12 ` Xiaoyao Li
2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson
2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson
2 siblings, 2 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe,
Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson
Explicitly finalize kvm_cpu_caps as part of each vendor's setup flow to
fix a bug where clearing SHSTK and IBT due to lack of CET XFEATURE support
makes kvm-intel.ko unloadable when nested=1. The late clearing results in
nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE
when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks,
ultimately leading to a mismatched VMCS config due to the reference config
having the CET bits set, but every CPU's "local" config having the bits
cleared.
Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by
kvm_x86_vendor_init(), before calling into vendor code, and not referenced
between ops->hardware_setup() and their current/old location.
Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER")
Cc: stable@vger.kernel.org
Cc: Mathias Krause <minipli@grsecurity.net>
Cc: John Allen <john.allen@amd.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Binbin Wu <binbin.wu@linux.intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/cpuid.c | 21 +++++++++++++++++++--
arch/x86/kvm/cpuid.h | 3 ++-
arch/x86/kvm/svm/svm.c | 4 +++-
arch/x86/kvm/vmx/vmx.c | 4 +++-
arch/x86/kvm/x86.c | 14 --------------
arch/x86/kvm/x86.h | 2 ++
6 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 575244af9c9f..267e59b405c1 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -826,7 +826,7 @@ do { \
/* DS is defined by ptrace-abi.h on 32-bit builds. */
#undef DS
-void kvm_set_cpu_caps(void)
+void kvm_initialize_cpu_caps(void)
{
memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
@@ -1289,7 +1289,24 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_clear(X86_FEATURE_RDPID);
}
}
-EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_set_cpu_caps);
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_initialize_cpu_caps);
+
+void kvm_finalize_cpu_caps(void)
+{
+ if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
+ kvm_caps.supported_xss = 0;
+
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ !kvm_cpu_cap_has(X86_FEATURE_IBT))
+ kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
+
+ if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) {
+ kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+ kvm_cpu_cap_clear(X86_FEATURE_IBT);
+ kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
+ }
+}
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_finalize_cpu_caps);
#undef F
#undef SCATTERED_F
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index d3f5ae15a7ca..3b0b4b1adb97 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -8,7 +8,8 @@
#include <uapi/asm/kvm_para.h>
extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
-void kvm_set_cpu_caps(void);
+void kvm_initialize_cpu_caps(void);
+void kvm_finalize_cpu_caps(void);
void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu);
struct kvm_cpuid_entry2 *kvm_find_cpuid_entry2(struct kvm_cpuid_entry2 *entries,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7803d2781144..0c23fcaedcc5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5305,7 +5305,7 @@ static __init void svm_adjust_mmio_mask(void)
static __init void svm_set_cpu_caps(void)
{
- kvm_set_cpu_caps();
+ kvm_initialize_cpu_caps();
kvm_caps.supported_perf_cap = 0;
@@ -5387,6 +5387,8 @@ static __init void svm_set_cpu_caps(void)
*/
kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
kvm_cpu_cap_clear(X86_FEATURE_MSR_IMM);
+
+ kvm_finalize_cpu_caps();
}
static __init int svm_hardware_setup(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 27acafd03381..7d373e32ea9c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8173,7 +8173,7 @@ static __init u64 vmx_get_perf_capabilities(void)
static __init void vmx_set_cpu_caps(void)
{
- kvm_set_cpu_caps();
+ kvm_initialize_cpu_caps();
/* CPUID 0x1 */
if (nested)
@@ -8230,6 +8230,8 @@ static __init void vmx_set_cpu_caps(void)
kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
kvm_cpu_cap_clear(X86_FEATURE_IBT);
}
+
+ kvm_finalize_cpu_caps();
}
static bool vmx_is_io_intercepted(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8acfdfc583a1..36385e6aebfa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -220,7 +220,6 @@ static DEFINE_PER_CPU(struct kvm_user_return_msrs, user_return_msrs);
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE)
-#define XFEATURE_MASK_CET_ALL (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)
/*
* Note, KVM supports exposing PT to the guest, but does not support context
* switching PT via XSTATE (KVM's PT virtualization relies on perf; swapping
@@ -10138,19 +10137,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
if (!tdp_enabled)
kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT;
- if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
- kvm_caps.supported_xss = 0;
-
- if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
- !kvm_cpu_cap_has(X86_FEATURE_IBT))
- kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
-
- if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) {
- kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
- kvm_cpu_cap_clear(X86_FEATURE_IBT);
- kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
- }
-
if (kvm_caps.has_tsc_control) {
/*
* Make sure the user can only configure tsc_khz values that
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 70e81f008030..9edfac5d5ffb 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -483,6 +483,8 @@ extern struct kvm_host_values kvm_host;
extern bool enable_pmu;
extern bool enable_mediated_pmu;
+#define XFEATURE_MASK_CET_ALL (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)
+
/*
* Get a filtered version of KVM's supported XCR0 that strips out dynamic
* features for which the current process doesn't (yet) have permission to use.
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps
2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
@ 2026-01-23 22:15 ` Sean Christopherson
2026-01-27 7:47 ` Chao Gao
2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson
2 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe,
Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson
Add a flag to track when KVM is actively configuring its CPU caps, and
WARN if a cap is set or cleared if KVM isn't in its configuration stage.
Modifying CPU caps after {svm,vmx}_set_cpu_caps() can be fatal to KVM, as
vendor setup code expects the CPU caps to be frozen at that point, e.g.
will do additional configuration based on the caps.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/cpuid.c | 8 ++++++++
arch/x86/kvm/cpuid.h | 4 ++++
2 files changed, 12 insertions(+)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 267e59b405c1..2f01511135c2 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -36,6 +36,9 @@
u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_cpu_caps);
+bool kvm_is_configuring_cpu_caps __read_mostly;
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_is_configuring_cpu_caps);
+
struct cpuid_xstate_sizes {
u32 eax;
u32 ebx;
@@ -830,6 +833,9 @@ void kvm_initialize_cpu_caps(void)
{
memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
+ WARN_ON_ONCE(kvm_is_configuring_cpu_caps);
+ kvm_is_configuring_cpu_caps = true;
+
BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) >
sizeof(boot_cpu_data.x86_capability));
@@ -1305,6 +1311,8 @@ void kvm_finalize_cpu_caps(void)
kvm_cpu_cap_clear(X86_FEATURE_IBT);
kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
}
+
+ kvm_is_configuring_cpu_caps = false;
}
EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_finalize_cpu_caps);
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 3b0b4b1adb97..07175dff24d6 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -8,6 +8,8 @@
#include <uapi/asm/kvm_para.h>
extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
+extern bool kvm_is_configuring_cpu_caps __read_mostly;
+
void kvm_initialize_cpu_caps(void);
void kvm_finalize_cpu_caps(void);
@@ -189,6 +191,7 @@ static __always_inline void kvm_cpu_cap_clear(unsigned int x86_feature)
{
unsigned int x86_leaf = __feature_leaf(x86_feature);
+ WARN_ON_ONCE(!kvm_is_configuring_cpu_caps);
kvm_cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
}
@@ -196,6 +199,7 @@ static __always_inline void kvm_cpu_cap_set(unsigned int x86_feature)
{
unsigned int x86_leaf = __feature_leaf(x86_feature);
+ WARN_ON_ONCE(!kvm_is_configuring_cpu_caps);
kvm_cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
}
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch
2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson
@ 2026-01-23 22:15 ` Sean Christopherson
2026-01-26 14:57 ` Sean Christopherson
2 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2026-01-23 22:15 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe,
Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson
When kvm-intel.ko refuses to load due to a mismatched VMCS config, print
all mismatching offsets+values to make it easier to debug goofs during
development, and it to make it at least feasible to triage failures that
occur during production. E.g. if a physical core is flaky or is running
with the "wrong" microcode patch loaded, then a CPU can get a legitimate
mismatch even without KVM bugs.
Print the mismatches as 32-bit values as a compromise between hand coding
every field (to provide precise information) and printing individual bytes
(requires more effort to deduce the mismatch bit(s)). All fields in the
VMCS config are either 32-bit or 64-bit values, i.e. in many cases,
printing 32-bit values will be 100% precise, and in the others it's close
enough, especially when considering that MSR values are split into EDX:EAX
anyways.
E.g. on mismatch CET entry/exit controls, KVM will print:
kvm_intel: VMCS config on CPU 0 doesn't match reference config:
Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000
Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000
Opportunistically tweak the wording on the initial error message to say
"mismatch" instead of "inconsistent", as the VMCS config itself isn't
inconsistent, and the wording conflates the cross-CPU compatibility check
with the error_on_inconsistent_vmcs_config knob that treats inconsistent
VMCS configurations as errors (e.g. if a CPU supports CET entry controls
but no CET exit controls).
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/vmx.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7d373e32ea9c..700a8c47b4ca 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2962,8 +2962,22 @@ int vmx_check_processor_compat(void)
}
if (nested)
nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
+
if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config))) {
- pr_err("Inconsistent VMCS config on CPU %d\n", cpu);
+ u32 *gold = (void *)&vmcs_config;
+ u32 *mine = (void *)&vmcs_conf;
+ int i;
+
+ BUILD_BUG_ON(sizeof(struct vmcs_config) % sizeof(u32));
+
+ pr_err("VMCS config on CPU %d doesn't match reference config:\n", cpu);
+ for (i = 0; i < sizeof(struct vmcs_config) / sizeof(u32); i++) {
+ if (gold[i] == mine[i])
+ continue;
+
+ pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
+ i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
+ }
return -EIO;
}
return 0;
--
2.52.0.457.g6b5491de43-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch
2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson
@ 2026-01-26 14:57 ` Sean Christopherson
2026-01-27 7:53 ` Chao Gao
0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2026-01-26 14:57 UTC (permalink / raw)
To: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen,
Rick Edgecombe, Chao Gao, Binbin Wu, Xiaoyao Li, Jim Mattson
On Fri, Jan 23, 2026, Sean Christopherson wrote:
> + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
> + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
As pointed out by the kernel bot, sizeof() isn't an unsigned long on 32-bit.
Simplest fix is to force it to an int.
pr_cont(" Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps()
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
@ 2026-01-27 7:42 ` Chao Gao
2026-01-27 15:12 ` Xiaoyao Li
1 sibling, 0 replies; 11+ messages in thread
From: Chao Gao @ 2026-01-27 7:42 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen,
Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson
On Fri, Jan 23, 2026 at 02:15:40PM -0800, Sean Christopherson wrote:
>Explicitly finalize kvm_cpu_caps as part of each vendor's setup flow to
>fix a bug where clearing SHSTK and IBT due to lack of CET XFEATURE support
>makes kvm-intel.ko unloadable when nested=1. The late clearing results in
>nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE
>when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks,
>ultimately leading to a mismatched VMCS config due to the reference config
>having the CET bits set, but every CPU's "local" config having the bits
>cleared.
>
>Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by
>kvm_x86_vendor_init(), before calling into vendor code, and not referenced
>between ops->hardware_setup() and their current/old location.
>
>Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER")
>Cc: stable@vger.kernel.org
>Cc: Mathias Krause <minipli@grsecurity.net>
>Cc: John Allen <john.allen@amd.com>
>Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
>Cc: Chao Gao <chao.gao@intel.com>
>Cc: Binbin Wu <binbin.wu@linux.intel.com>
>Cc: Xiaoyao Li <xiaoyao.li@intel.com>
>Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps
2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson
@ 2026-01-27 7:47 ` Chao Gao
0 siblings, 0 replies; 11+ messages in thread
From: Chao Gao @ 2026-01-27 7:47 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen,
Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson
On Fri, Jan 23, 2026 at 02:15:41PM -0800, Sean Christopherson wrote:
>Add a flag to track when KVM is actively configuring its CPU caps, and
>WARN if a cap is set or cleared if KVM isn't in its configuration stage.
>Modifying CPU caps after {svm,vmx}_set_cpu_caps() can be fatal to KVM, as
>vendor setup code expects the CPU caps to be frozen at that point, e.g.
>will do additional configuration based on the caps.
>
>Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch
2026-01-26 14:57 ` Sean Christopherson
@ 2026-01-27 7:53 ` Chao Gao
2026-01-27 18:59 ` Sean Christopherson
0 siblings, 1 reply; 11+ messages in thread
From: Chao Gao @ 2026-01-27 7:53 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen,
Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson
On Mon, Jan 26, 2026 at 06:57:26AM -0800, Sean Christopherson wrote:
>On Fri, Jan 23, 2026, Sean Christopherson wrote:
>> + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
>> + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
>
>As pointed out by the kernel bot, sizeof() isn't an unsigned long on 32-bit.
>Simplest fix is to force it to an int.
>
> pr_cont(" Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
> i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
Why pr_cont()? The previous line ends with '\n'. so, a plain pr_err() should work.
Anyway, the code looks good.
Reviewed-by: Chao Gao <chao.gao@intel.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps()
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
2026-01-27 7:42 ` Chao Gao
@ 2026-01-27 15:12 ` Xiaoyao Li
2026-01-27 16:19 ` Sean Christopherson
1 sibling, 1 reply; 11+ messages in thread
From: Xiaoyao Li @ 2026-01-27 15:12 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Mathias Krause, John Allen, Rick Edgecombe,
Chao Gao, Binbin Wu, Jim Mattson
On 1/24/2026 6:15 AM, Sean Christopherson wrote:
...
> +void kvm_finalize_cpu_caps(void)
It also finalizes the kvm_caps, at least kvm_caps.supported_xss, which
seems not consistent with the name.
Even more, just look at the function body, the name
"kvm_finalize_supported_xss" seems to fit better while clearing SHSTK
and IBT just the side effect of the finalized kvm_caps.supported_xss.
> +{
> + if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> + kvm_caps.supported_xss = 0;
> +
> + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
> + !kvm_cpu_cap_has(X86_FEATURE_IBT))
> + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
> +
> + if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) {
> + kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> + kvm_cpu_cap_clear(X86_FEATURE_IBT);
> + kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
> + }
> +}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps()
2026-01-27 15:12 ` Xiaoyao Li
@ 2026-01-27 16:19 ` Sean Christopherson
0 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-01-27 16:19 UTC (permalink / raw)
To: Xiaoyao Li
Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen,
Rick Edgecombe, Chao Gao, Binbin Wu, Jim Mattson
On Tue, Jan 27, 2026, Xiaoyao Li wrote:
> On 1/24/2026 6:15 AM, Sean Christopherson wrote:
> ...
> > +void kvm_finalize_cpu_caps(void)
>
> It also finalizes the kvm_caps,
No, it just happens to update supported_xss as well.
> at least kvm_caps.supported_xss, which seems not consistent with the name.
I agree, but I don't see a clearly better option. E.g. kvm_finalize_cpu_caps()
could be pedantic and only touch cpu_caps:
if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES) ||
(kvm_host.xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) {
kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
kvm_cpu_cap_clear(X86_FEATURE_IBT);
}
but then we have duplicate logic, and the connection between supported_xss and
SHSTK/IBT is lost.
The only viable alternative I can think of would be to provide a dedicated
kvm_set_xss_caps() and then do:
kvm_set_xss_caps();
kvm_finalize_cpu_caps();
where kvm_finalize_cpu_caps() just clears kvm_is_configuring_cpu_caps. Or I
suppose it could be:
kvm_set_xss_caps();
kvm_is_configuring_cpu_caps = false;
though I think I'd prefer to keep kvm_finalize_cpu_caps() and make it an inline.
Hmm, the more I look at that option, the more I like it? It's kinda silly,
especially if we end up with a whole pile of helpers, e.g.
kvm_set_xss_caps();
kvm_set_blah_caps();
kvm_set_loblaw_caps();
kvm_finalize_cpu_caps();
But at least for now, I definitely don't hate it.
> Even more, just look at the function body, the name
> "kvm_finalize_supported_xss" seems to fit better while clearing SHSTK and
> IBT just the side effect of the finalized kvm_caps.supported_xss.
No, I definitely want kvm_finalize_cpu_caps() somewhere, so that we end up with
kvm_initialize_cpu_caps() + kvm_finalize_cpu_caps(). The function happens to
only modify CET caps and thus only touches supported_xss as a side effect, but
the intent is very much that it will serve as the one and only place where KVM
makes "final" adjustments that are common to VMX and SVM.
But as above, I'm not opposed to having both. And it does provide a leaner diff
for the stable@ fix (though that's largely irrelevant since only 6.18 needs the
fix).
So this for patch 1 (not yet tested)?
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 27 Jan 2026 08:14:27 -0800
Subject: [PATCH] KVM: x86: Configuring supported XSS from {svm,vmx}_set_cpu_caps()
Explicitly configure KVM's supported XSS as part of each vendor's setup
flow to fix a bug where clearing SHSTK and IBT in kvm_cpu_caps, e.g. due
to lack of CET XFEATURE support, makes kvm-intel.ko unloadable when nested
VMX is enabled, i.e. when nested=1. The late clearing results in
nested_vmx_setup_{entry,exit}_ctls() clearing VM_{ENTRY,EXIT}_LOAD_CET_STATE
when nested_vmx_setup_ctls_msrs() runs during the CPU compatibility checks,
ultimately leading to a mismatched VMCS config due to the reference config
having the CET bits set, but every CPU's "local" config having the bits
cleared.
Note, kvm_caps.supported_{xcr0,xss} are unconditionally initialized by
kvm_x86_vendor_init(), before calling into vendor code, and not referenced
between ops->hardware_setup() and their current/old location.
Fixes: 69cc3e886582 ("KVM: x86: Add XSS support for CET_KERNEL and CET_USER")
Cc: stable@vger.kernel.org
Cc: Mathias Krause <minipli@grsecurity.net>
Cc: John Allen <john.allen@amd.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Binbin Wu <binbin.wu@linux.intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/svm.c | 2 ++
arch/x86/kvm/vmx/vmx.c | 2 ++
arch/x86/kvm/x86.c | 30 +++++++++++++++++-------------
arch/x86/kvm/x86.h | 2 ++
4 files changed, 23 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7803d2781144..c00a696dacfc 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5387,6 +5387,8 @@ static __init void svm_set_cpu_caps(void)
*/
kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
kvm_cpu_cap_clear(X86_FEATURE_MSR_IMM);
+
+ kvm_setup_xss_caps();
}
static __init int svm_hardware_setup(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 27acafd03381..9f85c3829890 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8230,6 +8230,8 @@ static __init void vmx_set_cpu_caps(void)
kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
kvm_cpu_cap_clear(X86_FEATURE_IBT);
}
+
+ kvm_setup_xss_caps();
}
static bool vmx_is_io_intercepted(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8acfdfc583a1..cac1d6a67b49 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9965,6 +9965,23 @@ static struct notifier_block pvclock_gtod_notifier = {
};
#endif
+void kvm_setup_xss_caps(void)
+{
+ if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
+ kvm_caps.supported_xss = 0;
+
+ if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+ !kvm_cpu_cap_has(X86_FEATURE_IBT))
+ kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
+
+ if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) {
+ kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+ kvm_cpu_cap_clear(X86_FEATURE_IBT);
+ kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
+ }
+}
+EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_setup_xss_caps);
+
static inline void kvm_ops_update(struct kvm_x86_init_ops *ops)
{
memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
@@ -10138,19 +10155,6 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
if (!tdp_enabled)
kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT;
- if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
- kvm_caps.supported_xss = 0;
-
- if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
- !kvm_cpu_cap_has(X86_FEATURE_IBT))
- kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
-
- if ((kvm_caps.supported_xss & XFEATURE_MASK_CET_ALL) != XFEATURE_MASK_CET_ALL) {
- kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
- kvm_cpu_cap_clear(X86_FEATURE_IBT);
- kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_ALL;
- }
-
if (kvm_caps.has_tsc_control) {
/*
* Make sure the user can only configure tsc_khz values that
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 70e81f008030..94d4f07aaaa0 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -483,6 +483,8 @@ extern struct kvm_host_values kvm_host;
extern bool enable_pmu;
extern bool enable_mediated_pmu;
+void kvm_setup_xss_caps(void);
+
/*
* Get a filtered version of KVM's supported XCR0 that strips out dynamic
* features for which the current process doesn't (yet) have permission to use.
base-commit: e81f7c908e1664233974b9f20beead78cde6343a
--
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch
2026-01-27 7:53 ` Chao Gao
@ 2026-01-27 18:59 ` Sean Christopherson
0 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-01-27 18:59 UTC (permalink / raw)
To: Chao Gao
Cc: Paolo Bonzini, kvm, linux-kernel, Mathias Krause, John Allen,
Rick Edgecombe, Binbin Wu, Xiaoyao Li, Jim Mattson
On Tue, Jan 27, 2026, Chao Gao wrote:
> On Mon, Jan 26, 2026 at 06:57:26AM -0800, Sean Christopherson wrote:
> >On Fri, Jan 23, 2026, Sean Christopherson wrote:
> >> + pr_cont(" Offset %lu REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
> >> + i * sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
> >
> >As pointed out by the kernel bot, sizeof() isn't an unsigned long on 32-bit.
> >Simplest fix is to force it to an int.
> >
> > pr_cont(" Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x\n",
> > i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
>
> Why pr_cont()? The previous line ends with '\n'. so, a plain pr_err() should work.
To avoid the "kvm_intel:" formatting. E.g. with pr_cont():
[ 5.355958] kvm_intel: VMCS config on CPU 0 doesn't match reference config:
[ 5.355986] Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000
[ 5.356019] Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000
[ 5.356048] kvm: enabling virtualization on CPU0 failed
versus with pr_err():
[ 6.527945] kvm_intel: VMCS config on CPU 0 doesn't match reference config:
[ 6.527979] kvm_intel: Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000
[ 6.528013] kvm_intel: Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000
[ 6.528048] kvm: enabling virtualization on CPU0 failed
Ugh, but my use of pr_cont() isn't right, because the '\n' resets to KERN_DEFAULT,
i.e. not captured in the above is that the continuations are printed at "warn",
not "err" as intended.
Ah, and fixing that by shoving the newline into pr_cont():
pr_err("VMCS config on CPU %d doesn't match reference config:", cpu);
for (i = 0; i < sizeof(struct vmcs_config) / sizeof(u32); i++) {
if (gold[i] == mine[i])
continue;
pr_cont("\n Offset %u REF = 0x%08x, CPU%u = 0x%08x, mismatch = 0x%08x",
i * (int)sizeof(u32), gold[i], cpu, mine[i], gold[i] ^ mine[i]);
}
pr_cont("\n");
avoids generating new timestamps too, which is even more desirable.
[ 5.239320] kvm_intel: VMCS config on CPU 0 doesn't match reference config:
Offset 76 REF = 0x107fffff, CPU0 = 0x007fffff, mismatch = 0x10000000
Offset 84 REF = 0x0010f3ff, CPU0 = 0x0000f3ff, mismatch = 0x00100000
[ 5.239397] kvm: enabling virtualization on CPU0 failed
Unless someone strongly prefers re-printing the timestamp+kvm-intel, I'll go with
the above approach for v2.
Thanks for the reviews!
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-01-27 18:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-23 22:15 [PATCH 0/3] KVM: x86: CET vs. nVMX fix and hardening Sean Christopherson
2026-01-23 22:15 ` [PATCH 1/3] KVM: x86: Finalize kvm_cpu_caps setup from {svm,vmx}_set_cpu_caps() Sean Christopherson
2026-01-27 7:42 ` Chao Gao
2026-01-27 15:12 ` Xiaoyao Li
2026-01-27 16:19 ` Sean Christopherson
2026-01-23 22:15 ` [PATCH 2/3] KVM: x86: Harden against unexpected adjustments to kvm_cpu_caps Sean Christopherson
2026-01-27 7:47 ` Chao Gao
2026-01-23 22:15 ` [PATCH 3/3] KVM: VMX: Print out "bad" offsets+value on VMCS config mismatch Sean Christopherson
2026-01-26 14:57 ` Sean Christopherson
2026-01-27 7:53 ` Chao Gao
2026-01-27 18:59 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox