[PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT
@ 2025-03-04  6:06 Paolo Bonzini
  2025-03-04  6:06 ` [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao

This series is my evolution of Yan's patches at
https://patchew.org/linux/20250224070716.31360-1-yan.y.zhao@intel.com/.

The implementation of the quirk is unchanged, but the concepts in kvm_caps
are a bit different.  In particular:

- if a quirk is not applicable to some hardware, it is still included
  in KVM_CAP_DISABLE_QUIRKS2.  This way userspace knows that KVM is
  *aware* of a particular issue - even if disabling it has no effect
  because the quirk is not a problem on a specific hardware, userspace
  may want to know that it can rely on the problematic behavior not
  being present.  Therefore, KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT is
  simply auto-disabled on TDX machines.

- if instead a quirk cannot be disabled due to limitations, for example
  KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT if self-snoop is not present on
  the CPU, the quirk is removed completely from kvm_caps.supported_quirks
  and therefore from KVM_CAP_DISABLE_QUIRKS2.

This series does not introduce a way to query always-disabled quirks,
which could be for example KVM_CAP_DISABLED_QUIRKS.  This could be
added if we wanted for example to get rid of hypercall patching; it's
a trivial addition.

The main semantic change with respect to v2 is to prevent re-enabling
quirks that have been disabled with KVM_ENABLE_CAP.  This in turn makes
it possible to just use kvm->arch.disabled_quirks for TDX-enabled

Paolo

Supersedes: <20250301073428.2435768-1-pbonzini@redhat.com>

Paolo Bonzini (3):
  KVM: x86: do not allow re-enabling quirks
  KVM: x86: Allow vendor code to disable quirks
  KVM: x86: remove shadow_memtype_mask

Yan Zhao (3):
  KVM: x86: Introduce supported_quirks to block disabling quirks
  KVM: x86: Introduce Intel specific quirk
    KVM_X86_QUIRK_IGNORE_GUEST_PAT
  KVM: TDX: Always honor guest PAT on TDX enabled guests

 Documentation/virt/kvm/api.rst  | 22 +++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  7 +++++-
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/mmu.h              |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 13 ----------
 arch/x86/kvm/mmu/spte.c         | 19 ++-------------
 arch/x86/kvm/mmu/spte.h         |  1 -
 arch/x86/kvm/svm/svm.c          |  1 +
 arch/x86/kvm/vmx/tdx.c          |  6 +++++
 arch/x86/kvm/vmx/vmx.c          | 43 +++++++++++++++++++++++++++------
 arch/x86/kvm/x86.c              | 13 +++++++---
 arch/x86/kvm/x86.h              |  3 +++
 12 files changed, 87 insertions(+), 44 deletions(-)

-- 
2.43.5


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks
  2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
@ 2025-03-04  6:06 ` Paolo Bonzini
  2025-03-05  3:20   ` Yan Zhao
  2025-03-19  1:20   ` Binbin Wu
  2025-03-04  6:06 ` [PATCH v3 2/6] KVM: x86: Allow vendor code to disable quirks Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao

Allowing arbitrary re-enabling of quirks puts a limit on what the
quirks themselves can do, since you cannot assume that the quirk
prevents a particular state.  More important, it also prevents
KVM from disabling a quirk at VM creation time, because userspace
can always go back and re-enable that.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 856ceeb4fb35..35d03fcdb8e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6525,7 +6525,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			break;
 		fallthrough;
 	case KVM_CAP_DISABLE_QUIRKS:
-		kvm->arch.disabled_quirks = cap->args[0];
+		kvm->arch.disabled_quirks |= cap->args[0];
 		r = 0;
 		break;
 	case KVM_CAP_SPLIT_IRQCHIP: {
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/6] KVM: x86: Allow vendor code to disable quirks
  2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
  2025-03-04  6:06 ` [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks Paolo Bonzini
@ 2025-03-04  6:06 ` Paolo Bonzini
  2025-03-04  8:15   ` Yan Zhao
  2025-03-04  6:06 ` [PATCH v3 3/6] KVM: x86: Introduce supported_quirks to block disabling quirks Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao

In some cases, the handling of quirks is split between platform-specific
code and generic code, or it is done entirely in generic code, but the
relevant bug does not trigger on some platforms; for example,
this will be the case for "ignore guest PAT".  Allow unaffected vendor
modules to disable handling of a quirk for all VMs via a new entry in
kvm_caps.

Such quirks remain available in KVM_CAP_DISABLE_QUIRKS2, because that API
tells userspace that KVM *knows* that some of its past behavior was bogus
or just undesirable.  In other words, it's plausible for userspace to
refuse to run if a quirk is not listed by KVM_CAP_DISABLE_QUIRKS2, so
preserve that and make it part of the API.

As an example, mark KVM_X86_QUIRK_CD_NW_CLEARED as auto-disabled on
Intel systems.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 3 +++
 arch/x86/kvm/svm/svm.c          | 1 +
 arch/x86/kvm/x86.c              | 2 ++
 arch/x86/kvm/x86.h              | 1 +
 4 files changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7ebbedc566ff..a4f213d235dd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2420,6 +2420,9 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
 	 KVM_X86_QUIRK_SLOT_ZAP_ALL |		\
 	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS)
 
+#define KVM_X86_CONDITIONAL_QUIRKS		\
+	 KVM_X86_QUIRK_CD_NW_CLEARED
+
 /*
  * KVM previously used a u32 field in kvm_run to indicate the hypercall was
  * initiated from long mode. KVM now sets bit 0 to indicate long mode, but the
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ebaa5a41db07..51cfef44b58d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5426,6 +5426,7 @@ static __init int svm_hardware_setup(void)
 	 */
 	allow_smaller_maxphyaddr = !npt_enabled;
 
+	kvm_caps.inapplicable_quirks &= ~KVM_X86_QUIRK_CD_NW_CLEARED;
 	return 0;
 
 err:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 35d03fcdb8e9..5abea6c73a38 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9775,6 +9775,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 		kvm_host.xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
 		kvm_caps.supported_xcr0 = kvm_host.xcr0 & KVM_SUPPORTED_XCR0;
 	}
+	kvm_caps.inapplicable_quirks = KVM_X86_CONDITIONAL_QUIRKS;
 
 	rdmsrl_safe(MSR_EFER, &kvm_host.efer);
 
@@ -12754,6 +12755,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	kvm->arch.apic_bus_cycle_ns = APIC_BUS_CYCLE_NS_DEFAULT;
 	kvm->arch.guest_can_read_msr_platform_info = true;
 	kvm->arch.enable_pmu = enable_pmu;
+	kvm->arch.disabled_quirks = kvm_caps.inapplicable_quirks;
 
 #if IS_ENABLED(CONFIG_HYPERV)
 	spin_lock_init(&kvm->arch.hv_root_tdp_lock);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 8ce6da98b5a2..221778792c3c 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -34,6 +34,7 @@ struct kvm_caps {
 	u64 supported_xcr0;
 	u64 supported_xss;
 	u64 supported_perf_cap;
+	u64 inapplicable_quirks;
 };
 
 struct kvm_host_values {
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/6] KVM: x86: Introduce supported_quirks to block disabling quirks
  2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
  2025-03-04  6:06 ` [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks Paolo Bonzini
  2025-03-04  6:06 ` [PATCH v3 2/6] KVM: x86: Allow vendor code to disable quirks Paolo Bonzini
@ 2025-03-04  6:06 ` Paolo Bonzini
  2025-03-05  3:23   ` Yan Zhao
  2025-03-04  6:06 ` [PATCH v3 4/6] KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao

From: Yan Zhao <yan.y.zhao@intel.com>

Introduce supported_quirks in kvm_caps; it starts with KVM_X86_VALID_QUIRKS
and bits can be removed to force-enable quirks according to platform-specific
logic.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20250224070832.31394-1-yan.y.zhao@intel.com>
[Remove unsupported quirks at KVM_ENABLE_CAP time. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c | 7 ++++---
 arch/x86/kvm/x86.h | 2 ++
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5abea6c73a38..062c1b58b223 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4782,7 +4782,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
 		break;
 	case KVM_CAP_DISABLE_QUIRKS2:
-		r = KVM_X86_VALID_QUIRKS;
+		r = kvm_caps.supported_quirks;
 		break;
 	case KVM_CAP_X86_NOTIFY_VMEXIT:
 		r = kvm_caps.has_notify_vmexit;
@@ -6521,11 +6521,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	switch (cap->cap) {
 	case KVM_CAP_DISABLE_QUIRKS2:
 		r = -EINVAL;
-		if (cap->args[0] & ~KVM_X86_VALID_QUIRKS)
+		if (cap->args[0] & ~kvm_caps.supported_quirks)
 			break;
 		fallthrough;
 	case KVM_CAP_DISABLE_QUIRKS:
-		kvm->arch.disabled_quirks |= cap->args[0];
+		kvm->arch.disabled_quirks |= cap->args[0] & kvm_caps.supported_quirks;
 		r = 0;
 		break;
 	case KVM_CAP_SPLIT_IRQCHIP: {
@@ -9775,6 +9775,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
 		kvm_host.xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
 		kvm_caps.supported_xcr0 = kvm_host.xcr0 & KVM_SUPPORTED_XCR0;
 	}
+	kvm_caps.supported_quirks = KVM_X86_VALID_QUIRKS;
 	kvm_caps.inapplicable_quirks = KVM_X86_CONDITIONAL_QUIRKS;
 
 	rdmsrl_safe(MSR_EFER, &kvm_host.efer);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 221778792c3c..287dac35ed5e 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -34,6 +34,8 @@ struct kvm_caps {
 	u64 supported_xcr0;
 	u64 supported_xss;
 	u64 supported_perf_cap;
+
+	u64 supported_quirks;
 	u64 inapplicable_quirks;
 };
 
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 4/6] KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT
  2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
                   ` (2 preceding siblings ...)
  2025-03-04  6:06 ` [PATCH v3 3/6] KVM: x86: Introduce supported_quirks to block disabling quirks Paolo Bonzini
@ 2025-03-04  6:06 ` Paolo Bonzini
  2025-03-05  3:19   ` Yan Zhao
  2025-03-04  6:06 ` [PATCH v3 5/6] KVM: x86: remove shadow_memtype_mask Paolo Bonzini
  2025-03-04  6:06 ` [PATCH v3 6/6] KVM: TDX: Always honor guest PAT on TDX enabled guests Paolo Bonzini
  5 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao, Kevin Tian

From: Yan Zhao <yan.y.zhao@intel.com>

Introduce an Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT to have
KVM ignore guest PAT when this quirk is enabled.

On AMD platforms, KVM always honors guest PAT.  On Intel however there are
two issues.  First, KVM *cannot* honor guest PAT if CPU feature self-snoop
is not supported. Second, UC access on certain Intel platforms can be very
slow[1] and honoring guest PAT on those platforms may break some old
guests that accidentally specify video RAM as UC. Those old guests may
never expect the slowness since KVM always forces WB previously. See [2].

So, introduce a quirk that KVM can enable by default on all Intel platforms
to avoid breaking old unmodifiable guests. Newer userspace can disable this
quirk if it wishes KVM to honor guest PAT; disabling the quirk will fail
if self-snoop is not supported, i.e. if KVM cannot obey the wish.

The quirk is a no-op on AMD and also if any assigned devices have
non-coherent DMA.  This is not an issue, as KVM_X86_QUIRK_CD_NW_CLEARED is
another example of a quirk that is sometimes automatically disabled.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/all/Ztl9NWCOupNfVaCA@yzhao56-desk.sh.intel.com # [1]
Link: https://lore.kernel.org/all/87jzfutmfc.fsf@redhat.com # [2]
Message-ID: <20250224070946.31482-1-yan.y.zhao@intel.com>
[Use supported_quirks/inapplicable_quirks to support both AMD and
 no-self-snoop cases, as well as to remove the shadow_memtype_mask check
 from kvm_mmu_may_ignore_guest_pat(). - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/virt/kvm/api.rst  | 22 ++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |  6 +++--
 arch/x86/include/uapi/asm/kvm.h |  1 +
 arch/x86/kvm/mmu.h              |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 10 ++++----
 arch/x86/kvm/vmx/vmx.c          | 41 +++++++++++++++++++++++++++------
 arch/x86/kvm/x86.c              |  2 +-
 7 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 2d75edc9db4f..452439b605af 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8157,6 +8157,28 @@ KVM_X86_QUIRK_STUFF_FEATURE_MSRS    By default, at vCPU creation, KVM sets the
                                     and 0x489), as KVM does now allow them to
                                     be set by userspace (KVM sets them based on
                                     guest CPUID, for safety purposes).
+
+KVM_X86_QUIRK_IGNORE_GUEST_PAT      By default, on Intel platforms, KVM ignores
+                                    guest PAT and forces the effective memory
+                                    type to WB in EPT.  The quirk is not available
+                                    on Intel platforms which are incapable of
+                                    safely honoring guest PAT (i.e., without CPU
+                                    self-snoop, KVM always ignores guest PAT and
+                                    forces effective memory type to WB).  It is
+                                    also ignored on AMD platforms or, on Intel,
+                                    when a VM has non-coherent DMA devices
+                                    assigned; KVM always honors guest PAT in
+                                    such case. The quirk is needed to avoid
+                                    slowdowns on certain Intel Xeon platforms
+                                    (e.g. ICX, SPR) where self-snoop feature is
+                                    supported but UC is slow enough to cause
+                                    issues with some older guests that use
+                                    UC instead of WC to map the video RAM.
+                                    Userspace can disable the quirk to honor
+                                    guest PAT if it knows that there is no such
+                                    guest software, for example if it does not
+                                    expose a bochs graphics device (which is
+                                    known to have had a buggy driver).
 =================================== ============================================
 
 7.32 KVM_CAP_MAX_VCPU_ID
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a4f213d235dd..9b9dde476f3c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2418,10 +2418,12 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
 	 KVM_X86_QUIRK_FIX_HYPERCALL_INSN |	\
 	 KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS |	\
 	 KVM_X86_QUIRK_SLOT_ZAP_ALL |		\
-	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS)
+	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS |	\
+	 KVM_X86_QUIRK_IGNORE_GUEST_PAT)
 
 #define KVM_X86_CONDITIONAL_QUIRKS		\
-	 KVM_X86_QUIRK_CD_NW_CLEARED
+	(KVM_X86_QUIRK_CD_NW_CLEARED |		\
+	 KVM_X86_QUIRK_IGNORE_GUEST_PAT)
 
 /*
  * KVM previously used a u32 field in kvm_run to indicate the hypercall was
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 89cc7a18ef45..dc4d6428dd02 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -441,6 +441,7 @@ struct kvm_sync_regs {
 #define KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS	(1 << 6)
 #define KVM_X86_QUIRK_SLOT_ZAP_ALL		(1 << 7)
 #define KVM_X86_QUIRK_STUFF_FEATURE_MSRS	(1 << 8)
+#define KVM_X86_QUIRK_IGNORE_GUEST_PAT		(1 << 9)
 
 #define KVM_STATE_NESTED_FORMAT_VMX	0
 #define KVM_STATE_NESTED_FORMAT_SVM	1
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 47e64a3c4ce3..f999c15d8d3e 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -232,7 +232,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 	return -(u32)fault & errcode;
 }
 
-bool kvm_mmu_may_ignore_guest_pat(void);
+bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm);
 
 int kvm_mmu_post_init_vm(struct kvm *kvm);
 void kvm_mmu_pre_destroy_vm(struct kvm *kvm);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e6eb3a262f8d..9d6294f76d19 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4663,17 +4663,19 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
 }
 #endif
 
-bool kvm_mmu_may_ignore_guest_pat(void)
+bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm)
 {
 	/*
 	 * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
 	 * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
 	 * honor the memtype from the guest's PAT so that guest accesses to
 	 * memory that is DMA'd aren't cached against the guest's wishes.  As a
-	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
-	 * KVM _always_ ignores guest PAT (when EPT is enabled).
+	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA.
+	 * KVM _always_ ignores guest PAT, when EPT is enabled and when quirk
+	 * KVM_X86_QUIRK_IGNORE_GUEST_PAT is enabled or the CPU lacks the
+	 * ability to safely honor guest PAT.
 	 */
-	return shadow_memtype_mask;
+	return kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT);
 }
 
 int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 486fbdb4365c..719e79712339 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7599,6 +7599,17 @@ int vmx_vm_init(struct kvm *kvm)
 	return 0;
 }
 
+static inline bool vmx_ignore_guest_pat(struct kvm *kvm)
+{
+	/*
+	 * Non-coherent DMA devices need the guest to flush CPU properly.
+	 * In that case it is not possible to map all guest RAM as WB, so
+	 * always trust guest PAT.
+	 */
+	return !kvm_arch_has_noncoherent_dma(kvm) &&
+	       kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT);
+}
+
 u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 {
 	/*
@@ -7608,13 +7619,8 @@ u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 	if (is_mmio)
 		return MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT;
 
-	/*
-	 * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
-	 * device attached.  Letting the guest control memory types on Intel
-	 * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
-	 * the guest to behave only as a last resort.
-	 */
-	if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
+	/* Force WB if ignoring guest PAT */
+	if (vmx_ignore_guest_pat(vcpu->kvm))
 		return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
 
 	return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT);
@@ -8506,6 +8512,27 @@ __init int vmx_hardware_setup(void)
 
 	kvm_set_posted_intr_wakeup_handler(pi_wakeup_handler);
 
+	/*
+	 * On Intel CPUs that lack self-snoop feature, letting the guest control
+	 * memory types may result in unexpected behavior. So always ignore guest
+	 * PAT on those CPUs and map VM as writeback, not allowing userspace to
+	 * disable the quirk.
+	 *
+	 * On certain Intel CPUs (e.g. SPR, ICX), though self-snoop feature is
+	 * supported, UC is slow enough to cause issues with some older guests (e.g.
+	 * an old version of bochs driver uses ioremap() instead of ioremap_wc() to
+	 * map the video RAM, causing wayland desktop to fail to get started
+	 * correctly). To avoid breaking those older guests that rely on KVM to force
+	 * memory type to WB, provide KVM_X86_QUIRK_IGNORE_GUEST_PAT to preserve the
+	 * safer (for performance) default behavior.
+	 *
+	 * On top of this, non-coherent DMA devices need the guest to flush CPU
+	 * caches properly.  This also requires honoring guest PAT, and is forced
+	 * independent of the quirk in vmx_ignore_guest_pat().
+	 */
+	if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
+		kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT;
+       kvm_caps.inapplicable_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT;
 	return r;
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 062c1b58b223..5b45fca3ddfa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13545,7 +13545,7 @@ static void kvm_noncoherent_dma_assignment_start_or_stop(struct kvm *kvm)
 	 * (or last) non-coherent device is (un)registered to so that new SPTEs
 	 * with the correct "ignore guest PAT" setting are created.
 	 */
-	if (kvm_mmu_may_ignore_guest_pat())
+	if (kvm_mmu_may_ignore_guest_pat(kvm))
 		kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
 }
 
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 5/6] KVM: x86: remove shadow_memtype_mask
  2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
                   ` (3 preceding siblings ...)
  2025-03-04  6:06 ` [PATCH v3 4/6] KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
@ 2025-03-04  6:06 ` Paolo Bonzini
  2025-03-04 10:51   ` Yan Zhao
  2025-03-04  6:06 ` [PATCH v3 6/6] KVM: TDX: Always honor guest PAT on TDX enabled guests Paolo Bonzini
  5 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao

The IGNORE_GUEST_PAT quirk is inapplicable, and thus always-disabled,
if shadow_memtype_mask is zero.  As long as vmx_get_mt_mask is not
called for the shadow paging case, there is no need to consult
shadow_memtype_mask and it can be removed altogether.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c  | 15 ---------------
 arch/x86/kvm/mmu/spte.c | 19 ++-----------------
 arch/x86/kvm/mmu/spte.h |  1 -
 arch/x86/kvm/vmx/vmx.c  |  2 ++
 arch/x86/kvm/x86.c      |  4 +++-
 5 files changed, 7 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9d6294f76d19..33c6d1d7e3e5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4663,21 +4663,6 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
 }
 #endif
 
-bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm)
-{
-	/*
-	 * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
-	 * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
-	 * honor the memtype from the guest's PAT so that guest accesses to
-	 * memory that is DMA'd aren't cached against the guest's wishes.  As a
-	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA.
-	 * KVM _always_ ignores guest PAT, when EPT is enabled and when quirk
-	 * KVM_X86_QUIRK_IGNORE_GUEST_PAT is enabled or the CPU lacks the
-	 * ability to safely honor guest PAT.
-	 */
-	return kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT);
-}
-
 int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index a609d5b58b69..f279153a1588 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -37,7 +37,6 @@ u64 __read_mostly shadow_mmio_value;
 u64 __read_mostly shadow_mmio_mask;
 u64 __read_mostly shadow_mmio_access_mask;
 u64 __read_mostly shadow_present_mask;
-u64 __read_mostly shadow_memtype_mask;
 u64 __read_mostly shadow_me_value;
 u64 __read_mostly shadow_me_mask;
 u64 __read_mostly shadow_acc_track_mask;
@@ -203,9 +202,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	if (level > PG_LEVEL_4K)
 		spte |= PT_PAGE_SIZE_MASK;
 
-	if (shadow_memtype_mask)
-		spte |= kvm_x86_call(get_mt_mask)(vcpu, gfn,
-						  kvm_is_mmio_pfn(pfn));
+	spte |= kvm_x86_call(get_mt_mask)(vcpu, gfn, kvm_is_mmio_pfn(pfn));
 	if (host_writable)
 		spte |= shadow_host_writable_mask;
 	else
@@ -460,13 +457,7 @@ void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only)
 	/* VMX_EPT_SUPPRESS_VE_BIT is needed for W or X violation. */
 	shadow_present_mask	=
 		(has_exec_only ? 0ull : VMX_EPT_READABLE_MASK) | VMX_EPT_SUPPRESS_VE_BIT;
-	/*
-	 * EPT overrides the host MTRRs, and so KVM must program the desired
-	 * memtype directly into the SPTEs.  Note, this mask is just the mask
-	 * of all bits that factor into the memtype, the actual memtype must be
-	 * dynamically calculated, e.g. to ensure host MMIO is mapped UC.
-	 */
-	shadow_memtype_mask	= VMX_EPT_MT_MASK | VMX_EPT_IPAT_BIT;
+
 	shadow_acc_track_mask	= VMX_EPT_RWX_MASK;
 	shadow_host_writable_mask = EPT_SPTE_HOST_WRITABLE;
 	shadow_mmu_writable_mask  = EPT_SPTE_MMU_WRITABLE;
@@ -518,12 +509,6 @@ void kvm_mmu_reset_all_pte_masks(void)
 	shadow_x_mask		= 0;
 	shadow_present_mask	= PT_PRESENT_MASK;
 
-	/*
-	 * For shadow paging and NPT, KVM uses PAT entry '0' to encode WB
-	 * memtype in the SPTEs, i.e. relies on host MTRRs to provide the
-	 * correct memtype (WB is the "weakest" memtype).
-	 */
-	shadow_memtype_mask	= 0;
 	shadow_acc_track_mask	= 0;
 	shadow_me_mask		= 0;
 	shadow_me_value		= 0;
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 59746854c0af..249027efff0c 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -187,7 +187,6 @@ extern u64 __read_mostly shadow_mmio_value;
 extern u64 __read_mostly shadow_mmio_mask;
 extern u64 __read_mostly shadow_mmio_access_mask;
 extern u64 __read_mostly shadow_present_mask;
-extern u64 __read_mostly shadow_memtype_mask;
 extern u64 __read_mostly shadow_me_value;
 extern u64 __read_mostly shadow_me_mask;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 719e79712339..b119dd8a66f1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8438,6 +8438,8 @@ __init int vmx_hardware_setup(void)
 	if (enable_ept)
 		kvm_mmu_set_ept_masks(enable_ept_ad_bits,
 				      cpu_has_vmx_ept_execute_only());
+	else
+		vt_x86_ops.get_mt_mask = NULL;
 
 	/*
 	 * Setup shadow_me_value/shadow_me_mask to include MKTME KeyID
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5b45fca3ddfa..8bf50cecc75c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13544,8 +13544,10 @@ static void kvm_noncoherent_dma_assignment_start_or_stop(struct kvm *kvm)
 	 * due to toggling the "ignore PAT" bit.  Zap all SPTEs when the first
 	 * (or last) non-coherent device is (un)registered to so that new SPTEs
 	 * with the correct "ignore guest PAT" setting are created.
+	 *
+	 * If KVM always honors guest PAT, however, there is nothing to do.
 	 */
-	if (kvm_mmu_may_ignore_guest_pat(kvm))
+	if (kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT))
 		kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
 }
 
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 6/6] KVM: TDX: Always honor guest PAT on TDX enabled guests
  2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
                   ` (4 preceding siblings ...)
  2025-03-04  6:06 ` [PATCH v3 5/6] KVM: x86: remove shadow_memtype_mask Paolo Bonzini
@ 2025-03-04  6:06 ` Paolo Bonzini
  2025-03-05  2:48   ` Yan Zhao
  5 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2025-03-04  6:06 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: xiaoyao.li, seanjc, yan.y.zhao

From: Yan Zhao <yan.y.zhao@intel.com>

Always honor guest PAT in KVM-managed EPTs on TDX enabled guests by
making self-snoop feature a hard dependency for TDX and making quirk
KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT not a valid quirk once TDX is enabled.

The quirk KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT only affects memory type of
KVM-managed EPTs. For the TDX-module-managed private EPT, memory type is
always forced to WB now.

Honoring guest PAT in KVM-managed EPTs ensures KVM does not invoke
kvm_zap_gfn_range() when attaching/detaching non-coherent DMA devices,
which would cause mirrored EPTs for TDs to be zapped, leading to the
TDX-module-managed private EPT being incorrectly zapped.

As a new feature, TDX always comes with support for self-snoop, and does
not have to worry about unmodifiable but buggy guests. So, simply ignore
KVM_X86_QUIRK_IGNORE_GUEST_PAT on TDX guests just like kvm-amd.ko already
does.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20250224071039.31511-1-yan.y.zhao@intel.com>
[Only apply to TDX guests. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx/tdx.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index b6f6f6e2f02e..89a0e90b7aef 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -624,6 +624,7 @@ int tdx_vm_init(struct kvm *kvm)

 	kvm->arch.has_protected_state = true;
 	kvm->arch.has_private_mem = true;
+	kvm->arch.disabled_quirks |= KVM_X86_QUIRK_IGNORE_GUEST_PAT;

 	/*
 	 * Because guest TD is protected, VMM can't parse the instruction in TD.
@@ -3470,6 +3471,11 @@ int __init tdx_bringup(void)
 		goto success_disable_tdx;
 	}

+	if (!cpu_feature_enabled(X86_FEATURE_SELFSNOOP)) {
+		pr_err("Self-snoop is required for TDX\n");
+		goto success_disable_tdx;
+	}
+
 	if (!cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM)) {
 		pr_err("tdx: no TDX private KeyIDs available\n");
 		goto success_disable_tdx;
-- 
2.43.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 2/6] KVM: x86: Allow vendor code to disable quirks
  2025-03-04  6:06 ` [PATCH v3 2/6] KVM: x86: Allow vendor code to disable quirks Paolo Bonzini
@ 2025-03-04  8:15   ` Yan Zhao
  0 siblings, 0 replies; 14+ messages in thread
From: Yan Zhao @ 2025-03-04  8:15 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc

> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 35d03fcdb8e9..5abea6c73a38 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9775,6 +9775,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  		kvm_host.xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
>  		kvm_caps.supported_xcr0 = kvm_host.xcr0 & KVM_SUPPORTED_XCR0;
>  	}
> +	kvm_caps.inapplicable_quirks = KVM_X86_CONDITIONAL_QUIRKS;
>  
>  	rdmsrl_safe(MSR_EFER, &kvm_host.efer);
>  
> @@ -12754,6 +12755,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	kvm->arch.apic_bus_cycle_ns = APIC_BUS_CYCLE_NS_DEFAULT;
>  	kvm->arch.guest_can_read_msr_platform_info = true;
>  	kvm->arch.enable_pmu = enable_pmu;
> +	kvm->arch.disabled_quirks = kvm_caps.inapplicable_quirks;
Should be

kvm->arch.disabled_quirks |= kvm_caps.inapplicable_quirks;

Otherwise, it may overwrite the disabled_quirks value set in vm_init hook.

>  
>  #if IS_ENABLED(CONFIG_HYPERV)
>  	spin_lock_init(&kvm->arch.hv_root_tdp_lock);
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 5/6] KVM: x86: remove shadow_memtype_mask
  2025-03-04  6:06 ` [PATCH v3 5/6] KVM: x86: remove shadow_memtype_mask Paolo Bonzini
@ 2025-03-04 10:51   ` Yan Zhao
  0 siblings, 0 replies; 14+ messages in thread
From: Yan Zhao @ 2025-03-04 10:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc

On Tue, Mar 04, 2025 at 01:06:46AM -0500, Paolo Bonzini wrote:
> The IGNORE_GUEST_PAT quirk is inapplicable, and thus always-disabled,
> if shadow_memtype_mask is zero.  As long as vmx_get_mt_mask is not
For shadow paging case, current KVM always ignores guest PAT, i.e., the quirk is
always-enabled.

However, this might be negligible, as non-coherent DMA is unlikely to function
well with shadow paging anyway, if I don't miss anything.

> called for the shadow paging case, there is no need to consult
> shadow_memtype_mask and it can be removed altogether.
... 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5b45fca3ddfa..8bf50cecc75c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -13544,8 +13544,10 @@ static void kvm_noncoherent_dma_assignment_start_or_stop(struct kvm *kvm)
>  	 * due to toggling the "ignore PAT" bit.  Zap all SPTEs when the first
>  	 * (or last) non-coherent device is (un)registered to so that new SPTEs
>  	 * with the correct "ignore guest PAT" setting are created.
> +	 *
> +	 * If KVM always honors guest PAT, however, there is nothing to do.
>  	 */
> -	if (kvm_mmu_may_ignore_guest_pat(kvm))
> +	if (kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT))
>  		kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
>  }
>  
> -- 
> 2.43.5
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 6/6] KVM: TDX: Always honor guest PAT on TDX enabled guests
  2025-03-04  6:06 ` [PATCH v3 6/6] KVM: TDX: Always honor guest PAT on TDX enabled guests Paolo Bonzini
@ 2025-03-05  2:48   ` Yan Zhao
  0 siblings, 0 replies; 14+ messages in thread
From: Yan Zhao @ 2025-03-05  2:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc

On Tue, Mar 04, 2025 at 01:06:47AM -0500, Paolo Bonzini wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
> 
> Always honor guest PAT in KVM-managed EPTs on TDX enabled guests by
> making self-snoop feature a hard dependency for TDX and making quirk
> KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT not a valid quirk once TDX is enabled.
> The quirk KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT only affects memory type of
Two left-overs :)

s/KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT/KVM_X86_QUIRK_IGNORE_GUEST_PAT
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 4/6] KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT
  2025-03-04  6:06 ` [PATCH v3 4/6] KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
@ 2025-03-05  3:19   ` Yan Zhao
  0 siblings, 0 replies; 14+ messages in thread
From: Yan Zhao @ 2025-03-05  3:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc, Kevin Tian

On Tue, Mar 04, 2025 at 01:06:45AM -0500, Paolo Bonzini wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
> 
> Introduce an Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT to have
> KVM ignore guest PAT when this quirk is enabled.
> 
> On AMD platforms, KVM always honors guest PAT.  On Intel however there are
> two issues.  First, KVM *cannot* honor guest PAT if CPU feature self-snoop
> is not supported. Second, UC access on certain Intel platforms can be very
> slow[1] and honoring guest PAT on those platforms may break some old
> guests that accidentally specify video RAM as UC. Those old guests may
> never expect the slowness since KVM always forces WB previously. See [2].
> 
> So, introduce a quirk that KVM can enable by default on all Intel platforms
> to avoid breaking old unmodifiable guests. Newer userspace can disable this
> quirk if it wishes KVM to honor guest PAT; disabling the quirk will fail
> if self-snoop is not supported, i.e. if KVM cannot obey the wish.
> 
> The quirk is a no-op on AMD and also if any assigned devices have
> non-coherent DMA.  This is not an issue, as KVM_X86_QUIRK_CD_NW_CLEARED is
> another example of a quirk that is sometimes automatically disabled.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Link: https://lore.kernel.org/all/Ztl9NWCOupNfVaCA@yzhao56-desk.sh.intel.com # [1]
> Link: https://lore.kernel.org/all/87jzfutmfc.fsf@redhat.com # [2]
> Message-ID: <20250224070946.31482-1-yan.y.zhao@intel.com>
> [Use supported_quirks/inapplicable_quirks to support both AMD and
>  no-self-snoop cases, as well as to remove the shadow_memtype_mask check
>  from kvm_mmu_may_ignore_guest_pat(). - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Documentation/virt/kvm/api.rst  | 22 ++++++++++++++++++
>  arch/x86/include/asm/kvm_host.h |  6 +++--
>  arch/x86/include/uapi/asm/kvm.h |  1 +
>  arch/x86/kvm/mmu.h              |  2 +-
>  arch/x86/kvm/mmu/mmu.c          | 10 ++++----
>  arch/x86/kvm/vmx/vmx.c          | 41 +++++++++++++++++++++++++++------
>  arch/x86/kvm/x86.c              |  2 +-
>  7 files changed, 69 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 2d75edc9db4f..452439b605af 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8157,6 +8157,28 @@ KVM_X86_QUIRK_STUFF_FEATURE_MSRS    By default, at vCPU creation, KVM sets the
>                                      and 0x489), as KVM does now allow them to
>                                      be set by userspace (KVM sets them based on
>                                      guest CPUID, for safety purposes).
> +
> +KVM_X86_QUIRK_IGNORE_GUEST_PAT      By default, on Intel platforms, KVM ignores
> +                                    guest PAT and forces the effective memory
> +                                    type to WB in EPT.  The quirk is not available
> +                                    on Intel platforms which are incapable of
> +                                    safely honoring guest PAT (i.e., without CPU
> +                                    self-snoop, KVM always ignores guest PAT and
> +                                    forces effective memory type to WB).  It is
Not sure if it's necessary to add something like:
The quirk is also not available on Intel platforms which do not enable EPT
(i.e., in the shadow paging case, KVM always ignores guest PAT).

> +                                    also ignored on AMD platforms or, on Intel,
> +                                    when a VM has non-coherent DMA devices
> +                                    assigned; KVM always honors guest PAT in
> +                                    such case. The quirk is needed to avoid
> +                                    slowdowns on certain Intel Xeon platforms
> +                                    (e.g. ICX, SPR) where self-snoop feature is
> +                                    supported but UC is slow enough to cause
> +                                    issues with some older guests that use
> +                                    UC instead of WC to map the video RAM.
> +                                    Userspace can disable the quirk to honor
> +                                    guest PAT if it knows that there is no such
> +                                    guest software, for example if it does not
> +                                    expose a bochs graphics device (which is
> +                                    known to have had a buggy driver).
>  =================================== ============================================
>  
>  7.32 KVM_CAP_MAX_VCPU_ID
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index a4f213d235dd..9b9dde476f3c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -2418,10 +2418,12 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
>  	 KVM_X86_QUIRK_FIX_HYPERCALL_INSN |	\
>  	 KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS |	\
>  	 KVM_X86_QUIRK_SLOT_ZAP_ALL |		\
> -	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS)
> +	 KVM_X86_QUIRK_STUFF_FEATURE_MSRS |	\
> +	 KVM_X86_QUIRK_IGNORE_GUEST_PAT)
>  
>  #define KVM_X86_CONDITIONAL_QUIRKS		\
> -	 KVM_X86_QUIRK_CD_NW_CLEARED
> +	(KVM_X86_QUIRK_CD_NW_CLEARED |		\
> +	 KVM_X86_QUIRK_IGNORE_GUEST_PAT)
>  
>  /*
>   * KVM previously used a u32 field in kvm_run to indicate the hypercall was
> diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> index 89cc7a18ef45..dc4d6428dd02 100644
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -441,6 +441,7 @@ struct kvm_sync_regs {
>  #define KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS	(1 << 6)
>  #define KVM_X86_QUIRK_SLOT_ZAP_ALL		(1 << 7)
>  #define KVM_X86_QUIRK_STUFF_FEATURE_MSRS	(1 << 8)
> +#define KVM_X86_QUIRK_IGNORE_GUEST_PAT		(1 << 9)
>  
>  #define KVM_STATE_NESTED_FORMAT_VMX	0
>  #define KVM_STATE_NESTED_FORMAT_SVM	1
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 47e64a3c4ce3..f999c15d8d3e 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -232,7 +232,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>  	return -(u32)fault & errcode;
>  }
>  
> -bool kvm_mmu_may_ignore_guest_pat(void);
> +bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm);
>  
>  int kvm_mmu_post_init_vm(struct kvm *kvm);
>  void kvm_mmu_pre_destroy_vm(struct kvm *kvm);
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e6eb3a262f8d..9d6294f76d19 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4663,17 +4663,19 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
>  }
>  #endif
>  
> -bool kvm_mmu_may_ignore_guest_pat(void)
> +bool kvm_mmu_may_ignore_guest_pat(struct kvm *kvm)
>  {
>  	/*
>  	 * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
>  	 * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
>  	 * honor the memtype from the guest's PAT so that guest accesses to
>  	 * memory that is DMA'd aren't cached against the guest's wishes.  As a
> -	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
> -	 * KVM _always_ ignores guest PAT (when EPT is enabled).
> +	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA.
> +	 * KVM _always_ ignores guest PAT, when EPT is enabled and when quirk
> +	 * KVM_X86_QUIRK_IGNORE_GUEST_PAT is enabled or the CPU lacks the
> +	 * ability to safely honor guest PAT.
>  	 */
> -	return shadow_memtype_mask;
> +	return kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT);
This changes the original logic for shadow paging.
But maybe it's benign as the point in [1].
[1] https://lore.kernel.org/all/Z8bbKCICpzBKyVBT@yzhao56-desk.sh.intel.com/

>  }
>  
>  int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 486fbdb4365c..719e79712339 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7599,6 +7599,17 @@ int vmx_vm_init(struct kvm *kvm)
>  	return 0;
>  }
>  
> +static inline bool vmx_ignore_guest_pat(struct kvm *kvm)
> +{
> +	/*
> +	 * Non-coherent DMA devices need the guest to flush CPU properly.
> +	 * In that case it is not possible to map all guest RAM as WB, so
> +	 * always trust guest PAT.
> +	 */
> +	return !kvm_arch_has_noncoherent_dma(kvm) &&
> +	       kvm_check_has_quirk(kvm, KVM_X86_QUIRK_IGNORE_GUEST_PAT);
> +}
> +
>  u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>  {
>  	/*
> @@ -7608,13 +7619,8 @@ u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>  	if (is_mmio)
>  		return MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT;
>  
> -	/*
> -	 * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
> -	 * device attached.  Letting the guest control memory types on Intel
> -	 * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
> -	 * the guest to behave only as a last resort.
> -	 */
> -	if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
> +	/* Force WB if ignoring guest PAT */
> +	if (vmx_ignore_guest_pat(vcpu->kvm))
>  		return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
>  
>  	return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT);
> @@ -8506,6 +8512,27 @@ __init int vmx_hardware_setup(void)
>  
>  	kvm_set_posted_intr_wakeup_handler(pi_wakeup_handler);
>  
> +	/*
> +	 * On Intel CPUs that lack self-snoop feature, letting the guest control
> +	 * memory types may result in unexpected behavior. So always ignore guest
> +	 * PAT on those CPUs and map VM as writeback, not allowing userspace to
> +	 * disable the quirk.
> +	 *
> +	 * On certain Intel CPUs (e.g. SPR, ICX), though self-snoop feature is
> +	 * supported, UC is slow enough to cause issues with some older guests (e.g.
> +	 * an old version of bochs driver uses ioremap() instead of ioremap_wc() to
> +	 * map the video RAM, causing wayland desktop to fail to get started
> +	 * correctly). To avoid breaking those older guests that rely on KVM to force
> +	 * memory type to WB, provide KVM_X86_QUIRK_IGNORE_GUEST_PAT to preserve the
> +	 * safer (for performance) default behavior.
> +	 *
> +	 * On top of this, non-coherent DMA devices need the guest to flush CPU
> +	 * caches properly.  This also requires honoring guest PAT, and is forced
> +	 * independent of the quirk in vmx_ignore_guest_pat().
> +	 */
> +	if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
What about
	if (!static_cpu_has(X86_FEATURE_SELFSNOOP) || !enable_ept)
?

> +		kvm_caps.supported_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT;
> +       kvm_caps.inapplicable_quirks &= ~KVM_X86_QUIRK_IGNORE_GUEST_PAT;
>  	return r;
>  }
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 062c1b58b223..5b45fca3ddfa 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -13545,7 +13545,7 @@ static void kvm_noncoherent_dma_assignment_start_or_stop(struct kvm *kvm)
>  	 * (or last) non-coherent device is (un)registered to so that new SPTEs
>  	 * with the correct "ignore guest PAT" setting are created.
>  	 */
> -	if (kvm_mmu_may_ignore_guest_pat())
> +	if (kvm_mmu_may_ignore_guest_pat(kvm))
>  		kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
>  }
>  
> -- 
> 2.43.5
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks
  2025-03-04  6:06 ` [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks Paolo Bonzini
@ 2025-03-05  3:20   ` Yan Zhao
  2025-03-19  1:20   ` Binbin Wu
  1 sibling, 0 replies; 14+ messages in thread
From: Yan Zhao @ 2025-03-05  3:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc

LGTM.

On Tue, Mar 04, 2025 at 01:06:42AM -0500, Paolo Bonzini wrote:
> Allowing arbitrary re-enabling of quirks puts a limit on what the
> quirks themselves can do, since you cannot assume that the quirk
> prevents a particular state.  More important, it also prevents
> KVM from disabling a quirk at VM creation time, because userspace
> can always go back and re-enable that.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 856ceeb4fb35..35d03fcdb8e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6525,7 +6525,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			break;
>  		fallthrough;
>  	case KVM_CAP_DISABLE_QUIRKS:
> -		kvm->arch.disabled_quirks = cap->args[0];
> +		kvm->arch.disabled_quirks |= cap->args[0];
>  		r = 0;
>  		break;
>  	case KVM_CAP_SPLIT_IRQCHIP: {
> -- 
> 2.43.5
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 3/6] KVM: x86: Introduce supported_quirks to block disabling quirks
  2025-03-04  6:06 ` [PATCH v3 3/6] KVM: x86: Introduce supported_quirks to block disabling quirks Paolo Bonzini
@ 2025-03-05  3:23   ` Yan Zhao
  0 siblings, 0 replies; 14+ messages in thread
From: Yan Zhao @ 2025-03-05  3:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc

LGTM.

On Tue, Mar 04, 2025 at 01:06:44AM -0500, Paolo Bonzini wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
> 
> Introduce supported_quirks in kvm_caps; it starts with KVM_X86_VALID_QUIRKS
> and bits can be removed to force-enable quirks according to platform-specific
> logic.
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20250224070832.31394-1-yan.y.zhao@intel.com>
> [Remove unsupported quirks at KVM_ENABLE_CAP time. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 7 ++++---
>  arch/x86/kvm/x86.h | 2 ++
>  2 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5abea6c73a38..062c1b58b223 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4782,7 +4782,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  		r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
>  		break;
>  	case KVM_CAP_DISABLE_QUIRKS2:
> -		r = KVM_X86_VALID_QUIRKS;
> +		r = kvm_caps.supported_quirks;
>  		break;
>  	case KVM_CAP_X86_NOTIFY_VMEXIT:
>  		r = kvm_caps.has_notify_vmexit;
> @@ -6521,11 +6521,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  	switch (cap->cap) {
>  	case KVM_CAP_DISABLE_QUIRKS2:
>  		r = -EINVAL;
> -		if (cap->args[0] & ~KVM_X86_VALID_QUIRKS)
> +		if (cap->args[0] & ~kvm_caps.supported_quirks)
>  			break;
>  		fallthrough;
>  	case KVM_CAP_DISABLE_QUIRKS:
> -		kvm->arch.disabled_quirks |= cap->args[0];
> +		kvm->arch.disabled_quirks |= cap->args[0] & kvm_caps.supported_quirks;
>  		r = 0;
>  		break;
>  	case KVM_CAP_SPLIT_IRQCHIP: {
> @@ -9775,6 +9775,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
>  		kvm_host.xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
>  		kvm_caps.supported_xcr0 = kvm_host.xcr0 & KVM_SUPPORTED_XCR0;
>  	}
> +	kvm_caps.supported_quirks = KVM_X86_VALID_QUIRKS;
>  	kvm_caps.inapplicable_quirks = KVM_X86_CONDITIONAL_QUIRKS;
>  
>  	rdmsrl_safe(MSR_EFER, &kvm_host.efer);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 221778792c3c..287dac35ed5e 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -34,6 +34,8 @@ struct kvm_caps {
>  	u64 supported_xcr0;
>  	u64 supported_xss;
>  	u64 supported_perf_cap;
> +
> +	u64 supported_quirks;
>  	u64 inapplicable_quirks;
>  };
>  
> -- 
> 2.43.5
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks
  2025-03-04  6:06 ` [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks Paolo Bonzini
  2025-03-05  3:20   ` Yan Zhao
@ 2025-03-19  1:20   ` Binbin Wu
  1 sibling, 0 replies; 14+ messages in thread
From: Binbin Wu @ 2025-03-19  1:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, xiaoyao.li, seanjc, yan.y.zhao



On 3/4/2025 2:06 PM, Paolo Bonzini wrote:
> Allowing arbitrary re-enabling of quirks puts a limit on what the
> quirks themselves can do, since you cannot assume that the quirk
> prevents a particular state.  More important, it also prevents
> KVM from disabling a quirk at VM creation time, because userspace
> can always go back and re-enable that.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   arch/x86/kvm/x86.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 856ceeb4fb35..35d03fcdb8e9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6525,7 +6525,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>   			break;
>   		fallthrough;
>   	case KVM_CAP_DISABLE_QUIRKS:
> -		kvm->arch.disabled_quirks = cap->args[0];
> +		kvm->arch.disabled_quirks |= cap->args[0];
>   		r = 0;
>   		break;
>   	case KVM_CAP_SPLIT_IRQCHIP: {
This  change requires changes in KVM selftests for monitor_mwait_test.

I cooked a patch to pass the test case.

 From 29b22d0a5cb14b418d289d78e2e290f7e0fc1749 Mon Sep 17 00:00:00 2001
From: Binbin Wu <binbin.wu@linux.intel.com>
Date: Tue, 18 Mar 2025 17:31:51 +0800
Subject: [PATCH] KVM: selftests: Test monitor/mwait cases in separate VMs

Test different cases of disabling quirk combinations for monitor/mwait in
separate VMs after KVM does not allow re-enabling quirks.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
---
  .../selftests/kvm/x86/monitor_mwait_test.c    | 44 ++++++++++++-------
  1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86/monitor_mwait_test.c b/tools/testing/selftests/kvm/x86/monitor_mwait_test.c
index 2b550eff35f1..b583e0523575 100644
--- a/tools/testing/selftests/kvm/x86/monitor_mwait_test.c
+++ b/tools/testing/selftests/kvm/x86/monitor_mwait_test.c
@@ -16,6 +16,8 @@ enum monitor_mwait_testcases {
         MWAIT_DISABLED = BIT(2),
  };

+static int testcase;
+
  /*
   * If both MWAIT and its quirk are disabled, MONITOR/MWAIT should #UD, in all
   * other scenarios KVM should emulate them as nops.
@@ -35,7 +37,7 @@ do {                                                                  \
                                testcase, vector);                       \
  } while (0)

-static void guest_monitor_wait(int testcase)
+static void guest_monitor_wait(void)
  {
         u8 vector;

@@ -54,31 +56,22 @@ static void guest_monitor_wait(int testcase)

  static void guest_code(void)
  {
-       guest_monitor_wait(MWAIT_DISABLED);
-
-       guest_monitor_wait(MWAIT_QUIRK_DISABLED | MWAIT_DISABLED);
-
-       guest_monitor_wait(MISC_ENABLES_QUIRK_DISABLED | MWAIT_DISABLED);
-       guest_monitor_wait(MISC_ENABLES_QUIRK_DISABLED);
-
-       guest_monitor_wait(MISC_ENABLES_QUIRK_DISABLED | MWAIT_QUIRK_DISABLED | MWAIT_DISABLED);
-       guest_monitor_wait(MISC_ENABLES_QUIRK_DISABLED | MWAIT_QUIRK_DISABLED);
-
+       guest_monitor_wait();
         GUEST_DONE();
  }

-int main(int argc, char *argv[])
+static void vm_test_case(int test_case)
  {
         uint64_t disabled_quirks;
         struct kvm_vcpu *vcpu;
         struct kvm_vm *vm;
         struct ucall uc;
-       int testcase;
-
-       TEST_REQUIRE(this_cpu_has(X86_FEATURE_MWAIT));
-       TEST_REQUIRE(kvm_has_cap(KVM_CAP_DISABLE_QUIRKS2));

         vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+
+       testcase = test_case;
+       sync_global_to_guest(vm, testcase);
+
         vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_MWAIT);

         while (1) {
@@ -87,7 +80,7 @@ int main(int argc, char *argv[])

                 switch (get_ucall(vcpu, &uc)) {
                 case UCALL_SYNC:
-                       testcase = uc.args[1];
+                       TEST_ASSERT_EQ(testcase, uc.args[1]);
                         break;
                 case UCALL_ABORT:
                         REPORT_GUEST_ASSERT(uc);
@@ -125,5 +118,22 @@ int main(int argc, char *argv[])

  done:
         kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+       TEST_REQUIRE(this_cpu_has(X86_FEATURE_MWAIT));
+       TEST_REQUIRE(kvm_has_cap(KVM_CAP_DISABLE_QUIRKS2));
+
+       vm_test_case(MWAIT_DISABLED);
+
+       vm_test_case(MWAIT_QUIRK_DISABLED | MWAIT_DISABLED);
+
+       vm_test_case(MISC_ENABLES_QUIRK_DISABLED | MWAIT_DISABLED);
+       vm_test_case(MISC_ENABLES_QUIRK_DISABLED);
+
+       vm_test_case(MISC_ENABLES_QUIRK_DISABLED | MWAIT_QUIRK_DISABLED | MWAIT_DISABLED);
+       vm_test_case(MISC_ENABLES_QUIRK_DISABLED | MWAIT_QUIRK_DISABLED);
+
         return 0;
  }
-- 
2.46.0



^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-03-19  1:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-04  6:06 [PATCH v3 0/4] KVM: x86: Introduce quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
2025-03-04  6:06 ` [PATCH v3 1/6] KVM: x86: do not allow re-enabling quirks Paolo Bonzini
2025-03-05  3:20   ` Yan Zhao
2025-03-19  1:20   ` Binbin Wu
2025-03-04  6:06 ` [PATCH v3 2/6] KVM: x86: Allow vendor code to disable quirks Paolo Bonzini
2025-03-04  8:15   ` Yan Zhao
2025-03-04  6:06 ` [PATCH v3 3/6] KVM: x86: Introduce supported_quirks to block disabling quirks Paolo Bonzini
2025-03-05  3:23   ` Yan Zhao
2025-03-04  6:06 ` [PATCH v3 4/6] KVM: x86: Introduce Intel specific quirk KVM_X86_QUIRK_IGNORE_GUEST_PAT Paolo Bonzini
2025-03-05  3:19   ` Yan Zhao
2025-03-04  6:06 ` [PATCH v3 5/6] KVM: x86: remove shadow_memtype_mask Paolo Bonzini
2025-03-04 10:51   ` Yan Zhao
2025-03-04  6:06 ` [PATCH v3 6/6] KVM: TDX: Always honor guest PAT on TDX enabled guests Paolo Bonzini
2025-03-05  2:48   ` Yan Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox